Abstract
We assessed the pan-cancer predictability of multi-omic biomarkers from haematoxylin and eosin (H&E)-stained whole slide images (WSI) using deep learning (DL) throughout a systematic study. A total of 13,443 DL models predicting 4,481 multi-omic biomarkers across 32 cancer types were trained and validated. The investigated biomarkers included a broad range of genetic, transcriptomic, proteomic, and metabolic alterations. Furthermore, established markers relevant for prognosis, molecular subtypes and clinical outcomes were included. Overall, we established the general feasibility of predicting multi-omic markers directly from routine histology images with DL across solid cancer types, where 50% of the models could perform at an area under the curve (AUC) of more than 0.633 (with 25% of the models having an AUC larger than 0.711). A wide range of biomarkers were detectable from routine histology images across all investigated cancer types, with a mean AUC of at least 0.62 in almost all malignancies. Strikingly, we observed that biomarker predictability was mostly consistent and not dependent on sample size and class ratio, suggesting a degree of true predictability inherent in histomorphology. Together, the results of our study show the potential of DL to predict a multitude of biomarkers across the omics spectrum using only routine slides. This paves the way for accelerating diagnosis and developing more precise treatments for cancer patients.
Competing Interest Statement
S.A., D.M., S.S., X.L., J.H., J.S., A.G., C.B., and P.R-L. are employees of Panakeia Technologies. J.N.K. declares consulting services for Owkin, France and Panakeia Technologies, UK. No other potential conflicts of interest are reported by any of the authors.