PT  - JOURNAL ARTICLE
AU  - Salim Arslan
AU  - Debapriya Mehrotra
AU  - Julian Schmidt
AU  - Andre Geraldes
AU  - Shikha Singhal
AU  - Julius Hense
AU  - Xiusi Li
AU  - Cher Bass
AU  - Pandu Raharja-Liu
TI  - Large-scale systematic feasibility study on the pan-cancer predictability of multi-omic biomarkers from whole slide images with deep learning
AID  - 10.1101/2022.01.21.477189
DP  - 2022 Jan 01
TA  - bioRxiv
PG  - 2022.01.21.477189
4099  - http://biorxiv.org/content/early/2022/01/23/2022.01.21.477189.short
4100  - http://biorxiv.org/content/early/2022/01/23/2022.01.21.477189.full
AB  - We assessed the pan-cancer predictability of multi-omic biomarkers from haematoxylin and eosin (H&amp;amp;E)-stained whole slide image (WSI) using deep learning and standard evaluation measures throughout a systematic study. A total of 13,443 deep learning (DL) models predicting 4,481 multi-omic biomarkers across 32 cancer types were trained and validated. The investigated biomarkers included genetic mutations, transcriptomic (mRNA) and proteomic under- and over-expression status, metabolomic pathways, established markers relevant for prognosis, including gene expression signatures, molecular subtypes, clinical outcomes and response to treatment. Overall, we established the general feasibility of predicting multi-omic markers across solid cancer types, where 50% of the models could predict biomarkers with the area under the curve (AUC) of more than 0.633 (with 25% of the models having AUC larger than 0.711). Aggregating across the omic types, our deep learning models achieved the following performance: mean AUC of 0.634 ±0.117 in predicting driver SNV mutations; 0.637 ±0.108 for over-/under-expression of transcriptomic genes; 0.666 ±0.108 for over-/under-expression of proteomes; 0.564 ±0.081 for metabolomic pathways; 0.653 ±0.097 for gene signatures and molecular subtypes; 0.742 ±0.120 for standard of care biomarkers; and 0.671 ±0.120 for clinical outcomes and treatment responses. The biomarkers were shown to be detectable from routine histology images across all investigated cancer types, with aggregate mean AUC exceeding 0.62 in almost all cancers. In addition, we observed that predictability is reproducible within-marker and less dependent on sample size and positivity ratio, indicating a degree of true predictability inherent to the biomarker itself.Competing Interest StatementThe authors have declared no competing interest.