RT Journal Article SR Electronic T1 Transcriptomic learning for digital pathology JF bioRxiv FD Cold Spring Harbor Laboratory SP 760173 DO 10.1101/760173 A1 Benoît Schmauch A1 Alberto Romagnoni A1 Elodie Pronier A1 Charlie Saillard A1 Pascale Maillé A1 Julien Calderaro A1 Meriem Sefta A1 Sylvain Toldo A1 Thomas Clozel A1 Matahi Moarii A1 Pierre Courtiol A1 Gilles Wainrib YR 2019 UL http://biorxiv.org/content/early/2019/09/11/760173.abstract AB Deep learning methods for digital pathology analysis have proved an effective way to address multiple clinical questions, from diagnosis to prognosis and even to prediction of treatment outcomes. They have also recently been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed. We propose a novel approach based on the integration of multiple data modes, and show that our deep learning model, HE2RNA, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without the need for expert annotation. HE2RNA is interpretable by design, opening up new opportunities for virtual staining. In fact, it provides virtual spatialization of gene expression, as validated by double-staining on an independent dataset. Moreover, the transcriptomic representation learned by HE2RNA can be transferred to improve predictive performance for other tasks, particularly for small datasets. As an example of a task with direct clinical impact, we studied the prediction of microsatellite instability from hematoxylin & eosin stained images and our results show that better performance can be achieved in this setting.