Abstract
Hematoxylin and eosin (H&E) is a common and inexpensive histopathology assay. Though widely used and information-rich, it cannot directly inform about specific molecular markers, which require additional experiments to assess. To address this gap, we present ROSIE, a deep-learning framework that computationally imputes the expression and localization of dozens of proteins from H&E images. Our model is trained on a dataset of over 1000 paired and aligned H&E and multiplex immunofluorescence (mIF) samples from 20 tissues and disease conditions, spanning over 16 million cells. Validation of our in silico mIF staining method on held-out H&E samples demonstrates that the predicted biomarkers are effective in identifying cell phenotypes, particularly distinguishing lymphocytes such as B cells and T cells, which are not readily discernible with H&E staining alone. Additionally, ROSIE facilitates the robust identification of stromal and epithelial microenvironments and immune cell subtypes like tumor-infiltrating lymphocytes (TILs), which are important for understanding tumor-immune interactions and can help inform treatment strategies in cancer research.
Competing Interest Statement
Several authors are affiliated with Enable Medicine as employees (M.B., Z.W., A.E.T., A.T.M.), consultants (E.W.), or scientific advisor (J.Z.).
Footnotes
Minor modifications to manuscript; including references and data availability.