RT Journal Article SR Electronic T1 Interactive phenotyping of large-scale histology imaging data with HistomicsML JF bioRxiv FD Cold Spring Harbor Laboratory SP 140236 DO 10.1101/140236 A1 Michael Nalisnik A1 Mohamed Amgad A1 Sanghoon Lee A1 Sameer H. Halani A1 Jose Velazquez Vega A1 Daniel J Brat A1 David A Gutman A1 Lee AD Cooper YR 2017 UL http://biorxiv.org/content/early/2017/05/19/140236.abstract AB Whole-slide imaging of histologic sections captures tissue microenvironments and cytologic details in expansive high-resolution images. These images can be mined to extract quantitative features that describe histologic elements, yielding measurements for hundreds of millions of objects. A central challenge in utilizing this data is enabling investigators to train and evaluate classification rules for identifying objects related to processes like angiogenesis or immune response. Here we present HistomicsML, an interactive machine-learning framework for large whole-slide imaging data. HistomicsML uses active learning direct user feedback, making classifier training efficient and scalable in datasets containing 108+ histologic objects. We demonstrate how HistomicsML can be used to phenotype microvascular structures in gliomas to predict survival, and to explore the molecular pathways associated with these phenotypes. Our approach enables researchers to unlock phenotypic information from digital pathology datasets to investigate prognostic image biomarkers and genotype-phenotype associations.