PT - JOURNAL ARTICLE AU - Lauren Schiff AU - Bianca Migliori AU - Ye Chen AU - Deidre Carter AU - Caitlyn Bonilla AU - Jenna Hall AU - Minjie Fan AU - Edmund Tam AU - Sara Ahadi AU - Brodie Fischbacher AU - Anton Geraschenko AU - Christopher J. Hunter AU - Subhashini Venugopalan AU - Sean DesMarteau AU - Arunachalam Narayanaswamy AU - Selwyn Jacob AU - Zan Armstrong AU - Peter Ferrarotto AU - Brian Williams AU - Geoff Buckley-Herd AU - Jon Hazard AU - Jordan Goldberg AU - Marc Coram AU - Reid Otto AU - Edward A. Baltz AU - Laura Andres-Martin AU - Orion Pritchard AU - Alyssa Duren-Lubanski AU - Ameya Daigavane AU - Kathryn Reggio AU - NYSCF Global Stem Cell Array ® Team AU - Phillip C. Nelson AU - Michael Frumkin AU - Susan L. Solomon AU - Lauren Bauer AU - Raeka S. Aiyar AU - Elizabeth Schwarzbach AU - Scott A. Noggle AU - Frederick J. Monsma, Jr. AU - Daniel Paull AU - Marc Berndl AU - Samuel J. Yang AU - Bjarki Johannesson TI - Integrating deep learning and unbiased automated high-content screening to identify complex disease signatures in human fibroblasts AID - 10.1101/2020.11.13.380576 DP - 2022 Jan 01 TA - bioRxiv PG - 2020.11.13.380576 4099 - http://biorxiv.org/content/early/2022/03/16/2020.11.13.380576.short 4100 - http://biorxiv.org/content/early/2022/03/16/2020.11.13.380576.full AB - Drug discovery for diseases such as Parkinson’s disease are impeded by the lack of screenable cellular phenotypes. We present an unbiased phenotypic profiling platform that combines automated cell culture, high-content imaging, Cell Painting, and deep learning. We applied this platform to primary fibroblasts from 91 Parkinson’s disease patients and matched healthy controls, creating the largest publicly available Cell Painting image dataset to date at 48 terabytes. We use fixed weights from a convolutional deep neural network trained on ImageNet to generate deep embeddings from each image and train machine learning models to detect morphological disease phenotypes. Our platform’s robustness and sensitivity allow the detection of individual-specific variation with high fidelity across batches and plate layouts. Lastly, our models confidently separate LRRK2 and sporadic Parkinson’s disease lines from healthy controls (receiver operating characteristic area under curve 0.79 (0.08 standard deviation)), supporting the capacity of this platform for complex disease modeling and drug screening applications.Competing Interest StatementY.C., M.F., S.A., A.G., S.V., A.N., Z.A., B.W., J.K., M.C., E.A.B., O.P., A.D., P.C.N., M.F., M.B., and S.J.Y. were employed by Google. M.F., A.G., S.V., A.N., Z.A., B.W., J.K., M.C., E.A.B., O.P., P.C.N., M.F., M.B., and S.J.Y. own Alphabet stock. The remaining authors declare no competing interests.