RT Journal Article SR Electronic T1 Self-Supervised Deep Learning Encodes High-Resolution Features of Protein Subcellular Localization JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.03.29.437595 DO 10.1101/2021.03.29.437595 A1 Hirofumi Kobayashi A1 Keith C. Cheveralls A1 Manuel D. Leonetti A1 Loic A. Royer YR 2021 UL http://biorxiv.org/content/early/2021/07/03/2021.03.29.437595.abstract AB Elucidating the diversity and complexity of protein localization is essential to fully understand cellular architecture. Here, we present cytoself, a deep learning-based approach for fully self-supervised protein localization profiling and clustering. cytoself leverages a self-supervised training scheme that does not require pre-existing knowledge, categories, or annotations. Applying cytoself to images of 1311 endogenously labeled proteins from the recently released OpenCell database creates a highly resolved protein localization atlas. We show that the representations derived from cytoself encapsulate highly specific features that can be used to derive functional insights for proteins on the sole basis of their localization. Finally, to better understand the inner workings of our model, we dissect the emergent features from which our clustering is derived, interpret these features in the context of the fluorescence images, and analyze the performance contributions of the different components of our approach.Competing Interest StatementThe authors have declared no competing interest.