PT - JOURNAL ARTICLE AU - Hirofumi Kobayashi AU - Keith C. Cheveralls AU - Manuel D. Leonetti AU - Loic A. Royer TI - Self-Supervised Deep-Learning Encodes High-Resolution Features of Protein Subcellular Localization AID - 10.1101/2021.03.29.437595 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.03.29.437595 4099 - http://biorxiv.org/content/early/2021/03/29/2021.03.29.437595.short 4100 - http://biorxiv.org/content/early/2021/03/29/2021.03.29.437595.full AB - Elucidating the diversity and complexity of protein localization is essential to fully understand cellular architecture. Here, we present cytoself, a deep learning-based approach for fully self-supervised protein localization profiling and clustering. cytoself leverages a self-supervised training scheme that does not require pre-existing knowledge, categories, or annotations. Applying cytoself to images of 1311 endogenously labeled proteins from the recently released OpenCell database creates a highly resolved protein localization atlas. We show that the representations derived from cytoself encapsulate highly specific features that can be used to derive functional insights for proteins on the sole basis of their localization. Finally, to better understand the inner workings of our model, we dissect the emergent features from which our clustering is derived, interpret these features in the context of the fluorescence images, and analyze the performance contributions of the different components of our approach.Competing Interest StatementThe authors have declared no competing interest.