RT Journal Article SR Electronic T1 Towards Universal Cell Embeddings: Integrating Single-cell RNA-seq Datasets across Species with SATURN JF bioRxiv FD Cold Spring Harbor Laboratory SP 2023.02.03.526939 DO 10.1101/2023.02.03.526939 A1 Rosen, Yanay A1 Brbić, Maria A1 Roohani, Yusuf A1 Swanson, Kyle A1 Li, Ziang A1 Leskovec, Jure YR 2023 UL http://biorxiv.org/content/early/2023/09/24/2023.02.03.526939.abstract AB Analysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, inter-species genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here, we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes’ biological properties using protein language models. By coupling protein embeddings from language models with RNA expression, SATURN integrates datasets profiled from different species regardless of their genomic similarity. SATURN has a unique ability to detect functionally related genes co-expressed across species, redefining differential expression for cross-species analysis. We apply SATURN to three species whole-organism atlases and frog and zebrafish embryogenesis datasets. We show that cell embeddings learnt in SATURN can be effectively used to transfer annotations across species and identify both homologous and species-specific cell types, even across evolutionarily remote species. Finally, we use SATURN to reannotate the five species Cell Atlas of Human Trabecular Meshwork and Aqueous Outflow Structures and find evidence of potentially divergent functions between glaucoma associated genes in humans and other species.Competing Interest StatementThe authors have declared no competing interest.