RT Journal Article SR Electronic T1 Construction of continuously expandable single-cell atlases through integration of heterogeneous datasets in a generalized cell-embedding space JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.04.06.438536 DO 10.1101/2021.04.06.438536 A1 Lei Xiong A1 Kang Tian A1 Yuzhe Li A1 Qiangfeng Cliff Zhang YR 2021 UL http://biorxiv.org/content/early/2021/04/08/2021.04.06.438536.abstract AB Single-cell RNA-seq and ATAC-seq analyses have been widely applied to decipher cell-type and regulation complexities. However, experimental conditions often confound biological variations when comparing data from different samples. For integrative single-cell data analysis, we have developed SCALEX, a deep generative framework that maps cells into a generalized, batch-invariant cell-embedding space. We demonstrate that SCALEX accurately and efficiently integrates heterogenous single-cell data using multiple benchmarks. It outperforms competing methods, especially for datasets with partial overlaps, accurately aligning similar cell populations while retaining true biological differences. We demonstrate the advantages of SCALEX by constructing continuously expandable single-cell atlases for human, mouse, and COVID-19, which were assembled from multiple data sources and can keep growing through the inclusion of new incoming data. Analyses based on these atlases revealed the complex cellular landscapes of human and mouse tissues and identified multiple peripheral immune subtypes associated with COVID-19 disease severity.Competing Interest StatementThe authors have declared no competing interest.