scAdapt: Virtual adversarial domain adaptation network for single cell RNA-seq data classification across platforms and species

Xiang Zhou; Hua Chai; Yuansong Zeng; Huiying Zhao; Ching-Hsing Luo; Yuedong Yang

doi:10.1101/2021.01.18.427083

Abstract

Motivation In single cell analyses, cell types are conventionally identified based on known marker gene expressions. Such approaches are time-consuming and irreproducible. Therefore, many new supervised methods have been developed to identify cell types for target datasets using the rapid accumulation of public datasets. However, these approaches are sensitive to batch effects or biological variations since the data distributions are different in cross-platforms or species predictions.

Results We developed scAdapt, a virtual adversarial domain adaptation network to transfer cell labels between datasets with batch effects. scAdapt used both the labeled source and unlabeled target data to train an enhanced classifier, and aligned the labeled source centroid and pseudo-labeled target centroid to generate a joint embedding. We demonstrate that scAdapt outperforms existing methods for classification in simulated, cross-platforms, cross-species, and spatial transcriptomic datasets. Further quantitative evaluations and visualizations for the aligned embeddings confirm the superiority in cell mixing and preserving discriminative cluster structure present in the original datasets.

Availability https://github.com/zhoux85/scAdapt.

Contact angyd25{at}mail.sysu.edu.cn or luojinx5{at}mail.sysu.edu.cn

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

https://github.com/zhoux85/scAdapt

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.