Abstract
Rapidly developing single-cell multi-omics sequencing technologies generate increasingly large bodies of multimodal data. Integrating multimodal data from different sequencing technologies, i.e. mosaic data, permits larger-scale investigation with more modalities and can help to better reveal cellular heterogeneity. However, mosaic integration involves major challenges, particularly regarding modality alignment and batch effect removal. Here we present a deep probabilistic framework for the mosaic integration and knowledge transfer (MIDAS) of single-cell multimodal data. MIDAS simultaneously achieves dimensionality reduction, imputation, and batch correction of mosaic data by employing self-supervised modality alignment and information-theoretic latent disentanglement. We demonstrate its superiority to other methods and reliability by evaluating its performance in full trimodal integration and various mosaic tasks. We also constructed a single-cell trimodal atlas of human peripheral blood mononuclear cells (PBMCs), and tailored transfer learning and reciprocal reference mapping schemes to enable flexible and accurate knowledge transfer from the atlas to new data.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
More comparisons are included.