Abstract
Clone assignment in single-cell genomics remains a challenge due to its diverse mutation macrostructures and many missing signals. Existing statistical methods, for the sake of numerical convergence, pose strong constraints on the form of predicted mutation patterns, so they easily identify sub-optimally fitted clones that overlook weak and rare mutations. To solve this problem, we developed SNPmanifold, a Python package that learns flexible mutation patterns using a shallow binomial variational autoencoder. The latent space of SNPmanifold can effectively represent and visualize complex mutations of SNPs (single-nucleotide polymorphisms) in the form of geometrical manifolds. Based on nuclear or mitochondrial SNPs, we demonstrated that SNPmanifold can effectively identify a large number of multiplexed donors of origin (k = 18) that all existing unsupervised methods fail and lineages of somatic clones with promising biological interpretation. Therefore, SNPmanifold can reveal insights into single-cell SNPs more comprehensively than other existing methods, especially in complex datasets.
Competing Interest Statement
The authors have declared no competing interest.