RT Journal Article SR Electronic T1 Interpretable variational encoding of genotypes identifies comprehensive clonality and lineages in single cells geometrically JF bioRxiv FD Cold Spring Harbor Laboratory SP 2024.07.04.602109 DO 10.1101/2024.07.04.602109 A1 Chung, Hoi Man A1 Huang, Yuanhua YR 2024 UL http://biorxiv.org/content/early/2024/07/09/2024.07.04.602109.abstract AB Clone assignment in single-cell genomics remains a challenge due to its diverse mutation macrostructures and many missing signals. Existing statistical methods, for the sake of numerical convergence, pose strong constraints on the form of predicted mutation patterns, so they easily identify sub-optimally fitted clones that overlook weak and rare mutations. To solve this problem, we developed SNPmanifold, a Python package that learns flexible mutation patterns using a shallow binomial variational autoencoder. The latent space of SNPmanifold can effectively represent and visualize complex mutations of SNPs (single-nucleotide polymorphisms) in the form of geometrical manifolds. Based on nuclear or mitochondrial SNPs, we demonstrated that SNPmanifold can effectively identify a large number of multiplexed donors of origin (k = 18) that all existing unsupervised methods fail and lineages of somatic clones with promising biological interpretation. Therefore, SNPmanifold can reveal insights into single-cell SNPs more comprehensively than other existing methods, especially in complex datasets.Competing Interest StatementThe authors have declared no competing interest.