Abstract
Copy number variation is crucial in deciphering the mechanism and cure of complex disorders and cancers. The recent advancement of scDNA sequencing technology sheds light upon addressing intratumor heterogeneity, detecting rare subclones, and reconstructing tumor evolution lineages at single-cell resolution. Nevertheless, the current circular binary segmentation based approach proves to fail to efficiently and effectively identify copy number shifts on some exceptional trails. Here, we propose SCYN, a CNV segmentation method powered with dynamic programming. SCYN resolves the precise segmentation on two in silico datasets. Then we verified SCYN manifested accurate copy number inferring on triple negative breast cancer scDNA data, with array comparative genomic hybridization results of purified bulk samples as ground truth validation. We tested SCYN on two datasets of the newly emerged 10x Genomics CNV solution. SCYN successfully recognizes gastric cancer cells from 1% and 10% spike-ins 10x datasets. Moreover, SCYN is about 150 times faster than state of the art tool when dealing with the datasets of approximately 2000 cells. SCYN robustly and efficiently detects segmentations and infers copy number profiles on single cell DNA sequencing data. It serves to reveal the tumor intra-heterogeneity. The source code of SCYN can be accessed in https://github.com/xikanfeng2/SCYN. The visualization tools are hosted on https://sc.deepomics.org/.
List of abbreviations
- CNV
- Copy Number Variation
- scDNA-seq
- Single Cell DNA sequencing
- scRNA-seq
- Single Cell RNA sequencing
- aCGH
- array Comparative Genomic Hybridization
- CBS
- Circular Binary Segmentation
- HMM
- Hidden Markov Model
- ARI
- Adjusted Rand Index
- NMI
- Normalized Mutual Information
- JI
- Jaccard Index
- mBIC
- modified Bayesian information criteria