Abstract
Background Copy number alterations (CNAs), due to its large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task.
Results We introduce Accucopy, a CNA-calling method that improves and adds another layer to our previous Accurity model to predict both total (TCN) and allele-specific copy numbers (ASCN) for the tumor genome. Accucopy adopts a tiered Gaussian mixture model coupled with an innovative autocorrelation-guided EM algorithm to find the optimal solution quickly. The Accucopy model utilizes information from both total sequencing coverage and allelic sequencing coverage. Accucopy is implemented in C++/Rust, available at http://www.yfish.org/software/.
Conclusions We describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than existing methods.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Provided additional benchmarks.
Abbreviations
- CNAs
- Copy number alterations
- TCN
- total copy number
- ASCN
- allele-specific copy numbers
- SCNAs
- somatic copy number alterations
- HGSNVs
- heterozygous single-nucleotide loci
- SKY
- spectral karyotyping
- LOH
- Loss-of-Heterozygosity
- BRCA
- breast invasive carcinoma
- COAD
- Colon adenocarcinoma
- GBM
- Glioblastoma multiforme
- HNSC
- Head and Neck squamous cell carcinoma
- PRAD
- Prostate adenocarcinoma
- CBS
- circular binary segmentation
- MACN
- Major Allele Copy Number
- TRE
- Tumor Read Enrichment
- LAR
- Log ratio of Allelic-coverage Ratios
- CNVs
- copy number variations