Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL

Abstract

Single-cell barcoding technologies enable genome sequencing of thousands of individual cells in parallel, but with extremely low sequencing coverage (<0.05×) per cell. While the total copy number of large multi-megabase segments can be derived from such data, important allele-specific mutations—such as copy-neutral loss of heterozygosity (LOH) in cancer—are missed. We introduce copy-number haplotype inference in single cells using evolutionary links (CHISEL), a method to infer allele- and haplotype-specific copy numbers in single cells and subpopulations of cells by aggregating sparse signal across hundreds or thousands of individual cells. We applied CHISEL to ten single-cell sequencing datasets of ~2,000 cells from two patients with breast cancer. We identified extensive allele-specific copy-number aberrations (CNAs) in these samples, including copy-neutral LOHs, whole-genome duplications (WGDs) and mirrored-subclonal CNAs. These allele-specific CNAs affect genomic regions containing well-known breast-cancer genes. We also refined the reconstruction of tumor evolution, timing allele-specific CNAs before and after WGDs, identifying low-frequency subpopulations distinguished by unique CNAs and uncovering evidence of convergent evolution.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The CHISEL algorithm.
Fig. 2: CHISEL reliably identifies allele-specific copy numbers.
Fig. 3: CHISEL reveals haplotype-specific CNAs and WGDs that shape tumor evolution.
Fig. 4: Reconstruction of tumor heterogeneity and evolution across multiple tumor sections.

Similar content being viewed by others

Data availability

The sequencing data from 10x Genomics Chromium Single Cell CNV Solution for patient S0 are available at https://support.10xgenomics.com/single-cell-dna/datasets. Raw read counts and phased SNP counts for patient S0 are available at https://doi.org/10.5281/zenodo.3817605 and for patient S1 at https://doi.org/10.5281/zenodo.3817536. The DOP-PCR sequencing data of 89 breast tumor cells are available from the NCBI Sequence Read Archive under accession SRA: SRP114962. All the processed data for all datasets of patients S0 and S1 and for the DOP-PCR data, as well as all the results of CHISEL, are available on GitHub at https://github.com/raphael-group/chisel-data.

Code availability

CHISEL is available on GitHub at https://github.com/raphael-group/chisel and on Code Ocean at https://doi.org/10.24433/CO.6796686.v1.

References

  1. Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Wang, Y. et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature 512, 155–160 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Navin, N. E. The first five years of single-cell cancer genomics and beyond. Genome Res. 25, 1499–1507 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).

    CAS  PubMed  Google Scholar 

  5. Andor, N. et al. Joint single cell DNA-seq and RNA-seq of gastric cancer reveals subclonal signatures of genomic instability and gene expression. Preprint at bioRxiv https://doi.org/10.1101/445932 (2018).

  6. Zahn, H. et al. Scalable whole-genome single-cell library preparation without preamplification. Nat. Methods 14, 167–173 (2017).

    CAS  PubMed  Google Scholar 

  7. Laks, E. et al. Clonal decomposition and DNA replication states defined by scaled single-cell genome sequencing. Cell 179, 1207–1221 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Ciriello, G. et al. Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, 1127–1133 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Taylor, A. M. et al. Genomic and functional approaches to understanding cancer aneuploidy. Cancer Cell 33, 676–689 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. Burrell, R. A., McGranahan, N., Bartek, J. & Swanton, C. The causes and consequences of genetic heterogeneity in cancer evolution. Nature 501, 338–345 (2013).

    CAS  PubMed  Google Scholar 

  13. McGranahan, N. & Swanton, C. Biological and therapeutic impact of intratumor heterogeneity in cancer evolution. Cancer Cell 27, 15–26 (2015).

    CAS  PubMed  Google Scholar 

  14. Desper, R. et al. Distance-based reconstruction of tree models for oncogenesis. J. Comput. Biol. 7, 789–803 (2000).

    CAS  PubMed  Google Scholar 

  15. Chowdhury, S. A. et al. Algorithms to model single gene, single chromosome, and whole genome copy number changes jointly in tumor phylogenetics. PLOS Comput. Biol. 10, e1003740 (2014).

    PubMed  PubMed Central  Google Scholar 

  16. Schwarz, R. F. et al. Phylogenetic quantification of intra-tumour heterogeneity. PLOS Comput. Biol. 10, 1–11 (2014).

    Google Scholar 

  17. El-Kebir, M. et al. Complexity and algorithms for copy-number evolution problems. Algorithms Mol. Biol. 12, 13 (2017).

    PubMed  PubMed Central  Google Scholar 

  18. Zaccaria, S., El-Kebir, M., Klau, G. W. & Raphael, B. J. Phylogenetic copy-number factorization of multiple tumor samples. J. Comput. Biol. 25, 689–708 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).

    PubMed  PubMed Central  Google Scholar 

  20. Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotech. 30, 413–421 (2012).

    CAS  Google Scholar 

  21. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Ha, G. et al. TITAN: Inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24, 1881–1893 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. Fischer, A., Vázquez-Garcı́a, I., Illingworth, C. J. & Mustonen, V. High-definition reconstruction of clonal composition in cancer. Cell Rep. 7, 1740–1752 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. McPherson, A. W. et al. ReMixT: clone-specific genomic structure estimation in cancer. Genome Biol. 18, 140 (2017).

    PubMed  PubMed Central  Google Scholar 

  25. Zaccaria, S. & Raphael, B. J. Accurate quantification of copy-number aberrations and whole-genome duplications in multi-sample tumor sequencing data. Preprint at bioRxiv https://doi.org/10.1101/496174 (2018).

  26. Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).

    CAS  PubMed  Google Scholar 

  27. Waddell, N. et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 518, 495–501 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Dentro, S. C. et al. Portraits of genetic intra-tumour heterogeneity and subclonal selection across cancer types. Preprint at bioRxiv https://doi.org/10.1101/312041 (2018).

  29. Langdon, J. A. et al. Combined genome-wide allelotyping and copy number analysis identify frequent genetic losses without copy number reduction in medulloblastoma. Gene. Chromosome. Cancer 45, 47–60 (2006).

    CAS  Google Scholar 

  30. Kuga, D. et al. Prevalence of copy-number neutral loh in glioblastomas revealed by genomewide analysis of laser-microdissected tissues. Neuro-Oncol. 10, 995–1003 (2008).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. O’Keefe, C., McDevitt, M. A. & Maciejewski, J. P. Copy neutral loss of heterozygosity: a novel chromosomal lesion in myeloid malignancies. Blood 115, 2731–2739 (2010).

    PubMed  PubMed Central  Google Scholar 

  32. Ha, G. et al. Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Res. 22, 1995–2007 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Bielski, C. M. et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat. Genet. 50, 1189–1195 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Campbell, K. R. et al. Clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers. Genome Biol. 20, 54 (2019).

    PubMed  PubMed Central  Google Scholar 

  35. Garvin, T. et al. Interactive analysis and assessment of single-cell copy-number variations. Nat. Methods 12, 1058–1060 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Bakker, B. et al. Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies. Genome Biol. 17, 115 (2016).

    PubMed  PubMed Central  Google Scholar 

  37. Wang, X., Chen, H. & Zhang, N. R. DNA copy number profiling using single-cell sequencing. Brief. Bioinform. 19, 731–736 (2017).

    PubMed Central  Google Scholar 

  38. Dong, X., Zhang, L., Hao, X., Wang, T. & Vijg, J. SCCNV: a software tool for identifying copy number variation from single-cell whole-genome sequencing. Preprint at bioRxiv https://doi.org/10.1101/535807 (2019).

  39. Wang, R., Lin, D.-Y. & Jiang, Y. SCOPE: a normalization and copy-number estimation method for single-cell DNA sequencing. Cell Syst. 10, 445–452 (2020).

    PubMed  PubMed Central  Google Scholar 

  40. Jamal-Hanjani, M. et al. Tracking the evolution of non-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).

    CAS  PubMed  Google Scholar 

  41. Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. McGranahan, N. et al. Allele-specific hla loss and immune escape in lung cancer evolution. Cell 171, 1259–1271 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Kim, C. et al. Chemoresistance evolution in triple-negative breast cancer delineated by single-cell sequencing. Cell 173, 879–893 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Deshwar, A. G. et al. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16, 35 (2015).

    PubMed  PubMed Central  Google Scholar 

  47. El-Kebir, M., Satas, G., Oesper, L. & Raphael, B. J. Inferring the mutational history of a tumor using multi-state perfect phylogeny mixtures. Cell Syst. 3, 43–53 (2016).

    CAS  PubMed  Google Scholar 

  48. Dentro, S. C., Wedge, D. C. & Van Loo, P. Principles of reconstructing the subclonal architecture of cancers. Cold Spring Harbor Perspect. Med. 7, a026625 (2017).

    Google Scholar 

  49. Gao, R. et al. Punctuated copy number evolution and clonal stasis in triple-negative breast cancer. Nat. Genet. 48, 1119–1130 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Fan, J. et al. Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data. Genome Res. 28, 1217–1227 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Zaccaria, S. & Raphael, B. J. Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL (Github, 2020); https://github.com/raphael-group/chisel

  52. Zaccaria, S. & Raphael, B. J. Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL (Code Ocean, 2020); https://doi.org/10.24433/CO.6796686.v1

  53. Staaf, J. et al. Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome snp arrays. Genome Biol. 9, R136 (2008).

    PubMed  PubMed Central  Google Scholar 

  54. Greenman, C. D. et al. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics 11, 164–175 (2009).

    PubMed  PubMed Central  Google Scholar 

  55. Popova, T. et al. Genome Alteration Print (GAP): a tool to visualize and mine complex cancer genomic profiles obtained by SNP arrays. Genome Biol. 10, R128 (2009).

    PubMed  PubMed Central  Google Scholar 

  56. Carter, S. L., Meyerson, M. & Getz, G. Accurate estimation of homologue-specific DNA concentration-ratios in cancer samples allows long-range haplotyping. Nat. Prec. https://doi.org/10.1038/npre.2011.6494.1 (2011).

  57. Chen, H., Bell, J. M., Zavala, N. A., Ji, H. P. & Zhang, N. R. Allele-specific copy number profiling by next-generation DNA sequencing. Nucleic Acid. Res. 43, e23–e23 (2014).

    PubMed  PubMed Central  Google Scholar 

  58. Shen, R. & Seshan, V. E. FACETS: Allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acid. Res. 44, e131–e131 (2016).

    PubMed  PubMed Central  Google Scholar 

  59. Cheng, Y. et al. Quantification of multiple tumor clones using gene array and sequencing data. Ann. Appl. Stat. 11, 967–991 (2017).

    PubMed  PubMed Central  Google Scholar 

  60. Choi, Y., Chan, A. P., Kirkness, E., Telenti, A. & Schork, N. J. Comparison of phasing strategies for whole human genomes. PLOS Genet. 14, e1007308 (2018).

    PubMed  PubMed Central  Google Scholar 

  61. Do, C. B. & Batzoglou, S. What is the expectation maximization algorithm? Nat. Biotechnol. 26, 897–899 (2008).

    CAS  PubMed  Google Scholar 

  62. Thorndike, R. L. Who belongs in the family? Psychometrika 18, 267–276 (1953).

    Google Scholar 

  63. Li, H. A statistical framework for snp calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Li, H. et al. The sequence alignment/map format and samtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank L. Hepler and K. Ganapathy from 10x Genomics for providing additional data for our study, for providing access to the published data of the total copy-number analysis, and for the useful feedback. This work is supported by a US National Institutes of Health (NIH) grants R01HG007069 and U24CA211000, US National Science Foundation (NSF) CAREER Award (CCF-1053753) and Chan Zuckerberg Initiative DAF grants 2018-182608 (B.J.R.). Additional support was provided by NIH grant (Rutgers) 2P30CA072720-20, the O’Brien Family Fund for Health Research and the Wilke Family Fund for Innovation (B.J.R.).

Author information

Authors and Affiliations

Authors

Contributions

S.Z. and B.J.R. conceived the project, developed the theory and algorithms and wrote the paper. S.Z. implemented the algorithms and performed the analyses.

Corresponding author

Correspondence to Benjamin J. Raphael.

Ethics declarations

Competing interests

B.J.R. is a cofounder of, and consultant to, Medley Genomics.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information

Supplementary Figs. 1–28, Results 1–4 and Methods 1–12.

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zaccaria, S., Raphael, B.J. Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL. Nat Biotechnol 39, 207–214 (2021). https://doi.org/10.1038/s41587-020-0661-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-020-0661-6

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer