Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Bayesian non-parametric clustering of single-cell mutation profiles

Nico Borgsmüller, Jose Bonet, View ORCID ProfileFrancesco Marass, Abel Gonzalez-Perez, Nuria Lopez-Bigas, Niko Beerenwinkel
doi: https://doi.org/10.1101/2020.01.15.907345
Nico Borgsmüller
1Department of Biosystems Science and Engineering, ETH Zürich, 4058 Basel, Switzerland
2SIB, Swiss Institute of Bioinformatics, 4058 Basel, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jose Bonet
3Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain
4Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Francesco Marass
1Department of Biosystems Science and Engineering, ETH Zürich, 4058 Basel, Switzerland
2SIB, Swiss Institute of Bioinformatics, 4058 Basel, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Francesco Marass
Abel Gonzalez-Perez
3Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain
4Research Program on Biomedical Informatics, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nuria Lopez-Bigas
3Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac, 10, 08028 Barcelona, Spain
5Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Niko Beerenwinkel
1Department of Biosystems Science and Engineering, ETH Zürich, 4058 Basel, Switzerland
2SIB, Swiss Institute of Bioinformatics, 4058 Basel, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: niko.beerenwinkel@bsse.ethz.ch
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

The high resolution of single-cell DNA sequencing (scDNA-seq) offers great potential to resolve intra-tumor heterogeneity by distinguishing clonal populations based on their mutation profiles. However, the increasing size of scDNA-seq data sets and technical limitations, such as high error rates and a large proportion of missing values, complicate this task, rendering the applicability of existing methods more limited. Here we introduce BnpC, a novel non-parametric method to cluster individual cells into clones and infer their genotypes based on their noisy mutation profiles. BnpC employs a Dirichlet process mixture model coupled with a Markov chain Monte Carlo sampling scheme, including a modified non-conjugate split-merge move and a novel posterior estimator to predict clones and genotypes. Our method was comprehensively benchmarked against state-of-the-art methods on simulated data using various data sizes and was applied to three cancer scDNA-seq data sets. On simulated data, BnpC compared favorably against current methods in terms of accuracy, runtime, and scalability. On tumor scDNA-seq data, BnpC was able to identify clonal populations missed by the original cluster analysis but supported by supplementary experimental data. As scDNA-seq data size constantly grows, scalable, efficient and accurate methods such as BnpC will become increasingly relevant, not only to solve intra-tumor heterogeneity, but also as a pre-processing step to reduce data size. BnpC is freely available under MIT license at https://github.com/cbg-ethz/BnpC.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 15, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Bayesian non-parametric clustering of single-cell mutation profiles
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Bayesian non-parametric clustering of single-cell mutation profiles
Nico Borgsmüller, Jose Bonet, Francesco Marass, Abel Gonzalez-Perez, Nuria Lopez-Bigas, Niko Beerenwinkel
bioRxiv 2020.01.15.907345; doi: https://doi.org/10.1101/2020.01.15.907345
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Bayesian non-parametric clustering of single-cell mutation profiles
Nico Borgsmüller, Jose Bonet, Francesco Marass, Abel Gonzalez-Perez, Nuria Lopez-Bigas, Niko Beerenwinkel
bioRxiv 2020.01.15.907345; doi: https://doi.org/10.1101/2020.01.15.907345

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3573)
  • Biochemistry (7517)
  • Bioengineering (5478)
  • Bioinformatics (20671)
  • Biophysics (10254)
  • Cancer Biology (7927)
  • Cell Biology (11566)
  • Clinical Trials (138)
  • Developmental Biology (6563)
  • Ecology (10130)
  • Epidemiology (2065)
  • Evolutionary Biology (13532)
  • Genetics (9496)
  • Genomics (12788)
  • Immunology (7869)
  • Microbiology (19443)
  • Molecular Biology (7611)
  • Neuroscience (41862)
  • Paleontology (306)
  • Pathology (1252)
  • Pharmacology and Toxicology (2179)
  • Physiology (3249)
  • Plant Biology (7005)
  • Scientific Communication and Education (1291)
  • Synthetic Biology (1941)
  • Systems Biology (5405)
  • Zoology (1107)