Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Reconstructing tumor evolutionary histories and clone trees in polynomial-time with SubMARine

View ORCID ProfileLinda K. Sundermann, View ORCID ProfileJeff Wintersinger, View ORCID ProfileGunnar Rätsch, View ORCID ProfileJens Stoye, View ORCID ProfileQuaid Morris
doi: https://doi.org/10.1101/2020.06.11.146100
Linda K. Sundermann
1Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, Canada
2Vector Institute for Artificial Intelligence, Toronto, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Linda K. Sundermann
Jeff Wintersinger
1Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, Canada
2Vector Institute for Artificial Intelligence, Toronto, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jeff Wintersinger
Gunnar Rätsch
3Department of Computer Science, ETH Zurich, Switzerland
4Biomedical Informatics, University Hospital Zurich, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gunnar Rätsch
Jens Stoye
5Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jens Stoye
Quaid Morris
1Donnelly Centre for Cellular + Biomolecular Research, University of Toronto, Canada
2Vector Institute for Artificial Intelligence, Toronto, Canada
6Ontario Institute for Cancer Research, Toronto, Canada
7Memorial Sloan Kettering Cancer Center, New York City, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Quaid Morris
  • For correspondence: morrisq@mskcc.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Tumors contain multiple subpopulations of genetically distinct cancer cells. Reconstructing their evolutionary history can improve our understanding of how cancers develop and respond to treatment. Subclonal reconstruction methods cluster mutations into groups that co-occur within the same subpopulations, estimate the frequency of cells belonging to each subpopulation, and infer the ancestral relationships among the subpopulations by constructing a clone tree. However, often multiple clone trees are consistent with the data and current methods do not efficiently capture this uncertainty; nor can these methods scale to clone trees with a large number of subclonal populations.

Here, we formalize the notion of a partial clone tree that defines a subset of the pairwise ancestral relationships in a clone tree, thereby implicitly representing the set of all clone trees that have these defined pairwise relationships. Also, we introduce a special partial clone tree, the Maximally-Constrained Ancestral Reconstruction (MAR), which summarizes all clone trees fitting the input data equally well. Finally, we extend commonly used clone tree validity conditions to apply to partial clone trees and describe SubMARine, a polynomial-time algorithm producing the subMAR, which approximates the MAR and guarantees that its defined relationships are a subset of those present in the MAR. We also extend SubMARine to work with subclonal copy number aberrations and define equivalence constraints for this purpose. In contrast with other clone tree reconstruction methods, SubMARine runs in time and space that scales polynomially in the number of subclones.

We show through extensive simulation and a large lung cancer dataset that the subMAR equals the MAR in > 99.9% of cases where only a single clone tree exists and that it is a perfect match to the MAR in most of the other cases. Notably, SubMARine runs in less than 70 seconds on a single thread with less than one Gb of memory on all datasets presented in this paper, including ones with 50 nodes in a clone tree.

The freely-available open-source code implementing SubMARine can be downloaded at https://github.com/morrislab/submarine.

Author summary Cancer cells accumulate mutations over time and consist of genetically distinct subpopulations. Their evolutionary history (as represented by tumor phylogenies) can be inferred from bulk cancer genome sequencing data. Current tumor phylogeny reconstruction methods have two main issues: they are slow, and they do not efficiently represent uncertainty in the reconstruction.

To address these issues, we developed SubMARine, a fast algorithm that summarizes all valid phylogenies in an intuitive format. SubMARine solved all reconstruction problems in this manuscript in less than 70 seconds, orders of magnitude faster than other methods. These reconstruction problems included those with up to 50 subclones; problems that are too large for other algorithms to even attempt. SubMARine achieves these result because, unlike other algorithms, it performs its reconstruction by identifying an upper-bound on the solution set of trees. In the vast majority of cases, this upper bound is tight: when only a single solution exists, SubMARine converges to it > 99.9% of the time; when multiple solutions exist, our algorithm correctly recovers the uncertain relationships in more than 80% of cases.

In addition to solving these two major challenges, we introduce some useful new concepts for and open research problems in the field of tumor phylogeny reconstruction. Specifically, we formalize the concept of a partial clone tree which provides a set of constraints on the solution set of clone trees; and provide a complete set of conditions under which a partial clone tree is valid. These conditions guarantee that all trees in the solution set satisfy the constraints implied by the partial clone tree.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • ↵* MorrisQ{at}mskcc.org

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted June 12, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Reconstructing tumor evolutionary histories and clone trees in polynomial-time with SubMARine
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Reconstructing tumor evolutionary histories and clone trees in polynomial-time with SubMARine
Linda K. Sundermann, Jeff Wintersinger, Gunnar Rätsch, Jens Stoye, Quaid Morris
bioRxiv 2020.06.11.146100; doi: https://doi.org/10.1101/2020.06.11.146100
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Reconstructing tumor evolutionary histories and clone trees in polynomial-time with SubMARine
Linda K. Sundermann, Jeff Wintersinger, Gunnar Rätsch, Jens Stoye, Quaid Morris
bioRxiv 2020.06.11.146100; doi: https://doi.org/10.1101/2020.06.11.146100

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2633)
  • Biochemistry (5221)
  • Bioengineering (3643)
  • Bioinformatics (15711)
  • Biophysics (7213)
  • Cancer Biology (5593)
  • Cell Biology (8045)
  • Clinical Trials (138)
  • Developmental Biology (4735)
  • Ecology (7462)
  • Epidemiology (2059)
  • Evolutionary Biology (10520)
  • Genetics (7698)
  • Genomics (10082)
  • Immunology (5148)
  • Microbiology (13823)
  • Molecular Biology (5354)
  • Neuroscience (30577)
  • Paleontology (211)
  • Pathology (871)
  • Pharmacology and Toxicology (1519)
  • Physiology (2234)
  • Plant Biology (4983)
  • Scientific Communication and Education (1036)
  • Synthetic Biology (1379)
  • Systems Biology (4130)
  • Zoology (803)