Reconstruction of clonal trees and tumor composition from multi-sample sequencing data

Bioinformatics. 2015 Jun 15;31(12):i62-70. doi: 10.1093/bioinformatics/btv261.

Abstract

Motivation: DNA sequencing of multiple samples from the same tumor provides data to analyze the process of clonal evolution in the population of cells that give rise to a tumor.

Results: We formalize the problem of reconstructing the clonal evolution of a tumor using single-nucleotide mutations as the variant allele frequency (VAF) factorization problem. We derive a combinatorial characterization of the solutions to this problem and show that the problem is NP-complete. We derive an integer linear programming solution to the VAF factorization problem in the case of error-free data and extend this solution to real data with a probabilistic model for errors. The resulting AncesTree algorithm is better able to identify ancestral relationships between individual mutations than existing approaches, particularly in ultra-deep sequencing data when high read counts for mutations yield high confidence VAFs.

Availability and implementation: An implementation of AncesTree is available at: http://compbio.cs.brown.edu/software.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Clonal Evolution / genetics*
  • Gene Frequency
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Models, Statistical
  • Mutation / genetics*
  • Neoplasms / classification*
  • Neoplasms / genetics*
  • Sequence Analysis, DNA / methods*