THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data

Genome Biol. 2013 Jul 29;14(7):R80. doi: 10.1186/gb-2013-14-7-r80.

Abstract

Tumor samples are typically heterogeneous, containing admixture by normal, non-cancerous cells and one or more subpopulations of cancerous cells. Whole-genome sequencing of a tumor sample yields reads from this mixture, but does not directly reveal the cell of origin for each read. We introduce THetA (Tumor Heterogeneity Analysis), an algorithm that infers the most likely collection of genomes and their proportions in a sample, for the case where copy number aberrations distinguish subpopulations. THetA successfully estimates normal admixture and recovers clonal and subclonal copy number aberrations in real and simulated sequencing data. THetA is available at http://compbio.cs.brown.edu/software/.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Breast Neoplasms / genetics*
  • Computer Simulation
  • Female
  • Genetic Heterogeneity*
  • Genome, Human / genetics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Likelihood Functions
  • Software*
  • Statistics as Topic