Sparse representation and Bayesian detection of genome copy number alterations from microarray data

Bioinformatics. 2008 Feb 1;24(3):309-18. doi: 10.1093/bioinformatics/btm601. Epub 2008 Jan 18.

Abstract

Motivation: Genomic instability in cancer leads to abnormal genome copy number alterations (CNA) that are associated with the development and behavior of tumors. Advances in microarray technology have allowed for greater resolution in detection of DNA copy number changes (amplifications or deletions) across the genome. However, the increase in number of measured signals and accompanying noise from the array probes present a challenge in accurate and fast identification of breakpoints that define CNA. This article proposes a novel detection technique that exploits the use of piece wise constant (PWC) vectors to represent genome copy number and sparse Bayesian learning (SBL) to detect CNA breakpoints.

Methods: First, a compact linear algebra representation for the genome copy number is developed from normalized probe intensities. Second, SBL is applied and optimized to infer locations where copy number changes occur. Third, a backward elimination (BE) procedure is used to rank the inferred breakpoints; and a cut-off point can be efficiently adjusted in this procedure to control for the false discovery rate (FDR).

Results: The performance of our algorithm is evaluated using simulated and real genome datasets and compared to other existing techniques. Our approach achieves the highest accuracy and lowest FDR while improving computational speed by several orders of magnitude. The proposed algorithm has been developed into a free standing software application (GADA, Genome Alteration Detection Algorithm).

Availability: http://biron.usc.edu/~piquereg/GADA

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Artificial Intelligence
  • Bayes Theorem
  • Chromosome Mapping / methods*
  • DNA Mutational Analysis / methods*
  • DNA, Neoplasm / genetics*
  • Gene Dosage / genetics*
  • Genetic Predisposition to Disease / genetics
  • Genetic Variation / genetics
  • Humans
  • Mutation
  • Neuroblastoma / genetics*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated / methods
  • Sequence Analysis, DNA / methods*

Substances

  • DNA, Neoplasm