Denoising array-based comparative genomic hybridization data using wavelets

Biostatistics. 2005 Apr;6(2):211-26. doi: 10.1093/biostatistics/kxi004.

Abstract

Array-based comparative genomic hybridization (array-CGH) provides a high-throughput, high-resolution method to measure relative changes in DNA copy number simultaneously at thousands of genomic loci. Typically, these measurements are reported and displayed linearly on chromosome maps, and gains and losses are detected as deviations from normal diploid cells. We propose that one may consider denoising the data to uncover the true copy number changes before drawing inferences on the patterns of aberrations in the samples. Nonparametric techniques are particularly suitable for data denoising as they do not impose a parametric model in finding structures in the data. In this paper, we employ wavelets to denoise the data as wavelets have sound theoretical properties and a fast computational algorithm, and are particularly well suited for handling the abrupt changes seen in array-CGH data. A simulation study shows that denoising data prior to testing can achieve greater power in detecting the aberrant spot than using the raw data without denoising. Finally, we illustrate the method on two array-CGH data sets.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Breast Neoplasms / genetics
  • Chromosome Aberrations
  • Chromosomes, Artificial, Bacterial / genetics
  • Computer Simulation
  • DNA, Neoplasm / genetics
  • Data Interpretation, Statistical*
  • Female
  • Fibroblasts
  • Gene Dosage*
  • Humans
  • Nucleic Acid Hybridization / methods*
  • Oligonucleotide Array Sequence Analysis / methods*

Substances

  • DNA, Neoplasm