Variance stabilization applied to microarray data calibration and to the quantification of differential expression

Bioinformatics. 2002:18 Suppl 1:S96-104. doi: 10.1093/bioinformatics/18.suppl_1.s96.

Abstract

We introduce a statistical model for microarray gene expression data that comprises data calibration, the quantification of differential expression, and the quantification of measurement error. In particular, we derive a transformation h for intensity measurements, and a difference statistic Deltah whose variance is approximately constant along the whole intensity range. This forms a basis for statistical inference from microarray data, and provides a rational data pre-processing strategy for multivariate analyses. For the transformation h, the parametric form h(x)=arsinh(a+bx) is derived from a model of the variance-versus-mean dependence for microarray intensity data, using the method of variance stabilizing transformations. For large intensities, h coincides with the logarithmic transformation, and Deltah with the log-ratio. The parameters of h together with those of the calibration between experiments are estimated with a robust variant of maximum-likelihood estimation. We demonstrate our approach on data sets from different experimental platforms, including two-colour cDNA arrays and a series of Affymetrix oligonucleotide arrays.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms
  • Analysis of Variance
  • Calibration / standards
  • Data Interpretation, Statistical
  • Gene Expression Profiling / instrumentation*
  • Gene Expression Profiling / methods*
  • Gene Expression Profiling / standards
  • Likelihood Functions
  • Models, Genetic*
  • Models, Statistical*
  • Oligonucleotide Array Sequence Analysis / instrumentation*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Oligonucleotide Array Sequence Analysis / standards
  • Reference Standards
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software