FDR control by the BH procedure for two-sided correlated tests with implications to gene expression data analysis

Biom J. 2007 Feb;49(1):107-26. doi: 10.1002/bimj.200510313.

Abstract

The multiple testing problem attributed to gene expression analysis is challenging not only by its size, but also by possible dependence between the expression levels of different genes resulting from coregulations of the genes. Furthermore, the measurement errors of these expression levels may be dependent as well since they are subjected to several technical factors. Multiple testing of such data faces the challenge of correlated test statistics. In such a case, the control of the False Discovery Rate (FDR) is not straightforward, and thus demands new approaches and solutions that will address multiplicity while accounting for this dependency. This paper investigates the effects of dependency between bormal test statistics on FDR control in two-sided testing, using the linear step-up procedure (BH) of Benjamini and Hochberg (1995). The case of two multiple hypotheses is examined first. A simulation study offers primary insight into the behavior of the FDR subjected to different levels of correlation and distance between null and alternative means. A theoretical analysis follows in order to obtain explicit upper bounds to the FDR. These results are then extended to more than two multiple tests, thereby offering a better perspective on the effect of the proportion of false null hypotheses, as well as the structure of the test statistics correlation matrix. An example from gene expression data analysis is presented.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Interpretation, Statistical*
  • Gene Expression Profiling / statistics & numerical data*
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data