Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach

BMC Bioinformatics. 2005 Jun 29:6:165. doi: 10.1186/1471-2105-6-165.

Abstract

Background: In testing for differential gene expression involving multiple serial analysis of gene expression (SAGE) libraries, it is critical to account for both between and within library variation. Several methods have been proposed, including the t test, tw test, and an overdispersed logistic regression approach. The merits of these tests, however, have not been fully evaluated. Questions still remain on whether further improvements can be made.

Results: In this article, we introduce an overdispersed log-linear model approach to analyzing SAGE; we evaluate and compare its performance with three other tests: the two-sample t test, tw test and another based on overdispersed logistic linear regression. Analysis of simulated and real datasets show that both the log-linear and logistic overdispersion methods generally perform better than the t and tw tests; the log-linear method is further found to have better performance than the logistic method, showing equal or higher statistical power over a range of parameter values and with different data distributions.

Conclusion: Overdispersed log-linear models provide an attractive and reliable framework for analyzing SAGE experiments involving multiple libraries. For convenience, the implementation of this method is available through a user-friendly web-interface available at http://www.cbcb.duke.edu/sage.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Carcinoma, Pancreatic Ductal / genetics
  • Cell Line, Tumor
  • Gene Expression Profiling / methods*
  • Gene Expression Profiling / statistics & numerical data*
  • Gene Library*
  • Humans
  • Internet
  • Linear Models*
  • Pancreatic Neoplasms / genetics
  • RNA, Messenger / analysis
  • ROC Curve
  • User-Computer Interface

Substances

  • RNA, Messenger