Systematic integration of RNA-Seq statistical algorithms for accurate detection of differential gene expression patterns

Nucleic Acids Res. 2015 Feb 27;43(4):e25. doi: 10.1093/nar/gku1273. Epub 2014 Dec 1.

Abstract

RNA-Seq is gradually becoming the standard tool for transcriptomic expression studies in biological research. Although considerable progress has been recorded in the development of statistical algorithms for the detection of differentially expressed genes using RNA-Seq data, the list of detected genes can differ significantly between algorithms. We present a new method (PANDORA) that combines multiple algorithms toward a summarized result, more efficiently reflecting true experimental outcomes. This is achieved through the systematic combination of several analysis algorithms, by weighting their outcomes according to their performance with realistically simulated data sets generated from real data. Results supported by the analysis of both simulated and real data from different organisms as well as correlation with PolII occupancy demonstrate that PANDORA improves the detection of differential expression. It accomplishes this by optimizing the tradeoff between standard performance measurements, such as precision and sensitivity.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Data Interpretation, Statistical
  • Gene Expression Profiling / methods*
  • RNA Polymerase II / metabolism
  • Sensitivity and Specificity
  • Sequence Analysis, RNA / methods*

Substances

  • RNA Polymerase II