RT Journal Article SR Electronic T1 Generalised empirical Bayesian methods for discovery of differential data in high-throughput biology JF bioRxiv FD Cold Spring Harbor Laboratory SP 011890 DO 10.1101/011890 A1 Thomas J. Hardcastle YR 2015 UL http://biorxiv.org/content/early/2015/04/06/011890.abstract AB Motivation High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a ‘large P, small n’ setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses.Results We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs.Availability The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html.Contact tjh48{at}cam.ac.uk