TY - JOUR T1 - ShapeCluster: Applying parametric regression to analyse time-series gene expression data JF - bioRxiv DO - 10.1101/035782 SP - 035782 AU - Philip Law AU - Vicky Buchanan-Wollaston AU - Andrew Mead Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/01/01/035782.abstract N2 - High-throughput technologies have made it possible to perform genome-scale analyses to investigate a variety of research areas. From these analyses, vast amounts of data are generated. However, this data can be noisy, which could obscure the underlying signal. Here, a high-throughput regression analysis approach was developed, where a variety of linear and nonlinear parametric models were fitted to gene expression profiles from time course experiments. These models include the logistic, Gompertz, exponential, critical exponential, linear+exponential, Gaussian and linear functions. The fitted parameters from these models reflect aspects of the model shape, and thus allowed for the interpretation of gene expression profiles in terms of the underlying biology, such as the time of initial gene expression. This provides a potentially more mechanistic approach to studying the genetic responses to stimuli. Together with a cluster analysis, termed ShapeCluster, it was possible to group genes based on these aspects of the expression profiles. By investigating different combinations of parameters, this added flexibility to the analysis and allowed for the investigation of the data in multiple ways, including the identification of groups of genes that may be co-regulated, or participate in response to the biological stress in question. Clusters from these methods were assessed for significance through the use of over-represented annotation terms and motifs, and found to produce biologically relevant sets of genes. The ShapeCluster package is available from https://sourceforge.net/projects/shapecluster/. ER -