Analyzing circadian expression data by harmonic regression based on autoregressive spectral estimation

Bioinformatics. 2010 Jun 15;26(12):i168-74. doi: 10.1093/bioinformatics/btq189.

Abstract

Motivation: Circadian rhythms are prevalent in most organisms. Identification of circadian-regulated genes is a crucial step in discovering underlying pathways and processes that are clock-controlled. Such genes are largely detected by searching periodic patterns in microarray data. However, temporal gene expression profiles usually have a short time-series with low sampling frequency and high levels of noise. This makes circadian rhythmic analysis of temporal microarray data very challenging.

Results: We propose an algorithm named ARSER, which combines time domain and frequency domain analysis for extracting and characterizing rhythmic expression profiles from temporal microarray data. ARSER employs autoregressive spectral estimation to predict an expression profile's periodicity from the frequency spectrum and then models the rhythmic patterns by using a harmonic regression model to fit the time-series. ARSER describes the rhythmic patterns by four parameters: period, phase, amplitude and mean level, and measures the multiple testing significance by false discovery rate q-value. When tested on well defined periodic and non-periodic short time-series data, ARSER was superior to two existing and widely-used methods, COSOPT and Fisher's G-test, during identification of sinusoidal and non-sinusoidal periodic patterns in short, noisy and non-stationary time-series. Finally, analysis of Arabidopsis microarray data using ARSER led to identification of a novel set of previously undetected non-sinusoidal periodic transcripts, which may lead to new insights into molecular mechanisms of circadian rhythms.

Availability: ARSER is implemented by Python and R. All source codes are available from http://bioinformatics.cau.edu.cn/ARSER.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Arabidopsis / genetics
  • Circadian Rhythm / genetics*
  • Computational Biology / methods*
  • Gene Expression Profiling / methods*
  • Oligonucleotide Array Sequence Analysis / methods
  • Pattern Recognition, Automated / methods
  • Regression Analysis