Modes and clustering for time-warped gene expression profile data

Bioinformatics. 2003 Oct 12;19(15):1937-44. doi: 10.1093/bioinformatics/btg257.

Abstract

Motivation: The study of the dynamics of regulatory processes has led to increased interest for the analysis of temporal gene expression level data. To address the dynamics of regulation, expression data are collected repeatedly over time. It is difficult to statistically represent the resulting high-dimensional data. When regulatory processes determine gene expression, time-warping is likely to be present, i.e. the sample of gene expression trajectories reflects variation not only in terms of the expression amplitudes, but also in terms of the temporal structure of gene expression.

Results: A non-parametric time-synchronized iterative mean updating technique is proposed to find an overall representation that corresponds to a mode of a sample of expression profiles, viewed as a random sample in function space. The proposed algorithm explores the application of previous work of Hall and Heckman to genome-wide expression data and provides an extension that includes random time-warping with the aim to synchronize timescales across genes. The proposed algorithm is universally applicable for the construction of modes for functional data with time-warping. We demonstrate the construction of mode functions for a sample of Drosophila gene expression data. The algorithm can be applied to define clusters among the observed trajectories of gene expression, without any kind of prior non-time-warped clustering, as illustrated in the numerical example.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Validation Study

MeSH terms

  • Adaptation, Physiological / genetics
  • Aging / genetics
  • Algorithms*
  • Animals
  • Cluster Analysis*
  • Databases as Topic
  • Drosophila / genetics*
  • Drosophila / growth & development*
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / physiology*
  • Life Cycle Stages / genetics*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Pattern Recognition, Automated
  • Time Factors