Unveiling clusters of RNA transcript pairs associated with markers of Alzheimer's disease progression

PLoS One. 2012;7(9):e45535. doi: 10.1371/journal.pone.0045535. Epub 2012 Sep 21.

Abstract

Background: One primary goal of transcriptomic studies is identifying gene expression patterns correlating with disease progression. This is usually achieved by considering transcripts that independently pass an arbitrary threshold (e.g. p<0.05). In diseases involving severe perturbations of multiple molecular systems, such as Alzheimer's disease (AD), this univariate approach often results in a large list of seemingly unrelated transcripts. We utilised a powerful multivariate clustering approach to identify clusters of RNA biomarkers strongly associated with markers of AD progression. We discuss the value of considering pairs of transcripts which, in contrast to individual transcripts, helps avoid natural human transcriptome variation that can overshadow disease-related changes.

Methodology/principal findings: We re-analysed a dataset of hippocampal transcript levels in nine controls and 22 patients with varying degrees of AD. A large-scale clustering approach determined groups of transcript probe sets that correlate strongly with measures of AD progression, including both clinical and neuropathological measures and quantifiers of the characteristic transcriptome shift from control to severe AD. This enabled identification of restricted groups of highly correlated probe sets from an initial list of 1,372 previously published by our group. We repeated this analysis on an expanded dataset that included all pair-wise combinations of the 1,372 probe sets. As clustering of this massive dataset is unfeasible using standard computational tools, we adapted and re-implemented a clustering algorithm that uses external memory algorithmic approach. This identified various pairs that strongly correlated with markers of AD progression and highlighted important biological pathways potentially involved in AD pathogenesis.

Conclusions/significance: Our analyses demonstrate that, although there exists a relatively large molecular signature of AD progression, only a small number of transcripts recurrently cluster with different markers of AD progression. Furthermore, considering the relationship between two transcripts can highlight important biological relationships that are missed when considering either transcript in isolation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Alzheimer Disease / genetics*
  • Alzheimer Disease / pathology
  • Biomarkers
  • Cluster Analysis
  • Computational Biology / methods
  • Databases, Genetic
  • Disease Progression
  • Gene Expression Profiling*
  • Humans
  • Molecular Sequence Annotation
  • Reproducibility of Results
  • Transcriptome*

Substances

  • Biomarkers

Grants and funding

The authors would like to thank the support of the ARC Centre of Excellence in Bioinformatics, and the University of Newcastle’s through the funding of the Priority Research Centre for Bioinformatics, Biomarker Discovery and Information-Based Medicine. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.