Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics

J Mol Biol. 2004 Sep 3;342(1):19-30. doi: 10.1016/j.jmb.2004.07.018.

Abstract

Facing the ever-growing list of newly discovered classes of functional RNAs, it can be expected that further types of functional RNAs are still hidden in recently completed genomes. The computational identification of such RNA genes is, therefore, of major importance. While most known functional RNAs have characteristic secondary structures, their free energies are generally not statistically significant enough to distinguish RNA genes from the genomic background. Additional information is required. Considering the wide availability of new genomic data of closely related species, comparative studies seem to be the most promising approach. Here, we show that prediction of consensus structures of aligned sequences can be a significant measure to detect functional RNAs. We report a new method to test multiple sequence alignments for the existence of an unusually structured and conserved fold. We show for alignments of six types of well-known functional RNA that an energy score consisting of free energy and a covariation term significantly improves sensitivity compared to single sequence predictions. We further test our method on a number of non-coding RNAs from Caenorhabditis elegans/Caenorhabditis briggsae and seven Saccharomyces species. Most RNAs can be detected with high significance. We provide a Perl implementation that can be used readily to score single alignments and discuss how the methods described here can be extended to allow for efficient genome-wide screens.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Base Sequence*
  • Caenorhabditis / genetics
  • Genome
  • Genomics*
  • Molecular Sequence Data
  • Nucleic Acid Conformation*
  • RNA, Untranslated* / chemistry
  • RNA, Untranslated* / genetics
  • RNA, Untranslated* / metabolism
  • Random Allocation
  • Saccharomyces / genetics
  • Sequence Alignment*

Substances

  • RNA, Untranslated