Mapping of conserved RNA secondary structures predicts thousands of functional noncoding RNAs in the human genome

Nat Biotechnol. 2005 Nov;23(11):1383-90. doi: 10.1038/nbt1144.

Abstract

In contrast to the fairly reliable and complete annotation of the protein coding genes in the human genome, comparable information is lacking for noncoding RNAs (ncRNAs). We present a comparative screen of vertebrate genomes for structural noncoding RNAs, which evaluates conserved genomic DNA sequences for signatures of structural conservation of base-pairing patterns and exceptional thermodynamic stability. We predict more than 30,000 structured RNA elements in the human genome, almost 1,000 of which are conserved across all vertebrates. Roughly a third are found in introns of known genes, a sixth are potential regulatory elements in untranslated regions of protein-coding mRNAs and about half are located far away from any known gene. Only a small fraction of these sequences has been described previously. A comparison with recent tiling array data shows that more than 40% of the predicted structured RNAs overlap with experimentally detected sites of transcription. The widespread conservation of secondary structure points to a large number of functional ncRNAs and cis-acting mRNA structures in the human genome.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Pairing
  • Base Sequence
  • Chromosome Mapping
  • Computational Biology / methods
  • Conserved Sequence
  • Genome, Human*
  • Humans
  • Introns
  • Models, Statistical
  • Nucleic Acid Conformation*
  • Phylogeny
  • RNA / chemistry
  • RNA, Messenger / metabolism
  • RNA, Untranslated / chemistry*
  • Regulatory Elements, Transcriptional
  • Sensitivity and Specificity
  • Sequence Analysis, DNA
  • Thermodynamics
  • Transcription, Genetic

Substances

  • RNA, Messenger
  • RNA, Untranslated
  • RNA