Small open reading frames: current prediction techniques and future prospect

Curr Protein Pept Sci. 2011 Sep;12(6):503-7. doi: 10.2174/138920311796957667.

Abstract

Evidence is accumulating that small open reading frames (sORF, <100 codons) play key roles in many important biological processes. Yet, they are generally ignored in gene annotation despite they are far more abundant than the genes with more than 100 codons. Here, we demonstrate that popular homolog search and codon-index techniques perform poorly for small genes relative to that for larger genes, while a method dedicated to sORF discovery has a similar level of accuracy as homology search. The result is largely due to the small dataset of experimentally verified sORF available for homology search and for training ab initio techniques. It highlights the urgent need for both experimental and computational studies in order to further advance the accuracy of sORF prediction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Codon / genetics*
  • Computational Biology / methods*
  • Computational Biology / trends
  • Databases, Protein
  • Forecasting
  • Molecular Sequence Annotation / methods*
  • Molecular Sequence Annotation / trends
  • Open Reading Frames / genetics*
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae Proteins / genetics

Substances

  • Codon
  • Saccharomyces cerevisiae Proteins