Computational identification of rare codons of Escherichia coli based on codon pairs preference

BMC Bioinformatics. 2010 Jan 28:11:61. doi: 10.1186/1471-2105-11-61.

Abstract

Background: Codon bias is believed to play an important role in the control of gene expression. In Escherichia coli, some rare codons, which can limit the expression level of exogenous protein, have been defined by gene engineering operations. Previous studies have confirmed the existence of codon pair's preference in many genomes, but the underlying cause of this bias has not been well established. Here we focus on the patterns of rarely-used synonymous codons. A novel method was introduced to identify the rare codons merely by codon pair bias in Escherichia coli.

Results: In Escherichia coli, we defined the "rare codon pairs" by calculating the frequency of occurrence of all codon pairs in coding sequences. Rare codons which are disliked in genes could make great contributions to forming rare codon pairs. Meanwhile our investigation showed that many of these rare codon pairs contain termination codons and the recognized sites of restriction enzymes. Furthermore, a new index (F(rare)) was developed. Through comparison with the classical indices we found a significant negative correlation between F(rare) and the indices which depend on reference datasets.

Conclusions: Our approach suggests that we can identify rare codons by studying the context in which a codon lies. Also, the frequency of rare codons (F(rare)) could be a useful index of codon bias regardless of the lack of expression abundance information.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Codon / genetics*
  • Computational Biology / methods*
  • Databases, Genetic
  • Escherichia coli / genetics*
  • Gene Expression Regulation, Bacterial

Substances

  • Codon