Enhanced effective codon numbers to understand codon usage bias

Reginald Smith

doi:10.1101/644609

Abstract

Codon usage bias is a well recognized phenomenon but the relative influence of its major causes: G+C content, mutational biases, and selection, are often difficult to disentangle. This paper presents methods to calculate modified effective codon numbers that allow the investigation of the relative strength of each of these forces and how genes or organisms have their codon biases shaped. In particular, it demonstrates that variation in codon usage bias across organisms is likely driven more by mutational forces while the variation in codon usage bias within genomes is likely driven by selectional forces.

Author summary A new method of disaggregating codon bias influences (G+C content, mutational biases, and selection) is described where I show how that different values of the effective codon number, following Wright’s N_c, can be used as ratios to demonstrate the similar or different causes of codon biases across genes or organisms. By calculating ratios of the different types of effective codon numbers, one can easily compare organisms or different genes while controlling for G+C content or relative mutational biases. The driving forces determining the variations in codon usage bias across or within organisms thus become much clearer.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.