TY - JOUR T1 - Frameshift and wild-type proteins are highly similar because the genetic code and genomes were optimized for frameshift tolerance JF - bioRxiv DO - 10.1101/067736 SP - 067736 AU - Xiaolong Wang AU - Quanjiang Dong AU - Gang Chen AU - Jianye Zhang AU - Yongqiang Liu AU - Yujia Cai Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/05/03/067736.abstract N2 - Frameshift protein sequences encoded by alternative reading frames of coding genes have been considered meaningless, and frameshift mutations have been considered of little importance for the molecular evolution of coding genes and proteins. However, functional frameshifts have been found widely existing. It was puzzling how a frameshift protein kept its structure and functionality while its amino-acid sequence was changed substantially. Here we show that frame similarities between frameshifts and wild types are higher than random similarities and are defined at the genetic code, gene, and genome levels. In the standard genetic code, frameshift codon substitutions are more conservative than random substitutions. The frameshift tolerability of the standard genetic code ranks in the top 2.0-3.5% of alternative genetic codes, showing that the genetic code is nearly optimal for frameshift tolerance. Furthermore, frameshift-resistant codons (codon pairs) appear more frequently than expected in many genes and certain genomes, showing that the frameshift optimality is reflected not only in the genetic code but more importantly, in its allowance of further optimizing the frameshift tolerance of a particular gene or genome, which shed light on the role of frameshift mutations in molecular and genomic evolution.Competing Interest StatementThe authors have declared no competing interest. ER -