Abstract
Synonymous codon choice can have dramatic effects on ribosome speed, RNA stability, and protein expression. Ribosome profiling experiments have underscored that ribosomes do not move uniformly along mRNAs, exposing a need for models of coding sequences that capture the full range of empirically observed variation. We present a method, Ixnos, that models this variation in translation elongation using a feedforward neural network to predict the translation elongation rate at each codon as a function of its sequence neighborhood. Our approach revealed sequence features affecting translation elongation and quantified the impact of large technical biases in ribosome profiling. We applied our model to design synonymous variants of a fluorescent protein spanning the range of possible translation speeds predicted with our model. We found that levels of the fluorescent protein in yeast closely tracked the predicted translation speeds across their full range. We therefore demonstrate that our model captures information determining translation dynamics in vivo, and that control of translation elongation alone is sufficient to produce large, quantitative differences in protein output.