Abstract
De novo protein-coding innovations sometimes emerge from ancestrally non-coding DNA, despite the expectation that translating random sequences is overwhelmingly likely to be deleterious. The “pre-adapting selection” hypothesis claims that emergence is facilitated by prior, low-level translation of non-coding sequences via molecular errors. It predicts that selection on polypeptides translated only in error is strong enough to matter, and is strongest when erroneous expression is high. To test this hypothesis, we examined non-coding sequences located downstream of stop codons (i.e. those potentially translated by readthrough errors) in Saccharomyces cerevisiae genes. We identified a class of “fragile” proteins under strong selection to reduce readthrough, which are unlikely substrates for co-option. Among the remainder, sequences showing evidence of readthrough translation, as assessed by ribosome profiling, encoded C-terminal extensions with higher intrinsic structural disorder, supporting the pre-adapting selection hypothesis. The cryptic sequences beyond the stop codon, rather than spillover effects from the regular C-termini, are primarily responsible for the higher disorder. Results are robust to controlling for the fact that stronger selection also reduces the length of C-terminal extensions. These findings indicate that selection acts on 3′ UTRs in S. cerevisiae to purge potentially deleterious variants of cryptic polypeptides, acting more strongly in genes that experience more readthrough errors.
Footnotes
Switched ribohits from binary to a quantitative variable, added analyses for the other two reading frames, added supplement. Figure 1 added; figures 4 and 5 (formerly 3 and 4) revised. Added credibility interval analyses for probability of seeing a ribohit when all three frames do/do not have a backup stop codon; added analysis for the probability of seeing a backup stop codon in frame X given that a backup stop codon is missing in frame Y; added analysis of Dom34 targets. Revised Methods for clarification, added figshare link to the raw data we analyzed.
https://github.com/MaselLab/Kosinski-and-Masel-CTerminalExtensions