Abstract
Little is known about the rate of emergence of genes de novo, how they spread in populations and what their initial properties are. We examined wild Saccharomyces paradoxus populations to characterize the diversity and turnover of intergenic ORFs over short evolutionary time-scales. We identified ∼34,000 intergenic ORFs per individual genome for a total of ∼64,000 orthogroups, which resulted from an estimated turnover rate relatively smaller than the rate of gene duplication in yeast. Hundreds of intergenic ORFs show translation signatures, similar to canonical genes, but lower translation efficiency, which could reduce their potential production cost or simply reflect a lack of optimization. Translated intergenic ORFs tend to display low expression levels with sequence properties that are on average closer to expectations based on intergenic sequences. However, some predicted de novo polypeptides with gene-like properties emerged from ancient as well as recent birth events, illustrating that the raw material for functional innovations may appear even over short evolutionary time-scales. Our results suggest that variation in the mutation rate along the genome impacts the turnover of random polypeptides, which may in turn influence their early evolutionary trajectory. Whereas low mutation rate regions allow more time for random intergenic ORFs to evolve and become functional before being lost, mutation hotspots allow for the rapid exploration of the molecular landscape, thereby increasing the probability to acquire a polypeptide with immediate gene-like properties and thus functional potential.