PT - JOURNAL ARTICLE AU - N Nikolaos Vakirlis AU - Alex S Hebert AU - Dana A Opulente AU - Guillaume Achaz AU - Chris Todd Hittinger AU - Gilles Fischer AU - Joshua J Coon AU - Ingrid Lafontaine TI - A Molecular Portrait of <em>de novo</em> Genes AID - 10.1101/119768 DP - 2017 Jan 01 TA - bioRxiv PG - 119768 4099 - http://biorxiv.org/content/early/2017/04/27/119768.short 4100 - http://biorxiv.org/content/early/2017/04/27/119768.full AB - What is the source of new genes? Identifying de novo genes is hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. Here we used a systematic approach that selects de novo candidates among genes taxonomically restricted to yeast genomes. We predict 703 de novo genes in 15 yeast genomes whose phylogeny spans at least 100 million years of evolution. We have validated 82 candidates, by providing new translation evidence for 25 of them through mass spectrometry experiments, in addition to those whose translation has been independently reported. We established that de novo gene emergence is a widespread phenomenon in the yeast subphylum, only a few being ultimately maintained by selection. We showed that de novo genes preferentially arise in GC-rich intergenic regions transcribed from divergent promoters, such as recombination hotspots, and propose a model for the early stages of de novo gene emergence and evolution.Significance Statement New genes with novel protein functions can evolve “from scratch” out of nonsense genomic sequences. These “de novo” genes can become essential and drive important phenotypic innovations. Understanding how and why the transition from noncoding to coding happens is therefore crucial. By developing a comprehensive approach we were able to accurately identify hundreds of de novo genes in a set of 15 yeast genomes. Our results support a model of de novo gene emergence from GC-rich, divergently transcribed regions that are associated to recombination hotspots. Only a few de novo genes are maintained by selection, will mature and finally integrate in the cell's network.