RT Journal Article SR Electronic T1 Synteny-based analyses indicate that sequence divergence is not the dominant source of orphan genes JF bioRxiv FD Cold Spring Harbor Laboratory SP 735175 DO 10.1101/735175 A1 Nikolaos Vakirlis A1 Anne-Ruxandra Carvunis A1 Aoife McLysaght YR 2019 UL http://biorxiv.org/content/early/2019/08/14/735175.abstract AB The origin of “orphan” genes, species-specific sequences that lack detectable homologues, has remained mysterious since the dawn of the genomic era. There are two dominant explanations for orphan genes: complete sequence divergence from ancestral genes, such that homologues are not readily detectable; and de novo emergence from ancestral non-genic sequences, such that homologues genuinely do not exist. The relative contribution of the two processes remains unknown. Here, we harness the special circumstance of conserved synteny to estimate the contribution of complete divergence to the pool of orphan genes. We find that complete divergence accounts for at most a third of eukaryotic orphan and taxonomically restricted genes. We observe that complete divergence occurs at a stable rate within a phylum, but different rates between phyla, and is frequently associated with gene shortening akin to pseudogenization. Two cancer-related human genes, DEC1 and DIRC1, have likely originated via this route in a primate ancestor.