Long terminal repeats power evolution of genes and gene expression programs in mammalian oocytes and zygotes

  1. Petr Svoboda2
  1. 1Bioinformatics Group, Division of Molecular Biology, Department of Biology, Faculty of Science, University of Zagreb, 10000, Zagreb, Croatia;
  2. 2Institute of Molecular Genetics, Academy of Sciences of the Czech Republic, 142 20 Prague 4, Czech Republic;
  3. 3Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
  4. 4Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-8562, Japan;
  5. 5Department of Integrated Biosciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, 277-8562, Japan
  • Present addresses: 6Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, 13125 Berlin, Germany; 7Department of Anatomy, Physiology and Cell Biology, School of Veterinary Medicine, University of California, Davis, CA 95616, USA

  • Corresponding authors: svobodap{at}img.cas.cz, kristian{at}bioinfo.hr
  • Abstract

    Retrotransposons are “copy-and-paste” insertional mutagens that substantially contribute to mammalian genome content. Retrotransposons often carry long terminal repeats (LTRs) for retrovirus-like reverse transcription and integration into the genome. We report an extraordinary impact of a group of LTRs from the mammalian endogenous retrovirus-related ERVL retrotransposon class on gene expression in the germline and beyond. In mouse, we identified more than 800 LTRs from ORR1, MT, MT2, and MLT families, which resemble mobile gene-remodeling platforms that supply promoters and first exons. The LTR-mediated gene remodeling also extends to hamster, human, and bovine oocytes. The LTRs function in a stage-specific manner during the oocyte-to-embryo transition by activating transcription, altering protein-coding sequences, producing noncoding RNAs, and even supporting evolution of new protein-coding genes. These functions result, for example, in recycling processed pseudogenes into mRNAs or lncRNAs with regulatory roles. The functional potential of the studied LTRs is even higher, because we show that dormant LTR promoter activity can rescue loss of an essential upstream promoter. We also report a novel protein-coding gene evolution—D6Ertd527e—in which an MT LTR provided a promoter and the 5′ exon with a functional start codon while the bulk of the protein-coding sequence evolved through a CAG repeat expansion. Altogether, ERVL LTRs provide molecular mechanisms for stochastically scanning, rewiring, and recycling genetic information on an extraordinary scale. ERVL LTRs thus offer means for a comprehensive survey of the genome's expression potential, tightly intertwining with gene expression and evolution in the germline.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.216150.116.

    • Freely available online through the Genome Research Open Access option.

    • Received September 19, 2016.
    • Accepted May 15, 2017.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server