Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes
- David Thybert1,2,
- Maša Roller1,
- Fábio C.P. Navarro3,
- Ian Fiddes4,
- Ian Streeter1,
- Christine Feig5,
- David Martin-Galvez1,
- Mikhail Kolmogorov6,
- Václav Janoušek7,
- Wasiu Akanni1,
- Bronwen Aken1,
- Sarah Aldridge5,8,
- Varshith Chakrapani1,
- William Chow8,
- Laura Clarke1,
- Carla Cummins1,
- Anthony Doran8,
- Matthew Dunn8,
- Leo Goodstadt9,
- Kerstin Howe3,
- Matthew Howell1,
- Ambre-Aurore Josselin1,
- Robert C. Karn10,
- Christina M. Laukaitis10,
- Lilue Jingtao8,
- Fergal Martin1,
- Matthieu Muffato1,
- Stefanie Nachtweide11,
- Michael A. Quail8,
- Cristina Sisu3,
- Mario Stanke11,
- Klara Stefflova5,
- Cock Van Oosterhout12,
- Frederic Veyrunes13,
- Ben Ward2,
- Fengtang Yang8,
- Golbahar Yazdanifar10,
- Amonida Zadissa1,
- David J. Adams8,
- Alvis Brazma1,
- Mark Gerstein3,
- Benedict Paten4,
- Son Pham14,
- Thomas M. Keane1,8,
- Duncan T. Odom5,8 and
- Paul Flicek1,8
- 1European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom;
- 2Earlham Institute, Norwich Research Park, Norwich NR4 7UH, United Kingdom;
- 3Yale University Medical School, Computational Biology and Bioinformatics Program, New Haven, Connecticut 06520, USA;
- 4Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA;
- 5University of Cambridge, Cancer Research UK Cambridge Institute, Robinson Way, Cambridge CB2 0RE, United Kingdom;
- 6Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California 92092, USA;
- 7Department of Zoology, Faculty of Science, Charles University in Prague, 128 44 Prague, Czech Republic;
- 8Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom;
- 9Wellcome Trust Centre for Human Genetics, Oxford OX3 7BN, United Kingdom;
- 10Department of Medicine, College of Medicine, University of Arizona, Tuscon, Arizona 85724, USA;
- 11Institute of Mathematics and Computer Science, University of Greifswald, Greifswald 17487, Germany;
- 12School of Environmental Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, United Kingdom;
- 13Institut des Sciences de l'Evolution de Montpellier, Université Montpellier/CNRS, 34095 Montpellier, France;
- 14Bioturing Inc, San Diego, California 92121, USA
Abstract
Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.234096.117.
-
Freely available online through the Genome Research Open Access option.
- Received December 29, 2017.
- Accepted March 5, 2018.
This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.