Less is more in mammalian phylogenomics: AT-rich genes minimize tree conflicts and unravel the root of placental mammals

Mol Biol Evol. 2013 Sep;30(9):2134-44. doi: 10.1093/molbev/mst116. Epub 2013 Jun 29.

Abstract

Despite the rapid increase of size in phylogenomic data sets, a number of important nodes on animal phylogeny are still unresolved. Among these, the rooting of the placental mammal tree is still a controversial issue. One difficulty lies in the pervasive phylogenetic conflicts among genes, with each one telling its own story, which may be reliable or not. Here, we identified a simple criterion, that is, the GC content, which substantially helps in determining which gene trees best reflect the species tree. We assessed the ability of 13,111 coding sequence alignments to correctly reconstruct the placental phylogeny. We found that GC-rich genes induced a higher amount of conflict among gene trees and performed worse than AT-rich genes in retrieving well-supported, consensual nodes on the placental tree. We interpret this GC effect mainly as a consequence of genome-wide variations in recombination rate. Indeed, recombination is known to drive GC-content evolution through GC-biased gene conversion and might be problematic for phylogenetic reconstruction, for instance, in an incomplete lineage sorting context. When we focused on the AT-richest fraction of the data set, the resolution level of the placental phylogeny was greatly increased, and a strong support was obtained in favor of an Afrotheria rooting, that is, Afrotheria as the sister group of all other placentals. We show that in mammals most conflicts among gene trees, which have so far hampered the resolution of the placental tree, are concentrated in the GC-rich regions of the genome. We argue that the GC content-because it is a reliable indicator of the long-term recombination rate-is an informative criterion that could help in identifying the most reliable molecular markers for species tree inference.

Keywords: Afrotheria; GC-content; biased gene conversion; incomplete lineage sorting; phylogenomics; placental mammal.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • AT Rich Sequence*
  • Animals
  • Base Composition*
  • Evolution, Molecular*
  • Female
  • Genome*
  • Mammals / classification*
  • Mammals / genetics
  • Molecular Sequence Data
  • Phylogeny*
  • Placenta / physiology
  • Pregnancy
  • Recombination, Genetic
  • Sequence Alignment
  • Sequence Homology, Nucleic Acid