Abstract
Genomes contain mosaics of discordant evolutionary histories, challenging the accurate inference of the tree of life. While genome-wide data are routinely used for discordance-aware phylogenomic analyses, due to modeling and scalability limitations, the current practice leaves out large chunks of the genomes. As more high-quality genomes become available, we urgently need discordance-aware methods to infer the tree directly from a multiple genome alignment. Here, we introduce CASTER, a site-based method that eliminates the need to predefine recombination-free loci. CASTER is statistically consistent under incomplete lineage sorting and is scalable to hundreds of mammalian whole genomes. We show both in simulations and on real data that CASTER is scalable and accurate and that its per-site scores can reveal interesting patterns of evolution across the genome.
Competing Interest Statement
The authors have declared no competing interest.