Abstract
As multi-individual genome-wide population-scale data is becoming available, more-complex modeling strategies are needed to quantify the patterns of nucleotide usage and associated mechanisms of evolution. Recently, the multivariate neutral Moran model was proposed. However, it was shown insufficient to explain the distribution of alleles in great apes. Here, we proposed a new model that includes allelic selection. Our theoretical results constitute the basis of a new Bayesian framework to estimate mutation rates and selection coefficients from population data, which was employed to quantify the patterns of genome-wide GC-biased gene conversion in great apes. Importantly, we showed that great apes have patterns of allelic selection that vary in intensity, a feature that we correlated with the great apes’ distinct demographies. We also demonstrate that the AT/GC toggling effect decreases the probability of a substitution, which promotes more polymorphisms in the base composition of great ape genomes. We assessed the impact of CG-bias in molecular analysis and we found that mutation rates and genetic distances are estimated under bias when gBGC is not properly accounted. Our results stress the need for gBGC-aware models in population genetics and phylogenetics.