Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change

Mol Biol Evol. 2009 Sep;26(9):2097-108. doi: 10.1093/molbev/msp119. Epub 2009 Jun 17.

Abstract

The prevalence of adaptive evolution relative to genetic drift is a central problem in molecular evolution. Methods to estimate the fraction of adaptive nucleotide substitutions (alpha) have been developed, based on the McDonald-Kreitman test, that contrast polymorphism and divergence between selectively and neutrally evolving sites. However, these methods are expected to give downwardly biased estimates of alpha if there are slightly deleterious mutations, because these inflate polymorphism relative to divergence. Here, we estimate alpha by simultaneously estimating the distribution of fitness effects of new mutations at selected sites from the site frequency spectrum and the number of adaptive substitutions. We test the method using simulations. If data meet the assumptions of the analysis model, estimates of alpha show little bias, even when there is little or no recombination. However, population size differences between the divergence and polymorphism phases may cause alpha to be over or underestimated by a predictable factor that depends on the magnitude of the population size change and the shape of the distribution of effects of deleterious mutations. We analyze several data sets of protein-coding genes and noncoding regions from hominids and Drosophila. In Drosophila genes, we estimate that approximately 50% of amino acid substitutions and approximately 20% of substitutions in introns are adaptive. In protein-coding and noncoding data sets of humans, comparison to macaque sequences reveals little evidence for adaptive substitutions. However, the true frequency of adaptive substitutions in human-coding DNA could be as high as 40%, because estimates based on current polymorphism may be strongly downwardly biased by a decrease in the effective population size along the human lineage.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptation, Physiological / genetics*
  • Amino Acid Substitution / genetics
  • Animals
  • Bias
  • Computer Simulation
  • DNA, Intergenic / genetics
  • Databases, Genetic
  • Drosophila / genetics
  • Evolution, Molecular*
  • Humans
  • Likelihood Functions
  • Macaca / genetics
  • Models, Genetic
  • Mutation / genetics*
  • Open Reading Frames / genetics
  • Population Density*
  • Recombination, Genetic / genetics

Substances

  • DNA, Intergenic