Inferring speciation and extinction rates under different sampling schemes

Mol Biol Evol. 2011 Sep;28(9):2577-89. doi: 10.1093/molbev/msr095. Epub 2011 Apr 11.

Abstract

The birth-death process is widely used in phylogenetics to model speciation and extinction. Recent studies have shown that the inferred rates are sensitive to assumptions about the sampling probability of lineages. Here, we examine the effect of the method used to sample lineages. Whereas previous studies have assumed random sampling (RS), we consider two extreme cases of biased sampling: "diversified sampling" (DS), where tips are selected to maximize diversity and "cluster sampling (CS)," where sample diversity is minimized. DS appears to be standard practice, for example, in analyses of higher taxa, whereas CS may occur under special circumstances, for example, in studies of geographically defined floras or faunas. Using both simulations and analyses of empirical data, we show that inferred rates may be heavily biased if the sampling strategy is not modeled correctly. In particular, when a diversified sample is treated as if it were a random or complete sample, the extinction rate is severely underestimated, often close to 0. Such dramatic errors may lead to serious consequences, for example, if estimated rates are used in assessing the vulnerability of threatened species to extinction. Using Bayesian model testing across 18 empirical data sets, we show that DS is commonly a better fit to the data than complete, random, or cluster sampling (CS). Inappropriate modeling of the sampling method may at least partly explain anomalous results that have previously been attributed to variation over time in birth and death rates.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Bayes Theorem
  • Birds
  • Computer Simulation
  • Extinction, Biological*
  • Genetic Speciation*
  • Models, Genetic*
  • Phylogeny
  • Selection Bias*