TY - JOUR
T1 - StarBEAST2 brings faster species tree inference and accurate estimates of substitution rates
JF - bioRxiv
DO - 10.1101/070169
SP - 070169
AU - Huw A. Ogilvie
AU - Remco R. Bouckaert
AU - Alexei J. Drummond
Y1 - 2017/01/01
UR - http://biorxiv.org/content/early/2017/04/02/070169.abstract
N2 - Fully Bayesian multispecies coalescent (MSC) methods like *BEAST estimate species trees from multiple sequence alignments. Today thousands of genes can be sequenced for a given study, but using that many genes with *BEAST is intractably slow. An alternative is to use heuristic methods which compromise accuracy or completeness in return for speed. A common heuristic is concatenation, which assumes that the evolutionary history of each gene tree is identical to the species tree. This is an inconsistent estimator of species tree topology, a worse estimator of divergence times, and induces spurious substitution rate variation when incomplete lineage sorting is present. Another class of heuristics directly motivated by the MSC avoids many of the pitfalls of concatenation but cannot be used to estimate divergence times. To enable fuller use of available data and more accurate inference of species tree topologies, divergence times, and substitution rates, we have developed a new version of *BEAST called StarBEAST2. To improve convergence rates we add analytical integration of population sizes, novel MCMC operators and other optimisations. Computational performance improved by 13.5× to 13.8× when analysing empirical data sets, and an average of 33.1 × across 30 simulated data sets. To enable accurate estimates of per-species substitution rates we introduce species tree relaxed clocks, and show that StarBEAST2 is a more powerful and robust estimator of rate variation than concatenation. StarBEAST2 is available through the BEAUTi package manager in BEAST 2.4 and above.
ER -