PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment

Syst Biol. 2013 Jul;62(4):611-5. doi: 10.1093/sysbio/syt022. Epub 2013 Apr 5.

Abstract

Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibrium frequency profiles implemented in PhyloBayes has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach. We introduce a message passing interface version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models. PhyloBayes MPI is freely available from our website www.phylobayes.org.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computational Biology / methods*
  • Models, Genetic
  • Phylogeny*
  • Software*