PT - JOURNAL ARTICLE
AU - Lam-Tung Nguyen
AU - Arndt von Haeseler
AU - Bui Quang Minh
TI - Complex Models of Sequence Evolution Require Accurate Estimators as Exemplified with the Invariable Site plus Gamma Model
AID - 10.1101/185652
DP - 2017 Jan 01
TA - bioRxiv
PG - 185652
4099 - http://biorxiv.org/content/early/2017/11/24/185652.short
4100 - http://biorxiv.org/content/early/2017/11/24/185652.full
AB - The invariable site plus Γ model is widely used to model rate heterogeneity among alignment sites in maximum likelihood and Bayesian phylogenetic analyses. The proof that the invariable site plus continuous Γ model is identifiable (model parameters can be inferred correctly given enough data) has increased the creditability of its application to phylogeny reconstruction. However, most phylogenetic software implement the invariable site plus discrete Γ model, whose identifiability is likely but unproven. How well the parameters of the invariable site plus discrete Γ model are estimated is still disputed. Especially the correlation of the fraction of invariable sites with the fractions of sites with a slow evolutionary rate is discussed as being problematic. We show that optimization heuristics as implemented in frequently used phylogenetic software cannot always reliably estimate the shape parameter, the proportion of invariable sites and the tree length. Here, we propose an improved optimization heuristic that accurately estimates the three parameters. While research efforts mainly focus on tree search methods, our results signify the equal importance of verifying and developing effective estimation methods for complex models of sequence evolution.