Abstract
Motivation Maximum Likelihood (ML) is a widely used model for inferring phylogenies. The respective ML implementations heavily rely on numerical optimization routines that use internal numerical thresholds to determine convergence. We systematically analyze the impact of these threshold settings on the log-likelihood (LnL scores) and runtimes for ML tree inferences with RAxML-NG, IQ-TREE, and FastTree on empirical datasets.
Results We provide empirical evidence that we can substantially accelerate tree inferences with RAxML-NG and IQ-TREE by changing the default values of two such numerical thresholds. At the same time, changing these settings does not significantly influence the quality of the inferred trees according to statistical significance tests. For RAxML-NG, increasing the likelihood thresholds ϵLnL and ϵbrlen to 10 and 103 respectively results in an average speedup of 1.9 ± 0.6 on Data collection 1 and 1.8 ± 1.3 on Data collection 2. Increasing the likelihood threshold ϵLnL to 10 in IQ-TREE results in an average speedup of 1.3 ± 0.4 on Data collection 1 and 1.3 ± 0.9 on Data collection 2.
Availability and Implementation All MSAs and results our analyses are based on are available for download at https://cme.h-its.org/exelixis/material/freeLunch_data.tar.gz. Our data generation scripts are available at https://github.com/tschuelia/ml-numerical-analysis.
Contact julia.haag{at}h-its.org
Supplementary information Supplementary data are available online.
Competing Interest Statement
The authors have declared no competing interest.