Abstract
Should we build our own phylogenetic trees based on gene sequence data, or can we simply use available synthesis phylogenies? This is a fundamental question that any study involving a phylogenetic framework must face at the beginning of the project. Building a phylogeny from gene sequence data (purpose-built phylogeny) requires more effort and expertise than subsetting an already available phylogeny (synthesis-based phylogeny). If phylogenetic diversity estimates based on these two types of phylogenies are highly correlated, using readily available synthesis-based phylogenies is justified for comparing phylogenetic diversity among communities. However, a comparison of how these two approaches to building phylogenetic trees influence the calculation of phylogenetic diversity has not been explicitly tested. We generated threepurpose-built phylogenies and their corresponding synthesis-based trees (two from Phylomatic and one from the Open Tree of Life). We then used a simulation approach to generate 1000 communities with a fixed number of species per site and compared the effects of different trees on estimates of phylogenetic alpha and beta diversity using Spearman’s rank-based correlation and linear mixed models. Synthesis-based phylogenies generally over-estimated phylogenetic diversity when compared to purpose-built ones. However, their resulting measures of phylogenetic diversity were highly correlated (Spearman’s r > 0.8 in most cases). Mean pairwise distance (both alpha and beta) is the index that is most robust to the differences in tree construction that we tested. Measures of phylogenetic diversity based on the Open Tree of Life showed the highest correlation with measures based on the purpose-built phylogenies. For comparing phylogenetic diversity among communities, our results justify taking advantage of recently developed and continuously improving synthesis trees such as the Open Tree of Life.