A Critical Review on the Use of Support Values in Tree Viewers and Bioinformatics Toolkits

Lucas Czech; Jaime Huerta-Cepas; Alexandros Stamatakis

doi:10.1101/035360

Abstract

Phylogenetic trees are routinely visualized to present and interpret the evolutionary relationships of species. Virtually all empirical evolutionary data studies contain a visualization of the inferred tree with branch support values. Ambiguous semantics in tree file formats can lead to erroneous tree visualizations and therefore to incorrect interpretations of phylogenetic analyses.

Here, we discuss problems that can and do arise when displaying branch values on trees after re-rooting. Branch values are typically stored as node labels in the widely-used Newick tree format. However, such values are attributes of branches. Storing them as node labels can therefore yield errors when re-rooting trees. This depends on the mostly implicit semantics that tools deploy to interpret node labels.

We reviewed 10 tree viewers and 10 bioinformatics toolkits that can display and re-root trees. We found that 14 out of 20 of these tools do not permit users to select the semantics of node labels. Thus, unaware users might obtain incorrect results when rooting trees inferred by common phylogenetic inference programs. We illustrate such incorrect mappings for several test cases and real examples taken from the literature. This review has already led to improvements and workarounds in 8 of the tested tools. We suggest tools should provide an option that explicitly forces users to define the semantics of node labels.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.