1 Abstract
The fossilized birth-death (FBD) process provides an ideal model for inferring phylogenies from both extant and fossil taxa. Using this approach, fossils (with or without character data) are directly considered as part of the tree. This leads to a statistically coherent prior on divergence times, where the variance associated with node ages reflects uncertainty in the placement of fossil taxa in the phylogeny. Since fossils are typically not associated with molecular sequences, additional information is required to place fossils in the tree. Previously, this information has been provided in two different forms: using topological constraints, where the user specifies monophyletic clades based on established taxonomy, or so-called total-evidence analyses, which use a morphological data matrix with data for both fossil and extant specimens in addition to the molecular alignment. In this work, we use simulations to evaluate these different approaches to handling fossil placement in FBD analyses, both in ideal conditions and in datasets including uncertainty or even errors. We also explore how rate variation in fossil recovery or diversification rates impacts these approaches. We find that the extant topology is well recovered under all methods of fossil placement. Divergence times are similarly well recovered across all methods, with the exception of constraints which contain errors. These results are consistent with expectations: in FBD inferences, divergence times are mostly informed by fossil ages, so variations in the position of fossils strongly impact these estimates. On the other hand, the placement of extant taxa in the phylogeny is driven primarily by the molecular alignment. We see similar patterns in datasets which include rate variation, however one notable difference is that relative errors in extant divergence times increase when more variation is included in the dataset, for all approaches using topological constraints, and particularly for constraints with errors. Finally, we show that trees recovered under the FBD model are more accurate than those estimated using non-FBD (i.e., non-time calibrated) inference. This result holds even with the use of erroneous fossil constraints and model misspecification under the FBD. Overall, our results underscore the importance of core taxonomic research, including morphological data collection and species descriptions, irrespective of the approach to handling phylogenetic uncertainty using the FBD process.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Corrected a spelling mistake in the name of one of the authors (De Baets instead of de Baets)