Abstract
Symbiosis is a key driver of evolutionary novelty and ecological diversity, but our understanding of how macroevolutionary processes originate extant symbiotic associations is still very incomplete. Cophylogenetic tools are used to assess the congruence between the phylogenies of two groups of organisms related by extant associations. If phylogenetic congruence is higher than expected by chance, we conclude that there is cophylogenetic signal in the system under study. However, how to quantify cophylogenetic signal is still an open issue. We present a novel approach, Random Tanglegram Partitions (Random TaPas) that applies a given global-fit method to random partial tanglegrams of a fixed size to identify the associations, terminals and nodes that maximize phylogenetic congruence. By means of simulations, we show that the output value produced is inversely proportional to the number and proportion of cospeciation events employed to build simulated tanglegrams. In addition, with time-calibrated trees, Random TaPas is also efficient at distinguishing cospeciation from pseudocospeciation. Random TaPas can handle large tanglegrams in affordable computational time and incorporates phylogenetic uncertainty in the analyses. We demonstrate its application with two real examples: Passerine birds and their feather mites, and orchids and bee pollinators. In both systems, Random TaPas revealed low cophylogenetic signal, but mapping its variation onto the tanglegram pointed to two different coevolutionary processes. We suggest that the recursive partitioning of the tanglegram buffers the effect of phylogenetic nonindependence occurring in current global-fit methods and therefore Random TaPas is more reliable than regular global-fit methods to identify host-symbiont associations that contribute most to cophylogenetic signal. Random TaPas can be implemented in the public-domain statistical software R with scripts provided herein. A User’s Guide is also available at GitHub.
Footnotes
The most significant change in this revised version concerns the way the frequencies of the host-symbiont associations are computed. The Random TaPas algorithm favors the sapling of one-to-one associations that potentially can lead to an over-representation of their occurrence in the percentile set. To avoid this, we have resorted to generate a null model based on the frequencies of each association in the whole frequency distribution. So we compute a residual Ri= Frequency actually observed in percentile – Expected frequency from the null model that measures whether the association is represented more or less frequently than expected by chance. So we expect large positive and negative residuals to represent congruent and incongruent associations, respectively. We now recommend this strategy to deal with systems with multiple host-symbiont associations, whereas the original Random Tapas is still appropriate for the less frequent one-to-one association systems.