Modeling phylogenetic biome shifts on a planet with a past

The spatial distribution of biomes has changed considerably over deep time, so the geographical opportunity for an evolutionary lineage to shift into a new biome depends on how the availability and connectivity of biomes has varied temporally. To better understand how lineages shift between biomes in space and time, we developed a phylogenetic biome shift model in which each lineage shifts between biomes and disperses between regions at rates that depend on the lineage’s biome affinity and location relative to the spatiotemporal distribution of biomes at any given time. To study the behavior of the biome shift model in an empirical setting, we developed a literature-based representation of paleobiome structure for three mesic forest biomes, six regions, and eight time strata, ranging from the Late Cretaceous (100 Ma) through the present. We then fitted the model to a time-calibrated phylogeny of 119 Viburnum species to compare how the results responded to various realistic or unrealistic assumptions about paleobiome structure. Ancestral biome estimates that account for paleobiome dynamics reconstructed a warm temperate (or tropical) origin of Viburnum, which is consistent with previous fossil-based estimates of ancestral biomes. In Viburnum, imposing unrealistic paleobiome distributions led to ancestral biome estimates that eliminated support for tropical origins, and instead inflated support for cold temperate ancestry during the warmer Paleocene and Eocene. The biome shift model we describe is applicable to the study of evolutionary systems beyond Viburnum, and the core mechanisms of our model are extensible to the design of richer phylogenetic models of historical biogeography and/or lineage diversification. We conclude that biome shift models that account for dynamic geographical opportunities are important for inferring ancestral biomes that are compatible with our understanding of Earth history.

Introduction temperatures and precipitation, but average minimum temperatures drop below freezing in 142 at least one of the coldest months. 143 Because we are interested in how biome states and regional states evolve in tandem, 144 we constructed a set of 3 × 6 = 18 compound states that we call biome-region states. 145 Throughout the paper, we identify the biome-region state for a lineage in biome state i and 146 region state k with the notation (i,k). However, in practice, we encode biome-region states Our aim is to model a regional biome shift process that allows changes in the 155 spatiotemporal distribution of biomes to influence the likelihood of a lineage shifting 156 between biomes and dispersing between regions. This process can be described in terms of  Figure 1: Cartoon of the relationship between paleobiome structure and a regional biome shift process. The left and right panels are aligned to the same geological time scale that is divided into a Hot (red) interval followed by a Cold (blue) interval. (A) Maps of paleobiome structure with two regions, East (E) and West (W), and two focal biomes of interest, Hot (H) and Cold (C), in which the expansive Hot biome is replaced by the Cold biome as the East and West regions separate. (B) Paleobiome adjacency matrices encode the availability of biomes within regions and the connectivity of biomes between regions based on whether paleobiome features are strong (dark) or weak (light). Diagonal elements reflect biome availability within regions while off-diagonal elements report biome connectivity between regions. (C) Two possible regional biome shift histories for a phylogeny with a western, hot-adapted (HW) origin. Lineages shift between biomes at rates that depend on the availability of biomes within the lineage's current region and disperse between regions at rates that depend on connectivity of the lineage's current biome between regions. The two histories require (top) or do not require (bottom) evolutionary events to be congruent with paleobiome structure. America. Together, the tropical and warm temperate forests formed a beltway of 222 boreotropical forests around the northern hemisphere (Wolfe 1985   Adjacency matrices are used as empirical priors to shape the time-stratified phylogenetic biome shift process. Rows correspond to eight time intervals, while columns correspond to regional features, specifically full (or null) connectivity (black), simple geographical connectivity (brown), or features involving the tropical (red), warm temperate (green), and cold temperate (blue) forest biomes. The matrix for each time and feature encodes the availability of (the diagonal) and the connectivity between (off-diagonal) regions for that feature at that time, where matrix rows and columns correspond to source and destination regions, respectively. Availability/connectivity is marked as being strong (dark), weak (medium), or marginal (light).
The three matrices on the right-hand side of Equation 1 are the uniform rate 283 matrix, Q 1 , the geographical rate matrix, Q G , and the biome rate matrix, Q B . In reference 284 to Figure 2, we wish to learn the relative influence of the uniform (first column), geography 285 (second), and biome (third, fourth, or fifth) matrix features on the biome shift process.

286
The first rate matrix (Q 1 ) may be considered a "null" rate matrix that sets the 287 relative transition rates between all pairs of regions, and separately between all pairs of 288 biomes, as equal (to one).
if biome and region shift (i = j and k = l) The effect is that biome shifts between biomes i and j follow the rates β i,j and dispersal 290 events follow the rates δ k,l regardless of the age of a lineage or the lineage's biome-region 291 state. As we develop rate matrices for geography (Q G ) and and biomes (Q B ) below, the 292 second role for Q 1 is that it allows for lineages to disperse or shift regardless of whether the connectivity/availability of the involved regions or biomes are scored as strong, weak, or 294 marginal.

295
The second rate matrix (indexed G for "geography", Q G ) is structured according to  Specifically, we set y strong = 1 and y marg = 0, then treat y weak as an estimated parameter 302 that satisfies y marg < y weak < y strong . Referring to Figure 2 again, these parameters control 303 the degree of contrast between cells across all matrices.

304
[Q G (m)] (i,k),(j,l) = The third rate matrix (indexed B for "biome", Q B ) defines the shift rates between 305 biomes and the dispersal rates between regions to depend on the spatiotemporal 306 distribution of biomes. A lineage's biome shift rate depends on whether the receiving 307 biome, j, has a strong, weak, or marginal presence in the region it currently occupies, k.

308
Likewise, the dispersal rate for a lineage that is currently adapted to biome type i depends 309 on whether the source region, k, and destination region, l, share a strong, weak, or marginal connection.  where µ is a rate taken to be sufficiently large that stationarity is reached.  . Stationary probabilities were computed assuming that biome and region shifts occur in roughly equal proportion (β = δ = 0.5), that lineages disperse primarily through the appropriate biome graph (w B = 0.8, w G = 0.16, and w 1 = 0.04), and that dominant biomes primarily define the structure of biome graphs (y strong = 1.0, y weak = 0.1, y marg = 0.0). Parameters were chosen to show interesting variation. Note, all stationary probabilities would be equal over all times if w 1 = 1.
The time-dependent source-sink dynamics in Figure 3 show how the availability of 340 and connectivity between regional biomes structures each time interval's stationary distribution. Stationary probabilities before the Oligocene tend to favor tropical biomes in all regions, but favor cold temperate biomes afterwards. This means that if the historical 343 spatial structure of biomes is relevant to biogeography, then lineages originating in the 344 Paleogene would more likely be adapted to tropical than to cold temperate forests simply 345 because cold temperate forests were a more marginal biome during that period of Earth's 346 history. 347 We can now completely define the time stratified rate matrix, Q(m), and the Biome setting that used the graphical structure of "Present" to represent all time intervals; 373 and the Null Biome setting that ignored all regional and biome structure by fixing w 1 = 1.

374
Departing from the general model description above, we re-parameterized our stochastically mapped state triplets (event series of length two) in our phylogenetic tree using a simple root-to-tip recursion. We processed each posterior sample by taking the 422 stochastically mapped root state to be the second state in the triplet, X root , then sampling 423 the preceding state, X subroot , from the sampling distribution obtained by Bayes rule where Q(m root ) is the root node's rate matrix and π(m root ) is its stationary distribution Simulation experiment 447 We measured how reliably we can select models in which biome structure influences the Biome shift rates were set to equal 1, except transitions between cold temperate and 457 tropical forests, which were set to 0. The event clock was set to µ = 0.03, except for the 458 null condition, which was assigned a slower rate of µ = 0.01 to account for the fact that 459 fewer event rate penalties are applied to it than the non-null conditions. We then 460 simulated 100 replicate datasets in RevBayes for each of the four conditions under the 461 regional biome-shift model described above, and estimated the posterior density for each 462 simulated dataset using MCMC in RevBayes. 463 We were primarily concerned with how our posterior estimates of w B respond to with the proportion of simulations that award no support to the w B > 0 model (Fig 4B).

492
Ancestral biomes for Viburnum

531
To illuminate why the Paleobiome analysis produces distinctly warmer ancestral   for the most probable biome-region states per node. Pie charts for root state probabilities are magnified to improve visibility. Vertical white and gray bands correspond to major geological timeframes referenced in this study.        at least moderately strong, even though w B is difficult to estimate precisely (Fig. 4). We  (Figures 2 and 7). The lesson 640 we take from this is that inferring the fundamental behavior of the process is not always 641 sufficient for estimating ancestral states; inferring if and how that behavior responds to 642 changing historical conditions is also necessary. 643 We note that an East Asian origin in warm temperate or tropical forests is Modern Biome). In the case of Viburnum, it appears that several key regional shifts 707 between Eastern Asia and North America occurred a relatively long time ago, when 708 northern latitudes were still primarily covered by warm temperate forests (Fig. 8A) in the evolutionary process as free parameters, whose values we estimate from the  Nonetheless, future studies should explore more quantitative approaches to defining 754 paleobiome structures for use with the time-stratified regional biome shift model.

755
Our simple model of regional biome shifts lacks several desired features. Perhaps

801
Finally, although we have compared inferences of event series under several biome 802 structure models, and have argued that paleobiome models can influence such inferences, 803 we caution that event series themselves may not be accurate descriptors of some relevant 804 evolutionary scenarios. For example, it is entirely possible that a shift into a new biome 805 could occur during a transition from one region into another (e.g., adaptation to cold 806 forests during range expansion through Beringia, or the long-distance dispersal of an 807 organism already pre-adapted to occupy a novel biome). Such scenarios highlight that the 808 model we have presented here is simplistic in some of its basic assumptions. We view it as 809 a start in the right direction, and look forward to extensions that will allow us to test a 810 variety of more nuanced hypotheses.  More generally, we hope that our analyses will motivate biogeographers who wish to 823 estimate ancestral biomes to account for variation in the spatial distribution of biomes