## Abstract

Isolation by distance is a widespread pattern in nature that describes the reduction of genetic correlation between subpopulations with increased geographic distance. In the population ancestral to modern sister species, this pattern may hypothetically inflate population divergence time estimation due to the potential for allele frequency differences in subpopulations at the ends of the ancestral population. In this study, we analyze the relationship between the time to the most recent common ancestor and the population divergence time when the ancestral population model is a linear stepping-stone. Using coalescent simulations, we compare the coalescent time to the population divergence time for various ratios of the divergence time over the product of the population size and the deme number. Next, we simulate whole genomes to obtain SNPs, and use the Bayesian coalescent program SNAPP to estimate divergence times. We find that as the rate of migration between neighboring demes decreases, the coalescent time becomes significantly greater than the population divergence time when sampled from end demes. Divergence-time overestimation in SNAPP becomes severe when the divergence-to-population size ratio < 10 and migration is low. We conclude that studies estimating divergence times be cognizant of the potential ancestral population structure in an explicitly spatial context or risk dramatically overestimating the timing of population splits.

## Introduction

A major goal in phylogenetic and phylogeographic studies is the estimation of species divergence times. The topic has a long and contentious history largely centered around questions of how to appropriately apply fossil calibrations (e.g., Heath et al. 2014; Brown and Smith 2018), rate heterogeneity (Pond and Muse 2005), rate of morphological evolution (Lynch 1990), and selecting an adequate clock model (Douzery et al. 2004; Lepage et al. 2007).

Beyond methodological concerns are those that emerge from the nature of the data itself. Most phylogenetic models assume that fixed differences between species are the result of genetic drift, and under the neutral theory of molecular evolution (Kimura 1968; King and Jukes 1969) the rate of evolution (or substitution rate) is equal to the per generation neutral mutation rate, *μ* (Kimura 1983). For well-calibrated molecular clocks (e.g., Knowlton and Weigt 1998; Weir and Schluter 2008; Herman et al. 2018), we can estimate the time of divergence (usually in years) as π_{12} / 2*μ*, where π_{12} is the pairwise sequence divergence between species 1 and 2. However, in general we are not interested in estimating the divergence time of specific genetic variants, but rather the time of population divergence (T_{D}). For example, we might be interested in estimating the timing of a vicariant event that we suspect corresponds to a past geological upheaval.

There is a known discrepancy between the coalescent time of neutral genetic variants (T_{MRCA}) and T_{D} (Nei and Li 1979). The degree of this discrepancy is determined by the ratio of T_{D} / *N*_{e}, where *N*_{e} is the effective population size (Edwards and Beerli 2000; Rosenberg and Feldman 2002). First, lineages must be within the same population, which occurs T_{D} generations in the past; second, these lineages must then coalesce, which on average requires 2*N*_{e} generations. Therefore, for a completely panmictic population: T_{MRCA} = T_{D} + 2*N*_{e}. The expected amount of pairwise sequence divergence is

(Wakeley 2000). When the ratio of T_{D} / *N*_{e} is large, the bias in coalescent time in the ancestral population is minimal compared to T_{D} (Edwards and Beerli 2000). However, as T_{D} / *N*_{e} becomes small, 2*N*_{e} plays a major role in the overall sequence divergence between species. Nordberg and Feldman (2002) evaluated the relationship between T_{MRCA} and T_{D} in a simple two population split model using coalescent simulations. They found that T_{MRCA} converged on T_{D} when the ratio of T_{D} / *N*_{e} ≈ 5. Importantly, the *N*_{e} in these models is that of the ancestral population; therefore, the extent of overestimation is the result of demographic conditions present in the ancestor. Demographic conditions that inflate *N*_{e}, such as ancestral population structure or a bottleneck following the split, is expected to have a major impact on divergence-time estimation (Gaggiotti and Excoffier 2000; Edwards and Beerli 2000; Wakeley 2000).

Wakeley (2000) demonstrated that in descendant species who share an ancestor whose population dynamics are characterized by an island model (Wright 1931) with free migration between demes, overestimation of divergence-times are on the order of 2*N*_{e}*D*[1 + (1/2M)] where M = 2*N*_{e}*mD*/(*D* - 1) and *m* is the migration rate. The expected amount of pairwise sequence divergence is therefore
where *D* is the number of demes.

Population subdivision initially leads to shallow coalescent times where individuals within a shared deme rapidly find ancestors (the “scattering phase”). However, since ancestral lineages must be in the same deme to coalesce, the rate in the “collecting phase” is characterized by the migration rate that shuffles ancestors around the range, reducing the probability that lineages coalesce (Wakeley 1998; 1999).

In the context of real populations, the island model of migration rarely applies (Meirmans 2012). Instead, population structure is the product of the spatial distribution and dispersal potential of the organism in question. Often this structure is in the form of isolation-by-distance (IBD). IBD is a widespread pattern in natural systems, characterized by a reduction in the probability of identity by descent (Wright 1943) or genetic correlation (Malécot 1968) with geographic distance. Patterns of IBD are most pronounced in stepping-stone models (Kimura 1953; Kimura and Weiss 1964) in which migration is restricted to neighboring demes. In this way, demes in close proximity share a greater proportion of migrants than they do with more distant demes. Distributions of coalescent times in stepping-stone models have been studied both in the context of one dimensional and two-dimensional models that are circular or toroidal (Maruyama 1970a; 1970b; Slatkin 1991), and in continuous models with joined ends (Maruyama 1971) or with discrete edges (Wilkins and Wakeley 2002). Slatkin (1991), using a circular stepping-stone model, showed that the probability for two genes sampled *i* steps apart have an average coalescent time:

Therefore, the amount of expected pairwise sequence divergence is:

The circular stepping-stone model should overestimate T_{D} more dramatically as the number of demes becomes large and the distance between them increases. However, like the island model of free migration, circular ranges are likely rare in nature. Instead, natural populations are characterized by discrete range edges where end demes may only receive migrants from one direction (e.g., Peterson and Denno 1998; Broquet et al. 2006; Aguillon et al. 2017). Hey (1991) showed analytically in the case of a linear stepping-stone model that the distribution of coalescent times of two alleles from demes at the extremes of the range should coalesce much deeper than any two alleles chosen randomly from the population.

Ultimately, the degree to which T_{MRCA} impacts phylogenetic inference and divergence-time estimation is dependent on its impact on π_{12}. Given that lower migration rates lead to greater T_{MRCA} (Hey 1991), we expect that differentiation (π_{12}) between end demes compared to central will become more pronounced at smaller *m*. If the difference between the T_{MRCA} of central demes and end demes is dramatic enough, we expect that divergence dating of species that arose from ancestral end demes may significantly overestimate T_{D}. ##examples

In this study, we estimate mean T_{MRCA} for two genes sampled in descendant species (either from the ends or the center of the ancestral range) in which the ancestral population is characterized by a stepping-stone model with discrete ends using a simulation approach. In particular, we are interested in what value of T_{D} / *ND* we expect T_{MRCA} to converge on T_{D}. Next, we examine the distribution of π_{12} across the genome under different simulated migration conditions to compare with expectations under a panmictic model. Finally, we test the performance of the phylogenetic inference program SNAPP (Bryant et al. 2012) on simulated single nucleotide polymorphism (SNP) data to evaluate how these trends may bias our inference of species divergence times.

## Methods

### Coalescent simulations

Using *fastsimcoal2* (Excoffier et al. 2013), we simulated sister species with a shared ancestor whose population dynamics are characterized by a stepping-stone model. Specifically, each simulation consisted of 10 demes (*D*) with no shared migration between them until time T_{D}. At T_{D}, migration resumes between demes in a linear stepping-stone fashion. In *fastsimcoal2*, the migration rate is the probability of an individual from deme *i* migrating to deme *j*, where *i* and *j* are neighboring demes. Center demes receive migrants from neighboring demes at rate 2*m*, whereas demes at the end of the range receive migrants at rate *m*. This is due to the fact that end demes have only a single neighbor, whereas all center demes have two neighbors (Fig. 1A). Throughout, we will use “end demes” to represent species descending from the ends of the ancestral range; “center demes” are those that descend from the center. We sampled *k* = 2 individuals to coalesce—in one run, we sample the end demes, and in the following we sample central neighboring demes. This was performed for migration rates of 0.1, 0.01, and 0.001, and a range of T_{D} / *ND* values. In addition, we simulated an island model of migration for comparison with the stepping-stone model. In the island model, the ancestral population consisted of 10 demes with free migration between each at rate *m*. This resulted in a total of 84 distinct simulation scenarios, and each were replicated 1,000 times.

To statistically compare between the three models (end deme sampled in stepping-stone, center deme in stepping-stone, and the island model), we subset ratios of T_{D} / *ND* to values of 10, 5, 2, 1, 0.5, and 0.1. Resulting T_{MRCA} distributions for each population model were compared using a pairwise Wilcoxon test in the R platform (R Core Team 2019), as the resulting distributions were non-normal.

### Genome simulations

To evaluate how ancestral IBD impacts pairwise sequence divergence (π_{12}), genome-wide coalescent times (T_{MRCA}), and divergence-time estimation, we performed hybrid simulations that combined the coalescent simulator *msprime* (Kelleher et al. 2016) and the forward-time simulator SLiM v3.3 (Haller and Messer 2019). Since forward-time simulators begin with individuals that are completely unrelated, often a neutral burn-in period is required to allow coalescence or mutation-drift equilibrium to occur (Haller et al. 2019). This can be computationally costly and time consuming; however, using tree-sequence recording methods in SLiM (Haller et al. 2019) we can bypass the need to equilibrate during the forward-time simulation. To generate a panmictic ancestral population with a coalescent history, we simulated 2000 individuals (*N*_{e} = 4000) using *msprime* with genome sizes of 10 Mb and a recombination rate of 10^{−8} (∼0.1 recombination events per individual per generation). The resulting coalescent trees were then imported into SLiM as the basis for the starting population.

In SLiM, the initial population was split into two populations of *N* = 1000: 1) an outgroup that remained panmictic (“sp3” in Fig. 1A) and 2) the ancestral population, which was subdivided into 10 demes (*N* = 100 per deme) in a linear stepping-stone model. These dynamics persisted for 50,000 generations after which the ancestral population was split into either “end” demes or “center” demes (see Fig. 1A). Population sizes of each deme following the split was increased to 1000 to maintain *N* throughout the simulation. Five different T_{D} values were simulated which correspond to T_{D} / *ND* ratios of 50, 25, 10, 5, and 1. These values of T_{D} / *ND* were chosen based on the results from the coalescent simulations (see Results); for values >10, T_{MRCA} is expected to converge on T_{D}, whereas values <10 are expected to overestimate T_{D} regardless of migration rates.

The resulting tree-sequences from the SLiM simulation were imported into Python 3 using *pyslim*, and we overlaid neutral mutations (*μ* = 10^{−7}) onto the trees using *msprime*. Pairwise divergence (π_{12}) was then estimated across the genome in windows of 100 kb for both end demes and center demes. These values were also converted into generations using π_{12} / 2*μ*, which gives a rough estimate of divergence time per window.

By rearranging equation 1, we can naively calculate *N*_{e} for the ancestral population from genome-wide π_{12} as:

From this, we plot estimated ancestral *N*_{e} within 100 kb windows across the genome to compare with the known census population size (*N*_{c} = 1000), and to evaluate the relationship between *N*_{e} and *N*_{c} in the presence of IBD.

Next, we plotted the distribution of coalescent times (T_{MRCA}) across the genome to visualize differences between T_{MRCA} of end and center demes. Median T_{MRCA} for each ratio and migration rate was compared via a Kruskal-Wallis test and a pairwise Wilcoxon rank test in R due to the data violating normality.

Each simulation produced >200,000 SNPs. For divergence-time analysis, we randomly sampled 3000 SNPs—a number found by Strange et al. (2018) to optimally perform in SNAPP (Bryant et al. 2012). Each run consisted of 10 individuals from species sp1 and sp2, and 1 individual from the outgroup population, sp3 (Fig. 1). Unlike other fully coalescent models, SNAPP does not sample from gene trees directly to estimate the species tree, but instead integrates over all possible gene trees using biallelic SNPs. The method has been found previously to perform well on both simulated and empirical data (Bryant et al. 2012; Strange et al. 2018). We designated a gamma-distributed prior on *θ* (=4*N*_{e}*μ*) with a mean equal to the expected π_{12} (equation 1). Forward (*u*) and backward (*v*) mutation rates were estimated within BEAUti (Bouckaert et al. 2014) from the empirical SNP matrix using the tab *Calc_mutation_rates*, and these values were sampled during the MCMC. The rate parameter λ, which is the birth-rate on the Yule tree prior, was gamma-distributed with α = 2 and β = 200, where the mean is α / β (Leaché and Bouckaert 2018).

SNAPP is designed to handle incomplete lineage sorting (ILS), but to minimize its effects—since we are not interested in the program’s ability to estimate topology but rather branch-lengths—we applied a fixed species tree. Branch-lengths in SNAPP do not scale to time, but instead are measured in number of substitutions. Given a fixed mutation rate, we convert the number of substitutions separating sp1 and sp2 to the number of generations as *g* = *s* / *μ*, where *s* is branch-lengths in units of substitutions (Bouckaert and Bryant 2015). The MCMC chain length was 10–50 million sampling every 1000 with a burn-in of 10%, ensuring that ESS values of interest were all >200. Runs were performed on the high-performance computing cluster CIPRES (www.phylo.org; Miller et al. 2010).

MCMC log files were then downloaded and analyzed in R. The performance of SNAPP was evaluated by comparing traces of end and center demes across migration rates and T_{D} / *ND* values. Results were evaluated using a two-way ANOVA followed by Tukey’s HSD post hoc test in R. Trees from the MCMC were summarized in TreeAnnotator v.2.6.0 (Bouckaert et al. 2014) and visualized in R using the package *ggtree* (Yu et al. 2017). Branch colors and widths were scaled by estimated median *θ* per branch.

## Results

### Coalescent simulation results

The coalescent simulations produced trends superficially similar to those found by Rosenberg and Feldman (2002). At the lowest T_{D} / *ND*, the proportion of deep coalescence was dramatically greater than at higher values with the curve producing a similar logarithmic relationship (Fig. 2). However, T_{D} and T_{MRCA} did not necessarily converge when T_{D} / *ND* = 5. Instead, the rate of convergence was dependent on both the deme sampled and the migration rate.

When migration was high (*m* = 0.1) and T_{D} / *ND* was less than 0.5, there was no significant difference between center or end demes in the stepping-stone model or the island model. However, for values of T_{D} / *ND* > 0.5, the T_{MRCA} of end demes became significantly different from both island (*p* < 0.02) and center demes (*p* < 0.01; see Table S1). When migration was reduced below 0.1, this pattern became more extreme. End demes were significantly different in all pairwise comparisons of models (*p* < 0.000001), and center demes differed from the island model at T_{D} / *ND* ratios of 0.5, 2, and 10 (*p* < 0.03) when *m* = 0.01. At the lowest migration rate simulated (*m* = 0.001), all pairwise model comparisons were significantly different when T_{D} / *ND* > 0.5 (*p* < 0.001; see Table S1).

### Genome simulation results

Results from the genome simulation approach corroborated those found with *fastsimcoal2*. Regardless of T_{D} / *ND*, when *m* = 0.1 the difference between center and end demes was less severe and only marginally significant (*p* = 0.001) relative to when *m* < 0.1 (Table S2). Across the simulated genomes, T_{MRCA} became dramatically deeper between end than center demes as migration fell below 0.01. For the genome-wide divergence estimates, the degree of overestimation depended on the ratio of T_{D} / *ND*. While all scenarios where *m* = 0.001 overestimated the true T_{D}, when T_{D} / *ND* < 10 end demes were 5–60 times more diverged than expected (Fig. 3). This is a direct result of the deeper coalescent times between end demes when *m* < 0.1, as these longer branches provide more time for mutations to occur and accumulate (Fig. 4).

Genome-wide coalescent times (T_{MRCA}) are shown in Fig. 4. When *m* = 0.1, only T_{D} / *ND* = 25 and 10 were significantly different between end and center demes (*p* < 0.005). Regardless of T_{D} / *ND*, the variance in T_{MRCA} steadily increased with decreasing *m*. Indeed, the increase in mean T_{MRCA} when *m* = 0.001 appears largely driven by an increase in the variance at this lower rate. Due to this, we find that ancestral *N*_{e} dramatically exceeds *N*_{c} when *m* = 0.001 (Fig. 6).

Despite the potential for divergence-time overestimation to be extreme, SNAPP was relatively resilient when T_{D} / *ND* > 10 and when *m* > 0.001. When T_{D} / *ND* = 50, SNAPP was overly conservative and underestimated the number of substitutions expected to occur (Fig. 5). When T_{D} / *ND* = 25, the mean estimate of both center and end demes when *m* > 0.001 either underestimated the true age or was within 5%. However, for end demes where *m* = 0.001 the estimated divergence time exceeded the true age by ∼80% (Table S3). A similar trend occurred when T_{D} / *ND* = 10 and 5. Here, both center and ends overestimated the true age, but the end demes did so more dramatically (138% the true age versus 81% for 10; 184% versus 67% for 5). The most dramatic overestimation occurred between end demes when T_{D} / *ND* = 1 at ∼700% the true age. Importantly, this was not merely the result of a low T_{D} / *ND* ratio, as the other migration regimes performed well. In fact, most were closer to the true T_{D} than the expected π_{12} accounting for 2*N* (Table S3).

Estimated *θ* for each branch is shown in Fig. 7 for T_{D} / *ND* = 10, and in Figs. S1–S4 for the remaining ratios. For all T_{D} / *ND* values except 1, the median ancestral *θ* was higher for end demes than center when *m* = 0.001, and the estimated *θ* for the descendant species (sp1 and sp2 in Fig. 1) was considerably lower than for the ancestor or the outgroup, sp3 (Fig. 7; Figs. S1–S3). These patterns are consistent with a population bottleneck, despite *N* being maintained throughout the simulation.

## Discussion

Macroevolutionary patterns are ultimately governed by microevolutionary processes (Li et al. 2018), an observation Lynch (2007), extending Dobzhansky’s (1973) maxim, summed up as “nothing in evolution makes sense except in light of population genetics”. In this light, we have demonstrated that the population genetic environment of the ancestor shapes the genetic landscape of descendant species. This has been known to impact tree topology when ILS is common (Kubatko and Degnan 2007) and overestimate divergence times in the presence of population structure caused by an island model of migration (Edwards and Beerli 2000; Wakeley 2000). Extensive prior work has shown that the stepping-stone model of migration reduces genetic correlation between demes (Kimura and Weiss 1964; Maruyama 1970a) and that demes farther apart should coalesce deeper in time than those geographically closer (Slatkin 1991; Hey 1991). However, to our knowledge, the impact of ancestral IBD has not been evaluated in the context of divergence-time estimation previously.

Rosenberg & Feldman (2002) found previously that when T_{D} / *N* = 5, T_{MRCA} and T_{D} largely converged in a simple population split model. However, when in the presence of ancestral IBD we found that convergence was dependent on the migration rate (i.e., the strength of ancestral IBD) and whether surviving demes neighbored each other or were at the range ends in the ancestral population.

When T_{D} / *ND* > 10, the ancestral dynamics contribute little to the divergence-time estimate differences between center and end demes. However, as this ratio decreases the contribution of 2*N*_{e} to overall sequence divergence becomes non-trivial. The probability that genetic variants share an ancestor just prior to the population split is higher between demes that are geographically closer than those more distant. This is mediated by the migration rate, which, when high enough, can largely erase the differences between center and end demes. When migration is high (10%, or *m* = 0.1), individuals move well between demes and the coalescent times largely converge (though deeper in time depending on the ratio of T_{D} / *ND*). However, as *m* falls below 1% (*m* = 0.01), or less than one migrant per generation being shared between demes, dispersal cannot keep up with genetic differentiation. Despite all migration regimes producing similar patterns of IBD (Fig. S5), *F*_{ST} becomes dramatically higher as migration drops below 1%. This differentiation in the ancestor contributes to the overall sequence divergence (π_{12}) between species, which drives an overestimation of the time of the population split (T_{D}) when end demes are the surviving lineages.

As expected, ancestral IBD skews π_{12} and T_{MRCA} away from expected values in a panmictic population, and this caused an inflation in *N*_{e} relative to *N*_{c}. For T_{D} / *ND* = 50 and *m* = 0.1, the mean π_{12} for end demes was 0.010459 and 0.010419 for center demes. Using equation 5, *N*_{e} = 1147.5 for end demes and 1047.5 for central. However, when *m* = 0.001, π_{12} for end demes was 0.012948, an *N*_{e} = 7370. Center demes, on the other hand, only increased to *N*_{e} = 1255. As with the coalescent times, at lower migration rates the variance in *N*_{e} becomes exceedingly large, driving up the mean. Importantly, mean genome-wide *N*_{e} always exceeds *N*_{c} in the presence of ancestral IBD at a level dictated by the migration rate.

This feature of ancestral IBD has important consequences for conservation genetics. Many studies use *N*_{e} as a rough biological measure of population size (Turner et al. 2002; Rieman and Allendorf, 2011; Hare et al. 2011), and therefore a metric of the health of a population. However, a common phenomenon in range contractions is fragmentation and isolation (Ceballos et al. 2017), which may result in IBD. If many of the demes once contributing to the connectivity of the population have become extinct, and *N*_{e} is estimated based on the surviving demes, it will overestimate the actual number of individuals within the population (i.e., the census size, *N*_{c}). Thus, we might incorrectly conclude that a population has a larger population size than it actually does, which may lead to mismanagement.

Since *N*_{e} is inflated in the ancestral lineage, the descendant species appear to pass through a bottleneck despite *N* remaining constant (Fig. 7). Estimated *θ* in SNAPP captured this dynamic with more extreme differences in *θ* (i.e., more dramatic bottlenecks) being inferred between end demes and when *m* = 0.001. Population bottlenecks have been found to cause divergence-time overestimation due to random differential survival of ancestral alleles into the descendant species (Gaggiotti and Excoffier 2000). In the presence of IBD, this differential allelic persistence between demes is mimicking a bottleneck—when demes are far apart this pattern is more extreme as they already maintain different allelic patterns ancestrally. However, because this pattern is recognizable (Fig. 7; Figs. S1–S3) it can be used to signal when ancestral IBD may be impacting our divergence-time estimation. Unfortunately, without prior range-size knowledge it may be impossible to differentiate between ancestral IBD and a bottleneck since these produce virtually identical genetic patterns. However, it may not be necessary to do so for simple divergence estimates.

The broader impact of ancestral IBD on divergence-time estimation when in the context of large phylogenies is beyond the scope of this work, but it is conceivable that the longer than expected branches between sister species might bias rate estimation (Aris-Brosou and Excoffier 1996; Magallón 2010). In the case of ancestral IBD, the inflated *N*_{e} is mimicking a pattern of substitution rate increase. Under neutrality, the rate of substitution is equal to the per generation mutation rate, *μ* (Kimura 1983); however, in the presence of population structure, substitutions may occur in the ancestral lineages between demes separated by large geographic distances. If the true age of the sister taxa is known but ancestral structure is not accounted for, the substitution rate will be upwardly biased.

Ancestral structured populations leave their imprint on descendent species in the form of greater coalescent times, and therefore larger than expected pairwise divergences between species. Further, these patterns cause inflated *N*_{e} relative to census sizes. Since ancestral IBD mimics the signature of a population bottleneck, coalescent methods that co-estimate *θ* along with the topology and π_{12}, such as SNAPP and *BEAST (Bouckaert et al., 2014), may be the best suited to reveal this potential source of bias. However, fully coalescent models such as these are infamously computationally costly and not presently used for whole-genome sequence data or for phylogenies with large numbers of tips. Indeed, SNAPP becomes prohibitively slow when the number of tips is ∼30 (Leaché and Bouckaert, 2018).

In the context of larger phylogenies or organisms in which little is known about their ancestral range, it may be impossible to know if extant species descend from range centers or ends, or the level of IBD present in the ancestor. The genetic consequences of ancestral structure therefore behave much like “ghost” populations (Slatkin 2005); despite being extinct, their influence haunts our ability to adequately assess the phylogenetic history of their descendants.

## Data accessibility statement

SLiM recipes, R and python code, and .XML files have been uploaded to https://github.com/hancockzb/ancestralIBD.

## Supplementary Material

## Acknowledgements

Thanks to Ben Haller and Wesley Brashear for coding help.