## Abstract

Fragmented landscapes pose a significant threat to the persistence of species as they are highly susceptible to heightened risk of extinction due to the combined effects of genetic and demographic factors such as genetic drift and demographic stochasticity. This paper explores the intricate interplay between genetic load and extinction risk within metapopulations with a focus on understanding the impact of eco-evolutionary feedback mechanisms. We distinguish between two models of selection: soft selection, characterised by subpopulations maintaining carrying capacity despite load, and hard selection, where load can significantly affect population size. Within the soft selection framework, we investigate the impact of gene flow on genetic load at a single locus, while also considering the effect of selection strength and dominance coefficient. We subsequently build on this to examine how gene flow influences both population size and load under hard selection as well as identify critical thresholds for metapopulation persistence. Our analysis employs the diffusion, semi-deterministic and effective migration approximations. Our findings reveal that under soft selection, even modest levels of migration can significantly alleviate the burden of load. In sharp contrast, with hard selection, a much higher degree of gene flow is required to mitigate load and prevent the collapse of the metapopulation. Overall, this study sheds light into the crucial role migration plays in shaping the dynamics of genetic load and extinction risk in fragmented landscapes, offering valuable insights for conservation strategies and the preservation of diversity in a changing world.

## Introduction

The long held belief that ecological (eco) and evolutionary (evo) processes occur on too different timescales to influence each other has fallen under scrutiny as many studies (Thompson, 1998, Hairston Jr et al., 2005) have documented that these timescales can be comparable: this can be the case when selection varies sharply over space and time, or when populations are too small to adapt efficiently, resulting in rapid loss in fitness (e.g., due to the accumulation of deleterious alleles).

Such eco-evolutionary feedback is particularly important in fragmented landscapes with small-sized patches, where mating between genetically related individuals within patches (i.e., inbreeding) as well as stochastic changes in genetic composition (due to genetic drift) can compromise the efficiency of natural selection. As a consequence, populations in such landscapes may become more susceptible to fixing deleterious mutations, resulting in an increase in genetic load and a decline in population numbers, which can further exacerbate both genetic drift and inbreeding. The resultant positive feedback loop between increasing load and falling population sizes may drive populations towards extinction through a process termed *mutational meltdown* (Lynch et al., 1995b). Evidence of such extinction events have been found in plant populations (Matthies et al., 2004) and in some vertebrate (Fagan and Holmes, 2006) and invertebrate (see *M. cinxia*) populations (Saccheri et al., 1998). This kind of feedback may be further aggravated by demographic and environmental stochasticity, both of which can limit the potential of populations to adapt to changing conditions. However, despite some understanding of individual factors contributing to extinction risk, a comprehensive picture of how deterministic and stochastic processes together structure genetic diversity and influence the persistence of populations remains lacking. This understanding is crucial for formulating effective conservation strategies that can safeguard biodiversity and promote long-term survival of endangered species (Lande, 1993, Frankham, 1998) particularly as human activities continue to exert significant pressures on natural environments. Gaining such quantitative understanding requires us to consider how fragmented the landscape is: how many local populations they contain, what their sizes are, to what extent they are connected by gene flow, etc.

Migration can affect various aspects of the eco-evo feedback loop, in essence homogenizing both genetic composition and population density across fragmented populations. Investigations of natural populations yield examples of both the beneficial and deleterious effects of migration. For instance, Finger et al. (2011) demonstrated how augmented gene flow reduced the negative effects of inbreeding in a critically endangered and isolated jellyfish tree (*Medusagyne oppositifolia*) population. Similarly, Land and Lacy (2000) showed that introducing eight wild-caught Texan female panthers into a small isolated Florida panther population caused a tripling of its size, bringing it back from near extinction within just 12 years of re-introduction. In contrast, Herman et al. found that gene flow between surface and cave populations of Mexican tetra, *Astyanax mexicanus*, had a swamping effect on cave-related traits. These examples show how migration can have very complex effects - they can have a ‘rescue effect’ when migrants are introduced into a vulnerable population or they can have negative consequences or generally engender a tension between inbreeding and outbreeding depression, in particular, when migration occurs between populations that are adapted to different environmental conditions or happens between distinct species (Frankham et al., 2011, Edmands, 2007, Templeton, 1986)). These varied effects of migration introduce complexities to practical decisions in conservation such as those concerning assisted gene flow (Aitken and Whitlock, 2013), underscoring the need for a more quantitative and theoretical understanding of gene flow in fragmented landscapes.

Migration can have rather intricate effects on genetic diversity even when there is no local adaptation i.e., when different local populations are subject to uniform selection pressures, e.g., due to purifying selection against unconditionally deleterious alleles. Small and somewhat isolated local populations may be nearly fixed for deleterious alleles at different loci; migration between such populations can result in hybrid offspring of increased vigour due to masking of recessive (deleterious) mutations. Migration can also increase fitness variation, which has two competing effects - it may alleviate drift load by preventing fixation of deleterious alleles (especially at low levels of migration), but also prevent purging of recessive alleles by increasing heterozygosity (at higher levels of migration) (Glémin et al., 2003).

The effect of migration on genetic diversity and adaptation in structured populations depends crucially on whether selection is “soft” or “hard”, i.e., whether different local populations contribute equally to the next generation regardless of fitness, or if the contribution of fitter populations is higher (e.g., when these are larger and send out more migrants). In the latter case, adaptation is expected to be most efficient at intermediate migration rates (Uecker et al., 2014, Gomulkiewicz et al., 1999), i.e., when gene exchange between demes is sufficiently low so as not to displace exceptionally fit local populations from the highest “adaptive peaks”, but high enough to eventually cause the entire population to move towards these peaks (Wright, 1931, Rouhani and Barton, 1993).

However, most work on hard vs. soft selection in subdivided populations assumes a rather specialised life cycle in which all (adult or juvenile) individuals across all demes join a common pool prior to reproduction, followed by uniform redistribution of zygotes back into demes (Levene, 1953, Christiansen, 1975, Ravigné et al., 2004), making it difficult to assess the role of limited dispersal. In particular, models of hard selection typically assume that while different local demes can vary in size and thus contribute differentially to the common pool of reproducing adults, the metapopulation as a whole is under global density regulation and is moreover at carrying capacity (see, e.g., Whitlock (2002)). This makes it difficult to use such models to assess how genetic load can decrease local populations over multiple generations due to the positive feedback between declining population size and increasing load (or vice versa) and to what extent this kind of meltdown may be arrested by (limited) dispersal between populations – questions that are crucial for a quantitative understanding of how migration ameliorates extinction risk in fragmented populations. This therefore calls for more realistic models that allow for variation in both local and global population sizes and explicitly incorporate eco-evolutionary feedback.

Lynch et al.(1995a, 1995b) considered such a model where load due to deleterious mutation accumulation leads to increased extinction risk. Using extensive computer simulations, they showed that populations with low effective size (typically less than 100 individuals) are at a significant risk of extinction via mutational meltdown within only about 100 generations (Lynch et al., 1995a); a conclusion with potential implications for management programs such as captive breeding programs. However, their analysis was limited to single randomly mating populations. Higgins and Lynch (2001) went beyond this to consider the dynamics of a metapopulation. Using metapopulation simulations with global and nearest-neighbour dispersal, they demonstrated that the eventual fate of a metapopulation (as measured by the median time to extinction) is critically dependent on the number of demes or metapopulation patches (see also Lande et al. (2003), Ch 4, Table. 4.1) as well as on the dispersal neighbourhood. Their analysis however primarily focused on extinction times and was restricted to scenarios where patch sizes were relatively small resulting in early population extinctions. Furthermore, their study was entirely simulation-based, making it hard to generalize their conclusions.

Going beyond simulations, Szép et al. (2021) introduced a stochastic polygenic model, that explicitly captures the coupling between population size and allele frequency at multiple loci and investigated how this influences local adaptation and extinction in metapopulations, as well as the role of gene flow and stochastic events i.e., demographic stochasticity and drift. They found that the extinction risk of metapopulations is higher when population sizes are small and the coupling between population size and allele frequency is strong. They also showed that local adaptation is more difficult under hard selection and if locally adaptive traits are more polygenic, causing populations to become extinct under much lower levels of gene flow than would be expected from single-locus theory. However, their conclusions beg the question of how these results change when selection is uniform across space. Using a similar theoretical model, but with spatially uniform selection, Sachdeva et al. (2022) investigated whether or not asymmetric gene flow (from a mainland to an island) can help arrest mutational-meltdown due to eco-evo feedback and thus prevent extinction of the island population. They found that migration can have qualitatively different effects on the island; having a positive effect (i.e., reducing load and thus the risk of extinction) when the island population is small and isolated (i.e., in a sink state) and having an opposite effect (i.e., increasing load by hindering purging) when it is large and connected.

However, beyond marginal populations that are plagued by asymmetric gene flow from central populations, it is important to understand how eco-evo feedback influences extinction risk and load in a metapopulation where migration is random and where the different islands (demes) can fix for different subsets of deleterious alleles that mask one another. We therefore extend the work of Sachdeva et al. (2022) to explore the eco-evolutionary dynamics of a metapopulation made up of a large number of demes exchanging migrants at random (i.e., under the island model). Our goal is to investigate the effect of gene flow on equilibrium load and population size, as well as the critical thresholds required for the persistence (i.e., for non-extinction) of the metapopulation. We further analyse how these thresholds are influenced by the architecture of load (the genomewide deleterious mutation rate and the distribution of selective effects and dominance coefficients of deleterious mutations) as well as features of the metapopulation landscape (carrying capacities and baseline growth rate of local demes). As we see below, a key parameter that determines the fate of the metapopulation is the total mutation rate relative to the baseline growth rate, which determines the extent to which genetic load depresses population growth and can thus be considered a proxy for the “hardness of selection”. Besides the advantage of explicitly incorporating the relationship between population size and load, our theoretical approach explicitly accounts for the indirect coupling between the dynamics of different loci: a small increase in drift load at very many weakly selected loci can have rather strong effects on total load which can further exacerbate the effects of drift per locus, resulting in an indirect coupling between different loci even in the absence of epistasis. This indirect coupling adds more complexity to the eco-evo dynamics and can have significant implications for population fitness and evolutionary outcomes in the metapopulation. We later go on to discuss the application of our results to conservation issues.

## Model and Methods

We consider a metapopulation with *n*_{D} (*i*=1 *…, n*_{D}) local populations or patches all connected to each other via migration. In each patch *i*, mating occurs randomly and adult individuals are allowed to undergo only one breeding season per generation. Individuals are diploid and subject to deleterious mutations at *L* unlinked loci at a rate *u* per haploid locus, resulting in a genome-wide deleterious mutation rate of 2*Lu*. We also allow for mutations in the reverse direction at rate *v*. The fitness of an individual is multiplicative across loci where at any locus the wild-type has fitness 1, the heterozygote has fitness *e*^{−hs} and the mutant has fitness *e*^{−s} respectively, with *h* representing the dominance coefficient and *s*, the strength of selection against the mutant allele. We assume here that *s* and *h* are the same across all loci, a gross simplification that is relaxed in the supplementary material, appendix G. Patches are also allowed to have different population sizes and mean fitness.

In each generation, a fraction *m* of individuals in each patch *i* migrate into a common pool and individuals from this pool are then evenly dispersed back to the patches. We assume that all patches experience the same environment, so that the fitness of any genotype is independent of the patch. We further assume a logistic model of population growth in each *i* i.e., , where the population size, *n*_{i}(*t*+1) in *i* after reproduction and density regulation depends on its previous size *n*_{i}(*t*), its carrying capacity *K*, the baseline growth rate, *r*_{0} as well as on the mean fitness, of individuals (hard selection) in *i* which holds true in a number of natural populations. Note that the carrying capacity *K* and the baseline growth rate, *r*_{0} are assumed to be the same across all patches.

The parameters of our model are therefore, the number of loci *L*, the rate of mutation *v*=*g u* (where *g* represents the degree of mutational bias; we mostly assume *g*=1), the strength of selection *s*, the dominance coefficient, *h*, the migration rate, *m*, the baseline growth rate, *r*_{0} in each patch and the carrying capacity *K*. As will be argued below, the behaviour of our model is governed largely by composite parameters *Km* (which represents the strength of migration scaled by the carrying capacity), *Ks* (which represents the per locus strength of selection scaled by the carrying capacity), *Ku* (which is the deleterious mutation rate scaled by the carrying capacity), *h*, 2*LU* =2*Lu/r*_{0} (which is a measure of the hardness of selection) and *r*_{0}*K* (whose inverse represents the strength of demographic fluctuations).

To guide the choice of parameter ranges, we look to some estimates in the literature. For example, estimates of the frequency at which deleterious mutations appear in the genome per individual per generation has been documented to be of the order of 0.1−1 in multicellular eukaryotes (Lynch et al., 1999), 1.2−1.4 in *Drosophila* (Haag-Liautard et al., 2007), 0.25−2.5 for *C. elegans* (Denver et al., 2004) and 2.2 for humans (Keightley, 2012), the latter being an unusually high value. Similarly, data on the distribution of fitness effects remain controversial as different studies have generated varied selection estimates on phenotypic traits. Although some experimental studies have shown that a sizeable fraction (*<*15%) of mutations are likely to be lethal (Mukai et al., 1972, Eyre-Walker et al., 2006), there is consensus that small effect mutations (with effect *<*5% on quantitative traits) have higher densities (Mukai, 1964, Mukai et al., 1972, Lyman et al., 1996, Lynch et al., 1999). Dominance coefficients are harder to estimate for weakly deleterious mutations but may be ∼0.2 for moderately deleterious variants (Charlesworth and Charlesworth, 1987). In addition, several studies point to a negative relationship between *s* and *h*, with mutations of small effect being almost additive and those of large effect being almost recessive (Greenberg and Crow, 1960, Simmons and Crow, 1977, Lynch et al., 1999, Gillespie, 2004). With regards to the baseline growth rate, *r*_{0}, this is a very challenging parameter to estimate as it can fluctuate due to a multitude of interrelated factors. Statistical and mathematical models that incorporate factors like the carrying capacity of the population in question, birth and death rates as well as other ecological parameters are often used to predict *r*_{0} but they can only provide an approximation at best due to the complexities involved.

Finally, indirect measures (e.g, *F*_{ST} ) have reported moderate to high levels of gene flow in plants and mammals with typical values between closely related species being of the order 0.05−0.2. For example, *F*_{ST} is ∼0.099 between North American and Eurasian populations of gray wolf and ∼0.018 between gray and red wolf populations which corresponds roughly to 2 and 14 migrants respectively per generation between local populations. Similarly, average *F*_{ST} was found to be ∼0.07 in a study of 12 populations of *Camelina sativa* and ∼0.077 in 337 species of seed plants (Gamba and Muchhala, 2020). Studies have however found that these *F*_{ST} values are higher in plants with limited dispersal mechanism such as those that are self-pollinated or wind-dispersed. For instance, Hamrick et al. (1990) found *F*_{ST} values to be of the order of 0.32 in a population of trees that dispersed their seeds by gravity and of the order of 0.15 in those that dispersed their seeds by wind; corresponding to approximately 5 and 14 migrants every ten generations. It is important to note that in populations that are on the verge of extinction, gene flow may be much less than this (Frankham et al., 2002, Casas-Marce et al., 2013, Szczeciśska et al., 2016) and indeed one of the goals of this study is to identify critical levels of migration required to prevent population collapse. Thus, most of the study will focus on low to moderate levels of gene flow. In addition, we will consider a genome-wide mutation rate (scaled relative to the growth rate), 2*LU<*1 (which corresponds to *<*63% reduction in fitness), otherwise, populations will not grow. We will also consider both recessive and additive alleles and growth rates of the order of 0.1.

Given this set-up, we are generally interested in understanding the varied effects migration has on populations outcomes and the role parameters such as dominance, selective effects and mutational bias play in this. Using simulations and analytical approximations, we explore this problem from two different angles.

First, we consider a soft selection model where the maximum possible genetic load, 2*Lu* in any deme is much less than the intrinsic growth rate (i.e., 2*Lu/r*_{0}≪1) so that demic population size (and hence that of the metapopulation) is always constant over time (even at high levels of maladaptation) thus ignoring the possibility of local extinctions. With this model, we explore the effect of gene flow on load due to a single locus and how this depends on the selection strength and dominance coefficient at the given locus.

Secondly, we also consider a hard selection model where the maximum possible genetic load in a population or deme is appreciable relative to the intrinsic growth rate (i.e., 2*Lu/r*_{0}≳1) so that population size now depends on the degree of maladaptation through load. This model therefore takes into account the interplay between population size and allele frequency, thus accounting for the possibility of local (and by extension global) extinction. Also, since population size differences exist among demes under this kind of model, larger and more fit islands would contribute more to the migrant pool and would be less influenced by incoming migrants. Using this model, we focus on understanding how migration influences population size and load (and hence metapopulation persistence); how it impacts critical extinction thresholds; and finally what the role of the hardness of selection is (where hardness of selection is simply a measure of the reduction in the growth rate of a population due to load and is quantified by the value of 2*LU* =2*Lu/r*_{0}; the larger this value, the more strongly is growth rate reduced, resulting in harder selection).

To gain insight into the above questions, we make use of a number of analytical approximations which are described below. First, we assume the diffusion approximation upon which we base our soft selection analysis. This will be used to obtain the equilibrium distribution of allele frequency at any given locus in any deme conditional on the mean allele frequency across the entire metapopulation (Wright, 1937).

In using the diffusion approximation (Wright, 1937), we make a simplifying assumption that alleles at different loci evolve independently (which is valid when all evolutionary and ecological processes are much weaker than the strength of recombination). However, to account for the effect of multilocus LD on load, we will consider a diffusion approximation with effective migration rates *m*_{e} called the ‘effective migration approximation’ (Sachdeva, 2022).

Under hard selection, population sizes can vary between demes and also within a deme over time, so that the metapopulation is characterised by a distribution of population sizes and we hence need to follow the joint distribution of population size and allele frequencies in any given deme. It turns out that the diffusion approximation cannot be used to obtain an explicit solution for such a joint distribution in the presence of mutation. Hence, to make analytical progress we look to a different approximation - *the semideterministic approximation* (see also Sachdeva et al. (2022)). This assumes that the population size of any deme is reduced with respect to carrying capacity by an amount that depends on the genetic load and is thus *determined* by load, rather than following a distribution. It further assumes that the load is that expected at mutation-selection-migration-drift equilibrium for the population size, allowing us to solve self-consistently for the population size (and load). Over the time scale required to reach such equilibrium, population size remains relatively constant due to weak demographic fluctuations and thus the approximation accounts for genetic drift but not demographic stochasticity.

The different approximations are discussed below. First we describe the dynamics of the joint evolution of population size and allele frequencies.

### Evolution of allele frequencies and population sizes in continuous time

If ecological and evolutionary processes are weak relative to recombination so that they cause only minor changes per generation, and if in addition, we ignore non-random associations and correlations among loci, then one can describe the dynamics of allele frequency, *p*_{i,j}, at any locus *j* and population size, *N*_{i}, in any deme *i* in continuous time as,
where *R*_{g,i} is the load in deme *i* obtained as the sum of the load due to all *j* loci.

Equation (1) and (2) are non-dimensional equations that have been obtained by re-scaling population size, *n* relative to carrying capacity, *K* (i.e., *N* =*n/K*), re-scaling load, *R*_{g} relative to the rate of increase in the population from low numbers - the characteristic growth rate, *r*_{0} (i.e., *R*_{g}=*r*_{g}*/r*_{0}) as well as re-scaling all other parameters relative to *r*_{0} i.e., *S*=*s/r*_{0}, *U* =*u/r*_{0}, *M* =*m/r*_{0} and *τ* =*r*_{0} *t. q*_{i,j}=1−*p*_{i,j}, is the population size averaged across all demes of the metapopulation and is the allele copy number at locus *j* averaged across all demes of the metapopulation. Finally and are the stochastic parts of the equations. represents the process of drift (i.e., stochastic fluctuation in allele frequencies due to random mating) and has mean,and variance, . represents demographic stochasticity (population size fluctuation due to randomness in birth and death) and has mean, and variance, . Note that *ξ*=*r*_{0} *K* (its inverse determines the strength of demographic fluctuations).

### The diffusion approximation for allele frequencies

The diffusion approximation has a long history in population genetics, being an effective mathematical tool for understanding how the long term behaviour of populations are shaped by evolutionary (ecological) processes.

Under soft selection, each deme is at carrying capacity such that population size in each deme is *n*=*K* and is not impacted by load, hence, we can ignore eq. (2). The third term in eq. (1) then depends only on the scaled migration rate, *M*, the allele frequency at locus *j* and the mean allele frequency at *j* (i.e., averaged across all demes of the metapopulation). We can now write down equations for the time evolution of the joint distribution, of allele frequencies for a fixed *N* (i.e., under soft selection). In the spirit of Wright, and under soft selection one can find an equilibrium solution for ,
where *ψ* is the marginal distribution at *j*. The expected allele frequency and heterozygosity, at any locus *j* in the population can be obtained by numerically integrating over eq. (4) i.e.,
The expected load, 𝔼 [*R*_{g}] in the population can then be obtained as the sum,
Under hard selection, however, we cannot find a direct solution for the joint distribution of population size and allele frequency with mutation. Hence, we resort to a semi-deterministic approximation as described below.

## The semi-deterministic approximation

As stated earlier, under the model of hard selection and in the presence of mutation, the diffusion approximation introduced above does not give an explicit formula for the equilibrium population size and load in any deme of a metapopulation. Hence, we introduce here a new approximation - the semi-deterministic approximation (Szép et al., 2021, Sachdeva et al., 2022).

This allows us to get straightforward and accurate solutions for the expected size and load in any population at equilibrium. It is called semi-deterministic because it takes into consideration one kind of stochasticity (genetic drift) but ignores the other (demographic stochasticity). It assumes that allele frequencies in a population change rather slowly and that demographic fluctuations in the population are very weak (implying a fairly steady size), so that at equilibrium, the load can be approximated by the expectation at mutationmigration-selection-drift balance and population size can then be calculated by a simple expression that depends on the load.

Mathematically, this can be obtained by setting the l.h.s. of eq. (2) to zero and assuming that at selectionmutation-migration-drift equilibrium, *R*_{g}∼ 𝔼 [*R*_{g}|*N*_{*}] where *N*_{*} represents the equilibrium population size. If we further assume that population sizes are roughly similar across patches (so that no island acts as a demographic source or sink), then population size on any island is just reduced (w.r.t. carrying capacity) by an amount proportional to load, i.e., *N*_{*}∼1−𝔼 [*R*_{g}|*N*_{*}]. This together with the equilibrium expression for 𝔼 [*R*_{g}|*N*_{*}] (given in eq. (5)) allows us to solve for *N*_{*}.

The above assumptions imply that we account for the genetic effects of migration on load but ignore its demographic effects which can be important in source-sink dynamics (i.e., when population sizes differ across demes). Our approximation is thus limited in this respect. The situation is however not too dire as we shall later on see in the result section, even in scenarios with source-sink dynamics, our approximation accurately predicts the population sizes of the non-extinct patches, though it fails to predict the proportion that are extinct.

There are five important parameters that arise from our semi-deterministic approximation. They are, the dominance coefficient, *h*, that determines the degree of dominance or recessiveness of an allele; the selection strength scaled by the carrying capacity, *Ks*, that determines the strength of selection against deleterious homozygote at a locus relative to local drift when demes are at carrying capacity; the mutation rate scaled by the carrying capacity, *Ku*; the migration rate scaled by the carrying capacity, *Km*, that determines the degree of subdivision and finally, the geneomewide mutation rate relative to the baseline growth rate, 2*LU* =2*Lu/r*_{0}, that determines the hardness of selection (see Sachdeva et al. (2022)). 2*LU* = 0 is equivalent to pure soft selection and 2*LU* = 1 represents selection that is very hard.

We now describe our simulation setup.

## Simulations

We run individual-based simulations only for soft selection to identify regimes in which multilocus interactions are important and also to test the validity of approximations based on effective migration rates, *m*_{e} (i.e., the effective migration approximation). However, for hard selection, we perform allele frequency simulations (that assume linkage equilibrium) since in general, multilocus interactions only have limited effects which are captured reasonably well by simple extensions of the diffusion approximation, that include *m*_{e}. We describe the allele frequency simulation below.

Under soft selection, since population size is always constant, we only follow allele frequencies, **p**_{i,j} at all *L* loci (i.e., *j*=1, …, *L*) and in all *n*_{D} demes (i.e., *i*=1, …, *n*_{D}) of the metapopulation. Whereas, under hard selection, where population size vary across patches, we follow both the size, *N*_{i} and allele frequencies **p**_{i,j} in all patches of the metapopulation.

The simulation is initialized such that the patches comprising the metapopulation are initially perfectly fit (i.e., the mutant allele is assumed to be absent so that its frequency in each patch is initially zero) and individuals then gradually accumulate deleterious mutations over time. In each generation or time step, allele frequencies (under soft and hard selection) and population size (under hard selection) undergo changes due to processes of migration, mutation, selection, and stochastic events (i.e., drift and demographic stochasticity) as described in appendix A of the suplementary material. The metapopulation is allowed to equilibrate and the equilibrium load and population size (under hard selection) are then computed.

The simulation was implemented in Fortran (see Olusanya et al. (2023) for code).

### Soft selection

We first consider soft selection, where each subpopulation has a fixed number *K* of individuals, regardless of mean fitness. Assuming loci evolve independently (i.e., under linkage and identity equilibrium), the distribution of allele frequencies (at a given locus) within subpopulations at mutation-drift-selectionmigration equilibrium depends only on the scaled parameters *Ks, Ku, Km* and the dominance coefficient *h*, and is given by eq. (4) (where *n*≡*K*), conditional on , the mean allele frequency across all subpopulations (where can be obtained self-consistently by numerically solving .

The soft selection model has been analysed in earlier studies that assume either very weak or strong selection, i.e., *Ks*≪1 (Whitlock, 2002), or *Ksh*+*Km* at least 5 (Glémin et al., 2003, Roze, 2015). These limits are useful to consider as they yield simpler intuition and allow for explicit analytical results: for instance, when selection on deleterious homozygote is much weaker than local drift (i.e., *Ks*≪1), probabilities of identity by descent involving two or more genes can be approximated by their expectation under the *neutral* island model, allowing one to solve explicitly for in terms of (essentially) neutral higher cumulants of the allele frequency distribution (Whitlock, 2002). In this limit, selection is only efficient at the level of the population as a whole and not within local demes, some of which may be nearly fixed for the deleterious allele (if *Km* is small enough). By contrast, for large *Ksh*, deleterious alleles segregate at low frequencies within any deme and the allele frequency distribution is concentrated about *p*. Thus, in this limit, gene flow only serves to further narrow the distribution of allele frequencies, i.e., further increase the effective size of local demes (Glémin et al., 2003, Roze, 2015). Here we focus on *moderately* deleterious alleles with *Ks*∼1 (which contribute the most to drift load in a single isolated deme). As we argue in appendix B (see supplementary material), in this regime, selection has more subtle effects, which are not captured by either intuitive description above.

Beyond these more conceptual issues, the concrete question we address here is: how much gene flow between subpopulations is required to alleviate load at a single locus under soft selection, and how does this depend on the (scaled) selection strength *Ks* and dominance coefficient *h* at the locus? We then ask: to what extent is load per locus affected by multilocus associations when multiple unlinked loci across the genome are under selection? In the next section, we build upon this to investigate similar questions for the case of hard selection, where positive feedback between increasing drift load at large numbers of loci and declining local deme sizes can drive the entire metapopulation to extinction.

To begin with, consider the case of fully isolated demes (*Km*=0). For *Ku*≪1, any deme will be close to fixation for one or other allele at any locus. Under this “fixed state approximation” (see also Szép et al. (2021)), the probability that the deme is nearly fixed for the wildtype allele is proportional to: , while the corresponding probability for the deleterious allele is proportional to:, regardless of *h*. Thus, for *Ku*≪1, the expected deleterious allele frequency is and the expected load *G* (scaled by the deterministic expectation 2*u*) is: , assuming 𝔼 [*pq*] ≈ 0. It follows from this that the maximum contribution to load is from lociwith *Ks*≈0.64, independent of *h*, with load per locus being for this value of *Ks*: thus drift can inflate the load associated with moderately selected loci by a factor of several hundreds or thousands in an isolated population (see also Kondrashov (1995)).

Let us now consider how gene flow changes these simple expectations: fig. 1a shows load per locus (scaled by 2*u*) as a function of *Km* for *Ks*=0.64 for various values of *h*; the inset shows the expected deleterious allele frequency vs. *Km*. Symbols depict predictions of the full diffusion approximation (eqs. (4)-(6), *j*=1). In addition, we also show predictions of a simpler ‘moderate selection’ approximation (lines), which is outlined below (details in appendix B).

Figure 1a shows that a very low level of gene flow (*Km*∼0.1) is enough to dramatically reduce drift load, regardless of dominance. For instance, *G/*(2*u*) falls from 140 at *Km*=0 to 4−5 at *Km*=0.1 for both nearly recessive (*h*=0.02) and co-dominant (*h*=0.5) deleterious alleles, for *Ku*=0.001. In both cases, the expected allele frequency declines from 0.22 at *Km*=0 to ∼0.01 at *Km*=0.1 (inset), while heterozygosity hardly increases (from ∼0.002 to 0.003−0.004). Thus the reduction in load at low levels of migration is almost entirely due to the decline in the number of fixed deleterious alleles, with little to no change in the number of segregating alleles.

A further increase in migration further reduces load, though in the case of recessive alleles, this is partially offset by an increase in heterozygosity (resulting in less efficient purging). Thus, for recessive alleles, load is minimum at *Km*≈4 and rises again as *Km* increases further, approaching the level expected in a panmictic population at large *Km*. For the parameters shown in fig. 1a, the reduction in load due to purging at intermediate *Km* is rather modest and hardly visible on the scale of the plot; purging has a more substantial effect when *u/*(*hs*) and *Ks* are smaller, and provided *h<*1*/*4 (Whitlock, 2002).

Increasing gene flow also shifts the frequency spectrum of alleles that contribute to load – for *Ks*≲0.1, the predominant contribution at *Km*=0.1 is from fixed or nearly fixed alleles; however, one migrant per deme per generation (*Km*=1) is already enough to prevent fixation of deleterious alleles, so that load is entirely due to alleles segregating at intermediate frequencies at this higher migration level (dashed vs. dotted red curves in fig. 1b). As expected, gene flow has a much weaker effect at strongly selected loci: for *Ks*=6.4, load is entirely due to segregating alleles regardless of *Km* (blue curves). Thus, increasing gene flow reduces load only very weakly at strongly selected loci (fig. 1c), and may even be detrimental if it hinders purging (in the case of recessive deleterious alleles).

Moreover, even low levels of gene flow tend to “even out” the contributions of alleles with different selection coefficients to load, so that moderately deleterious alleles no longer contribute disproportionately (as in isolated populations). For instance, with *Km*=0.1, load is maximum for *Ks* ≈ 0.1 (regardless of *h*), for which it is only about ∼3 times larger than the deterministic expectation (fig. 1c); by contrast, in isolated populations, it may be several hundred or thousand times larger (see above). With modest levels of gene flow (*Km*=0.5), there is an even weaker inflation of load for intermediate *Ks*, while ≳2 migrants per generation are enough for load to be more or less independent of *Ks* as in a large, essentially panmictic population.

One can ask further: are the processes underlying this dramatic reduction in load (even with low levels of gene flow) qualitatively different for alleles with different selective effects? More concretely, expressing load as , it follows that for recessive alleles (*h<*1*/*2), load declines if the mean deleterious allele frequency and/or *F*_{ST} at the selected locus decrease. A decline in *F*_{ST} (for a fixed ), in turn, reflects a change in the allele frequency distribution from a more *U* -shaped (wherein a fraction of demes are nearly fixed for the deleterious allele) to a more unimodal or concentrated distribution (wherein the deleterious allele segregates at a low frequency close to *p* in almost all demes). Thus, in effect, we ask: to what extent is the reduction in the mean allele frequency across the entire population (as measured by ) and/or the reduction in probability of local fixation of the deleterious alleles (as measured by *F*_{ST} ) sensitive to the selection coefficient of the deleterious allele?

Figure 1d shows vs. *Ks* (main plot) and *F*_{ST} vs. *Ks* (inset), for various values of *Km*, for co-dominant and nearly recessive alleles (filled vs. open symbols). Note that while selection is quite efficient in reducing the mean number of deleterious alleles already for *Ks*≳0.05 (say, under low gene flow, i.e., *Km*=0.1), it has little effect on *F*_{ST} (i.e., on the probability of local fixation within demes, given *p*) unless *Ks*≳0.5. For example, *F*_{ST} is reduced by only about 15% with respect to its neutral value for alleles with *Ks*=0.5, but by *>*50% for *Ks*=2, for both values of *h*. Thus, selection for increased heterozygosity within demes (as reflected in a decrease in *F*_{ST} ) does not markedly influence the evolutionary dynamics of alleles with *Ks*≲1 (which suffer the most severe inflation of load in isolated populations) but can be important for moderately deleterious alleles with 1≲*Ks*≲5 (which may still contribute substantially to load).

In Appendix B, we introduce a ‘moderate selection’ approximation, which allows *F*_{ST} to be affected by selection, but assumes that the relationship between (appropriately scaled) higher cumulants of the allele frequency distribution and *F*_{ST} is the same as under the neutral island model. The predictions of this approximation are shown by lines in fig. 1a, 1c and 1d, and appear to match the full diffusion (symbols) quite well for 0.5≲*Ks*≲5. This suggests that moderate selection essentially changes pairwise coalescence times (to different extents within and between demes), without appreciably affecting other statistical properties of the genealogy, e.g., the degree to which branching is skewed or asymmetric, at loci subject to purifying selection (see also Appendix B).

#### Effect of multilocus interactions on load

So far, we have considered the equilibrium load at a *single* locus, neglecting linkage and identity disequilibrium between deleterious alleles segregating across multiple loci in the genome. These may be substantial if, for instance, different subpopulations are nearly fixed for partially recessive deleterious alleles at very many different loci, so that *F* 1 offspring of migrants and residents (and their descendants) have higher fitness than residents, giving rise to heterosis. In such a scenario, the extent of gene flow at any one locus depends not only on *Km*, the average number of migrants exchanged between demes, but also on their relative fitness, which may differ significantly from that of residents if deleterious alleles segregate at multiple loci in the population as a whole (i.e., if 2*Lu* is large and *Ks* small), allele frequencies differ markedly between subpopulations (migration is low or *F*_{ST} high), and alleles are recessive (*h* is small).

More specifically, heterosis implies that deleterious alleles that enter a given deme on a migrant genetic background are more likely to be transmitted to the next generations than deleterious alleles on resident backgrounds, resulting in an *effective* migration rate that is higher than the raw migration rate (see also Ingvarsson and Whitlock (2000)). For a genome with *L* equal-effect unlinked loci with selective effect *s* and dominance *h* per locus, and assuming weak selective effects (*s*≪1), the effective migration rate at any locus is shown by Zwaenepoel et al. (2023) to be approximately:
Following Sachdeva (2022) and Zwaenepoel et al. (2023), we can incorporate the effects of multi-locus heterosis by assuming that allele frequencies at any locus follow the equilibrium distributon in eq. (4), but with the raw migration rate replaced by an effective migration rate which itself depends on the expected allele frequency and the expected heterozyosity 𝔼 [*pq*] within demes (or equivalently, and *F*_{ST} ). As before, this allows for a numerical solution, yielding the theoretical predictions (solid lines) in fig. 2.

Figure 2 shows load per locus scaled by 2*u* as a function of *Ks* for low migration (*Km*=0.1), for two values of the total mutation rate (2*Lu*=0.2 in red and 2*Lu*=0.5 in blue) and two dominance coefficients (*h*=0.02 in fig. 2a and *h*=0.2 in fig. 2b). For each value of 2*Lu*, we further simulate with two values of *L*, scaling down *s, u* and *m*, and scaling up *K* as we increase *L*, so that *Ks, Km, Ku* and *Ls* remain constant. Symbols depict results of individual-based simulations for a metapopulation with 100 demes; dashed curves show single-locus predictions (obtained using eq. (6)) that do not account for multilocus heterosis; solid curves represent predictions that account for interference via effective migration rates: note that these depend on *L* only via the combination *Ls* (or alternatively, *Lu* for a given *u/s*). As expected, there is better agreement between simulations and theory for smaller values of *s* (or alternatively, larger *L*), for a given total mutation rate, as the expression for effective migration rate in eq. (7) becomes more accurate as *s*→0. As can be seen in fig. 2, load per locus is most strongly reduced by multilocus heterosis when 2*Lu* is large, and for loci with small *h* and intermediate *Ks*. Moreover, the effects of multilocus heterosis become weaker with increasing migration (which corresponds to lower *F*_{ST} ) which reduces allele frequency differences across demes and consequently the extent of heterosis (see also Roze, 2015).

## Hard selection

In our soft selection analysis, we showed that the load in a metapopulation can be quite significant (l.h.s. of fig. 1a) when migration is limited, in particular, when *Km<*1). We also demonstrated that a minimal amount of gene flow (e.g., *Km* = 0.1 in fig. 1a) is enough to purge or alleviate such load (by about 96% in fig. 1a) irrespective of the value of *h*. This holds true even for moderately deleterious alleles that would otherwise contribute disproportionately to load. This purging advantage however decreases as *Km* becomes very large (for example, *Km>*4 in fig. 1a). Here, we would like to understand how this changes under hard selection where there is a feedback between population size and genetic load so that population sizes are not fixed at *K* but decline with increasing load, placing populations burdened by high load at an increased risk of extinction.

More concretely, we aim to understand the impact of gene flow on drift load and purging in the context of hard selection and what the attendant consequences of these are for metapopulation outcomes, for example, on critical extinction thresholds. We also aim to understand the role of dominance and how critical thresholds are influenced by the strength of eco-evo feedback. The strength of such feedback depends largely on 2*LU* =2*L*(*u/r*_{0}) which is the expected load (in the absence of drift) divided by the baseline growth rate. If 2*LU* =0.5 for example, this indicates a 50% reduction in growth rate due to genetic load (assuming load is primarily deterministic), we therefore take 2*LU* as a measure of the hardness of selection.

For simplicity, our analysis will focus on scenarios where loci have equal effects; more complicated but realistic scenarios involving a distribution of fitness effects are explored in appendix G of supplementary material. As introduced in The semi-deterministic approximation, the scaled effect per locus is denoted by *Ks* and gives the strength of selection at a deleterious homozygous locus relative to local drift, when population sizes are at carrying capacity *K*. However in general, under hard selection and depending on parameter values, population sizes will be less than *K*, so that drift will usually be stronger relative to selection, than indicated by the value of *Ks* (or *Ksh*).

The rest of this section is organised as follows, first we will identify critical migration thresholds for extinction by exploring how metapopulation outcomes as measured by the mean equilibrium average population size per island scaled by the carrying capacity *K* (i.e., *N* ) depend on the magnitude of gene flow (as measured by *Km*). We will then test the validity of the semi-deterministic approximation by contrasting semi-deterministic results of *N*, (under different *Km* values) with outcomes derived from our simulations, distinguishing between two simulation runs, each with different strengths of demographic fluctuations, while keeping other (scaled) parameters, *Ks, Km, Ku* fixed. We do this to see how well the results from our simulations converge to the semideterministic results as demographic fluctuations become weaker (i.e., as *r*_{0}*K* increases, since the strength of demographic fluctuation scales as 1*/*(*r*_{0}*K*)); our expectation is that the semi-deterministic approximation should be good enough in the limit *r*_{0}*K*→∞. We will later build on this in a more general way to explore how the thresholds depend on homozygous selective effects relative to drift, dominance and the ‘hardness’ of selection. Finally, we will explore parameter regimes which allow for the metapopulation to persist (i.e., avoid extinction) and explore how equilibrium load and population size per deme in this regime depend on various parameters.

### Extinction thresholds with low gene flow

Figure 3a shows the mean population size across the metapopulation vs. *Km* (the average number of migrants exchanged between any deme and the metapopulation, when all demes are at carrying capacity). This is shown for *Ks*=1 and *Ks*=10 (fig. 3a and 3b respectively) and for both recessive and additive alleles assuming fairly hard selection (i.e., 2*LU* =0.6). The solid lines are results from the semi-deterministic approximation and the symbols connected by dashed lines are results from simulations. Keeping all other scaled parameters fixed, simulations are run for various values of *r*_{0}*K* (where *r*_{0} is fixed at 0.1 and *K* is varied) to see the impact of the strength of demographic stochasticity. Circular symbols denote simulations run with a higher *K* (here, *K*=3000) and triangles denote simulations run with a lower *K* (here, *K*=500). Note that to keep *Ks* fixed, increasing (decreasing) *K* would mean reducing (increasing) *s*.

Let us first concentrate on the analytical semi-deterministic prediction (solid lines). Figure 3a and 3b show that there exist a critical *Km* threshold (which we will call *Km*_{c}) below which the metapopulation collapses (i.e., below which *N* =0) and this threshold is highest for additive alleles (blue vs red colors). We call this a critical threshold because it represents a tipping point in the fate of the metapopulation wherein a slight variation (for example the introduction of an additional migrant) can change the expected metapopulation outcome. The extinction of the metapopulation occurs at very low *Km* values because such low values typically imply very little gene flow among the different patches. As such, these isolated patches rapidly dwindle in size (due to inbreeding) at a faster rate than can be rescued by migration, thus leading to their extinction and the collapse of the metapopulation. However, for *Ks*=1 fig. 3a, the situation is less gloomy for patches with recessive alleles (red solid line) because they are are able to purge some of their load, thereby requiring less migration to escape extinction; we will come to this later in more detail.

An important aspect to focus on in figs. 3a and 3b is the disparity between the critical threshold, *Km*_{c}, observed in simulations compared to our analytical approximation (symbols vs solid lines) – *Km*_{c} from simulations are lower compared to semi-deterministic outcomes. However, it is worth noting that these thresholds approach the semi-deterministic expectation as the parameter *K* (and consequently *r*_{0}*K* (for fixed *r*_{0})) increases, suggesting that our analytical approximation remains valid when demographic fluctuations become less pronounced.

Interestingly, we observe two qualitatively different behaviours for *Ks*=1 (fig. 3a) and *Ks*=10 (fig. 3b). In the case of *Ks*=1 (fig. 3a), we observe that for low *Km*, near the semi-deterministic threshold, the mean population size, *N* is higher and the critical migration threshold lower when *K* is smaller (*K*=500 here; represented by triangles with dashed lines) compared to when we have a higher value of *K* (*K*= 3000 - represented by circles with dashed lines). To illustrate this point, fig. 3c shows the equilibrium distribution of deme sizes for different values of *K* just at the semi-deterministic threshold (*Km*=1.1). At this value of *Km*, we see that large-sized patches (*K*=2000; black line) are extinct whereas patches with smaller sizes (e.g., *K*=250; blue line) are still able to avoid extinction. This observation is puzzling as it indicates that smaller patches are more stable, requiring fewer migrants per generation (i.e., lower *Km*) to prevent extinction (all other parameters in particular *Ks*, the strength of selection relative to drift, remaining constant). Usually, we would expect an opposite outcome since larger patches would be imagined to be less prone to the effects of demographic stochasticity (hence more stable than small-sized patches). This unusual result however highlights the role of the demographic effect of migration in small-sized patches where a few migrants (say, 1 per generation) can go a long way in preventing extinction (see also (Sachdeva et al., 2022) figs. 3b and 3d).

In contrast, for larger *Ks*, we observe the more expected behaviour where mean sizes are higher for higher *K* (compare red triangles and circles). Figure 3d further illustrates this point, where at low *K* (e.g. *K*=250 (blue line)) we have a bimodal distribution of population size with a fraction of demes close to extinction and the remaining fraction peaked at a stable non-zero *N* . With increasing *K* however, there is a reduction in the weight of the distribution near extinction due to a decline in demographic fluctuations which then further narrows the distribution around *N* . With *K*=2000, we obtain a unimodal distribution where the population peaks at *N* =0.55 which coincides with the semi-deterministic approximation (dashed vertical black line). Similar figures are shown in appendix C for *h*=0.5. Hereon we will only show results from the semideterministic approximation with the understanding that these should be accurate if *r*_{0}*K* is sufficiently large.

Now we investigate what determines the thresholds discussed above. In particular, we explore how the selection strength per locus influences the critical migration rates, *Km*_{c}, below which the entire metapopulation collapses, starting from a state in which it is stable, i.e., where all demes are at the stable population equilibrium. We then build on this to understand the role of dominance and the sensitivity of the thresholds to the hardness of selection. It is important to note that the corresponding thresholds for the population to grow starting from a state where there are only a small number of individuals could be much more stringent because small populations suffer from genetic Allee effects (see also Sachdeva et al., 2022).

For simplicity, we begin by assuming equal effect loci; scenarios involving a distribution of fitness effects are explored in appendix G. Figure 4a and 4b show that there is a non-monotonic relationship between *Km*_{c} (the critical migration threshold required to prevent a complete meltdown of the metapopulation) and the scaled strength of selection, *Ks*. In particular, when *Ks* is low, the fitness differences between genotypes in each subpopulation is minimal so that low rates of migration are sufficient to maintain diversity and prevent extinction. As *Ks* increases from low values, the fitness differences between genotypes become more pronounced and the fitness cost of carrying harmful mutations becomes higher (as was also seen in the soft selection results; figs. 2a and 2b). Consequently, a higher migration is necessary to counteract this accumulation of load and prevent extinction leading to a rise in *Km*_{c} until a maximum value at intermediate *Ks*. Beyond this peak, as *Ks* increases further, selection becomes strong enough to eliminate individuals with high load so that we again require less migration to prevent extinction.

When alleles are at least (partially) recessive, it is much easier to maintain the metapopulation as load in each subpopulation is lower, due to more efficient purging of deleterious recessives alleles. Furthermore, we see that when selection is moderately hard (i.e., 2*LU* =0.4 in fig. 4a), critical migration thresholds are much lower compared to when selection is much harder emphasizing the impact of the strength of the coupling between population size and load. In addition, we see that the metapopulation never collapses beyond *km*_{c}=1 (corresponding here to *NKm*∼1 i.e., one migrant per generation). On the other hand, with harder selection (i.e., 2*LU* =0.8 in fig. 4b), we require very high migration between subpopulations to prevent the metapopulation from going extinct. In essence, we see that the conditions for metapopulation persistence for both recessive and additive alleles are much more stringent as selection gets harder (compare gray region in 4a and 4b). It is important to emphasize that these results however assume equal effect loci and equal forward and backward mutation rates between deleterious and wild-type alleles. We explore the role of asymmetric mutation rates in appendix F (see supplementary material).

The relatively small difference in load per locus between co-dominant and recessive alleles that we observe at a given level of migration under soft selection (fig. 1c) can translate into rather different critical migration thresholds (depending on whether load is primarily due to recessive or additive alleles) for metapopulation persistence under hard selection. This is especially marked when selection is harder (fig. 4b) as a small increase in load (e.g., due to a small decline in gene flow) can set in motion a very strong positive feedback between increasing load and declining population size (which results in stronger drift), culminating in extinction.

To further quantify the role of the hardness of selection, we look at the dependence of the critical migration threshold on the type of selection going from very soft to very hard selection (i.e., as 2*LU* increases). We see clearly from fig. 4c that the critical migration threshold above which populations go extinct due to high load (fig. 4d) is always higher when selection is hard. This holds true for both additive and recessive alleles with the threshold for recessive alleles lower than that for additive alleles as drift always increases load with additive alleles. In addition, when selection is soft and per locus strength of selection is stronger (compare dashed lines corresponding to *Ks*=5 with solid lines corresponding to *Ks*=1), populations are a bit more stable, having a lower load (lhs of fig. 4d) and hence, requiring a lower *Km*_{c} (lhs of fig. 4c) to prevent extinction.

Finally, for a metapopulation to persist (i.e., for *N* =1−𝔼 [*R*_{g}]*>*0), fig. 6 (in appendix H of supplementary material) shows that we need a reasonable amount of gene flow (*Km>*0.2) and under this level of gene flow, moderately deleterious alleles contribute most to load. Albeit, this is numerically only a very modest (5−10%) effect (see also fig. 2). This makes sense due to the simple fact that large effect alleles, though having the potential to drastically increase load, are less likely to fix. Small effect alleles on the other hand, which are more likely to fix, have less effect on load when they do fix.

### Load and population size under high gene flow

Having analysed the low *Km* behaviour, we now consider the behaviour of large *Km*. Going back to figs. 3a and 3b, we observe that above the critical migration threshold, *Km*_{c} and in particular for large *Km*, our simulation results depend only weakly on *K*. For example, for *Ks*=1 (fig. 3a) and with *h* = 0.02 and *Km* = 2, *K* = 500 is already sufficient for our simulations to match the semi-deterministic approximation. To further investigate this, we explore the behaviour of the equilibrium distribution of deme sizes for different values of *K* and for *h* = 0.02. We do this with *Ks*=1 (and *Km*=2.0; fig. 2a in appendix D) as well as with *Ks*=10 (and *Km*=5.0; fig. 2b in appendix D). The peaking of these distributions at the semideterministic expectation for *K*∼250−500 further confirms the weak dependence of our simulation results on *K*.

In addition to the above, beyond *Km*_{c}, there exists only a weak dependence of population size on *Km*, especially when alleles are recessive (fig. 3a), with load close to the deterministic prediction above *Km*_{c} (fig. 5b). This is contrary to previous findings (Whitlock, 2002) where low gene flow is thought to substantially reduce load, due to increased expression and consequently stronger purging of deleterious alleles in more isolated populations. To better understand our results, we look at how the equilibrium allele frequency in the metapopulation (relative to that in an undivided population) as well as how the equilibrium mean load (relative to the deterministic load) depend on the level of gene flow (i.e., *Km*) as shown in figs. 5a and 5b respectively. The solid lines are results from the semi-deterministic approximation and colored circles represent simulation results. Here, we have obtained the deterministic expectation of allele frequency, *p*_{det}, at any given locus and load, *R*_{g,det}, in an undivided population by respectively solving eq. (1) at equilibrium and plugging the result into *R*_{g,i} in eq. (2) to get *R*_{g,det}; this is as opposed to assuming *p*_{det}∼*u/hs* and *R*_{g,det}∼2*Lu* which both overestimate the expectation when alleles are nearly fully recessive.

We see from fig. 5a that when alleles are (partially) recessive, there is a decrease in the frequency of the deleterious allele (as migration decreases) due to purging. However, fig. 5b shows that this purging effect is not strong enough to counter the negative effect of inbreeding as load on the whole increases^{1} with decreasing migration (solid lines and circles), hence the low population size observed at low *Km* in fig. 3a. Similar analysis for strongly selected alleles (*Ks*=10) are shown in appendix E of supplementary material.

## Discussion

In this study, we have explored the varied roles of gene flow, selective effects and dominance on load and extinction dynamics in a metapopulation, explicitly capturing the interaction between population size (in each patch) and allele frequencies at multiple loci. In particular, we distinguished between two models of selection: a soft selection model, which holds true when the total deleterious mutation rate in a population is much less than the baseline growth rate (i.e., 2*LU* ≪1) and size of each subpopulation is assumed to always be at carrying capacity, and a hard selection model (with 2*LU* ∼1), that captures the feedback between population size and allele frequency. Our results provide useful insights into the intricate relationship between genetic diversity, eco-evolutionary processes and the long-term persistence of populations, thus contributing to our understanding of biodiversity and conservation in fragmented landscapes.

Under the soft selection model, we showed that independent of the dominance of deleterious alleles, very little migration (as little as one migrant per approximately ten generations, *Km*=0.1) between a set of interconnected patches is enough to reduce the load due to slightly deleterious mutations (i.e., those with *Ks<*1) by a factor of 100, bringing it close to the deterministic expectation. We show that this reduction happens due to decrease in the fixation probability of deleterious alleles. With one migrant per generation (*Km*=1) and for partially recessive mutations, load is reduced below this deterministic limit and is only due to alleles segregating at intermediate frequencies. Our findings are consistent with those of Whitlock (2002) who demonstrated that at an intermediate level of migration, characterized by medium variance among local populations, load would be lowest under soft selection when mutations are partially recessive and exhibit mildly deleterious effects. This is also corroborated by Roze and Rousset (2004), who similarly observed minimum load at intermediate migration rates due to the purging of weakly deleterious and partially recessive mutations (see also Zhou and Pannell (2010)).

It has long been recognized that the reduction in the mean population fitness due to weakly deleterious mutatations (*Ks*∼1) reaching high frequencies or even fixation just by chance greatly exceeds the reduction due to strongly deleterious mutations that are efficiently kept at deterministic mutation-selection balance frequencies (Kimura et al., 1963; Kondrashov, 1995). Our result suggests that as long as soft selection operates, in order for drift load to be a real issue, populations have to be very strongly fragmented (*Km<*0.1).

Soft selection models are interesting for two reasons. Firstly, some fitness components affect population size more than others (adult viability versus male mating success, say). Secondly, the estimates of load obtained under the soft selection model can be an upper bound to the load in real populations, answering the following question: if the effective population size stays the same, what fraction of offspring will fail to survive under the predicted burden of deleterious mutations?

In reality however, the accumulation of deleterious mutations is likely to decrease the size of the population, or even lead to its extinction (Kondrashov, 1995). Soft selection models do not take into account the complete positive feedback loop when the population size decreases due to genetic load, which in turn leads to further increase in genetic load. Therefore, extinction due to drift load is not possible under soft selection (Charlesworth 2013; Keightley and Eyre-Walker 2010), and we must explicitly model hard selection (where population size declines with increasing load) to investigate how much gene flow is required to prevent metapopulation collapse. Such a feedback loop has been studied in the mutational meltdown literature (e.g., Lynch et al. 1995a,b). In the second part of the paper, we therefore extended our results by considering hard selection.

Under hard selection (i.e., when the intrinsic growth rate is comparable to the total deleterious mutation rate) and when local deme sizes (and hence typical values of *Ks*) are small, we find that much more gene flow is required to ensure persistence. In this case, one may need as many as ∼ 2 and 5 migrants per generation for recessive and additive alleles respectively to ensure metapopulation persistence. These thresholds are highly sensitive to 2*LU* (a proxy for the hardness of selection) with higher values of 2*LU* necessitating higher thresholds. Such higher migration thresholds, *Km*_{c}, are typical of metapopulations with small local population sizes as dynamics in these populations are largely driven by stochastic events (drift and demographic stochasticity) resulting in a positive feedback between population size and load (via allele frequency changes at different loci) which exacerbates extinction risk. What insights might we glean from these high threshold values? These underscore the importance of preserving connectivity (and hence genetic diversity) within metapopulations and the necessity of increased conservation measures to ensure/maintain the viability of such populations.

A second interesting result we found was that the range of *Km*_{c} below which extinction is possible is highest for intermediate values of *Ks* (∼1). Moreover, with much harder selection, the condition for the metapopulation to persist becomes more restrictive as total load (due to a large number of loci) is higher in this case and hence a higher rate of migration is required to counteract the negative effect of deleterious mutation accumulation.

Bimodal distributions (see figs. 3c and 3d) where some demes teeter on the brink of extinction while others thrive with larger populations often emerge naturally close to critical migration thresholds even in the absence of heterogeneity (i.e., when patch qualities and carrying capacities are uniform) due to the stochasticity inherent in our model. Such bimodality can serve as an important signal of impending population collapse, where minor variations in migration rate or external factors can push some demes past the point of no return, causing local extinctions and possibly cascading towards a collapse of the metapopulation (Scheffer et al., 2001, Drake and Griffen, 2010).

Overall, we identified several parameters that govern the fate of a metapopulation. One such key parameter is the genome-wide mutation rate scaled by the intrinsic growth rate, 2*Lu/r*_{0} which is a measure of the hardness of selection. How might one estimate such a parameter? To do this, we would need to know the total deleterious mutation rate, 2*LU* in the population (some estimates of which exist in the literature) as well as the intrinsic growth rate *r*_{0}, which is a much harder parameter to estimate.

Our findings, while intuitive, often become obscured by the prevailing confusion surrounding the concept of “hard selection”. This term is invoked in two distinct classes of eco-evo models. The first category of models meticulously track the coevolution of population size and allele frequencies, and account for the positive feedback between increasing load and declining numbers (Szép et al. (2021)) while the second class of models primarily focus on the consequences of frequency and/or density-dependent selection for the preservation of genetic diversity (Whitlock, 2002). In the latter class of models, hard vs. soft selection (which refers to whether or not local populations contribute to the next generation in proportion to their fitness) is often confounded with local vs. global density regulation (which determines, among other factors, whether changes in deleterious allele frequency can accumulate over multiple generations within local populations). Moreover, these models assume that populations are consistently at their carrying capacity (Whitlock, 2002), thereby neglecting any influence of genetic load on total population size and consequently, on the extent of drift. It is these complexities that underscore the need for models that explicitly incorporate eco-evolutionary feedback and allow for changes in both local and global population sizes. These models not only provide a fresh perspective but also yield distinct results that diverge from the existing body of research. For example, we showed that under such a model with explicit regulation, beyond a critical threshold, reduced *Km* has only a slight effect on metapopulation outcomes (causing a minor decline in average size); a result contrary to results from constant metapopulation size models where reduced gene flow is thought to be beneficial for metapopulations (Whitlock, 2002).

How might the results of our analysis compare with results from models that account for the explicit arrangement of patches (and hence different migration patterns) such as the one and two dimensional stepping-stone models? Bascompte and Solé (1996) posited that metapopulations with explicit spatial considerations might exhibit heightened vulnerability to habitat fragmentation compared to their spatially implicit counterparts suggesting that non-spatial models such as the one considered in this study may tend to underestimate critical thresholds and the impact of fragmentation on metapopulation persistence. This begs the question of the role of different spatial configurations or arrangements on critical thresholds for persistence; are there specific arrangements that prove to be more efficient than others? In general, since the amount of habitat loss that a population can withstand may depend on its spatial distribution, it will be interesting in future to extend our model to incorporate more complex effects of landscape structure and habitat configurations (such as habitat corridors and stepping stone habitats (Bennett, 1999)). Such models may not only provide a more realistic depiction of how fragmentation affect populations but may also play a crucial role in moderating its impacts on the long-term persistence of metapopulations.

What are some of the limitations of our study? Our theoretical framework is based on the infinite island model. Although this is a useful simplification that ensures analytical tractability, in reality, no natural population is truly infinite. Our analysis may therefore benefit from considering finite size populations as this may provide us with better theoretical insights (see Barton and Olusanya (2022)).

Secondly, in our analysis, we make the simplifying assumption that all demes of the metapopulation are at one equilibrium state *N* * so that in eq. (2). We therefore fail to account for scenarios where the different demes can end up in different equilibrium states so that (the bimodality we saw in fig. 3c and fig. 3d were obtained in the absence of this heterogeneity and such bimodality occur close to critical threshold for extinction). In addition to this, we defined load as where we have essentially assumed that the optimal genotype in a population is one with fitness 1. Although, this is a theoretically sound way to think about the problem, however, in reality, we cannot always measure the perfect genotype.

There are other extensions to our model that can be considered in the future. For one, we can explore the role of negative epistasis which may help reduce load (Kimura and Maruyama, 1966). The idea behind this being that with such negative epistasis, the more deleterious mutation an individual has, the less and less fit it becomes just because each mutation becomes worse in the presence of the other. Thus, as individuals die out, they take with them a lot of deleterious mutations which consequently helps to reduce load.

Secondly, in this work, we have assumed that offspring gametes are formed merely by freely recombining parental gametes i.e., assuming a recombination rate, *r*=0.5. Such high rates of recombination can be more potent at disrupting the linkage between genetic loci, fostering the formation of novel allelic combinations. This can lead to a more rapid reduction in load, as deleterious mutations are more likely to be separated and ultimately purged from the population (Kimura and Maruyama, 1966, Crow, 1970, 2017). How might this change with lower rates of recombination or if recombination is allowed to vary across the genome and how much more will load be exacerbated in this case? Providing answers to these questions may further contribute to our understanding of the maintenance of genetic variation and the adaptation of populations to changing environmental conditions.

In addition, we can also extend the investigation of the role of multilocus interactions (i.e., linkage and identity disequilibrium) on load to the case of hard selection using the effective migration approximation. However, validating this with individual-based simulations would be computationally intensive.

Finally, we have assumed a metapopulation landscape where selection is uniform across space. It would be interesting to extend our framework to explore what happens when some loci are under spatially heterogeneous selection (as in Szép et al. (2021)) while others are subject to unconditionally deleterious mutation. What would be the impact of gene flow under such scenarios? While gene flow may potentially swamp local adaptation, it could also concurrently alleviate the burden of deleterious mutations. Determining the relative strengths of these contrasting effects under realistic parameter regimes remains a crucial endeavor worth exploring.

## Funding

This research was funded by the Austrian Science Fund (FWF) [FWF P-32896B] and the DOC Fellowship of the Austrian Academy of Sciences [Grant nos: 26293; 26380].

## SUPPLEMENTARY INFORMATION

### A. How events in the life cycle of individuals change patch size and load

To account for the effect of migration on population size, we first determine the net number of migrants, *netM* (*t*) (i.e., the difference between the total number of immigrants and emigrants in a given patch) and add this to the existing patch size. Under soft selection, we assume a zero net migration rate (i.e., a balance between the number of immigrants and emigrants) so as to keep the patch size fixed. However, under hard selection, the net number of migrant is assumed non-negative. Migration therefore changes the population size according to (under soft selection) and (under hard selection). Similarly, to account for the effect of migration on allele frequencies in any patch, we add the net number of migrant alleles, *netp*(*t*) to the existing allele copy number in the patch and divide this by the population size . In essence, migration changes allele frequencies according to .

With regards to the effect of mutation, we assume equal mutation rate to and from the deleterious allele so that mutation changes allele frequency according to . However, mutation has no effect on population size (under hard selection) so that .

Following mutation, adults in each patch mate to produce offspring that survive to be next generation parents. Under soft selection, the mating process in each patch involves sampling with replacement, *N*_{i}(*t*) pair of individuals based on their fitness and freely recombining their gametes to form offspring gametes. This means that we do not explicitly distinguish between the male and female sexes and there is also the possibility of self mating. Under hard selection on the other hand, load and densitydependent regulation changes population size according to and demographic stochasticity (randomness in birth and death) is further imposed on the population to determine how much survivable offspring, can be formed. The latter is achieved by sampling from a Poisson distribution with parameter . Offspring are then formed by choosing with replacement pairs of parents also based on their fitness and freely recombining their gametes. For both soft and hard selection, selection changes allele frequency according to where is the mean fitness in the *i*^{th} patch and .

Finally, the allele frequency at the end of the generation, is obtained by sampling from a Binomial distribution with parameters, thus accounting for genetic drift.

### B. ‘Moderate selection’ approximation under soft selection

In principle, any quantity such as load, *F*_{ST}, etc. for a single locus can be computed from the equilibrium allele frequency distribution , by first numerically solving to obtain the mean allele frequency , then plugging this into the equilibrium distribution and finally integrating over the distribution to obtain higher moments. However, it is also useful to consider an alternative approach based on equations for moments (or cumulants) of the allele frequency distribution.

In general, any moment will depend on higher moments, resulting in a set of recursions that is not closed. Approximations thus rely on closing this set of recursions in different ways, depending on assumptions about the relative magnitudes of *Ks* (or *Ksh*) and *Km* (Whitlock (2002); Glémin et al. (2003); Roze (2015)). Here, we introduce another such moment closure approximation, which applies also for recessive (*h*=0) alleles and intermediate selection coefficients (*Ks*∼1). We also attempt to discuss the biological meaning of the underlying assumptions.

As in the main text, let *p* denote the frequency of the deleterious allele at a given locus in a given deme and the mean across all demes in the population. We denote the expected change in allele frequency per unit time by *M* (*p*) and the variance of the change by *V* (*p*). These are:
where *K* is the number of individuals per deme. Under the diffusion approximation, the expectation (denoted by 𝔼_{ψ}[…]) of any function *f* (*p*) of the allele frequency *p* over the allele frequency distribution *ψ*[*p*] satisfies (see also Ohta and Kimura (1971)):
Setting *f* (*p*) to be *p* and *p*^{2} yields the following equations for the first and second moments of the allele frequency distribution respectively (see also Whitlock (2002), Glémin et al. (2003)):
It will be useful to express the above equations in terms of appropriately scaled cumulants (rather than moments) of the allele frequency distribution:
Here, *F*_{ST}, *γ* and *κ* denote respectively the variance, skew and kurtosis of the frequency distribution within any deme, scaled by the corresponding cumulant calculated for the distribution of alleles across the entire population.

At equilibrium, moments of the allele frequency distribution are constant in time, i.e., *d* 𝔼 [ *p* ]*/dt* = *d* 𝔼 [ *p*^{2} ]*/dt* = 0; further, . Combining equations (3) and (4), and assuming deleterious alleles to be sufficiently rare *overall* in the population that terms can be neglected, we have at equilibrium:
This *pair* of equations is underdetermined as it involves *four* variables , *F*_{ST}, *γ* and *κ*. To obtain an approximate solution, we further assume that third and higher cummulants of the allele frequency distribution, i.e., the skew and kurtosis are related to *F*_{ST} as in the *neutral* infinite-island model. Note that *F*_{ST} itself is not taken to be neutral (or unaffected by selection), as is assumed by Whitlock (2002), but only that:
Substituting into eq. (5) and expressing load as , we obtain:
Equation (7b) is cubic in *F*_{ST} and can be solved (e.g., numerically); this solution for *F*_{ST} can then be substituted into eq. (7a) to obtain the average deleterious allele frequency *p* and load *G*. Thus, in essence, our approximation allows for selection that is strong enough to change *F*_{ST} but not the relationship between higher cummulants (*γ, κ* etc.) and *F*_{ST} .

It is useful to juxtapose this approximation with those used in earlier work. Whitlock (2002) assumes that not just the relationship between higher cummulants (*γ, κ* etc.) and *F*_{ST}, but also *F*_{ST} is unaffected by selection and is essentially neutral, i.e., equal to 1*/*(1+4*Km*), where *K* is the population size per deme. Thus, his is a *weak selection* approximation and applies when *Ks*≪1, so that the term involving *Ks* in eq. (7b) (or equivalently in eq. (5b)) can be neglected. On the other extreme, Roze (2015) and Glémin et al. (2003) consider a parameter regime where the local deme size *K* is large enough that allele frequency distributions are essentially concentrated around the mean , which is relatively low. In practice, this means that *F*_{ST} is small enough that terms in eq. (7b) can be neglected, which gives: .

Theirs is thus a *strong selection, strong migration* approximation; in practice, it applies for *Ksh*+*Km* greater than 5. By contrast, the approximation for *F*_{ST} introduced above applies also for intermediate *Ks* (see fig. 1d in the main text) and is thus a *moderate selection* approximation: in essence, it captures the key effect of selection which is to change coalescence times within and between demes (to *different* extents) without necessarily changing the extent to which the branching structure of genealogies is (a-)symmetric. This is, however, only a rough interpretation, and a more rigorous analysis will be required to fully justify such approximations and more generally understand intermediate *Ks* regimes where neither selection nor drift can be treated as minor perturbations (to neutral or deterministic predictions respectively).

#### Relaxing the low mean allele frequency assumption

All the approximations described above assume that the expected deleterious allele frequency is sufficiently low (i..e, deleterious alleles are very far from global fixation) that terms can be neglected. When this is no longer true (but if (6) still applies, i.e., if third and higher cumulants depend on *F*_{ST} as in the neutral model), equations (3), (4) and (6) together yield a cubic equation for :
Since deleterious allele frequencies are expected to be high only for very weak selection, we can typically neglect the effects of selection on *F*_{ST} in this parameter regime, and simply solve equation (8) for under the assumption .

### C. Demographic effect of migration in small and large patches and dependence on Ks

Just like for recessive alleles (i.e., *h*=0.02 in fig. 3c and fig. 3d of the main text), we see that with *Ks*=1 (fig. 1a, small-sized patches (e.g., *K*=250) benefit from the demographic effect of migration and are more stable. With increasing *K*, we observe a bimodal distribution of population sizes with most of the weight close to *N* =0 (e.g., see *K*=2000 (black line)) as many of the patches are already extinct and the remaining non-extinct patches lie at a fraction of carrying capacity. With much larger *K* however, e.g., *K*=3000 (brown line) all the demes already go extinct. In the case of *Ks*=10, larger populations are more stable.

### D. Weak dependence of simulation results on *K* beyond Kmc

### E. Effect of gene flow on equilibrium allele frequency and equilibrium mean load for *Ks*=10

### F. Exploring the role of asymmetric mutation on critical migration thresholds

Here, we explore the effect of asymmetric mutation (i.e., mutation biased towards generating more deleterious alleles) on the critical migration threshold necessary to prevent the extinction of the metapopulation. We do this for additive alleles (i.e., *h*=0.5) and with 2*LU* =0.4. Looking at fig. 4, we observe some interesting dynamics. First, we see that for weak to mildly deleterious effects (*Ks*≤1), the critical migration threshold is higher with increasing degree of mutational bias (lhs of fig. 4). This makes sense as the more mutation is biased towards the formation of deleterious alleles, the more the number of deleterious alleles we have segregating in the population and the higher the load. Thus, a higher migration rate is needed to bring in new variation and alleviate this burden of load. In contrast, with moderate to strong selective effects, this difference in the critical migration threshold owing to mutational bias disappears.

### G. Distribution of fitness effects and dominance coefficient

Here, we explore the impact of gene flow on the mean population size in the metapopulation when we have a distribution of fitness effects and dominance coefficients. To do this, we assume that a fraction *a* of loci are additive and have fitness effects *Ks*_{A} and the remaining fraction, (1−*a*) of loci are recessive with fitness effects *Ks*_{R} as shown in fig. 5.

We see from fig. 5a - 5c that independent of the value of *a*, when additive alleles are nearly neutral and recessive alleles are non-neutral, increasing the selective effect of the recessive alleles, i.e., making them more strongly deleterious has little or no effect on the critical migration threshold above which the metapopulation survives. This also holds for the case where the additive alleles are mildly deleterious and occupy a higher proportion of loci (dashed lines in fig. 5c).

On the other hand, when additive alleles occupy a lower or equal proportion of loci as the recessive alleles, we see a somewhat different dynamics. We observe a (slightly) higher critical migration threshold when additive and recessive alleles are mildly deleterious (black dashed line in fig. 5a and 5b). As the recessive alleles become moderately deleterious, the threshold reduces and increasing the selective effect further makes little or no effect on the threshold migration rate (compare blue and red dashed lines in both fig. 5b and 5c). Finally, for all combinations considered, we see the critical migration threshold at least doubling as *a* increases.

### H. Alleles that contribute most to load

The above plot shows that under our model of hard selection, when we have intermediate levels of connection in the metapopulation (i.e., with *Km*=2, orange color), alleles with mildly deleterious effects contribute most to load.

## Footnotes

https://git.ista.ac.at/oolusany/genetic-load-and-extinction-in-a-metapopulation

↵

^{1}We see a slight non-monotonic behaviour with*h*=0.02 but this is essentially negligible.