Abstract
Classical models that ignore linkage predict that deleterious recessive mutations should purge or fix within inbred populations, yet inbred populations often retain moderate to high segregating load. True overdominance could generate balancing selection strong enough to sustain inbreeding depression even within inbred populations, but this is considered rare. However, arrays of deleterious recessives linked in repulsion could generate appreciable pseudo-overdominance that would also sustain segregating load. We used simulations to explore how long pseudo-overdominant (POD) zones persist once created (e.g., by hybridization between populations fixed for alternative mildly deleterious mutations). Balanced haplotype loads, tight linkage, and moderate to strong cumulative selective effects all serve to maintain POD zones. Tight linkage is key, suggesting that such regions are most likely to arise and persist in low recombination regions (like inversions). Selection and drift unbalance the load, eventually eliminating POD zones, but this process is quite slow under strong pseudo-overdominance. Background selection accelerates the loss of weak POD zones but reinforces strong ones in inbred populations by disfavoring homozygotes. Models and empirical studies of POD dynamics within populations help us understand how POD zones may allow the load to persist, greatly affecting load dynamics and mating systems evolution.
1 Introduction
Inbreeding depression (δ) is defined as the lower fitness of inbred compared to outbred individuals (Darwin, 1876). It is now generally accepted that δ is mainly due to the expression of segregating deleterious recessive mutations (Charlesworth and Charlesworth, 1987; Crow, 1993; Bataillon and Kirkpatrick, 2000; Roze, 2015). As direct selection, background selection, genetic drift and inbreeding all act to reduce diversity at such loci, maintaining non-negligible levels of inbreeding depression is difficult to explain (Byers and Waller, 1999; Winn et al, 2011). Examples include inbred lines of Zea mays Kardos et al (2014); Larièpe et al (2012), Arabidopsis (Seymour et al, 2016), Mimulus (Brown and Kelly, 2020) and C. elegans (Chelo et al, 2019; Bernstein et al, 2019). Such observations led many to conclude that overdominant selection, i.e. a higher fitness of heterozygotes compared to either homozygote, was operating (Kimura and Ohta, 1971; Charlesworth and Charlesworth, 1987). But truly overdominant loci are rare, and most effects previously attributed to overdominance (such as heterosis and hybrid vigor) can be explained by simple dominance interactions (Crow, 1999a). Curiously, analyses of inbreeding depression often detect evidence of overdominance (see for example Baldwin and Schoen 2019). These apparent overdominant effects, however, probably reflect the effects of many deleterious recessive mutations linked in repulsion, a phenomenon termed pseudo-overdominance (hereafter POD, introduced by Ohta and Kimura 1969; reviewed by Waller 2021). We have known for half a century that a single strong overdominant locus can generate enough selection against homozygotes to persist even under complete self-fertilization (Kimura and Ohta, 1971). Could such strong effects also arise and persist via pseudo-overdominance?
Pseudo-overdominant selection will only emerge in genomic regions where many deleterious alleles are clustered together and often linked in repulsion, generating complementary haplotypes that express similar inbreeding loads as homozygotes. Genomic regions with reduced recombination, such as centromeric regions and chromosomal inversions, often maintain higher than expected heterozygosity. Centromeric regions in Zea mays, for example, maintain heterozygosity even after repeated generations of inbreeding (McMullen et al, 2009). This has also been found in 22 centromeric regions in the human genome (Gilbert et al, 2020). Kremling et al (2018) confirmed that many rare variants in maize express deleterious effects confirming that “even intensive artificial selection is insufficient to purge genetic load.” Brandenburg et al (2017) identified 6,978 genomic segments (≈ 9% of the genome) with unexpectedly high heterozygosity in land races of maize. These heterozygous segments contained more deleterious mutations than other parts of the genome, with several deeply conserved across multiple land races. Inversions, which halt recombination, also appear to accumulate lasting loads of deleterious mutations. Jay et al (2021) found that ancient inversions contribute greatly to heterosis in Heliconius butterflies. Kirkpatrick (2010) concluded that although the genetic basis for inversion overdominance has not yet been clearly determined, POD is plausible.
Pseudo-overdominance (POD) at many loci of small effect should mimic overdominant selection at a single locus, favouring heterozygosity for load within particular genomic regions. This could sustain inbreeding depression even in the face of purifying selection and drift. For POD to influence species evolution, it must exist for long enough and generate enough overdominant selection to leave a signature. Recombination, however, acts to break up such regions by unbalancing haplotype loads, allowing selection and drift to purge or fix their mutations. It is thus remarkable that polymorphic inversions expressing balancing selection date back to ancient hybridization events in Heliconius butterflies (Jay et al, 2021). Similarly, five ancient polymorphic zones predate the divergence of Arabidopsis from Capsella (approx. 8 million generations ago, Wu et al, 2017). These observations suggest that polymorphic regions may generate enough selection to sustain themselves for long periods of time. Could this selection derive from POD?
Several mechanisms might generate enough initial overdominance to create a POD zone including crosses between independently inbred lineages or sub-populations (generating high heterosis in the F1), a truly overdominant (e.g., self-incompatibility) locus, or chromosomal inversions where recombination is strongly suppressed, allowing mutations to accumulate. Here, we use simulations to study the evolutionary dynamics of POD zones generated initially by admixture between two populations fixed for different sets of deleterious mutations. In this scenario, high fitness emerges in the F1 where mutations fixed within each population are ‘masked’ as heterozygotes in hybrid offspring (Kim et al, 2018). We extend existing theory regarding the stable polymorphism that can exist at a single bi-allelic overdominant locus to examine the conditions necessary for POD to maintain two haplotypes containing many linked recessive deleterious mutations as heterozygotes. Because pseudo-overdominance depends on tight linkage among these loci, we expect that over time such zones will be vulnerable to being broken up by recombination. We therefore also explore how varying levels of linkage, dominance, selection and selfing rates affect POD zone stability and decay. Finally, we test how selection elsewhere in the genome affects the ability of POD zones to persist and the reciprocal effects of POD zones on load dynamics elsewhere in the genome.
2 Approaches
2.1 Load needed to generate a POD
Kimura and Ohta (1971) demonstrated that when the selective effects generating true overdominance are strong enough, a stable equilibrium can exist that perpetuates the two overdominant alleles indefinitely even within a fully self-fertilizing population. Consider a scenario in which two haplotypes, noted H1 and H2, occur within a diploid population self-fertilizing at rate σ. Each homozygote suffers a fitness reduction (s1 or s2) compared to the heterozygote fitness. In the case of true overdominance, Kimura and Ohta (1971) showed that a stable polymorphism will persist at an overdominant locus when: where sx = min(s1, s2) < 0.5. When both segregating homozygotes reduce fitness by at least half (s1, s2 > 0.5), selection acts to maintain overdominance even as the selfing rate approaches one, as selection removes homozygotes faster than they are generated (Rocheleau and Lessard, 2000). For situations with stable polymorphism, setting s1 = s2 results in both alleles being maintained at a frequency of 0.5.
We use this threshold under true overdominance to estimate the number of load loci within pseudo-overdominant (POD) zone required to generate the necessary level of overdominance needed to maintain a stable equilibrium (see Eq. 1). For the sake of simplicity, we assume that each haplotype carries the same number nL of deleterious mutations all with the same coefficient of selection s and dominance h. We assume initial complete linkage, as it can then be broken by recombination, with loci evenly spaced, occurring at intervals of ℓ Morgans between alternating trans-mutations on opposing haplotypes (Fig. 1). As fitness effects are considered multiplicative across loci, an individual’s fitness is: where he and ho are the number of heterozygous and homozygous mutations, respectively, carried by the individual. In the case of complete linkage homozygosity at these loci only occurs in individuals carrying two copies of the same haplotype (genotype H1 H1 or H2H2). As both haplotypes carry the same number of mutations, the coefficient of selection acting against either homozygote (sH = s1,s2), relative to the fitness of the heterozygote H1H2 (WAA/WAa) is:
This expression allows us to determine the number of deleterious alleles per haplotype necessary to sustain enough overdominance to preserve both haplotypes via stable balancing selection (see Supp. File 1):
As expected, the number of loci required to obtain a strength of selection against homozygotes sH decreases for higher values of s and h. For s = 0.01 and h = 0.2, nL = 115 for sH to be at least 0.5, which should sustain POD selection indefinitely (Supp. File 1, Fig. S1).
2.2 Inbreeding depression
Inbreeding depression δ is a population specific variable, reflecting the number of heterozygotes maintained in a population. The general equation used to estimate inbreeding depression is: where Ws is the fitness of selfed offspring and Wo that of outcrossed offspring (Charlesworth and Charlesworth, 1987). If there is a POD zone, we can consider that there are two potential forms of selection contributing to inbreeding depression: 1) selection against deleterious mutations that are scattered throughout the genome (noted δs) and 2) overdominant selection generated by POD zones (noted δod). If we assume that selection against deleterious mutations elsewhere in the genome and overdominant selection do not interfere with one another (i.e. no associative overdominance or effects of background selection) and fitness effects remain multiplicative (see for example Kirkpatrick and Jarne 2000, the upper limit of the expected level of inbreeding depression will be:
When mutations are deleterious, and accounting for drift, δs depends on the haploid mutation rate U, the coefficient of selection s and the dominance of mutations h (see equation 3 from Bataillon and Kirkpatrick 2000): where F = σ/(2 – σ) is the equilibrium inbreeding coefficient (expected deviation from Hardy-Weinberg equilibrium of genotype frequencies). Though this expression for F remains true for weak overdominance (Glémin, 2021), when there is strong overdominance, the inbreeding coefficient depends on the coefficients of selection and allelic frequencies (Appendix A4 from Kimura and Ohta, 1971). In our case with symmetrical selection against homozygotes, this term is given as: will tend to zero with increasing sH (see Fig. A1 in Supp. File 1). Selfing populations subject to strong overdominant selection thus tend to behave like outcrossing ones as low fitness homozygotes are eliminated. In the presence of POD selection, we set F in Eq. 7 to .
At equilibrium, the contribution of POD to inbreeding depression δod can, for symmetrical overdominance, be written as: where , which simplifies to when s1 = s2 = sH - see Eq. A2 from Supp. File 1 and Kimura and Ohta (1971). We provide the general expressions for and δod in Supp. File 1 (see Eq. A3).
As previously shown, δod increases with the selfing rate σ for strong overdominant selection and δs decreases with σ (Charlesworth and Charlesworth, 1987, 1990). It is therefore possible to have similar δ (given in Eq. 6) in outcrossers and selfers, depending on the rates of background mutation U and the strength of POD selection (i.e. the value of sH).
2.3 Recombination and POD’s
Thus far, we have assumed complete linkage in order to apply one-locus overdominance theory to infer the strength of selection against homozygotes necessary to sustain a stable equilibrium. However, some recombination will occur, allowing the strong linkage disequilibrium among loci within a POD to erode over time. In order to examine the effect of recombination on the stability of POD, we propose a system of Ordinary Difference Equations (ODEs) representing the change in frequencies of the two initial haplotypes (ΔP1 and ΔP2) and that of a newly introduced recombinant haplotype (ΔPc):
The mean fitness of the population is the sum of the expected genotypic frequencies after selection (see Supp. File 2, Eq. (A4)), and sc, sc,1 and sc,2 are the coefficients of selection associated respectively with haplotypes HcHc, HcH1 and HcH2. We resolve this system of equations to determine the conditions necessary for a recombinant haplotype Hc to increase in frequency (ΔPc > 0).
3 Simulations
So as to confirm expectations from the analytical model given above and explore the dynamics of POD selection, we develop an individual-based simulation program in C++, uploaded to Zenodo.org (Abu Awad and Waller, 2022). We consider a scenario where POD selection arises after an admixture event between two initially isolated populations fixed for different mutations within the same genomic region (a ”proto-POD” zone). Each population is made up of N sexual diploid individuals, self-fertilizing at a fixed rate, σ. Each individual is represented by two vectors, each carrying the positions (between 0 and 1) of deleterious mutations along a single chromosome with map length R Morgans. Recombination occurs uniformly throughout the genome. Mutations within and outside of the POD zone have a fixed effect, with respective coefficients of selection, s and sd, and dominances, h and hd. Individual fitness is calculated as shown in Eq. 2. New mutations are sampled from a Poisson distribution with parameter U, the haploid mutation rate and their positions are uniformly distributed along the genome (infinite-locus model). Generations are discrete (no overlap) and consist of three phases: i) introducing new mutations, ii) selection, and iii) recombination and gamete production.
3.1 POD zone architecture and initiation
Two types of simulation are run, one with an arbitrary ideal haplotype structure expected to favour POD persistence and one with a more realistic distribution of mutations within the POD zone. The former consists of constructing two perfectly complementary haplotypes, H1 and H2. Cis-mutations occur at regular intervals (every 2ℓ M) along each haplotype and mutations are staggered, spreading the load evenly through the POD and ensuring pseudo-overdominance (Fig. 1). The expected number of recombination events occurring between two trans-mutations is then ℓ. The second type of POD zone architecture is one with randomly placed mutations in a predefined genomic region, their positions sampled from a uniform distribution, while ensuring that a locus with the same position is not sampled for both haplotypes. In both cases the center of the POD zone is kept constant for both haplotypes and the size of the POD zone is 2ℓnL M, with nL potentially different for each haplotype. The POD zone is arbitrarily positioned around the center of the genome, its exact center at position 0.5 along the chromosome.
After a burn-in period of 4 000 generations, allowing the two source populations (each fixed for a given haplotype in the proto-POD zone) to reach mutation-selection-drift equilibrium, a new population of size N is created by randomly sampling individuals from both populations. We arbitrarily consider that each source population contributes 50% of individuals to the new population. The new population is then allowed to evolve for a further 4000 generations. Samples of 100 individuals are taken every 10 generations to estimate inbreeding depression, which we compare to the theoretical expectations presented above (Eqs. 7, 9 and 6). We also use these samples to estimate heterozygosity within and outside the POD zone (POD He and genome He, respectively) as: where hei is the number of heterozygous mutations carried by individual j (out of a sample of 100) and L is the total number of segregating sites in the genomic region of interest. A decrease of He with time signals the erosion of the POD zone, either through loss or fixations of mutations.
Unless stated otherwise, all variable plotted are values obtained 4000 generations after the hybridisation event. Figures are made using the ggplot2 package (v3.3.6, Wickham 2016), with, in most cases, lines generated using the geom_smooth option. When this gave results that were too divergent compared to plotting the mean, the mean was used.
3.2 Simulations run
Simulations are run for population size N = 100,1000 and 5000 and for selfing rates σ between 0 and 0.95. The haploid background mutation U is set to 0, 0.1 and 0.5, with new mutations outside the POD zone having a fixed coefficient of selection (sd = 0.01) and dominance (hd = 0.2 or 0.5). We explore the effect of genome map length R, choosing R =1 and 10 Morgans for tight and loose linkage respectively, and we examine different strengths of linkage between loci in the POD zone, with ℓ = 10−4, 10−5 and 10−6. We consider both weak and strong selection against homozygotes, setting sH to sH = 0.14, 0.26 and 0.45. These correspond to stable (polymorphic) overdominant selection when σ = 0, 0.5 or even (with a narrow range of stability) 0.95 (Fig. A2, dotted lines). To determine the effects of POD selection on heterozygosity elsewhere in the genome, we also run simulations where all alleles within the initial POD zone are neutral for all parameter sets mentioned above (achieved by setting s and h = 0 within the POD). We run 100 repetitions for each parameter set.
4 Results
4.1 POD persistence and degradation
We first examine how recombination, the strength of selection against linked load loci, and their arrangement within the POD zone, influence POD persistence.
4.1.1 Recombination and POD degradation
Under the assumption that recombination within the POD block is rare (reflecting tight linkage), any new haplotype Hc will be generated by a single recombination event. This is reflected in the ODEs introduced in Eq. (10) which compute changes in frequency of the two initial haplotypes (H1 and H2) and a recombinant (Hc). For simplicity, we initially assume an ideal case where mutations are arranged alternately within the POD zone (see Fig 1). Positions of deleterious alleles in H1 H2 heterozygotes alternate in trans relative to flanking mutations on the same chromosome (Fig. 1). Each haplotype carries nL deleterious mutations. Consider two cases: 1) the recombinant haplotype Hc (and its complement) each carry nL deleterious mutations; 2) Hc carries nL – 1 mutations because recombination has cleaved one from one end of the POD zone.
Given arbitrary values of sc, sc,1 and sc,2 (the coefficients of selection against HcHc, HcH1 and HcH2 genotypes, respectively), the only possible equilibria involve fixing one of the three haplotypes or maintaining only two of them. Hence any rare haplotype, Hc, should either be lost, go to fixation, or replace one of the initial haplotypes (co-existing with the other). For Hc to increase in frequency, ΔPc (Eq. (10)) must be positive when it enters the population (or it would be eliminated). Assuming the frequency of a recombinant Pc is of order ϵ (ϵ being very small), the expression for ΔPc for the leading order of Pc (noted ) can be derived. In a population at equilibrium with P1 = P2 = (1 – ϵ)/2 and setting s1 = s2 = sH:
The denominator of this expression is always greater than 0 for sH < 1. To understand the behavior of , we simplify the above equation by setting to 0 (no self-fertilisation or very strong overdominant selection with sH ≈ 1, see Supp Fig. A1). In this case Eq. 12 simplifies to 2(sH – sc,1 – sc,2)/(2 – sH). If no mutations have been cleaved off by recombination (i.e Hc carries nL mutations), the numerator 2(sH – sc,1 – sc,2) ≤ 0 (see Eq. B1 in Supp. File 2 for expressions of sc,1 and sc,2) making negative (Fig. B2 in Supp. File 2). Hence Hc haplotypes will be selected against. This is because recombinant Hc haplotypes will share mutations with both the initial H1 and H2 haplotypes and a proportion of loci in HcH1 and HcH2 genotypes will inevitably be homozygous, resulting in a lower fitness of these genotypes compared to H1H2 heterozygotes. In this case neither the homozygous nor heterozygous genotypes with a recombinant haplotype present a selective advantage. If instead Hc carries nL – 1 mutations, the resulting coefficients of selection (Eq. B2, Supp. File 2) lead to a positive (the numerator in this case can be positive). The larger (or the selfing rate σ) the more positive the resulting .
This result leads us to predict that if a POD is initially stable, its eventual loss will usually occur gradually as recombination events near the distal ends of the POD cleave off mutations creating haplotypes with improved relative fitness. The reduced zones of stable equilibria for sc = sH in selfing populations (Fig. A2, in Supp. File 1) means that selection will more easily act to destabilise the POD zone by eroding mutations. This should fix one of the original haplotypes or a recombinant with the strength of selection affecting the rate at which this occurs.
Using simulations, we confirm results from single locus overdominance that stronger selection is more likely to result in stable polymorphism even for high selfing rates (Supp. Fig. S2). Drift and selection can both act to erode POD (shown by the rate of decrease of heterozygosity in Supp. Fig. S2). Strong drift renders selection neutral when NesH << 1, accelerating the loss of supposedly stable POD selection (N = 100 in Supp Fig. S2). Increasing the efficacy of selection will also favour the loss of POD selection, but unlike for strong drift, this is due to a more efficient purging (and higher effective recombination rate) of loci contributing to POD selection (N = 5000 in Supp Fig. S2). As the differences between population sizes are quantitative, and sH is a good predictor of mid/long-term stability of POD zones, in the following, we examine simulations only for N = 1000, for which both drift and selection act on POD stability, and sH = 0.45, for which overdominant selection is stable for all self-fertilisation rates simulated.
4.1.2 Effect of the strength of selection against individual loci
As mutations are progressively lost from POD zones, recombinants can go to fixation. This will eventually destabilize the POD zone. We next assess how varying the coefficients of selection s and dominance h against individual loci affects POD persistence. For a fixed value of selection against homozygotes, sH, varying s, h and nL (obtained using Eq. (4)), we calculate the expected increase in frequency a recombinant haplotype using Eq. (12). If no mutation is lost (Hc also carries nL mutations), remains negative except under high rates of self-fertilisation when they can be positive (though close to 0). However, a mutation lost through recombination generates a positive that increases with increasing strengths of selection and dominance of the mutations for all rates of self-fertilisation (Figs. 2 a and b for sH = 0.45). We confirm this prediction via simulations. These show that most losses of diversity (fixation or loss of mutations) occur at the ends of the POD zone (Figs.2c and d for selfing rate σ = 0.95). Losses of diversity within the POD zone intensify as s and h increase.
Stronger selection against individual mutations sustains heterozygosity more effectively as fewer mutations suffice to generate the same amount of balancing selection. However, the loss of a stronger mutation as a result of recombination will more likely unbalance and destabilise the POD zone. This accelerates the fixation or loss of mutations (Fig.2c). Increasing the dominance of load loci has similar effects as increasing s but requires more mutations to reach the same sH (i.e. nL = 60 and 150 for h = 0 and 0.3 respectively, Fig. 2f). This is because increased dominance increases the relative fitness of both the fitter homozygote (i.e. the haplotype with one less mutation due to recombination) and the heterozygote, increasing the overall fitness advantage of losing a mutation. The same patterns are observed in outcrossing populations to a lesser extent (Supp. Fig. S3). Increased linkage within the POD zone reduces the rate at which these higher fitness recombinants occur, slowing this process (dashed lines, Figs. 2e and f; see Supp. Fig. S4 for patterns of mutation loss within the POD zone).
4.1.3 POD region architecture
So far, we have considered only an ideal genetic architecture that favours maintaining POD, namely homozygotes of both haplotypes having identical fitness disadvantages relative to the heterozygote and equally spaced cis and trans mutations within the POD zone. We now relax these assumptions by considering initial haplotypes carrying different numbers of mutations, nL, within the POD region (while maintaining equal spacing) and then by placing randomly spaced mutations within the POD zone.
To unbalance the segregating homozygotes, consider alternative POD zone haplotypes with nL = 80,100, or 120 mutations paired with a haplotype H1 with nL = 100 mutations (denoted by relative lengths of 0.8 1 and 1.2 respectively in Figs. 3a and c). These generate substantial fitness differentials with relative selection coefficients against homozygotes s1 = 0.47 and s2 = 0.35 (blue lines), s1 = s2 = 0.45 (black lines), or s1 = 0.43 and s2 = 0.53 (green lines). In outcrossing populations, selection trims down longer, more loaded haplotypes as recombination makes variants available. This shrinks more loaded haplotypes to sizes close to the smaller haplotype (Fig. 3a, solid lines). Overdominant selection, however, sustains the core POD region’s heterozygosity, He(Fig. 3b, solid lines). Self-fertilising populations, in contrast, show less POD zone stability under asymmetric selection despite the fact that populations with balanced loads showed only slight observed losses or fixations of mutations (dashed black lines in Figs. 3a and c). When the alternative haplotype has less load (a relative size of 0.8), it quickly goes to fixation (dashed blue lines in Figs. 3a and c). This result matches the theoretical expectation that no overdominant polymorphism can be maintained with these coefficients of selection against homozygotes when the selfing rate is 0.95 (see Fig.A2 in the Supp. File 1). When the total load of the second haplotype increases to a relative size of 1.2, the POD zone is more commonly sustained as mutations are trimmed off the ends of the POD zone (Fig. 3a, c). This difference in behavior reflects the need for segregating load to exceed a threshold to sustain a POD zone. As for outcrossing, most mutations of the larger haplotype will be trimmed off the edges, but there is some fixation and/or loss of mutations along the whole POD region (dashed green line in Fig. 3a), lowering the mean observed He (dashed green line in Fig. 3c). This is most probably due to a larger range of recombinants having a higher selective advantage, provided that they trim the larger haplotype and thus help destabilize POD selection.
When the mutations are not in an ideal configuration, but randomly positioned throughout the designated POD zone, stability of the POD zone is barely affected in outcrossing populations (solid lines in Figs. 3b and d), even when the haplotypes are initially uneven. Selfing populations, however, require stronger linkage to retain the POD zone (compare dashed lines in Fig. 3 for ℓ = 10−6 M to Fig. S5 for ℓ = 10−5). Despite more frequent fixations/losses of mutations, some heterozygosity nonetheless persists for approximately 1000 generations even with lower linkage (Supp. Fig. S5).
4.2 Background mutations
Mutations introduced elsewhere in the genome influence POD selection dynamics and persistence and vice versa as POD’s affect purifying selection across the genome. In general, when a POD zone is stable, background mutations will not destabilise it. Background selection does, however, affect heterozygosity within and outside the POD zone. Let us compare heterozygosity within the POD zone in simulations with background mutations to simulations lacking it (i.e. U > 0 vs. U = 0; Fig. 4a). Interestingly, in self-fertilising populations, He within the POD zone rises when background selection occurs elsewhere in the genome. These effects increase when mutation rates rise (green vs. blue lines, U = 0.5 and 0.1 respectively) and linkage increases (full vs. dashed lines reflecting map lengths of R =1 and 10 Morgans respectively).
Similarly, the presence of a stable POD zone affects the heterozygosity of deleterious mutations observed elsewhere in the genome. When mutation rates are low (U = 0.1), POD selection slightly decreases the mutational heterozygosity elsewhere in the genome (blue lines Fig. 4b). Conversely, a higher genomic mutation rate (U = 0.5, green lines) results in increased heterozygosity, especially in highly selfing populations with small map lengths (implying tight linkage - solid green line in Fig. 4b). Effects of POD selection on effective population size are complex but in most cases, POD selection tends to decrease Ne (Supp. Fig. S6).
To confirm that these effects derive from overdominance rather than some other effect of background selection, we simulated effects of co-dominant background mutations (hd = 0.5). Because such mutations are expressed in heterozygotes and thus easily removed by selection, they generate few associations with other loci. Co-dominant background mutations have little effect on within-POD zone heterozygosity in contrast to simulations with more recessive mutations (hd = 0.2). This is true even within selfing populations (Supp. Fig. S7a). This confirms that it is associative overdominance between the POD zone and other load loci that increases heterozygosity (Supp. Fig. S7b). Varying rates of background mutation and POD zone length also have complex effects on effective population size Ne (Supp. Fig. S7c).
4.3 Inbreeding depression
As expected, the overdominance generated in a POD zone increases the inbreeding depression, δ, populations express (Supp. Fig. S8). Observed δ in outcrossing populations can be predicted using Eq. (6), which accounts for overdominant selection and unlinked deleterious mutations. In selfing populations variable erosion of the POD zone and POD selection dynamics generate bimodal distributions of δ (see Supp. Fig. S9 for clearer representations). Some simulations generate values of δ close to those predicted by Eq. (6) (dashed lines in Fig. 5) while others generate values predicted when selection acts only against the unlinked recessive deleterious mutations (Eq. (7), dotted lines in Fig. 5). This may reflect loss of the POD zone. Genomes with smaller map lengths (e.g., R =1 Morgans) generally increase the observed δ, especially in selfing populations (see Supp. Figs. S8 and S10).
5 Discussion
Given that purging, drift, and background selection all reduce segregating variation and thus inbreeding depression, we face the question of what force perpetuates these, even within small and inbred populations. Waller (2021) emphasized this enigma and reviewed mechanisms that might account for it. Selective interference among loci might act to slow or block purging (Lande and Schemske, 1985a; Winn et al, 2011). Recurrent mutations might also replenish the load fast enough to regenerate δ (Fisher, 1930; Charlesworth, 2018). A third possibility is that clusters of recessive mutations linked in repulsion emerge, creating enough balancing selection via pseudo-overdominance (POD) to counter purging and drift, sustaining selection for outcrossing or mixed mating systems (Waller, 2021). Our goals here were to explore the dynamic stability of POD zones (initially ignoring how they arise) using both classical one-locus overdominant theory (Kimura and Ohta, 1971) and simulations. We found that strong and balanced POD zones can persist for hundreds to many thousands of generations.
Whether POD zones are fragile or robust depends critically on several genetic parameters. These include the number and severity of deleterious mutations, their proximity and cis-/trans-positions, and their levels of dominance/recessivity (Figs. 2 and S3). Strong and balanced selection plus tight linkage allow POD zones to persist as these conditions enhance the associations (linkage disequilibria) that generate POD effects. Recombination dissolves these associations, allowing purifying selection and drift to disrupt POD zones, purging and fixing mutations. Mutations erode from either end of the POD zone or the load becomes unbalanced enough to fix one haplotype. The importance of linkage and small mutational effects are evident in the radically enhanced purging seen in models that ignore linkage and assume major mutational effects (Lande and Schemske, 1985b). We also found that new recessive mutations that occur elsewhere in the genome generate associations with load alleles within POD zones that enhance POD zone heterozygosity and persistence (Fig. 4). Such mutations add to the segregating load, increasing heterozygote advantage. Because levels of heterozygosity are correlated across the genome in partially inbred populations (identity disequilibrium), the background selection generated by mutations outside the POD zone tend to reinforce the balancing selection favoring heterozygotes in the POD zone. POD zones also exert reciprocal effects, enhancing the heterozygosity of mutations occurring elsewhere in the genome when mutation rates are moderate (U=0.5, Fig. 4b). This effect was amplified within selfing populations, presumably reflecting how selection against POD zone homozygotes favors heterozygosity across the genome when more identity disequilibrium occurs. These effects would be further enhanced if mutations were to have varying dominance effects, a scenario which we did not consider here. However, recent work has shown that POD selection can be generated in a single population by the clustering of mutations in repulsion, even without heterogenous recombination rates along the chromosome (Sianta et al, 2021). These results coupled with ours lead us to hypothesize that any genomic region displaying reduced recombination could provide a haven for POD zones to emerge and persist.
5.1 How do POD zones originate?
Many empirical observations could be explained by the existence of POD zones (see Introduction and Waller 2021). Whether POD zones that are conserved across populations exist in sufficient number and strength to affect evolutionary dynamics hinges on the relative rates at which they are created and destroyed. We focused on POD zone erosion and loss, not how they arise. As our results show, a requirement for POD stability is strong linkage within a given genomic region in which mutations can accumulate through the actions of selection and genetic drift. Inversions and centromeric regions with restricted recombination provide preconditions favoring POD zone emergence, as do genomic regions neighbouring loci currently or previously under overdominant selection, where recombination is suppressed. Examples where this has been observed include self-incompatibility loci (Takebayashi, 2003; Igic et al, 2008; Mable, 2008), MHC loci (Garrigan and Hedrick, 2003; Gemmell and Slate, 2006), and loci with balanced polymorphisms generated by ecological selection (van Oosterhout et al, 2000; Jay et al, 2021). In such regions, mutations of small effect become effectively neutral when the product of the effective population size and the selection coefficient Nes << 1 (Crow and Kimura, 1970; Hedrick et al, 2016)). These will drift in frequency and often fix increasing the “drift load” to the point where it may compromise population viability (Whitlock et al, 2000; Charlesworth, 2018). Selection against strongly deleterious mutations will accentuate fixation of milder mutations linked in repulsion via “background selection” (Charlesworth et al, 1997; Zhao and Charlesworth, 2016). Pairwise and higher associations (linkage disequilibria) also increase within small and inbred populations even among alleles at unlinked loci limiting selection (Hill and Robertson, 1966; Sved, 1971; Ohta and Cockerham, 1974; Lewontin, 1974).
The scenario we suggested that might create POD zones involved drift fixing alternative sets of recessive deleterious mutations among isolated populations. When such populations hybridize, their F1 progeny experience high heterosis reflecting the cumulative effects of POD across the whole genome (Crow, 1999b). Under free recombination, this heterosis is expected to erode by 50% in the F2 and each subsequent generation as recombination dissipates the associations generating the POD (Harkness et al, 2019) (ignoring the presence of epistatic Dobzhansky-Muller incompatibilities -(Ehiobu et al, 1989). However, where clumps of mutations occur within short genomic regions (or in low recombination zones), POD zones may be spawned. Inter-population crosses often reveal high heterosis (Willi et al, 2013; Spigler et al, 2017) as do crosses between low-fitness inbred lines in plant and animal breeding programs. Theory suggests that any incipient POD zone generating heterozygous progeny at least twice as fit as homozygous progeny will allow that POD zone to persist even in highly selfing populations. Dramatic examples of “hybrid vigor” in F1 crosses include cases where progeny have up to 35 times the fitness of parental lineages (Tallmon et al, 2004; Hedrick and Garcia-Dorado, 2016) easily satisfying this condition.
Proto-POD zones may be fragile. Our models show that recombination and selection eliminate proto-POD zones with weak, unbalanced, or loosely linked loads. However, in some regions, cumulative selective effects from localized mutations may be large and balanced enough to allow a persistent POD zone to emerge. Such zones eliminate many homozygous progeny, reducing effective rates of inbreeding (, Eq. 8). This, in turn, reduces rates at which deleterious recessive mutations are lost both within POD zones and elsewhere in the genome (Fig. 4). Selection against low-fitness recombinants might even favor the evolution of reduced rates of recombination within POD zones providing another mechanism to stabilize POD zones (cf. Olito et al 2022). We ignore the potential of POD zones to gain strength over time by accumulating additional internal mutations sheltered from selection as heterozygotes, which would augment the overdominance as observed at the S-locus in Arabidopsis halleri – (Llaurens et al, 2009)).
5.2 Evolutionary consequences of POD selection
POD zones could affect the architecture and the dynamics of the genetic load in various ways. Most conspicuously, our simulations of background selection show how POD zones could increase the segregational load elsewhere in the genome and vice versa. Such effects imply that mutations both within and outside the POD zone could reinforce the selection maintaining POD zones sustaining more variability and segregating loads than otherwise expected. Such loads could favor self-incompatibility mechanisms for their ability to produce fewer low-fitness homozygous genotypes. Our scenario where population hybridization spawns POD zones suggests a mechanism whereby fixed drift loads might regularly be converted into segregating loads which then persist in regions expressing strong overdominance.
Although we expect positive heterozygosity-fitness correlations within partially inbred populations (given that heterozygosity inversely measures inbreeding), heterozygosity and variation within POD zones reflects the opposite: non-adaptive variation emerging from sustained mutational and segregational genetic loads. This may help to explain why heterozygosity-fitness correlations can be weak and inconsistent (David, 1998). POD zones might increase loads within populations by creating safe havens within which new deleterious mutations could accumulate while increasing the load of mutations segregating elsewhere in the genome. Small, inbred populations might also become vulnerable to “mutational meltdown” threatening population viability (Gabriel et al, 1993). Conversely, POD zones may provide individual or population advantages by sustaining inbreeding depression and favoring outcrossing in ways that better sustain adaptive genetic variability.
5.3 POD effects on mating system evolution
The presence of POD conspicuously affects the evolution of plant and animal mating systems by sustaining more segregational load and higher inbreeding depression than expected especially in small, inbred populations. Early models of mating system evolution sought to explain variable levels of self-fertilization as equilibria reflecting how selection acted on progeny with more or less inbreeding depression. In these simple static models, inbreeding depression less than 0. 5 would result in exclusive selfing while higher levels would favor exclusive outcrossing. More dynamic simple models that allow selection make mixed mating systems even more improbable by allowing inbreeding to purge deleterious mutations, generating ”run-away” selection for ever-increasing levels of selfing (Lande and Schemske, 1985b). If drift instead fixes many segregating mutations, similar effects emerge as this, too, causes inbreeding depression to decline. The ability of many small, inbred populations to nevertheless retain genetic variation and inbreeding depression plus the absence of purely inbreeding taxa thus pose a paradox (Byers and Waller, 1999; Winn et al, 2011). More complex and realistic models that incorporate effects of linkage, drift, and the associations among loci that arise in small, inbred populations show far more complex dynamics (Charlesworth and Charlesworth, 1987; Uyenoyama et al, 1993). One relevant model showed that a single unlinked overdominant viability locus anywhere in the genome generates positive associations with modifier alleles enhancing outcrossing (Uyenoyama and Waller, 1991). Such associations favor a persistently mixed mating system. Because POD also favors heterozygotes, we expect POD zones to exert similar effects. The presence of POD zones might thus help to account for the paradoxes of persistent segregating loads and populations and species that maintain mixed mating systems. If, instead, POD zones regularly arise and then deteriorate, selection could alternately favor selfing and outcrossing. This might provide an entirely different mechanism favoring mixed mating systems.
5.4 Conclusions
Understanding the mechanisms that create and sustain POD zones cast light on how commonly POD zones may arise and persist and the genetic and demographic circumstances that enhance their longevity. Comparative genomic data will be particularly useful for searching for POD zones and analyzing their structure and history. Our models demonstrate how several genetic, demographic, and mating system parameters may affect load dynamics within and beyond POD zones. Any POD zones that persist are likely to strongly affect mating system evolution by reducing both purifying selection and drift, sapping the power these forces would otherwise have to reduce inbreeding depression. Our models demonstrate that POD zones can persist given the right conditions. We encourage further research to extend and refine our understanding of this phenomenon.
6 Acknowledgements
We thank Sylvain Glémin, Lei Zhao, Yaniv Brandvain and an anonymous reviewer for useful comments. Diala Abu Awad was funded by the Alexander von Humboldt Stiftung.