Diversification or collapse of self-incompatibility haplotypes as outcomes of evolutionary rescue

Self-incompatibility systems in angiosperms are exemplars of extreme allelic polymorphism maintained by long-term balancing selection. Pollen that shares an allele with the pollen recipient at the self-incompatibility locus is rejected, and this rejection favors rare alleles as well as preventing self-fertilization. Advances in molecular genetics reveal that an ancient, deeply conserved, and well-studied incompatibility system functions through multiple tightly linked genes encoding separate pollen-expressed F-box proteins and pistil-expressed ribonucleases. We show that certain recombinant haplotypes at the incompatibility locus can drive collapse in the number of incompatibility types. We use a modified evolutionary rescue model to calculate the relative probabilities of increase and collapse in number of incompatibility types given the initial collection of incompatibility haplotypes and the population rate of gene conversion. We find that expansion in haplotype number is possible when population size or the rate of gene conversion is large, but large contractions are likely otherwise. By iterating a Markov chain model derived from these expansion and collapse probabilities, we find that a stable haplotype number distribution in the realistic range of 10–40 is possible under plausible parameters. However, small or moderate-sized populations should be susceptible to substantial additional loss of haplotypes beyond those lost by chance during bottlenecks. The same processes that can generate many incompatibility haplotypes in large populations may therefore be crushing haplotype diversity in smaller populations.


Introduction 22
Self-incompatibility (SI) is a common strategy by which plants ensure outcrossing, and it provides a classic example of extreme allelic polymorphism maintained by long-term balancing selection. 24 An SI plant rejects self pollen, which is identified by a specificity phenotype encoded by a highly polymorphic self-incompatibility locus (S-locus). This type of inbreeding avoidance mechanism 26 is widespread in plants: distinct non-homologous SI systems have been discovered in the Poaceae (Li et al. 1997), Papaveraceae (Foote et al. 1994), Solanaceae (McClure et al. 1989), Brassicaceae 28 (Stein et al. 1991), and Asteraceae (Hiscock et al. 2003), and some form of SI is thought to be present in nearly 40% of plant species across 100 families (Igić et al. 2008). Rejection occurs 30 when the pollen specificity matches the pistil specificity, which is likewise encoded by the Slocus. Pollen with a rare specificity has an advantage because it is less likely to encounter a 32 pistil with a matching specificity and is thus less likely to be rejected. This advantage of rarity results in balancing selection, which maintains polymorphism at the S-locus by protecting S-locus 34 alleles (S-alleles) from loss through drift (Wright 1939). While it is "fairly obvious that selection would tend to increase the frequency of any additional alleles that may appear" (Wright 1939), 36 it is not obvious how novel S-alleles do appear. Molecular genetic understanding of SI has advanced rapidly, and modern theory to explain the diversification of S-alleles must account for 38 what is now known about the structure of the S-locus. We develop a population genetic model of the expansion, collapse, and long-term evolution of S-allele number under the widespread 40 solanaceous SI system.
The SI system originally discovered in Nicotiana (East and Mangelsdorf 1925) is particularly but rather through a form of nonself-recognition (Kubo et al. 2010). In nonself-recognition systems, pollen is rejected by default, but complementary pollen-and pistil-products react to prevent 82 rejection of cross pollen. Results from models of self-recognition systems may not be directly applicable to nonself-recognition. Despite the ubiquity of solanaceous SI, few models of S-allele 84 evolution explicitly incorporate its nonself-recognition system. Fujii et al. (2016) proposed that a pollen-function mutation complementary to a novel pistil-function mutation could arise on a sin-86 gle background and then spread to other backgrounds through gene conversion. Bod'ová et al. (2018) enumerated several possible evolutionary pathways to new S-alleles, and their results also 88 showed that the pathway hypothesized by Fujii et al. (2016) could either expand or reduce S-allele number. An explanation of the long-term trajectory of S-allele number must therefore account 90 not only for the steps needed to create new fully-functioning alleles, but also for dynamics that can lead to their loss.

92
A stochastic process involving the risk of collapse raises radically different questions than one of continual growth. Are S-allele numbers repeatedly expanding and contracting, or do 94 they reach equilibria where changes are rare? We may sometimes observe changes in progress if they are frequent, but infrequent changes may be buried in the past. When a contraction 96 occurs, is it usually small or catastrophic? Large contractions, followed by re-expansion, should result in more turnover in alleles and fewer shared alleles among species than would small 98 contractions. Since contraction is possible whether SI is maintained throughout (Bod'ová et al. 2018) or is temporarily lost in self-compatible intermediates (Uyenoyama et al. 2001;Gervais 100 et al. 2011;Bod'ová et al. 2018), neither pathway is unidirectional. Is the pathway maintaining SI still sufficient for expansion, or are self-compatible intermediates necessary? If new proto-S-102 alleles are segregating in natural populations, are they self-compatible, -incompatible, or both? Answers to these questions will aid interpretation of S-allele genealogies, which document both 104 similarities and differences between species in their complement of S-alleles. For example, the large historical contraction in S-allele number in the common ancestor of Physalis and Witheringia 106 has been interpreted as a demographic bottleneck (Paape et al. 2008), but a firmer theoretical investigation could assess whether a collapse restricted to the S-locus is a plausible alternative 108 explanation.
We use an evolutionary rescue model to approximate the relative probabilities of collapse 110 and expansion in S-allele number as well as the distribution of collapse magnitudes. We find a negative relationship between initial S-allele number and the probability of further expansion. By 112 constructing and iterating a Markov chain out of these contraction and expansion probabilities, we find stable long-term distributions of haplotype number. Realistic S-allele numbers (20-40) 114 are possible through this process alone for large but plausible values of population size and rate of gene conversion. However, contractions can be very large when they occur, often eliminating 116 the majority of S-alleles. These results suggest that contractions in S-allele number are ubiquitous unless new S-alleles are prevented from invading in the first place, as might occur if they cause 118 pollen limitation.
Our results point to three major predictions. First, we predict that collapse is very likely when 120 many existing S-alleles are pre-adapted to being compatible with novel S-alleles. This is because the pre-adapted alleles eliminate their competitors when a novel specificity arises, thereby re-122 ducing the total number of surviving alleles. Second, we predict that intraspecific variation in S-allele number can sometimes be explained by an invasion of a runaway self-incompatible al-124 lele rather than a bottleneck or a transition to self-compatibility. We should therefore expect to find allopatric populations with large disparities in S-allele number, and we should expect the 126 population with the smaller number to harbor an S-allele that, if introduced into the other population, would trigger a collapse. Third, populations that have recently undergone such a collapse 128 should be characterized by self-incompatibility but low S-allele diversity, and they should show a signature of a selective sweep at the S-locus but greater polymorphism outside it.

Model and Results
The risk of contraction in S-allele number in the solanaceous incompatibility system is rooted 132 in the system's biological details. This system involves two kinds of complementary products: pistil-expressed ribonucleases (RNases) (McClure et al. 1989) and pollen-expressed F-box pro-134 teins (Lai et al. 2002;Entani et al. 2003;Sijacic et al. 2004). Together, these products achieve a form of collaborative nonself-recognition first described by Kubo et al. (2010). Under this sys-136 tem, each "allele" at the S-locus is actually a tightly linked haplotype containing one RNase gene and a collection of paralogous F-box genes. The RNase gene is highly polymorphic, and each 138 functionally distinct haplotype possesses a different RNase allele. Nonself pollen is recognized through the complement of F-box proteins it expresses. Each F-box paralog produces a function-140 ally distinct product, and each of these products is capable of detoxifying one or more forms of RNase. Pollen is only successful if it expresses both of the two F-box proteins to match the 142 diploid pollen recipient's two RNase alleles. Self-fertilization is prevented because each fully functional haplotype lacks a functional copy of the F-box gene that corresponds to the RNase 144 on the same haplotype, and so each pollen grain necessarily lacks one of the two F-box proteins required to fertilize the plant that produced it. Rejection is not restricted to self-pollination: any 146 two individuals that share one haplotype will reject half of each other's pollen, while individuals that share both haplotypes are completely incompatible. Rejection is therefore more likely 148 between closely related individuals. Tight genetic linkage across all components of the S-locus reduces the probability that recombination will cause a haplotype to lose a functional F-box par-150 alog (reducing its siring opportunities) or gain the F-box paralog that detoxifies its own RNase (inducing self-compatibility).

152
This system presents two novel challenges that are absent in self-recognition systems. The first is a "chicken-egg problem." A novel F-box specificity alone is at best neutral because it 154 detoxifies an RNase that does not yet exist. A novel RNase alone is deleterious because it degrades all pollen and renders the plant ovule-sterile. Both of these mutations must invade in 156 order to generate a new, fully functional S-haplotype. Second, for every novel RNase that arises, the corresponding F-box specificity must appear on every other haplotypic background in order 158 to restore cross-compatibility among all haplotypes. Building on the hypothesis for expansion of haplotype number proposed by Fujii et al. (2016), Bod'ová et al. (2018 showed how cross-160 compatibility could be restored among all incompatibility classes after novel RNase and F-box mutations have invaded. If an initially neutral F-box mutation already exists when its com-162 plementary RNase mutation arises, the F-box mutation will then invade because it confers the advantage of compatibility with the new RNase. To restore full cross-compatibility among hap-164 lotypes, all haplotypes must acquire F-box paralogs complementary to all RNase alleles other than their own either through gene conversion (Fujii et al. 2016)  Based on this biological background, we introduce a set of metaphors to make the interactions among haplotypes more intuitive. Each form of pistil-function RNase is a lock, and each form 178 of pollen-function F-box protein is a key. A diploid plant codominantly expresses two different locks in its pistils. Each pollen grain expresses every paralogous key in its haploid genome, 180 and these keys collectively form that pollen grain's key ring. The pollen must unlock both of the pollen recipient's locks in order to fertilize it, which requires keys to both locks. Each haplotype is 182 self-incompatible because its key ring lacks the key to the lock on the same haplotype. Our model must explain two processes: the invasion of a novel lock and the evolution of the complementary 184 key on other haplotypes. 186 We focus on a population without any self-compatible haplotypes. Several models of incompatibility evolution include a self-compatible intermediate (Uyenoyama et al. 2001;Gervais et al. 188 2011;Bod'ová et al. 2018), which can either lead to expansion or to total collapse of incompatibility. Self-compatible haplotypes do not lose siring opportunities as their frequency increases, 190 so they are not under the same negative frequency-dependence as functional incompatibility haplotypes. If inbreeding depression is low, self-compatibility is often universally superior to 192 self-incompatiblity, and the population will eventually become completely self-compatible at equilibrium (Uyenoyama et al. 2001;Gervais et al. 2011;Bod'ová et al. 2018). We do not re-194 tread this well-established result. Our population can instead be interpreted as one in which inbreeding depression is so great that any self-compatible haplotypes produced by mutation or 196 gene conversion are effectively instantly removed by selection, although we do not explicitly model inbreeding depression.

198
Evolution of a new incompatibility class requires both a new lock and a new key. A lock mutation without a corresponding key would cause ovule sterility to the mutant carrying it and 200 would thus be quickly eliminated, but it would not otherwise affect the population. In contrast, a key to a not-yet-extant lock would be neutral as long as there were no cost of expressing it. There 202 should therefore be conditionally neutral functional diversity among duplicate keys. If a series of unsuccessful lock mutations culminates in one that corresponds to an extant key, the final 204 mutation will appear to have been anticipated by that key. We therefore model the key appearing before its complementary lock with the knowledge that this pair of mutations may have been 206 preceded by a series of pairs that occurred in the "wrong" order. The new lock allele then arises from its ancestral lock allele on one of the existing haplotypes. Biologically, this corresponds to 208 a mutation changing the specificity of one of the existing RNase alleles such that its product is now recognized by a different F-box protein and no longer by the old one. Note that there are 210 two other possible results of gene conversion besides adding a key to a key ring. First, gene conversion may replace a functional F-box paralog with a non-functional or deleted allele: i.e., 212 it may remove a key. Such a conversion event should reduce the siring success of the haplotype and be unconditionally deleterious. Second, gene conversion may induce self-compatibility by 214 uniting a key with its complementary lock. This conversion event should also be deleterious if SI is maintained by intense inbreeding depression. These convertants certainly occur, but we do not 216 track them because they should be removed rapidly. Note that our gene conversion parameter, which includes only conversions that add a key without inducing self-compatibility, is therefore 218 smaller than the raw rate of functional gene conversion, which also includes the production of short-lived deleterious convertants. Rather than speculate on the empirical proportions of these 220 kinds of conversion events, we simply vary the rate of key additions that maintain SI.
A major barrier to expansion without a self-compatible intermediate is that, unless all other 222 haplotypes possess the new key, the new lock mutation will necessarily reduce the amount of compatible pollen the mutant plant receives. If this reduced pollen supply reduces the mutant's 224 seed set, the mutation will be eliminated by selection because it provides no countervailing advantage. However, plants in large populations with efficient pollination may be so pollen-226 saturated that even a small compatible proportion is sufficient to achieve full seed set. We model such a population because it provides a biologically plausible (though probably not general) 228 scenario in which a self-incompatible lock mutant can persist. Large populations are also the most conducive to expansion for reasons discussed below, so the absence of pollen limitation of 230 seed set may be most realistic for the populations in which expansion is most likely. Strictly, though, the lock mutation can be neutral even if the mutant suffers some pollen limitation: it 232 must merely suffer no more pollen limitation than average.
We consider a population with n initial complete haplotypes, each at frequency 1/n (Wright 234 1939). When one new lock mutation appears, we identify five classes of haplotypes (summarized in fig. 1). The mutation of the new lock is the first in a four-step evolutionary pathway to 236 expansion or collapse ( fig. 2). First, there are n L haplotypes in the class that possesses one of the initial n locks and a key ring that is "complete," sensu Bod'ová et al. (2018). That is, they possess the keys to all locks but their own, including the mutant lock. This first class is denoted L for "lucky" because its members fortuitously possess the key to the new lock. Each haplotype in this 240 class initially occurs at frequency d L /n, where d L is an arbitrary initial fraction. Second, for each of these n L complete haplotypes, there exists one haplotype that is identical except that it lacks 242 the key to the new lock. These could represent ancestral versions of the complete haplotypes that have not yet gained the key to the new lock. This second class is denoted C for "chump" because 244 members of this class are simply inferior versions of the corresponding "lucky" haplotypes. Each haplotype in this class occurs at frequency (1 − d L )/n. Third, there are n U haplotypes that have 246 one of the initial n locks and an incomplete key ring that lacks only the key to the mutant lock.
Unlike the previous class, there exists no complete version of these haplotypes initially. This third 248 class is denoted U for "unlucky" because members of this class cannot unlock the new lock like the "lucky" haplotypes can, but neither must they compete with superior version of themselves 250 like the "chump" haplotypes must. Each haplotype in this class occurs at frequency 1/n. Fourth, there is the ancestral haplotype on which the lock mutation arose. It is like any other incomplete 252 haplotype (e.g., a U haplotype) except that its key ring is identical to that of the haplotype bearing the lock mutation. This fourth class is denoted A for "ancestral" because it is ancestral to the lock 254 mutant. It occurs at frequency (1 − d S )/n, where d S is the arbitrary initial fraction of ancestral haplotypes that have been replaced with lock-mutant haplotypes. Fifth, there is the haplotype 256 bearing the lock mutation. Unlike all other incomplete haplotype classes, the key that it lacks is the key to its ancestor rather than the key to the new lock. It is denoted S for "spiteful" because 258 the lock mutation does not increase the haplotype's own fitness, but it does decrease the fitness of the "unlucky" and "chump" haplotypes. (Strictly, this is a case of ammensalism rather than 260 spite.) There are thus n L + n U + 1 = n pre-existing haplotypes (including the mutant's ancestor) along with the mutant itself for a total of n + 1 haplotypes.

262
These classes occur at initial frequencies, where each p represents the frequency of all haplotypes of a given class. That is, the population 264 begins with lock alleles at equal frequencies (as expected at equilibrium before the origin of the new lock) except that a fraction d S of copies of A are replaced by copies of S, which induces the 266 distinction between the L and C classes.

Invasion of key to novel lock 268
The population has now passed through the "mutation" step, the first of the four-step pathway where the subscript indicates a diploid genotype. Genotypes AA, AS and SS do not exist because the A and S haplotypes are mutually cross-incompatible as well as self-incompatible.

282
Since all individuals are assumed to be equally fecund as ovule parents and equally viable, all variation in fitness is determined by siring success. The siring success of a given haplotype 284 on a given dam genotype is equal to the frequency of the dam genotype times the frequency of the sire haplotype among all pollen compatible with the dam. We therefore define D XY of 286 diploid genotype XY as the frequency of XY divided by the fraction of pollen compatible with XY. Comparing marginal fitnesses, w, where each P represents the frequency of a diploid genotype defined in terms of the two classes of haplotypes it possesses. An L haplotype can fertilize everything the corresponding C haplotype 290 can, and it can also fertilize individuals carrying the S haplotype, so Although U haplotypes are compatible with all L haplotypes, they are otherwise like C haplo-292 types: The A and S haplotypes have equal fitnesses because, although they have different locks, they 294 have identical key rings and thus identical siring prospects: We find that w L > w C as long as D LS , D CS , or D US > 0 (i.e., as long as S exists). That is, as long 296 as L haplotypes have a functional key that C haplotypes lack, L haplotypes have greater fitness.
When S does not exist, w C = w L . Therefore, w C < w L as long as S persists throughout, and all 298 C haplotypes are lost at equilibrium. At the initial allele frequencies, the fitnesses w A and w S conveniently reduce to S is greater than the initial mean fitness across the haplotypes, This condition essentially expresses the requirement that there is a sufficient number (n L or n U ) U ) haplotypes such that the lock mutation benefits from 304 their decrease in frequency, and that this quantity must be greater the more copies of the lock mutant there are initially (d S ). Numerical iteration suggests that this condition is not stringent:

306
we considered values of n L , n U ≥ 2 and always found the lock mutation to increase at least slightly.

Equilibration of haplotype frequencies
The genotype frequency recursion equations (see Appendix) are not linear functions of genotype 310 frequencies, so we cannot derive a vector of equilibrium genotype frequencies by simply solving for an eigenvector of a recursion matrix. Instead, we numerically iterate the genotype frequency 312 recursions for 4000 generations to generate the equilibrium genotype frequencies given n U , n L , d L , and d S . We held d L and d S both constant at 0.01, and varied n U and n L through all combinations 314 of the values 2, 5, 10, and 20 haplotypes. For all these scenarios, the haplotype classes L, A, and S increased in frequency, while classes C and U decreased. A representative example with 316 n L = n U = 2 is plotted in figure 3 for the first 1000 generations. Class C approached extinction, and almost all gains went to class L. Classes A and S increased only slightly in frequency.

Acquiring the new key through gene conversion
The population has now passed through the "invasion" step, the second in the four-step pathway 320 ( fig. 2) be extremely unlikely for each haplotype to acquire the new key by an independent mutation.
The new key is also unlikely to be copied to other key rings through recombination because 328 recombinational exchange is rare within the S-locus. However, the low observed allelic diversity at a given F-box paralog locus suggests that these loci have undergone recent gene conversion 330 events (Kubo et al. 2015). Fujii et al. (2016) have shown that if repeated gene conversion events copy the new key to other key rings in series, the products of gene conversion will replace their 332 ancestors without the new key.
We follow Fujii et al. (2016) in modeling the evolution of cross-compatibility through gene 334 conversion. However, as Bod'ová et al. (2018) showed, adding a new key to a haplotype can in some cases give it a universal advantage over others (i.e., an advantage present at any frequency) 336 and potentially drive the others extinct. In particular, if the S haplotype acquires the key to its ancestor's lock, the resulting haplotype is compatible as a sire with every other haplotype, 338 but only a subset of haplotypes are compatible with it. Numerical iteration shows that such a haplotype will rise to high frequencies and eliminate every haplotype that lacks the key to the 340 new lock (fig. 4). These lost haplotypes include all of the U haplotypes along with the lock alleles they carry. Therefore, the overall process results in a collapse in haplotype number at equilibrium 342 rather than expansion.

Evolutionary rescue of doomed haplotypes 344
The population has now passed through the "conversion" step, the third in the four-step path- before any one of them is lost, then the number of lock alleles at equilibrium is one greater than the initial number, and expansion has occurred. Bod'ová et al. (2018)  It is also possible that the lock mutant's ancestral haplotype acquires the key to the new 372 lock before the mutant acquires the key to its ancestor's lock. The main difference is that, since all other extant haplotypes already possess the key to the ancestor's lock, only the novel lock 374 must be rescued. Failure to rescue the novel lock will result in a return to the status quo rather than a contraction. However, if the novel lock is rescued, the resulting haplotype will begin to 376 exclude the haplotypes lacking its key The process is nearly identical to the case in which the mutant acquires its ancestor's lock before the ancestor acquires its lock, except that the ancestor 378 is already rescued at the beginning of the process, and the novel lock begins at a lower frequency.
the rescue of the novel lock, which may or may not occur before its invasion.
The probability that any one of them will survive. The probability that the population is rescued is the probability that at least one of these mutations survives. Similarly, the probability that a 390 haplotype is rescued is the probability that at least one copy adds the new key to its key ring and survives. Orr and Unckless (2008) used a model of survival probability in a population of 392 decreasing size (Otto and Whitlock 1997), which is itself a modification of the branching-process model used by Fisher (1923) and Haldane (1927). However, we modify the original model in The survival probability for a single gene convertant is thus frequency-dependent, which com-410 plicates the calculation of the probability that at least one gene convertant survives.
In the model of Orr and Unckless (2008), the population declines at a predictable rate deter- The complement of this probability is the probability that one doomed haplotype is rescued, 444 assuming the expected number of gene convertants under the expected genotype frequency trajectory. We approximated the probability that a given number of haplotypes were rescued as the 446 probability that each of them was rescued independently. The rescue probabilities are not strictly independent because one rescue event alters the trajectory of genotype frequencies. However, 448 they may be approximately independent if the rescue events occur in a relatively short period before genotype frequencies can be greatly altered. Orr and Unckless (2008) found that rescue 450 is most likely to occur early in the process when mutation supply is highest, so it is plausible that rescues also occur in a short period in our model. This procedure resulted in a probability 452 distribution of the number of surviving haplotypes after rescue or collapse. A natural question is, are large collapses guaranteed, or is there a non-trivial probability of expansion?

454
For all iterations, we assumed that at least two haplotypes had already acquired the key to the new lock. This is the minimum number for the lock-mutation not to confer ovule-sterility because, if only one haplotype could fertilize it, all ovules carrying the lock-mutation would be fertilized by the same pollen haplotype. The maternal haplotype of the resulting offspring would be incompatible with all pollen except the paternal haplotype, which would be incompatible with itself. Therefore, when there is exactly one haplotype of class L, all ovules carrying the lock 460 mutation grow up to be ovule-sterile adults. We calculated the probability that all haplotypes survived for all combinations of n L = 2, 5, 10, 20, n U = 2, 5, 10, 20, and R conversion in the range of probability never reached 0.5 for n L ≥ 3. The effect of n U , which was to decrease the expansion probability, was larger for larger n L .

470
The distribution of haplotype number after rescue/collapse responded similarly to R conversion for different values of n L , gradually shifting rightward toward maintenance of the initial number 472 of haplotypes, but never reaching high probabilities of expansion for R conversion ≤ 0.1 ( fig. 6). For n L = 2 and n L = 5, the modal outcome was loss of all or almost all doomed haplotypes.

Long-term behavior
The population has now passed through the final "rescue/collapse" step of the four-step path-476 way ( fig. 2). However, natural populations may traverse this pathway repeatedly, and they must have undergone successive rounds of expansion to produce their current haplotypic diversity.

478
The distribution of haplotype number in nature should therefore be governed by the long-term balance between collapse and expansion. We can make crude predictions for the long-term evo-480 lution of haplotype number by treating different haplotype numbers as states in a Markov chain.
The transition process works as follows. In the initial state, there are n = n L + n U + 1 unique 482 lock alleles, including the one carried by the single haplotype of class A. After the new lock invades, there are n + 1 locks, now also including that of the single haplotype of class S. After 484 the rescue/collapse phase, the new state is the number of surviving lock alleles, ranging from n = n − n U = n L + 1 (no locks are rescued; only those sharing a haplotype with a complete key 486 ring survive) to n = n + 1 = n L + n U + 2 (all locks are rescued). Assuming that haplotype number changes only through the foregoing pathway, the stationary distribution of this Markov chain 488 represents the probability distribution for haplotype number at the expansion-collapse equilibrium. Other unmodeled processes, such as bottlenecks or alternative diversification pathways, 490 may decrease or increase the number observed in nature relative to the values predicted by the Markov chain.

492
At the end of one transition, all haplotypes possess the key to the lock that just invaded.
However, this does not imply anything about whether they possess the key to the next lock to 494 invade. The haplotypes that were lucky with respect to one lock may be unlucky with respect to the next. The outcome of the rescue process therefore tells us n but not n L or n U individually. We 496 assume that n L remains a constant parameter between states, and that all changes in haplotype number are represented by changes in n U .

498
Our rescue probability calculations showed that expansion was likely even from large initial haplotype numbers when n L = 2 and R conversion The time steps in this Markov process are somewhat abstract: they represent the waiting times between the invasion of one lock and the next. In reality, they would be variable in length.

508
However, neither rescue nor collapse can occur during these periods. The length of each waiting period is therefore irrelevant to the diversification process we modeled, though it may affect the 510 opportunity for bottlenecks or alternative diversification pathways. Since we were unable to map the waiting periods onto chronological time, we ignored the dynamics of the process and focused 512 on the ultimate stationary distribution. Preliminary trials showed that the final distribution was unaffected by the initial distribution, and we set the initial distribution such that all probability 514 was concentrated at n U = 1.
Our numerical iterations required a transition matrix of finite size. We therefore enforced an 516 upper boundary of 40 incomplete haplotypes, near the upper range observed in nature (Lawrence 2000). With the fixed value of n L = 2, this resulted in up to 42 total haplotypes. Any non-zero 518 probability of expanding beyond this maximum was instead treated as additional probability of remaining at the maximum. We also enforced a lower boundary at the biological minimum of 520 n = 3: below this value, all plants reject all pollen and the population is completely sterile. Any non-zero probability of collapsing to a state in which n < 3 was instead treated as additional 522 probability of transitioning to n = 3.
We found that the stationary distribution shifted toward greater numbers of haplotypes as 524 R conversion increased ( fig. 7). At R conversion = 0.01, almost all probability was concentrated at the biological minimum of three haplotypes, while at R conversion = 1, almost all probability was 526 concentrated at 40 haplotypes, the maximum we allowed. In the range from R conversion = 0.1-0.3, the stable distribution was centered around intermediate values in the range 3-40.

Discussion
We used an evolutionary rescue model to estimate the relative probabilities of expansion and 530 contraction in S-haplotype number, calculate the distribution of contraction magnitudes, and predict the long-term evolution of haplotype number. We find that expansion from low haplo-532 type number is possible if gene conversion is frequent or the population is large, but that the possibility of contraction is ubiquitous. These collapses can be large, easily resulting in the loss 534 of the majority of S-haplotypes. A unique prediction of this model is that recently bottlenecked populations should have a "debt" such that, once the appropriate gene conversion event occurs, 536 they will experience a sudden reduction in haplotype number in addition to the reduction directly caused by the bottleneck. The long-term maintenance of haplotype number is therefore 538 more precarious than previously appreciated.
This vulnerability is a necessary consequence two very reasonable premises along with what 540 is known empirically about the control of self-incompatibility. First, we assumed that gene conversion can copy any F-box specificity from one haplotype to another. This is consistent with 542 the low observed polymorphism among alleles of a given F-box paralog despite their long coexistence (Kubo et al. 2015). Second, we assumed that the supply of compatible pollen did In this case, we should predict that S-haplotype number will drop precipitously in P. pyraster in the (possibly distant) future.

564
The rate at which gene conversion copies a functional F-box from one haplotype to another is unknown. Per-nucleotide rates of gene conversion have been estimated to be 400 times the rates 566 of crossovers in Drosophila (Gay et al. 2007). Assuming a crossover rate of 10 −8 per meiosis per nucleotide, this yields a gene conversion rate on the order of 10 −6 . A functional gene conversion 568 rate of 10 −6 would produce few haplotypes (≈ 3) for populations smaller than 100,000, ≈ 10 − 40 for populations on the order of 100,000, and larger numbers for larger populations. Such 570 populations are large but not unrealistic. Note, however, that the parameter in our model is not merely the per-nucleotide rate, but the rate at which a functional F-box is gained. This rate 572 excludes gene conversion events that eliminate rather than transfer a specificity as well as events that fail to affect specificity at all.

574
The amount of nucleotide divergence between functionally distinct F-box paralogs could serve as an upper bound on how many changes are required to convert one specificity to an- Other than large populations are rapid gene conversion, another possible explanation of high 592 S-haplotype diversity is that gene flow buoys species-wide diversity despite local contractions. Uyenoyama et al. (2001) found that, similar to our results, S-allele number was unlikely to ex-594 pand within a single population because of the threat of collapse into self-compatibility. They proposed that expansion may instead occur through local turnover in S-alleles followed by in-596 troduction of the novel alleles into the broader metapopulation. However, this model assumed that incompatibility operates through self-recognition, in which introgressed S-alleles gain an 598 advantage because they are compatible with resident plants that do not recognize them as self. It is thus not obvious whether the conclusion translates to nonself-recognition. Castric et al. (2008) 600 found less divergence between putative shared ancestral variants of S-locus genes in Arabidopsis lyrata and A. halleri than between non-S orthologs, consistent with elevated introgression at the 602 S-locus in this self-recognition system. Under collaborative nonself-recognition, however, each novel RNase requires a corresponding F-box. A novel introgressed RNase allele will not nec-604 essarily be detoxified by local F-box proteins, and it may thus inflict ovule-sterility. If there is insufficient compatible pollen to support the migrant haplotype, self-incompatibility may act as 606 a barrier to introgression not just at the S-locus, but genome-wide. This barrier could be overcome, but it would require either that the key corresponding to the foreign lock already exists 608 in the population (e.g., as a dual-function key or as a segregating neutral variant), or that there is sufficient migration to supply the corresponding foreign keys. Reintroduction of recently lost 610 RNases, unlike introduction of truly novel RNases, should function identically under self-and nonself-recognition. Populations that have undergone a recent contraction meet this condition 612 because they have lost RNases but not necessarily the corresponding F-box genes. As long as the lost RNases were reintroduced from another population before the F-box genes became pseudogenized, the population could recover all lost S-haplotypes. A metapopulation can therefore still provide a "safety net" that prevents permanent contraction even if truly novel RNases cannot 616 easily spread.
Highly differentiated sets of RNases may even act to generate reproductive isolation. If 618 nonself-recognition poses greater barriers to introgression than self-recognition, this tendency might partly explain species selection for self-incompatibility: clades with nonself-recognition may more rapidly undergo reproductive isolation and thus speciation. Landis et al. (2018) found either a positive effect or no effect of autogamy on net diversification in Polemoniaceae, in con-622 trast to the negative effect of self-incompatibility on net diversification found by Goldberg et al. (2010) in Solanaceae. Although Goldberg et al. (2010) attributed the greater net diversification 624 rate in self-incompatible species to a lesser extinction rate compared to self-compatible species rather than a greater speciation rate, it is difficult to estimate these quantities separately. If the 626 increased diversification rate in the self-incompatible species of Solanaceae is due to isolating effects of nonself-recognition rather than from self-incompatibility per se, we should expect a 628 reduced effect of self-incompatibility on diversification in families with other systems like Brassicaceae and Asteraceae. 630 We found that the expansion probability greatly decreased as the number of complete haplotypes increased. Greater numbers of complete haplotypes reduce the expansion probability in 632 at least two ways. First, by increasing the number and thus the frequency of fit haplotypes, they increase the mean siring success. When compatibility with the new RNase is more common, the 634 competitive advantage of this compatibility is reduced. Second, with more complete haplotypes, the equilibrium frequency of the novel RNase is lesser. These two effects reduce the advantage 636 of compatibility with the new RNase and thus the survival probability of the rescuing gene convertant. Expansion was most likely when exactly two haplotypes were initially complete. This 638 condition may be more biologically plausible than it first appears. Most F-box mutations should exist on only one haplotypic background because all new mutations fall into this class. How-640 ever, these alleles should contribute neither to expansion nor contraction: they cannot support a corresponding RNase mutation in the first place unless they exist on at least two backgrounds.

642
If F-box mutations on two backgrounds are the second-most-common, then they are the most common within the subset that can actually contribute to contraction or expansion.

644
An essential parameter of this model, the number of haplotypes pre-adapted to the novel RNase specificity, has not been systematically estimated. discovery of the fine-scale genetic basis of nonself-recognition (Kubo et al. 2010(Kubo et al. , 2015 necessitated new theoretical explanation of the expansion process, and this theory now points to the 678 unanticipated possibility of S-allele collapse through runaway gene convertants. The following genotype frequency recursions were used to determine the equilibrium genotype frequencies after the invasion of the key to the new lock ( fig. 3). They were also used to produce 682 the trajectories leading to extinction of doomed haplotypes ( fig. 4), which were in turn used to calculate rescue probabilities. The genotype frequency recursions are 684 P LL = 2p L n L − 2 n L P LL + n L − 1 2n L (P LC + P LU + P LA + P LS )  Pollen F-box = Figure 1: Haplotype classes. In this example, the population begins with three complete haplotypes, and then a fourth lock allele is introduced. The new lock arises from the "ancestral" haplotype, creating a "spiteful" haplotype which is neutral itself but decreases the fitness of the other haplotypes. All haplotypes lack the key to the new lock except for those "lucky" enough to carry it in advance, which have an advantage over the "chump" haplotypes that bear their corresponding locks. Haplotypes that are "unlucky" merely lack the new key but don't compete against other haplotypes with their same locks.  Figure 2: Model steps. These four steps describe a hypothetical pathway to expansion in the number of incompatibility haplotypes. In the "mutation" step, a lock allele mutates to a new specificity (from four to five in this case), one that can be unlocked by an existing, previously neutral key variant. In the "invasion" step, haplotypes with the key to the new lock spread and replace otherwise identical haplotypes lacking the key. In the "conversion" step, haplotypes acquire new keys. This step ends when either the lock mutant acquires the key to its ancestor's lock, or when its ancestor acquires the key to the new lock. In the "rescue/collapse" step, the remaining haplotypes lacking the key to the new lock are driven to extinction unless they can acquire the missing key in their remaining time. Vertical arrows indicate gene conversion events.
Each * represents a haplotype that has acquired a missing key through gene conversion. Question marks indicate gene conversion events that may or may not produce a surviving gene convertant haplotype before the potential recipient is driven extinct. Gray boxes indicate haplotypes resulting from evolutionary rescue events that may or may not occur.