Diversification or collapse of self-incompatibility haplotypes as a rescue process

In angiosperm self-incompatibility systems, pollen with an allele matching the pollen recipient at the self-incompatibility locus is rejected. Extreme allelic polymorphism is maintained by frequency-dependent selection favoring rare alleles. However, two challenges limit the spread of a new allele (a tightly linked haplotype in this case) under the widespread “collaborative non-self recognition” mechanism. First, there is no obvious selective benefit for pollen compatible with non-existent stylar incompatibilities, which themselves cannot spread if no pollen can fertilize them. However, a pistil-function mutation complementary to a previously neutral pollen mutation may spread if it restores self-incompatibility to a self-compatible intermediate. Second, we show that novel haplotypes can drive elimination of existing ones with fewer siring opportunities. We calculate relative probabilities of increase and collapse in haplotype number given the initial collection of incompatibility haplotypes and the population gene conversion rate. Expansion in haplotype number is possible when population gene conversion rate is large, but large contractions are likely otherwise. A Markov chain model derived from these expansion and collapse probabilities generates a stable haplotype number distribution in the realistic range of 10–40 under plausible parameters. However, smaller populations might lose many haplotypes beyond those lost by chance during bottlenecks.

the Brassicaceae (Kachroo et al. 2002), pollen is accepted by default, but self pollen carries a product that induces rejection in the maternal plant. However, the SI system in question functions 92 not through self-recognition, but rather through a "collaborative nonself-recognition mechanism" (Kubo et al. 2010). In a nonself-recognition system, pollen is rejected by default, but cross pollen 94 carries a product that allows it to bypass rejection. The crucial difference from a theoretical perspective is in the behavior of a mutant that expresses a new, unrecognized specificity. A 96 new specificity would be compatible with existing phenotypes under self-recognition (it is new, but still not self), whereas it would be incompatible with existing phenotypes under nonself-98 recognition (it is no longer recognized as nonself). Results from models of self-recognition systems might therefore not be directly applicable to nonself-recognition. Despite the ubiquity of 100 collaborative non-self recognition-based SI, few models of S-allele evolution explicitly consider this system. Fujii et al. (2016) proposed that a pollen-function mutation complementary to a 102 novel pistil-function mutation could arise on a single background and then spread to other backgrounds through gene conversion. Bod'ová et al. (2018) enumerated several possible evolutionary 104 pathways to new S-alleles and demonstrated that the pathway hypothesized by Fujii et al. (2016) could either expand or reduce S-allele number. An explanation of the long-term trajectory of S-106 allele number must therefore account not only for the steps needed to create new fully functional alleles but also for dynamics that can lead to their loss. 108 The stochastic process involving the risk of collapse, which follows from the collaborative non-self recognition system, raises radically different questions than one where drift is the only 110 force opposing continual growth. Are S-allele numbers repeatedly expanding and contracting, or do they reach equilibria where changes are rare? We might sometimes observe changes in 112 progress if they are frequent, but infrequent changes might be buried in the past. When a con-traction occurs, is it usually small or catastrophic? Large contractions, followed by re-expansion, should result in more turnover in alleles and fewer shared alleles among species than would small contractions. Since contraction is possible whether SI is maintained throughout (Bod'ová 116 et al. 2018) or is temporarily lost in SC intermediates (Uyenoyama et al. 2001;Gervais et al. 2011;Bod'ová et al. 2018), neither pathway is unidirectional. Is the pathway maintaining SI still suf-118 ficient for expansion, or are SC intermediates necessary? If new proto-S-alleles are segregating in natural populations, are they self-compatible, self-incompatible, or both? Answers to these 120 questions will aid interpretation of S-allele genealogies, which document both similarities and differences between species in their complement of S-alleles. For example, the large historical 122 contraction in S-allele number in the common ancestor of Physalis and Witheringia has been interpreted as a demographic bottleneck (Paape et al. 2008), but a firmer theoretical investigation 124 could assess whether a collapse restricted to the S-locus is a plausible alternative explanation.
We find that diversification and collapse of S-alleles result from a rescue process similar to 126 evolutionary or genetic rescue of a population. To approximate the relative probabilities of collapse and expansion in S-allele number as well as the distribution of collapse magnitudes, we 128 therefore expand the analogy of evolutionary rescue (Orr and Unckless 2008) to model the extinction of haplotypes and to incorporate the frequency-dependent dynamics associated with SI. 130 We find a negative relationship between initial S-allele number and the probability of further expansion. By constructing and iterating a Markov chain out of these contraction and expansion 132 probabilities, we find stable long-term distributions of S-allele number. We show that for large but plausible values of population size and rate of gene conversion, this process can generate 134 numbers of S-allele numbers comparable to those found in nature (20-40). However, contractions can be very large when they occur, often eliminating the majority of S-alleles. These results 136 suggest that an evolutionary opportunity to expand the number of S-alleles is intrinsically connected to the risk of a contraction in S-allele number. 138 Our results yield numerous novel predictions. First we predict that novel stylar rejection alleles will arise on SC haplotypes which are maintained by mutation-selection balance. Second, 140 we predict that, when many but not all S-alleles are fortuitously capable of fertilizing plants carrying the novel S-allele, the S-alleles lacking this capability are very likely to be lost. This 142 is because the alleles possessing this fortuitous compatibility eliminate their competitors when a novel specificity arises, thereby reducing the total number of surviving alleles. However, the 144 fortuitously compatible S-alleles themselves are never expected to be lost at equilibrium. The joint effect of increasing the number of fortuitously compatible S-alleles is to increase the probability that some S-alleles are lost but also to increase the minimum number of surviving S-alleles.
Third, we predict that intraspecific variation in S-allele number can sometimes be explained by an 148 invasion of a runaway SI allele rather than a bottleneck or a transition to SC. Counterintuitively, we therefore predict that when allopatric populations exhibit large disparities in S-allele number, 150 the population with the smaller number of S-alleles will harbor a stylar allele absent from and incompatible with pollen from the population with more S-alleles, while all S-alleles from the 152 population with more S-alleles can be fertilized by pollen in the population with fewer S-alleles.

Collaborative Nonself-Recognition
The risk of contraction in S-allele number in the "collaborative nonself-recognition" incompati- 156 bility system is rooted in the system's biological details. This system involves two kinds of complementary products: pistil-expressed ribonucleases (RNases) (McClure et al. 1989) and pollen-158 expressed F-box proteins (Lai et al. 2002;Entani et al. 2003;Sijacic et al. 2004). Together, these products achieve a form of collaborative nonself-recognition first described by Kubo et al. (2010).

160
Under this system, each "allele" at the S-locus is actually a tightly linked haplotype containing one RNase gene and a collection of paralogous F-box genes. The RNase gene is highly polymor-162 phic, and each functionally distinct haplotype possesses a different RNase allele. Nonself pollen is recognized through the complement of F-box proteins it expresses. Each F-box paralog pro-164 duces a functionally distinct product, and each of these products is capable of detoxifying one or more forms of RNase. Pollen is only successful if it expresses both of the two F-box proteins 166 to match the diploid pollen recipient's two RNase alleles. Self-fertilization is prevented because each fully functional haplotype lacks a functional copy of the F-box gene that corresponds to 168 the RNase on the same haplotype, and so each pollen grain necessarily lacks one of the two F-box proteins required to fertilize the plant that produced it. Rejection is not restricted to self-170 pollination: any two individuals that share one haplotype will reject half of each other's pollen, while individuals that share both haplotypes are completely incompatible. Rejection is therefore 172 more likely between closely related individuals. Tight genetic linkage across all components of the S-locus reduces the probability that recombination will cause a haplotype to lose a functional F-box paralog (reducing its siring opportunities) or gain the F-box paralog that detoxifies its own This system presents two novel challenges that are absent in self-recognition systems. The first is a "chicken-egg problem." A novel F-box specificity alone is at best neutral because it 178 detoxifies an RNase that does not yet exist. A novel RNase alone is deleterious because it degrades all pollen and renders the plant ovule-sterile. Both of these mutations must invade in 180 order to generate a new, fully functional S-haplotype. Second, for every novel RNase that arises, the corresponding F-box specificity must appear on every other haplotypic background in order 182 to restore cross-compatibility among all haplotypes. Building on the hypothesis for expansion of haplotype number proposed by Fujii et al. (2016), Bod'ová et al. (2018 showed how cross-184 compatibility could be restored among all incompatibility classes after novel RNase and F-box mutations have invaded. If an initially neutral F-box mutation already exists when its com-186 plementary RNase mutation arises, the F-box mutation will then invade because it confers the advantage of compatibility with the new RNase. To restore full cross-compatibility among hap-188 lotypes, all haplotypes must acquire F-box paralogs complementary to all RNase alleles other than their own either through gene conversion (Fujii et al. 2016)  cess along with several other expansion pathways and found that up to 14 haplotypes could be maintained.

200
Based on this biological background, we introduce a set of metaphors to make the interactions among haplotypes more intuitive. Each form of pistil-function RNase is a lock, and each form 202 of pollen-function F-box protein is a key. A diploid plant codominantly expresses two different locks in its pistils. Each pollen grain expresses every paralogous key in its haploid genome, 204 and these keys collectively form that pollen grain's key ring. The pollen must unlock both of the pollen recipient's locks in order to fertilize it, which requires keys to both locks. Each fully 206 functional haplotype is SI because its key ring lacks the key to the lock on the same haplotype ( fig. 1).

208
In contrast to Bod'ová et al. (2018) our model addresses the realistic possibility that novel RNase alleles (locks) will have ovular fitness below the population average because keys to this 210 lock are initially rare. Thus, while Bod'ová et al. (2018) assumed the novel RNase was neutral, we model the case when this RNase suffers more intense pollen limitation than the rest of 212 the population. How could such a stylar mutation, which increases pollen limitation, increase ovular fitness (as is required for its adaptive spread)? We suggest that novel locks can be fa-214 vored on self-compatible genetic backgrounds otherwise maintained by the balance of selection against self-compatibility and gene conversion of F-box alleles which restore self-compatibility.

216
By restoring SI and preventing self-fertilization, a new lock can provide an ovular advantage and can spread by natural selection. Concurrently, the complementary key spreads within and among 218 haplotypes ("key rings"), and the number of S-allele increases if all S-alleles are "rescued" before S-haplotypes lacking this novel key are driven to extinction due to their mating disadvantage.

Model outline
We identify six haplotype classes with regard to their phenotypes expressed in pollen and pistils 222 ( fig. 2), and we evaluate a three-step evolutionary pathway to expansion or contraction of Shaplotype number ( fig. 3). We define "contraction" as any transition to a state with fewer lock 224 alleles. We define "expansion" as the transition from an initial state in which all haplotypes are cross-compatible with all other haplotypes to a final state that is the same but with one additional 226 cross-compatible SI haplotype.

In
Step 1, the "conversion" step, a haplotype acquires the key to its own lock through gene 228 conversion, rendering it SC. This ensures that when a lock mutation occurs on this haplotype, the resulting mutant will already possess every key but its own and will be cross-compatible as a 230 sire with all other haplotypes as soon as it arises. A supply of such SC haplotypes is maintained at the balance between gene conversion and strong selection against inbreeding. The invasion 232 and fixation of an SC haplotype should function identically under self-and nonself-recognition, and this process has already been thoroughly covered by Uyenoyama et al. (2001) and Gervais 234 et al. (2011). We return to it briefly below. However, we focus on cases in which inbreeding depression is strong enough that SC is at a net disadvantage-ours is a model of diversification 236 of S-haplotypes, not the transition to SC. For such cases, there is no risk that the population will become entirely SC: the only possible outcomes are increase, decrease, and stasis in the number 238 of SI haplotypes.

In
Step 2, the "mutation" step, the lock allele on the SC intermediate haplotype mutates to a new specificity, one that can be unlocked by an existing, previously neutral key variant. The new lock is a necessary component of a new haplotype. The SC intermediate previously had the 242 key to every lock, but now that its own lock has mutated, the resulting mutant has the key to every lock but its own and is once again SI. The pre-existence of a complementary key on other 244 haplotypes ensures there is at least some pollen compatible with the new lock. However, in the likely event that the complementary key is present on only a subset of other haplotypes, the new 246 lock will reject more pollen than other locks. If the amount of compatible pollen received is a limiting factor on seed set, the lock mutation might reduce ovule success. But since the mutant 248 is compatible as a sire with all other haplotypes (thanks to the "conversion" step), its increased pollen success can compensate for lost ovule success. This might seem to be asking a lot of nature 250 in this step -the mutation to a new lock coincidentally occurs on a rare SC haplotype, and the lock mutation's matching key already exists in the population despite being previously neutral.

252
However, these events appear less contrived when we consider the alternatives and realize that there is always an opportunity for these rare events to occur. If the lock mutation occurs on 254 an SI haplotype, the resulting mutant will derive no advantage because it was already SI, but it will still suffer additional pollen limitation. If the lock mutation lacks a complementary key, 256 it will reject all pollen and suffer complete ovule failure. Such mutations should surely occur, but they should remain at low frequencies because they are strongly deleterious. Since there is a 258 constant supply of SC haplotypes at gene conversion-selection balance, the population can wait indefinitely until one such haplotype mutates to a lock with a pre-existing key.

260
Finally, in Step 3, the "rescue/collapse" step, the remaining haplotypes lacking the key to the new lock are driven to extinction unless they can acquire the missing key through gene 262 conversion in their remaining time. This restores mutual cross-compatibility among all surviving haplotypes, whether there are more (expansion) or fewer (contraction) than the initial number.

264
The six relevant haplotype classes are described in figure 2. Table 1 gives notation for all parameters and variables. There are initially n lock alleles, but diversity arises from which 266 keys each haplotype possesses. The first haplotype class consists of n L haplotypes that possess one of the initial n locks and a key ring that is "complete" (Bod'ová et al. 2018). That is, they 268 possess the keys to all extant locks but their own, including the mutant lock. This first class is denoted L for "lucky" because its members fortuitously possess the key to the new lock.

270
Each haplotype in this class initially occurs at frequency d L /n, where the parameter d L is an arbitrary initial proportion. In nature, the parameter n L would be determined by the extent 272 of gene conversion that has already occurred: if the neutral novel key has been copied from its original background to many others, then n L is high. Second, for each of these n L complete 274 haplotypes, there exists one haplotype that is identical except that it lacks the key to the new lock.
These could represent ancestral versions of the complete haplotypes that have not yet gained the 276 key to the new lock. This second class is denoted C for "chump" because members of this class are simply inferior versions of the corresponding "lucky" haplotypes. Each haplotype in this 278 class occurs at frequency (1 − d L )/n. Third, there are n U haplotypes that have one of the initial n locks and an incomplete key ring that lacks only the key to the mutant lock. Unlike the previous 280 class, there exists no complete version of these haplotypes initially. This third class is denoted U for "unlucky" because members of this class cannot unlock the new lock like the "lucky" 282 haplotypes can, but neither must they compete with a superior version of themselves like the "chump" haplotypes must. Each haplotype in this class occurs at frequency 1/n. Fourth, there 284 is the ancestral haplotype from which the SC intermediate arose. It is like any other incomplete haplotype (e.g., a U haplotype), except that it has the same lock as the SC intermediate. This

286
fourth class is denoted A for "ancestral" because its lock is ancestral to that of the lock mutant. It occurs at frequency 1/n − p (0) it is the background on which the lock mutation arises. We lump all other SC haplotypes into the class from which they originated (L or U). Sixth, there is the haplotype bearing the lock 296 mutation. It is identical to the intermediate haplotype, except that it has a novel lock and is thus SI. It is denoted M for "mutant" because of its novel lock mutation. This haplotype increases its 298 own fitness over its SC predecessor by preventing selfing, but it also decreases the fitness of the "unlucky" and "chump" haplotypes by presenting a lock to which they lack the key. It occurs 300 at initial frequency p (0) M There are thus n L + n U + 1 = n pre-existing haplotypes (including the mutant's ancestor) along with the mutant itself for a total of n + 1 haplotypes.
Throughout out model, we assume that selection dominates random genetic drift, so we use deterministic equations to describe the change in haplotype frequency by selection. As 304 such, the frequency of the SC intermediate after Step 1 ( fig. 3), "conversion," is controlled by the deterministic equilibrium between selection and gene conversion, and the decay of doomed 306 haplotypes after Step 2, "mutation," occurs through deterministic genotype frequency trajectories (see Appendix). However, our model includes numerous stochastic forces that play a critical role 308 in mediating the diversification or collapse of S-allele diversity. The first event we model as a random variable is the distribution of compatible pollinations on the lock mutant after Step 2.

310
This distribution is used to demarcate parameter values in which compatible pollen supply does or does not limit ovule success. Most critically, we model the probability of survival of new gene 312 convertants in Step 3, the "rescue/collapse" step, as a stochastic process. Specifically, we use the branching processes approximation of Fisher (1923) and Haldane (1927) to approximate this 314 survival probability based on the fitness benefit of the convertant, though we check the accuracy of this approximation through stochastic simulation. The third and final random variable is 316 the number of surviving haplotypes after rescue/collapse, which is itself calculated from the survival probability and the expected supply of gene convertants. We use this distribution to 318 parameterize a transition matrix, and iterate this as a Markov Chain to determine the long-run stable distribution of haplotype number numerically. Though we do not model long-term drift

328
Step 1, the "conversion" step, consists of the generation and equilibration of the SC intermediate ( fig. 3). The SC intermediate provides the advantage of compatibility with other individuals car-fix in the population. We therefore focus on cases in which SI is maintained by extreme inbreed-334 ing depression: specifically, all selfed offspring are eliminated before they can reproduce. In this case, for many values of S-haplotype number and primary selfing rates, the SC intermediate is 336 maintained at low frequency by the balance between gene conversion and selection. What is this frequency? 338 We determine conversion-selection balance of the SC intermediate using a simple model. A population contains n SI haplotypes that are all mutually cross-compatible. Since all haplotypes 340 are cross-compatible, a diploid's maternal and paternal haplotype carry the key to the other's lock. This state does not normally result in SC pollen because, though the diploid possesses keys 342 to both of its own locks, these keys are on separate haplotypes. The absence of recombination at the S-locus then ensures that none of the plant's pollen carries both keys. However, gene 344 conversion allows a lock to be copied from one haplotype to another, potentially generating an SC haplotype from any diploid carrying at least one SI haplotype. In this population, gene 346 conversion in zygotes generates SC haplotypes at rate µ. Individuals carrying two SC haplotypes self at rate σ, and those carrying one self at rate σ/2. We model extreme inbreeding depression 348 such that all selfed offspring are inviable, and so the selfing rate is equivalent to a fecundity penalty. However, SC haplotypes gain an advantage in pollen. Assuming all individuals are 350 heterozygous at the lock locus, SI haplotypes are rejected with probability n−2 n , whereas SC haplotypes are never rejected because they carry all keys. If SC haplotypes are rare, the expected 352 change in frequency of all SC haplotypes, p, is If n, µ, and σ are treated as constant parameters, ∆p is a cubic function of p. We determine the 354 equilibrium frequency by setting ∆p = 0 and solving numerically using Mathematica's NSolve function. We solve for the equilibrium frequency of SC haplotypes, varying µ from 10 −4 to 10 −3 356 in steps of 10 −4 , σ from 0.1 to 1 in steps of 0.1, and n from 3 to 40 in steps of 1 ( fig. 4). The biologically meaningful domain of ∆p is the real numbers between zero and one, so we ignore 358 solutions outside this interval.
We find two broad categories of outcomes depending on the parameter values. First are 360 cases in which ∆p is positive from p = 0 to p = 1, and there are no internal equilibria. In such cases, SC haplotypes are expected to rise to fixation despite intense inbreeding depression.
This occurs when SC confers a large pollen advantage because n is small, or when it incurs a small ovule disadvantage because σ is small (Charlesworth and Charlesworth 1979;Uyenoyama 364 et al. 2001). Second are cases in which there are two solutions between p = 0 and p = 1.
In these cases, the smaller solution is stable, and the larger is unstable. The unstable solution 366 tends to be fairly large, usually between 0.1 and 0.5, so we warn that predicted frequencies of unstable equilibria are likely imprecise. Because this internally unstable equilibrium cannot be 368 approached by any biological process, we report and investigate only the stable solution. Stable equilibria for µ = 0.0003, with σ from 0.1-1.0, and n from 0-40 are plotted in (fig. 4).

Invasion of lock mutant
With the existence of SC haplotypes at conversion-selection balance, we can generate a new SI 372 haplotype through mutation. If an SC haplotype's lock mutates to a novel specificity, it will no longer have the key to its own lock, making it SI. It will, however, possess the key to every other 374 lock, making it universally cross-compatible as a sire. The SC haplotype therefore acts as an SC haplotypes. These haplotypes are compatible as sires with every other haplotype, but only a subset of haplotypes are compatible with them. In I, the siring advantage partially compensates 382 for the loss of ovule success through selfing, resulting in a smaller but still deleterious net effect on fitness. The mutation from I to M restores SI while maintaining the siring advantage.

384
However, this mutation also potentially increases pollen limitation because the new lock rejects the majority of pollen in the population. In order for M to invade and for the population to 386 progress to the rescue/collapse step, M must have fitness above the population average despite the tradeoff of pollen limitation. 388 We use a simple model of pollen limitation to determine if M will invade when rare. As SC haplotypes are rare and of fitness below the population average, so the invasion of M depends 390 only on whether M is superior to L and U haplotypes. We assume L, U, and M haplotypes are equally efficient at rejecting self pollen, and so the quantity of self pollen is irrelevant in 392 determining these haplotypes' fitnesses relative to each other, and since self pollen acts simply to 'waste' ovules, we do not consider self-pollen as a source that alleviates pollen limitation.

394
For simplicity, we assume that each flower carries 10 ovules, and receives a number of pollen grains equal to the parameter G. The proportion of G that is compatible with the maternal flower 396 is a binomial random variable, but the expectation of this variable is sufficient to determine fitness, so we treat ovule success as a deterministic function of the proportion of the compatible 398 pollen. The proportion of pollen that is compatible with the maternal flower is n L d L / (n L + n U ) for UM genotypes, d L (n L − 1) / (n L + n U ) for LM genotypes, and (n L + n U − 2) / (n L + n U ) for 400 all other genotypes. Fertilization occurs through the following process: one ovule subtracts one compatible pollen grain from the flower's stock of compatible pollen to produce a seed, then the 402 next ovule does the same, and this process continues until either ovules run out (full seed set) or compatible pollen runs out (partial seed set). We therefore define a flower's ovule success as We note that if M is disfavored, this novel specificity goes extinct, and we await a mutation to 420 another novel specificity (or a recurrent mutation to this specificity which could be favored if the number of "'lucky' haplotypes' is, by chance, higher). As such, with every novel specificity, this 422 process repeats itself until one increases in frequency.
Numerical iteration of genotype frequencies when M is beneficial shows that M will rise to high frequencies and eliminate every haplotype that lacks the key to the new lock ( fig. 6). These lost 426 haplotypes include all of the U haplotypes along with the lock alleles they carry. Therefore, the overall process results in a collapse in haplotype number at equilibrium rather than expansion. lock, as occurred in the previous step at rate µ. We do not track these gene convertants in this step because they should be maintained at low frequency at conversion-selection balance. The 490 parameter R conversion is therefore smaller than the raw rate of functional gene conversion, which also includes the production of short-lived deleterious convertants. Once the genotype frequencies, the expected number of gene convertants, and the survival 512 probability of a new gene convertant were known for every generation, we calculated the probability that none of them survived as the product of the complements of the survival probabilities.

514
The complement of this probability is the probability that one doomed haplotype is rescued at least once, assuming the expected number of gene convertants under the expected genotype 516 frequency trajectory. We approximated the probability that a given number of haplotypes were rescued as the probability that each of them was rescued independently. The rescue proba-518 bilities are not strictly independent because one rescue event alters the trajectory of genotype frequencies. Rescuing one haplotype essentially creates another lucky haplotype, which could 520 outcompete the remaining doomed haplotypes. However, if the rescues happen in relatively rapid succession, each rescued haplotype will have less effect on subsequent rescues because it 522 is still at low frequency. Orr and Unckless (2008) found that rescue is most likely to occur while the population is still relatively large and mutation supply is highest. That is, there is a short For all iterations, we assumed that at least two haplotypes had already acquired the key to the new lock. This is the minimum number for the lock-mutation not to confer ovule-sterility 532 because, if only one haplotype could fertilize it, all ovules carrying the lock-mutation would be fertilized by the same pollen haplotype. The maternal haplotype of the resulting offspring 534 would reject all pollen except the paternal haplotype, which would reject itself. Therefore, when there is exactly one haplotype of class L, all ovules carrying the lock mutation grow up to be 536 ovule-sterile adults. In this case, the lock mutation is deleterious rather than neutral and should be rapidly lost. Though this scenario might often occur, it results in no change in the number of 538 S-haplotypes and can be safely ignored in this model.
We calculated the probability that all haplotypes survived for all combinations of n L = 540 2, 5, 10, 20, n U = 2, 5, 10, 20, and R conversion in the range of 0.1 -1 in increments of 0.01 and the range of 1 -10 in increments of 0.1. We found that the probability that all haplotypes were 542 rescued (the expansion probability) decreased with the number of doomed haplotypes and increased with the population rate of gene conversion R conversion (fig. 7). For n L = 2, the expansion 544 probability rapidly saturated near 1.0 as R conversion increased regardless of n U . The expansion probability greatly decreased for n L = 3 relative to n L = 2 and continued decreasing gradually 546 for larger values of n L . The effect of n U was also to decrease the expansion probability, though this effect was less than that of n L .

548
The distribution of haplotype number after rescue/collapse responded similarly to R conversion for different values of n L , gradually shifting rightward toward maintenance of the initial number 550 of haplotypes, but never reaching high probabilities of expansion for R conversion ≤ 0.1 (fig. S2).
For n L = 2 and n L = 5, the modal outcome was loss of all or almost all doomed haplotypes.

Long-term behavior
The population has now passed through the final "rescue/collapse" step of the three-step path- laborative non-self recognition (Lawrence 2000). We then iterate each transition matrix for 1000 introductions of a new lock to estimate its stationary distribution.

570
Our previous rescue probability calculations showed that expansion was likely even at large haplotype numbers when n L = 2 and R conversion ≥ 0.1 ( fig. 7). Such large expansion probabilities 572 are necessary to produce the many haplotypes observed in nature. To cover a range of expansion probabilities, we generate transition matrices for R conversion = 0.1-0.4 in intervals of 0.1, as well 574 as R conversion = 0.01 and 1. We again assume that the initial frequency of each L haplotype is low, d L = 0.01. We also assume that n L = 2 remains a constant parameter between states, and 576 that all changes in haplotype number are represented by changes in n U . This value of n L allowed non-trivial expansion probabilities. In reality, the transition matrix should be partly determined 578 by several random variables: the number of L haplotypes and the initial frequency of each. The number would be determined by the extent to which gene conversion had already spread the new 580 key before the corresponding lock originated. Likewise, the frequencies would be determined by the outcome of drift between corresponding L and C haplotypes. We sidestep this complication 582 and assume constant parameters n L and d L .
We found that the stationary distribution shifted toward greater numbers of haplotypes as 584 R conversion increased ( fig. 8). At R conversion = 0.01, the modal haplotype number was eight, while at R conversion = 1, it was 37. In the interval from R conversion = 0.2-0.4, the stable distribution was The forgoing model only considers changes in the number of S-haplotypes, but another rel-594 evant process is the transition from SI to SC. A transition to SC is expected when SC alleles are beneficial and their frequency at conversion-selection balance is at fixation. We repeat the 596 previous model of long-term haplotype number evolution, but additionally assume that any population below a threshold number of S-haplotypes will transition to SC, losing all other S-598 haplotypes as the SC haplotype rises to fixation. We assume transitions to SC are irreversible, as appears to be the case for the collaborative nonself-recognition system (Igić et al. 2004), and so SC 600 populations leave the Markov process entirely. Thus, any probability mass below this threshold is eliminated every generation, and the remaining probabilities are divided by their sum so they 602 add to one. The resulting distribution describes the probability of each haplotype number conditional on remaining SI. We set this lower threshold at 11 haplotypes, the minimum number for 604 which we found an internal equilibrium frequency of an SC intermediate given µ = 0.0003 and σ = 0.5 ( fig. 4). We also set the starting haplotype number at 11 haplotypes and note that, though 606 these parameters would not allow diversification to 11 haplotypes in the first place, initial diversification could have occurred if parameters were originally even less favorable to SC (e.g., larger 608 σ). All long-run distributions were unchanged after adding this threshold except for R = 0.01, for which the probability was strongly concentrated at the new minimum of 11 ( fig. S3).

Discussion
We developed a new model of 'haplotypic rescue' to estimate the relative probabilities of expan-612 sion and contraction in S-haplotype number, calculate the distribution of contraction magnitudes, and predict the long-term evolution of haplotype number. We find that expansion from low hap-614 lotype number is possible if gene conversion is frequent or the population is large. When collapses occur instead, they can be large, easily resulting in the loss of the majority of S-haplotypes.

616
A unique prediction of this model is that recently bottlenecked populations should have a "debt" such that, once the appropriate gene conversion event occurs, they will experience a sudden re-618 duction in haplotype number in addition to the reduction directly caused by the bottleneck.
Despite the possibility of collapse, we found that stable distributions of haplotype number (2018) assumed that pollen limitation was binary: plants that received any pollen had the max-644 imum ovule success, and plants that received no pollen had no ovule success. That is, RNase mutants have ovule success equal to that of non-mutants despite rejecting the majority of pollen.

646
Instead, we explore the gradation between total ovule-sterility when pollen is limiting and equal undiminished ovule success when pollen is abundant. We find that a novel RNase can invade 648 despite causing increased pollen limitation as long as compatible pollen is sufficiently abundant.
We also find that the relative fitness of an RNase mutation decreases as the number extant RNases 650 increases: the frequency-dependent advantage of rarity is less when the competing RNases are numerous and rare than when they are few and common. This diminishing advantage of RNase 652 novelty could act as a negative feedback that limits how many RNase alleles can be produced in a single population. We therefore predict a negative relationship between pollen limitation and 654 RNase number in nature. Nevertheless, pollen limitation appears to be a surmountable barrier to RNase diversification.

656
Once the novel RNase invades, S-haplotype number might still either contract or expand. The probability of expansion depends on the population size and gene conversion rate. However, 658 the rate at which gene conversion copies a functional F-box from one haplotype to another is unknown. Per-nucleotide rates of gene conversion have been estimated to be 400 times the rates 660 of crossovers in Drosophila (Gay et al. 2007). Assuming a crossover rate of 10 −8 per meiosis per nucleotide, this yields a gene conversion rate on the order of 10 −6 . A functional gene conversion 662 rate of 10 −6 would produce few haplotypes (≈ 8) for populations on the order of 10,000, ≈ 10 − 30 for populations on the order of 100,000, and larger numbers for larger populations. Such 664 populations are large but not unrealistic. Note, however, that relevant quantity is not merely the per-nucleotide rate, but the rate at which a functional F-box is gained. This rate excludes gene 666 conversion events that eliminate a pollen specificity as well as events that fail to affect specificity at all. It is therefore smaller than the per-nucleotide rate of gene conversion, though how much 668 smaller is unknown. The rate at which F-box paralogs are copied would constrain not only the probability of rescue (through R conversion ) but also the input of SC gene convertants (through µ).

670
It might be possible to estimate the rate of functional gene conversion indirectly. To perform such an estimate, a study would need to measure both the frequency and seed set of wild SC 672 individuals relative to SI individuals in a predominantly SI population. Since the equilibrium frequency of SC haplotypes is determined by their ovule fitness and the gene conversion rate, it 674 should be possible to estimate the gene conversion rate given the observed frequency and fitness of SC.

676
Other than large populations and rapid gene conversion, another possible explanation of high S-haplotype diversity is that gene flow buoys species-wide diversity despite local contrac-678 tions. Uyenoyama et al. (2001) found that, similar to our results, S-allele number was unlikely to expand within a single population because coexistence was rarely possible between a mu-680 tant SC intermediate and its ancestral haplotype. They proposed that expansion might instead occur through local turnover in S-alleles followed by introduction of the novel alleles into the 682 broader metapopulation. However, this model assumed that incompatibility operates through self-recognition, in which introgressed S-alleles gain an advantage because they are compatible 684 with resident plants that do not recognize them as self. It is thus not obvious whether the conclusion translates to nonself-recognition. Castric et al. (2008) found less divergence between putative 686 shared ancestral variants of S-locus genes in Arabidopsis lyrata and A. halleri than between non-S orthologs, consistent with elevated introgression at the S-locus in this self-recognition system. An

762
Even though we assume maximal inbreeding depression throughout, we find that populations are susceptible to invasion and fixation of SC haplotypes unless selfing rates are also high 764 ( fig. 4). The possibility of transition to SC does not seem to affect the distribution of haplotype numbers of those populations that remain SI unless the stable distribution crosses the minimum 766 for maintaining SI ( fig. S3), likely because there is a strong upward pull when haplotype numbers start below the stable range ( fig. 9, fig. 10). However, populations can only diversify from 768 low to moderate numbers of haplotypes in the first place if selfing rate is high. Thus, though diversification is possible through this pathway alone, transitions to SC might be very common at 770 low haplotype numbers in nature. A trait that is frequently lost but rarely or never regained can still common among many species if species with the trait speciate more frequently or species 772 without the trait go extinct more frequently. This macroevolutionary process, species-level selection, is hypothesized to maintain SI among species despite frequent and possibly irreversible 774 transitions to SC Goldberg et al. 2010 locus ought to answer these questions, but the present model is best viewed as one of continued diversification rather than one of origin.

780
SI has developed into a rich study system through the continued interaction between theory and empirical research. Experimental demonstration of the genetic control of SI  gelsdorf 1925) and field research showing the number of S-alleles in natural populations (Emerson 1938(Emerson , 1939 inspired theoretical explanations of the balancing selection capable of maintaining 784 this diversity (Wright 1939;Charlesworth and Charlesworth 1979). The theoretical potential for balancing selection indicated the S-locus as a candidate for long-term polymorphism, and S-allele 786 phylogenies confirmed this possibility (Igić and Kohn 2001;Steinbachs and Holsinger 2002). The discovery of the fine-scale genetic basis of nonself-recognition (Kubo et al. 2010(Kubo et al. , 2015 necessi-788 tated new theoretical explanation of the expansion process, and this theory now points to the unanticipated possibility of S-allele collapse through runaway gene convertants.

Appendix: genotype frequency recursions
The following genotype frequency recursions governed changes in the frequencies of the classes 792 of SI haplotypes. They were used to produce the trajectories leading to extinction of doomed haplotypes ( fig. 6), which were in turn used to calculate rescue probabilities. SC haplotypes were 794 assumed to be rare and were neglected in these recursions. Each P XY represents the frequency of a diploid genotype consisting of one haplotype of class X and one of class Y. The genotype 796 frequency recursions are P LL = 2p L n L − 2 n L P LL + n L − 1 2n L (P LC + P LU + P LA + P LM )  Figure 1: Rejection of self pollen in the style. Each haplotype contains a single lock and multiple paralogous keys. Each pistil expresses the two locks in its diploid genotype, and each pollen grain expresses all keys in its haplotype. A pollen tube is arrested in the style unless it contains keys to both of the pistil's locks. Each haplotype lacks the key to its own lock, but contains the keys to all other locks. Self-fertilization is prevented because pollen necessarily lacks one of the keys expressed by the plant that produced it. In this example, the population begins with three complete haplotypes (those with the key to every lock but their own). Gene conversion generates an SC "intermediate" haplotype from the "ancestral" haplotype by introducing the key to the ancestral haplotype's own lock. The SC intermediate acquires a novel lock by mutation, restoring SI and generating the "mutant" haplotype. All haplotypes lack the key to the new lock except for those "lucky" enough to carry it in advance, which have an advantage over the "chump" haplotypes that bear their corresponding locks. Haplotypes that are "unlucky" merely lack the new key but don't compete against other haplotypes with their same locks.   Figure 4: Conversion-selection balance of SC gene convertants. Equilibrium frequency was calculated assuming the gene conversion rate µ = 0.0003. Equilibrium frequency declined with increasing RNase number and selfing rate. The left-most point of each curve represents the threshold RNase number below which an SC haplotype will rise to fixation rather than equilibrate at low frequency. This threshold appears as an abrupt stop rather than an asymptote because RNase number can only take on integer values. No internal equilibrium was found for selfing rate σ = 0.1. from an initial state (horizontal axis) to the next state (vertical axis), where each state is a number of S-haplotypes. For all states, n L = 2, and only n U changes between states. The diagonal from bottom-left to top-right (white line) represents no change in haplotype number but could result from either loss of a novel haplotype (i.e., stasis) or maintenance of a novel haplotype and loss of an old one (i.e., turnover). The off-diagonal immediately above that represents gain and maintenance of single novel haplotype. The probability of diversification decreases with increasing haplotype number. It remains greater than 0.5 until n L + n U = 18 and remains the most likely outcome until n L + n U = 22. for 100 transitions from the transition matrix for R conversion = 0.3 and n L = 2. The initial state was three haplotypes (n L = 2, n U = 1). Haplotype number increased rapidly at first and continued to fluctuate after entering the long-term stable range.  Total haplotypes Probability Figure S2: Distribution of haplotype number after a single rescue/collapse event. This probability distribution of outcomes is conditional on the population rate of gene conversion R conversion , and on the current numbers of lucky n L and unlucky n U haplotypes. For all panels, n U = 10.
Expansion only occurs when the final number of haplotypes is n + 1 (the rightmost bin in each histogram). For n L > 2 and small population gene conversion rates (R conversion ), the most likely outcomes is either the loss of all doomed haplotypes or the loss of all but one.  . 8), except that all probability mass below 11 haplotypes was removed each generation and the remaining probabilities were divided by their sum.