Establishment Process of a Magic Trait Allele subject to Both Divergent Selection and Assortative Mating

Sexual selection and divergent selection are among the major driving forces of reproductive isolation, which could eventually result in speciation. A magic trait is defined such that a single trait is subject to both divergent selection and sexual selection through phenotype-based assortative mating. We are here interested in the evolutionary behavior of alleles in a genetic locus responsible for a magic trait. We assume that, in a pair of homogeneous subpopulations, a mutant allele arises at the magic trait locus, and theoretically obtain the probability that the new allele establishes in the population. We also consider the trajectory of allele frequency along the establishment. Divergent selection simply favors the new allele to fix where it is beneficial, whereas assortative mating works against rare alleles. It is theoretically demonstrated that the fate of the new allele is determined by the relative contributions of the two selective forces, divergent selection and assortative mating, when the allele is rare so that the two selective forces counteract. We also show that random genetic drift also plays an important role. The theoretical results would contribute to improve our understanding of how natural selection initiates speciation.

S peciation occurs when reproductive isolation establishes between different populations. Sexual 9 selection is one of the major forces driving reproductive isolation (Coyne and Orr 2004). Specia-10 tion driven by sexual selection could occur when phenotypic difference is involved in mate choice. 11 Several theoretical models indicated that sexual selection alone can lead speciation even in the face 12 of gene flow (Wu 1985;Turner and Burrows 1995;Higashi et al. 1999;Takimoto et al. 2000), but these 13 results largely rely on their assumptions such as ample genetic variation, symmetric distribution 14 of female preference or strong female choice (Arnegard and Kondrashov 2004; Gavrilets 2004), and 15 empirically not well supported yet as reviewed in Ritchie (2007). At the moment, it is considered 16 that speciation by sexual selection alone is difficult to occur (Gavrilets 2004). This is largely because 17 diversity in female preference is difficult to maintain, rather disappears by genetic drift as female 18 preference is not directly subject to selection (i.e., selection works through male phenotype). Ritchie 19 (2007) pointed out that sexual selection should work efficiently together with niche specialization or 20 local adaptation. 21 Synergy between local adaptation and assortative mating can be a powerful driver of speciation 22 (Gavrilets 2004). Establishment of a locally adaptive mutation could lead stable genetic divergence 23 between local populations in different environments, even in the face of gene flow between them. 24 If there is another locus that is involved in sexual selection, it also reduces gene flow between 25 populations. This effect is particularly strong when the locus is genetically linked to the target 26 locus of local adaptation and an extreme case is that a single locus is pleiotropically subject to 27 both divergent selection and sexual selection. Models that handle sexual selection at a locus under 28 assuming that mating pairs of haploid individuals with different alleles produce fewer offsprings 23 than pairs with the same alleles. Although this mode of selection is categorized into fecundity 24 selection, it is mathematically identical to assortative mating, where females dislike to mate with 25 males having different alleles. Slatkin (1982) analytically showed that successful invasion of a new 26 allele requires a larger selection coefficient of adaptive selection than the sum of the migration 27 rate and strength of assortative mating, both of which have an effect against the invasion of new 28 alleles. Additionally, the author derived the critical migration rate, over which polymorphic state 29 is unstable so that new alleles likely become extinct. Recently, Kisdi and Priklopil (2011) explored 30 the evolutionary branching of a magic trait in an infinite population, although this model is not 31 very relevant to our interest because the magic trait is quantitative determined by multiple genes. 32 While the analysis assuming an infinite population gives a great deal of insight, it is also important 33 to consider this process in a finite population with the stochasticity of random genetic drift. To our 34 best knowledge, there has been no research which explored analytically the establishment process  To understand the effect of random genetic drift, we here consider a two-population model with . The authors developed a 1 heuristic method, which essentially focuses on the process in the adaptive subpopulation, rather than 2 considering the dual process in the adaptive and maladaptive subpopulations. This is because the 3 establishment probability is largely determined by how the focal allele increases in frequency when 4 it invades into the adaptive population, and the process in the maladaptive subpopulation does not 5 play a crucial role. The authors showed that Kimura's formula (Kimura 1962) well approximates the 6 establishment probability when an adaptive allele arises in the adapted subpopulation if the selection 7 coefficient is replaced by the leading eigenvalue of transition matrix of deterministic dynamics. In 8 this work, following the idea of Yeaman and Otto (2011), we successfully derive the establishment 9 probability in haploid and diploid models, not only when an adaptive allele arises in the adapted 10 subpopulation but also when an adaptive allele arises in the maladapted subpopulation.

12
We use a discrete-generation two-population model, between which bidirectional migration is 13 allowed, and we consider both haploid and diploid cases. The assumptions and model settings 14 shared by the haploid and diploid models are described here. The sizes of subpopulations I and II 15 are assumed to be N 1 and N 2 , respectively. On average, N 1 m 1 = N 2 m 2 individuals are exchanged 16 per generation where m 1 and m 2 are backward migration rate of subpopulation I and II, respectively. 17 There are two alleles, A and a, on which selection works. Allele A is favored in subpopulation I, and 18 disfavored in subpopulation II. Assuming no recurrent mutation between alleles A and a, we focus 19 on the behavior of allele frequencies to obtain the establishment probability of allele A forward in 20 time. Life cycle is assumed to be in the order of selection, mating and migration in each generation. 21 The major difference between the haploid and diploid models is in how selection works. We assume 22 that the fitness of allele a is 1, and the fitness of allele A is given depending on the model, as will be 23 explained below.

24
Haploid Model 25 In the haploid model, we simply assume that the fitness of allele A is 1 + s 1 > 1 in subpopulation I, 26 but 1 + s 2 < 1 in subpopulation II. Let p i be the allele frequency of allele A in subpopulation i. Then, 27 the expected frequency of allele A after the selection event is In the mating event, reproductive opportunity of a female is independent of her genotype. Assor-29 tative mating occurs such that a female avoids mating with male who has a different allele from hers 30 with probability α. Then, the expected frequency of allele A after the mating event is given by We assume that individuals that constitute the next generation are randomly produced by binomial 32 sampling with p i . After this mating event, N 1 m 1 = N 2 m 2 individuals are exchanged between 33 subpopulations in the migration event.

35
In the diploid model, we consider the dominance effect on the phenotype, based on which assortative 36 mating works. We assume allele A has a dominance coefficient, h, which means that if genotype AA 37 and genotype aa have trait values P AA and P aa , respectively, the trait value of genotype Aa is given 1 by hP AA + (1 − h)P aa . We put p i , q i and r i as the genotype frequency of AA, Aa and aa, respectively, 2 in subpopulation i.

3
In the selection event, we assume the fitness of genotype AA, Aa and aa are 1 + s 1 , 1 + hs 1 and 4 1 in subpopulation I, and 1 + s 2 , 1 + hs 2 and 1 in subpopulation II (s 1 > 0 and s 2 < 0). Then, the 5 expectation of frequencies of AA, Aa after the selection event are given by respectively, and the expected frequency of aa is given by In the mating event, a female avoids mating with a male who has a different trait from hers. To 8 incorporate the effect of dominance, we assume that a female avoids mating proportional to the trait 9 difference, and that genotype AA avoids mating with genotype aa with probability α. Then, the 10 expected frequencies of genotype AA and aa after the mating event are respectively, and the expected frequency of Aa are given by q i = 1 − p i − r i . According to these 12 probabilities p i and q i , individuals for the next generation is produced by multinomial sampling.

13
After that, N 1 m 1 = N 2 m 2 individuals are exchanged between subpopulations (migration event).

15
Establishment Probability in the Haploid Model 16 The initial state is that all individuals have allele a in both subpopulations. First, we derive the 17 establishment probability of allele A which arises in subpopulation I with initial frequency 1 N 1 . We 18 assume m i |s i |, α to ensure the stable maintenance of divergence (see below). By assuming that 19 |s i |, α, m i 1 and ignoring the second order of these parameters, the expected changes in allele 20 frequency in one generation are given by The change of allele frequencies, p 1 and p 2 , can be well described by a two-dimensional diffusion 22 equation, which is unfortunately very difficult to solve. Alternatively, we use the heuristic approach 23 developed by Yeaman and Otto (2011), which stemmed from their demonstration that the establish-1 ment probability of an allele which arises in the adapted subpopulation can be approximated by the 2 establishment probability of an allele under directional selection if the selection intensity is replaced 3 by the leading eigenvalue of transition matrix of deterministic process. This means that the leading 4 eigenvalue is considered to represent the 'effective' strength of natural selection on allele A which 5 arises in subpopulation I. 6 The deterministic process considering divergent selection and migration (but not assortative 7 mating) is described as The leading eigenvalue of the transition matrix in Equation 6, λ, is given by Following Yeaman and Otto (2011), the expected change in allele frequency in one generation is 10 approximated in the one-population system: Then, the establishment probability of allele A which newly arises in subpopulation I, u 1 , is given 12 along Kimura's formula (Kimura 1962) as where 14 Next, we derive the establishment probability of allele A that newly arises in subpopulation II 15 with initial frequency 1 N 2 . Because allele A is maladaptive in subpopulation II, we can assume that 16 the frequency of allele A in subpopulation II should be kept low due to divergent selection, so 17 that the process should be described by the branching process. The selection coefficient against 18 allele A is s 2 − α 2 in subpopulation II. Then, the establishment probability of allele A which arises in 19 subpopulation II is approximated by For the details of the derivation, see APPENDIX 21 A.

22
The accuracy of our derivation was checked by simulations ( Figure 1). Forward simulations 23 were carried out under the haploid model to obtain u 1 and u 2 , with initial frequency of allele A 24 (p 1 , p 2 ) = (1/N 1 , 0) and (0, 1/N 2 ), respectively. For each parameter set, we ran ≥100,000 replications 25 and counted the number of replications in which A establishes in the simulated population. The 26 "establishment" is defined such that the new introduced mutation is still present after 5N 1 generations 27 passed. It should be noted that, according to our definition, the established replications include two 28 cases; case C where alleles A and a coexist, namely, allele A is nearly fixed in subpopulation I while 29 allele a is nearly fixed in subpopulation II, typical to strong divergent selection, and case F where 1 allele A is fixed in both subpopulations. Let Pc be the relative proportion of case C in established 2 replications. In Figure 1, a gray region is placed such that Pc > 0.9 in the left, while Pc < 0.1 in the 3 right. According to our derivation, u 1 and u 2 from Equations 9 and 10 evaluate the sum of these two 4 cases, so that they are comparable to the result of our simulations, although our interest is in case C 5 where strong divergent selection is maintaining the two alleles (i.e., left of the gray region). where λ = s 1 if we ignore assortative mating, while u 2 ∼ 0. As the migration rate increases, u 1 16 decreases and u 2 increases, and they become similar to each other for a large migration rate. Because 17 allele A is advantageous only in subpopulation I, this beneficial effect would be reduced with 18 increasing the migration rate, and vise versa for allele a. When the migration rate is very large 19 (m ∼ 0.5), the two subpopulations can be considered as a single random-mating population, and the 20 fixation probability of a single mutation is mainly determined by the average selection coefficient, where and N T = N 1 + N 2 (presented by a triangle in 22 Figure 1). The gray region for 0.1 ≤ Pc ≤ 0.9 is placed in a narrow window of the migration rate, 23 indicating that a fairly small difference in the migration rate around the gray region could change the 24 typical outcome dramatically. In the right of the gray region, the established probability essentially 25 means that allele A is fixed in both subpopulations, whereas in the left, the two alleles are nearly 26 fixed in each subpopulation (i.e., divergent selection is maintaining them). 27 In the following, the effect of α on u i is explained along with the symmetric case (N 1 = N 2 = 2000) 28 shown in Figure 1A. Allele A is favored in subpopulation I through divergent selection, which 29 is a positive force for its establishment, although this positive effect is weakened by migration.

30
Mathematically, λ in Equation 7 can be considered as the effective intensity of divergent selection, 31 where migration is taken into account. In contrast, assortative mating is a negative force when allele 32 A is rare, which works to make allele A difficult to increase in frequency. This is because allele 33 A has to mate with allele a in most cases (i.e., reproduction is less successful). Thus, the relative 34 contributions of divergent selection and assortative mating largely determines the fate of allele A.

35
Let us first assume a very small migration rate (see the left end at m 1 = 5 × 10 −5 in each panel 36 in Figure 1A). If we ignore migration, λ is simply given by s i and the total strength of selection 37 (divergent selection and assortative mating) on allele A is roughly given by s i − α 2 in subpopulation 38 i. As expected, the established probability u 1 and u 2 decrease with increasing α. When assortative 39 mating is weak (α = 0.01 ∼ 0.03 in Figure 1A), u 1 could be roughly approximated by u 1 = 2s 1 − α. 40 When α = 0.04, we have 2s 1 − α = 0, where the two selective forces should roughly cancel each other, 41 but it seems that u 1 exceeds the neutral expectation (1/N 1 ) because the negative effect of assortative 42 mating is relaxed when p 1 increases. The establishment probability u 1 is very low for α > 0.04, where 43 selection in total works against allele A due to strong assortative mating. It should be interesting to 1 point out that, even with strong assortative mating, allele A can establish if A happens to increase in 2 frequency by random genetic drift. Once allele A becomes common, the negative effect of assortative 3 mating is somehow reduced, and it could successfully go to the establishment by the positive effect 4 of divergent selection.

5
As the migration rate increases, the intensity of divergent selection is effectively weaken, so that 6 u 1 decreases. At the limit of free migration, we have u 1 = u 2 is given by Equation 11 as mentioned 7 above. The location of the gray region is quite constant over the range of α. This robustness to 8 α should be because the eventual fate of A (i.e., cases C or F) is mainly determined by divergent 9 selection because assortative mating works efficiently only when allele A is rare, namely, shortly 10 after it arises in the population. Once allele A increases and exceed a certain frequency, allele A may 11 not be so deleterious because there are many allele A to mate with no reproductive reduction.

12
Focusing on u 1 , Figure 2   Again, the overall agreement of colors in the circle and those in the background suggest excellent 19 performance of Equation 9 for a wide range of the parameter set. It seems that our analytical result 20 slightly overestimates the establishment probability when s i and α are so large that the establishment 21 probability is sensitive to the second order of s i , α, which were ignored in our derivation. The blue 22 dashed line represents 2λ = α, where the two selective forces roughly cancel each other so that it 23 can be theoretically considered as a threshold of establishment in an infinite population. When the 24 population size is very large, u 1 drops quickly above the line, whereas allele A may establish even 25 above the line when the population size is small, because of random genetic drift.

27
The initial state is that all individuals are aa homozygotes in both subpopulations. We first consider 28 the establishment probability of allele A which newly arises in subpopulation I with initial frequency 29 1 2N 1 . As well as our derivation for the haploid model, we approximate the two-population process by 30 a one-population system by focusing the establishment process in subpopulation I alone (Yeaman 31 and Otto 2011). To do so, the fitnesses of genotypes AA, Aa and aa are given by 1 + λ, 1 + hλ and 1, 32 respectively, where λ is the effective strength of natural selection as defined above. Allele A can be 33 considered dominant over allele a when h = 1, and recessive when h = 0. Following Yeaman and 34 Otto (2011), we assume that the leading eigenvalue of transition matrix approximates the growth 35 rate of allele A in a virtual one-population system (when allele A is rare). Then, λ is given by where we assume fairly strong selection, say hs i is as large as on the order of 1 N i .

37
Given Equation 12, M p and M q , the expected changes in the genotype frequencies of AA and Aa 38 in one generation, are obtained by assuming that λ, α 1 and ignoring the second order of these, 39 and the Kolmogorov's forward equation is given by where and φ is an arbitrary transition density. By changing 1 the variables, Equation 13 is rewritten as a function of allele frequency of A, x, and frequency of 2 heterozygote q as This forward equation with two variables, x and q, is still difficult to solve, so that we attempt to 4 reduce the dimension with approximation, that is, the frequency of heterozygotes, q, is approximated 5 by a function of x. If we ignore assortative mating so that the Hardy-Weinberg equilibrium holds, we 6 can simply assume q(x) = 2x(1 − x). Even with assortative mating, Yamamichi and Sasaki (2013) 7 demonstrated that the assumption of the Hardy-Weinberg equilibrium (i.e., q(x) = 2x(1 − x)) works Then, by using Kimura's formula (Kimura 1962), the establishment probability of allele A that newly 15 arises in subpopulation I, u 1 , is given by where Next, we derive the establishment probability of allele A that newly arises in subpopulation II. 18 Following the haploid case, the establishment probability u 2 is given by  In the haploid model, assuming a low migration rate and strong selection, p * 1 is given by p * 1 ≈ 7 1 − m 1 s 1 + α 2 . Following Ewens (1973), we can derive the conditional mean sojourn time at frequency x, where We then obtain the time required for a newly arisen 10 allele A to reach allele frequency x: so that t(p * 1 ) provides the establishment time. Note that although this argument holds for a new allele 12 arisen in subpopulation I, it can be applied to an allele arisen in subpopulation II after it migrated 13 into subpopulation I. 14 In the diploid model, we can approximate p * for other cases h is relatively smaller than 1. We then obtain the conditional mean sojourn time at . 1 Our derivation works well when h is not small.

2
The theoretical result is quite simple and there is no marked difference between the haploid and 3 diploid models. Therefore, we demonstrate the point by using the haploid model. Figure 5A shows 4 the theoretical trajectory of t(x) by the red line when strong assortative mating is working (α = 0.04), 5 compared with the case with no assortative mating (α = 0, Figure 5B). For each panel, we also show 6 ten independent established runs by black lines. The major difference is in that the allele frequency 7 likely stays long in low frequency with assortative mating (Figure 5A), in comparison with the 8 symmetric function in Figure 5B. This difference is easy to understand: As mentioned above, the 9 negative effect on the newly arisen allele is strong only when its allele frequency is low, and the effect 10 is relaxed once it increases. This pattern is globally observed both in the haploid and diploid models. 11

12
We are interested in how a newly arisen allele of a magic trait (allele A) behaves and contributes 13 to ecological speciation. A magic trait is defined such that a single trait is subject to both divergent 14 selection and assortative mating. Divergent selection simply favors the new allele to fix where it is 15 beneficial, thereby creating a genetic difference between subpopulations. Assortative mating works 16 in a more trick way: it also prefers difference between subpopulations but a new allele is not always 17 advantageous. This is because when allele A is still rare after it arises, allele A is disadvantageous 18 because allele A has to mate with allele a in most cases with a reduction in reproductive success. 19 Once allele A becomes common so that A and A can mate with no reduction in reproduction, allele A 20 is not very deleterious any more and would fix in the adapted subpopulation, thereby contributing 21 to genetic divergence between subpopulations.

22
In this work, we investigate the establishment probability of such a magic trait allele under the 23 haploid and diploid models. We successfully obtained the establishment probability by using the 24 approximation method of Yeaman and Otto (2011). We confirmed that our derivation agreed well 25 with our simulation results. Our theory mainly focuses on the early phases, that is, when allele A is 26 still rare so that divergent selection and assortative mating counteract. It is theoretically demonstrated 27 that the relative contributions of divergent selection and assortative mating largely determine the 28 fate of allele A. 29 In the haploid model, λ in Equation 7 explains the effective intensity of divergent selection 30 with migration taken into account, and the intensity of assortative mating is parameterized by α. 1 Theoretically, 2λ − α is the key quantity to determine the fate of allele A when its frequency (p 1 ) is 2 rare. If 2λ − α > 0, allele A is on average selected for, and selected against if 2λ − α < 0 in an infinite 3 population. In a finite population, we show that allele A can establish even when 2λ − α < 0. This is 4 especially true for a small population (Figures 2, S1-S3) because random genetic drift occasionally 5 increase p 1 , and the negative effect of assortative mating is relaxed once allele A becomes common. 6 Then, it is likely that allele A goes to establishment. 2 We here derive the establishment probability of a single allele A which arises in the maladapted 3 subpopulation II. First, we derive the probability distribution of the number of migrant alleles to 4 subpopulation I which originate from a single allele A in subpopulation II by using the branching 5 process. Then, by assuming that migrant alleles behave independently, we derive an approximate 6 formula of the establishment probability. The derivation works for both haploid and diploid models. 7 We put the probability variables X t as the number of allele A in subpopulation II at generation 8 t and Y t as the number of allele A which have migrated to subpopulation I until generation t. 9 Because we consider the case where negative selection works on allele A in subpopulation II, we 10 can assume that the frequency of allele A is kept low so that allele A mostly exist in heterozygotes 11 in the diploid case. Therefore, X t is almost same as the number of heterozygotes in the diploid 12 model. We define a probability generating function at generation t as f t (x, y) = E[x X t y Y t ]. We set

19
Then, the establishment probability of a single allele A that arises in the maladapted subpopulation 20 II, u 2 , is approximately given by For the diploid case, we should substitute s 2 and α by hs 2 and hα in the above equations.