Unpacking conditional neutrality: genomic signatures of selection on conditionally beneficial and conditionally deleterious mutations

It is common to look for signatures of local adaptation in genomes by identifying loci with extreme levels of allele frequency divergence among populations. This approach to finding genes associated with local adaptation often assumes antagonistic pleiotropy, wherein alternative alleles are strongly favoured in alternative environments. Conditional neutrality has been proposed as an alternative to antagonistic pleiotropy, but conditionally neutral polymorphisms are transient and it is unclear how much outlier signal would be maintained under different forms of conditional neutrality. Here, we use individual-based simulations and a simple analytical heuristic to show that a pattern that mimics local adaptation at the phenotypic level, where each genotype has the highest fitness in its home environment, can be produced by the accumulation of mutations that are neutral in their home environment and deleterious in non-local environments. Because conditionally deleterious mutations likely arise at a rate many times higher than conditionally beneficial mutations, they can have a significant cumulative effect on fitness even when individual effect sizes are small. We show that conditionally deleterious mutations driving non-local maladaptation may be undetectable by even the most powerful genome scans, as differences in allele frequency between populations are typically small. We also explore the evolutionary effects of conditionally-beneficial mutations and find that they can maintain significant signals of local adaptation, and they would be more readily detectable than conditionally deleterious mutations using conventional genome scan approaches. We discuss implications for interpreting outcomes of transplant experiments and genome scans that are used to study the genetic basis of local adaptation.

conditionally beneficial mutations, they can have a significant cumulative effect on 24 fitness even when individual effect sizes are small. We show that conditionally 25 deleterious mutations driving non-local maladaptation may be undetectable by even the 26 most powerful genome scans, as differences in allele frequency between populations are 27 typically small. We also explore the evolutionary effects of conditionally-beneficial 28 mutations and find that they can maintain significant signals of local adaptation, and they 29 would be more readily detectable than conditionally deleterious mutations using 30 conventional genome scan approaches. We discuss implications for interpreting outcomes 31 of transplant experiments and genome scans that are used to study the genetic basis of 32 local adaptation. have opposite fitness profiles, with each having high fitness in one environment and low 46 rates (Felsenstein 1976;Bürger 2014). By contrast, for the CB and CD cases, only 70 monomorphic single-locus equilibria will be stable under deterministic models with high 71 migration; the mutant will replace the ancestral allele in the CB scenario, whereas the 72 mutant will be lost in the CD scenario (Whitlock and Gomulkiewicz 2005;Bürger 2014). 73 Therefore, with sufficient gene flow, polymorphism at any given locus will be transient, 74 persisting only for a short time until the monomorphic equilibrium is regained. In contrast 75 to antagonistic pleiotropy, in order for CB and CD to yield a phenotypic signature 76 resembling local adaptation in a transplant experiment, it is necessary for polymorphisms 77 to segregate at multiple loci with fitness effects conditional in opposite environments (as 78 shown in Figure 1), and new polymorphisms must be renewed continually by mutation. 79 Thus, for both CB and CD, the maintenance of genotype-by-environment interactions for 80 fitness depends upon sufficient spatial structure to delay the eventual return to should more accurately been seen as causing "non-local maladaptation" due to the 86 accumulation of conditionally deleterious mutation load (which we refer to as "non-local 87 load" for simplicity). If we are interested in studying local adaptation because it gives 88 some insight into longer term patterns of species-wide adaptation that drive 89 macroevolution, then it is important to understand the relative contributions of CB, CD, 90 and AP. likely because distinguishing between these types of mutation in an empirical setting 95 requires identifying which allele was ancestral. Indeed, if only the relative fitness effects 96 of two alleles are considered without any knowledge of which allele was ancestral, it is 97 not possible to discern whether a CD or CB fitness profile is operating. For example, if 98 genotypes AA and aa have realized fecundities (e.g. number of seeds) of wAA,1 = 20 and 99 waa,1 = 20 in environment 1 and wAA,2 = 40 and waa,2 = 50 seeds in environment 2, it is 100 unclear whether A is conditionally beneficial or a is conditionally deleterious, without 101 knowing which was ancestral. However, the distinction between CB and CD mutations 102 becomes important when considering that universally deleterious mutations occur much 103 more frequently than universally beneficial mutations (Bataillon 2000;Keightley and 104 Eyre-Walker 2010) and that this difference would presumably also extend to their 105 conditionally-dependent counterparts (i.e. to CB or CD mutations). Hence, even 106 mutations with very small conditionally deleterious effects (e.g. as in the infinitesimal 107 model, sensu Bulmer 1980) may accumulate to a sufficient extent to cause substantial 108 levels of non-local load -this appears to be the case for mutational load involving non-109 conditional mutations (Agrawal and Whitlock 2012). 110

Evolution by CB mutations should result in a continual increase in absolute 111
fitness at the species scale as CB mutations accumulate and fix. In contrast, patterns of 112 sequence evolution with CD mutations would be dynamically more similar to purifying 113 selection, as the new mutations would eventually be purged such that there would be no 114 gradual increase in absolute fitness. If there are high mutation rates, small effect sizes, 115 and short sojourn times for CD mutations, there may be very little measurable allele 116 frequency divergence between populations evolving in response to different selective 117 regimes. Hence, CD mutations may be important in explaining cases where no significant 118 allelic differentiation is detected (e.g. in a genome scan) despite a significant genotype-119 by-environment interaction for fitness or variation in fitness-related traits. 120 In this study, we used individual-based population genetic simulations to study 121 the accumulation of conditionally neutral mutations in two patches connected by 122 relatively high migration (i.e. such that the effect of migration exceeds the effect of drift: 123 m >> 1/Ne). While it is also biologically interesting to explore the dynamics of CB and 124 CD mutations when drift is strong relative to migration (i.e. m >> 1/Ne), we expect such 125 mutations to accumulate at a rate that is approximately clock-like in the patch where they 126 are neutral (as per the molecular clock; Kimura 1968). As such, we consider this case 127 only briefly, and focus instead on the somewhat less easily predictable question of 128 whether CB and CD mutations can accumulate when migration is high. To this end, we 129 simulated separate cases with either CD or CB mutations under a range of mean selection 130 coefficients and mutation rates, compared their capacity to generate significant genotype-

Analytical prediction for non-local load via CD mutations 139
A simple approximation for the amount of non-local load contributed by CD mutations at 140 equilibrium can be derived by combining several results from classical population 141 genetics, and assuming a large population size (i.e. ignoring drift). Consider a given type 142 of locus (CD2) where both alleles (a and A) have equal fitness in patch 1, allele A is 143 deleterious in patch 2, and the rate of mutation from wild type a to A is µ (with the 144 opposite scenario for locus type CD1, where a is wild type, and A is deleterious in patch 145 1). Under this scenario, the total non-local load for the population inhabiting patch 1 146 (which would be realized if its inhabitants migrated to patch 2), can be calculated as the 147 summation across all n CD2 loci: 148 (1) 149 where si is the selection coefficient and pi,1 is the frequency of A at the i th locus in patch 1 150 (and qi,1 = frequency of a in patch 1), with a similar formula applying for patch 2 by 151 summing across CD1 loci. 152 To estimate the equilibrium frequency of CD2 mutations in patch 1 (pi,1), we can 153 assume that new mutations that occur in patch 2 will contribute comparatively little to the 154 evolutionary dynamics among the two patches (because they will be purged from patch 2 155 quickly after arising) and can be ignored. Mutations from a to A that occur in patch 1 will 156 behave neutrally within the patch, but patch 1 will experience persistent immigration of 157 alleles from patch 2 at rate m, all of which will be a because of the above assumption that 158 qi,2 = 1 (i.e. we are ignoring the short periods of time when a recently mutated A in patch 159 2 has not yet been purged). Because migration will exert a deterministic forcing of allele 160 frequencies in much the same way as natural selection in classical single-population 161 models (i.e. the frequency in the next generation, p' = p(1 -m)), we can substitute m for s 162 in Haldane's classic model of mutation load (p ≈ µ/s; Bürger 2000), so that A will reach a 163 mutation-selection equilibrium at pi,1 ≈ µ/m. Substituting this into Eq. (1), then the total 164 non-local load will be: 165 where s is the mean selection coefficient. 167 This is an interesting result, as unlike the classic single-population genetic load 168 that is independent of the selection coefficient where L = nµ (Bürger 2000), it predicts 169 that non-local load should depend linearly on s. However, this result will tend to 170 underestimate the amount of non-local load because we ignored mutations from a to A in 171 patch 2, which could migrate to patch 1 and establish before they are purged from patch 172 2, thereby increasing the non-local load above that accounted for by Eq. (2). Also, some 173 segregation of A in patch 2 may occur due to immigration of A from patch 1, which will 174 reduce the effect of migration from patch 2 to patch 1 on p1 (as some migrants would be 175 A), thereby increasing the equilibrium frequency over µ/m. This simple approximation 176 may underestimate the non-local load at equilibrium, but provides some insight into how 177 the processes governing non-local load differ from simple genetic load due to 178 unconditionally deleterious mutations. 179 180

Individual-based simulations 181
To simulate the evolution of individuals distributed in two patches linked by migration, 182 we used SLiM v.1.8 (Messer 2013) with a modification that allowed the inclusion of 183 conditionally neutral mutations (modified code available at 184 https://github.com/samyeaman/slim_condneut). Each individual in these simulations was 185 hermaphroditic and had a diploid genome consisting of ten 100kb chromosomes, for a 186 total of 2,000,000 potential mutational targets (i.e. loci) in each genome. The 187 recombination rate was set to 10 -6 between adjacent loci, except where otherwise 188 specified. At every locus, the ancestral allele was neutral in both patches. When a 189 mutation arose at a given locus, it had a 1/3 chance of being universally neutral, a 1/3 190 chance of being neutral in patch 1 and selected in patch 2, and a 1/3 chance of being 191 selected in patch 1 and neutral in patch 2. The selection coefficients were drawn from an 192 exponential distribution, with a mean value of s specified for each parameter set. In study. The per-locus mutation rate (µ) was set to 10 -8 or 10 -9 for simulations involving 199 conditionally deleterious mutations, and to 10 -10 or 10 -11 for simulations involving 200 conditionally beneficial mutations. Empirical estimates of deleterious mutation rates are 201 highly variable, but, in humans, the rate is likely bracketed by the mutations rates used in 202 our simulations (Agrawal and Whitlock 2012). The beneficial mutations rates used in our 203 simulations are simply based on an assumption that the rate is at least one or two orders 204 of magnitude lower than the deleterious rate. 205 The number of individuals per patch (N) was set to 1000 or 10,000. The migration 206 rate (m) between the two patches was set to 0.5 for the first 50,000 generations of the 207 simulation and then reduced to 0.01 or 0.001 for the last 50,000 generations. The genome 208 of every individual in both patches, including mutational information for every basepair 209 position, was sampled every 10,000 generations. We ran at least 25 replicates of each 210 parameter combination -in a few cases where mutation rates were lowest (10 -10 and 10 -211 11 ), we ran additional replicates (up to 100) when there were insufficient mutations across 212 the smaller number of replicates to compute summary statistics. 213 214

Maintenance of GxE for fitness with conditionally neutral alleles 216
To compare the extent of genotype-by-environment fitness interactions evolving 217 due to either CD or CB mutations (i.e. non-local load in the case of CD mutations, or 218 transient local adaptation in the case of CB mutations), we calculated the average home-219 away fitness difference over the last 40,000 generations of our simulations. We note that 220 in our two-deme two-habitat model, with symmetric fitness effects, there is little to no 221 difference between "home-away" and "local-foreign" comparisons on average (censu 222 Kawecki and Ebert 2004), and both criteria yield similar quantification of local 223 adaptation (supplementary material, Figure S1). In the simulations with CD mutations, 224 home and away fitness tended to stabilize shortly after the reduction in migration rate and 225 before the last 40,000 generations (supplementary material, Figure S2 and S3). Whereas 226 many of the simulations with CB mutations also stabilized before the last 40,000 227 generations (supplementary material, Figure S4 and S5), fitness tended to increase 228 indefinitely for some simulations with higher values of s, and fluctuate extensively for 229 smaller values of N, rather than reach an apparent stable state. Hence, our simulation 230 results regarding the dynamics of CD mutations are more easily interpretable than those 231 regarding the dynamics of CB mutations. 232 To compare the analytical prediction for non-local load from Eq. (2) with results 233 from simulations, we used our calculation of home-away fitness difference as a measure 234 of non-local load. We found that this quantity increased linearly with n, s, and µ, and

Non-local maladaptation when migration is weak relative to drift 266
While we mainly focused on cases where migration was strong relative to drift 267 (i.e. m >> 1/Ne), we briefly explore the case where migration is weak relative to drift. As 268 long as the migration rate is sufficiently low (i.e., m << 1/Ne), it is expected that the 269 conditionally neutral mutations will be purged (for CD mutations) or will fix (for CB 270 mutations) in the patch where s ≠ 0, and will operate as neutral mutations in the patch 271 where s = 0. For CD mutations, we would expect them to accumulate in the patch where 272 they are neutral at a rate that is roughly proportional to the mutation rate, as per the 273 classic neutral molecular clock result (Kimura 1968). When we simulated cases with CD 274 mutations, we indeed saw that the rate of accumulation of GxE for fitness was 275 approximately linear ( Figure S8), in contrast to the apparent state of equilibrium that was 276 observed at high migration (e.g. Figure S2). The rate of accumulation of non-local load in 277 this case scales with mutation rate, consistent with above prediction, and also increases 278 with increasing strength of selection. It is clear from these simulations that the 279 accumulation CD mutations when drift is strong relative to migration will eventually lead 280 to speciation, when the average fitness difference between home and away is large 281 enough to be lethal. FST is elevated while they are transiently segregating. In contrast, CD simulations tended 295 to have relatively fewer fitness-affecting loci falling in the 1% tail of the distribution of 296 C&D). While the particular CD alleles responsible for driving most of the home vs. away 298 fitness effect did tend to have the highest FST values (as this is necessary to cause the 299 fitness difference), these alleles still didn't usually surpass FST found for neutral loci 300 (Figure 4 E&F). Thus, it will be very difficult to use conventional genome scan 301 approaches to identify the loci responsible for non-local maladaptation, but potentially 302 much easier for such methods to identify causal CB loci (but see Yoder and Tiffin 2018). 303 In the CD simulations, the proportion of fitness-affecting loci falling into the 1% 304 tail of the neutral FST distribution declined below 1% as the strength of selection 305 increased (Figure 4 C&D). We also observed a decline in the ratio of segregating CD to 306 neutral mutations with increasing home-away fitness differences (Figure 5 A&B), and a 307 decline in sojourn time and density with increasing mean selection coefficients 308 (supporting material, Figure S9). Hence, despite the fact that increasing the mean 309 selection coefficient in our simulations resulted in more mutations with strong deleterious 310 effects, the largest mutations (with the strongest effects) were purged very quickly and 311 did not accumulate sufficiently to contribute substantially to home-away fitness 312 differences. So, the relevant effect of a higher mean selection coefficient was an increase 313 in the number of moderately deleterious mutations with sufficiently long sojourn times 314 and densities to have a cumulative effect on home-away fitness differences. The decline 315 in the proportion of fitness-affecting loci falling into the 1% tail of the neutral FST 316 distribution was, therefore, driven by purging of large-effect loci (which were more 317 abundant when mean s was high), while the extent of home-away fitness differences was 318 driven by the accumulation of fitness-affecting loci with moderate fitness effects (which 319 were also more abundant when mean s was high). The robustness of these patterns to 320 different assumptions about the distribution of fitness effects is a potential avenue for 321 future research, but we expect that any realistic distribution will have an increased 322 proportion of large-effect mutations when the mean selection coefficient is higher. As are involved in non-local maladaptation. Our results reinforce the conclusion that genome 344 scans have limited power to detect conditionally neutral that contribute to local 345 adaptation (Yoder and Tiffin, 2018). 346 The amounts of non-local maladaptation observed here are relatively minor in 347 absolute terms (<10% difference in fitness between home and away) and are unlikely to 348 explain a large fraction of observed fitness differences commonly attributed to local 349 adaptation on their own. We have illustrated, nonetheless, that CD mutations may be a 350 factor contributing to apparent signatures of local adaptation that is commonly 351 overlooked in the focus on antagonistic pleiotropy (or beneficial mutations in general). 352 Conditionally deleterious mutations may be a particularly potent factor when combined 353 with AP alleles, since the latter will tend to "protect" flanking regions from gene flow 354 The role of conditionally deleterious mutations in patterns of local adaptation may 359 be quite important because these types of mutations are likely common. Deleterious 360 mutations are much more common than beneficial mutations, and it is reasonable to 361 assume that there would be some condition-dependence in their deleterious effects. For 362 simplicity, we studied conditional mutations that are exactly neural in one environment 363 (as shown in figure 1). In practice, our results would hold if these mutations were nearly-364 neutral in one environment relatively strongly selected in the other -as long as the 365 selection coefficient in the "neutral" environment is below the selection-drift threshold, 366 these mutations will behave like exactly neutral mutations. So, we contend that, taken 367 together, conditionally nearly-neutral and conditionally exactly-neutral mutations may be 368 common relative to antagonistic pleiotropy mutations with strong and opposite effects in 369 alternative environments. 370 As an empirical strategy, it has been common for studies of local adaptation to 371 infer conditional neutrality when they fail to observe a fitness effect in one environment Fitness difference at T=100,000 (log10) 580 581 Figure S7: Changing the number of chromosomes and recombination rate among 582 adjacent loci does not substantially affect the linear relationship between the prediction of 583 equation (1) and the simulation results, except at very low predicted fitness differences. 584 Simulations have N = 1000, m = 0.01, n = 10 6 sites, and µ = 10 -9 , and either 1 585 chromosome with recombination rate of r = 10 -5 or 10 -6 , or 10 chromosomes with r = 10 -6 586 between adjacent sites. 587 588 Fitness difference at T=100,000 (log10) 589 590 Figure S8: Accumulation of CD mutations causing differences in fitness between home 591 (solid) and away (dashed) environments, when migration is weak relative to drift (m = 592 N/10) for different mutation rates and mean strengths of selection.