The Behavior of Molecular Measures of Natural Selection after a Change in Population Size

A common model to describe natural selection at the molecular level is the nearly neutral theory, which emphasizes the importance of mutations with slightly deleterious fitness effects as they have a chance to get fixed due to genetic drift. Since genetic drift is stronger in smaller than in larger populations, a negative relationship between molecular measures of selection and population size is expected within the nearly neutral theory. Originally, this hypothesis was formulated under equilibrium conditions. A change in population size, however, pushes the selection-drift balance off equilibrium leading to alterations in the efficacy of selection. To investigate the nonequilibrium behavior, we relate measures of natural selection and genetic drift to each other, considering both, measures of micro- and macroevolution. Specifically, we use a Poisson random field framework to model πN/πS and ω as time-dependent measures of selection and assess genetic drift by an effective population size. This analysis reveals a clear deviation from the expected equilibrium selection-drift balance during nonequilibrium periods. Moreover, we find that microevolutionary measures quickly react to a change in population size and reflect a recent change well, at the same time as they quickly lose the knowledge about it. Macroevolutionary measures, on the other hand, react more slowly to a change in population size but instead capture the influence of ancient changes longer. We therefore conclude that it is important to be aware of the different behaviors of micro- and macroevo- lutionary measures when making inference in empirical studies, in particular when comparing results between studies.

INTRODUCTION are formulated as functions of time, (π N /π S )(t) and ω(t). Equipped with 142 the analytical model, we discuss the following questions. First, how does parameter. In other words, the population size over time is a step function the mutation intensity is κθ per generation and κθN per time unit. Each 166 mutation is assigned a population selection intensity γ. 167 We use the Wright-Fisher model with selection (Fisher 1930;Wright 1931) 168 for two alleles segregating at one site to model reproduction and then study 169 the population dynamics of the collection of all L independent sites. In the 170 limit as L tends to infinity and N is large but fixed, the Poisson random field with initial value ξ s s = y ∈ (0, 1), where B t is a standard Brownian motion. 178 Here, N κ (t)/N = 1 whenever t ≤ t * and N κ (t)/N = κ for t > t * . We denote 179 such a Markov process by (ξ s t ) t≥s or simply (ξ t ) t≥0 when the initial time is 180 s = 0. Furthermore, let P γ,κ y and E γ,κ y be the law and expectation of processes 181 (ξ t ) t that start in y, have selective pressure γ, and evolve in a population of 182 size κN . Let τ 1 be the time to fixation of the derived allele. The fixation 183 probability is given by (Kimura 1962) 184 q γ,κ (y) = P γ,κ y (τ 1 < ∞) = 1 − e −2γκy 1 − e −2γκ , γ = 0, q 0,κ (y) = y.

205
Formally, we construct our model X Nκ t using stochastic Poisson integrals.

206
The detailed presentation and most of the technical aspects are deferred to for suitable functions f ∈ F, specified in Appendix A, satisfying sufficient 209 conditions for these integrals to be well defined. In particular, f (0) = 0.

210
The class D is the path space for the diffusion processes t → ξ t , consisting of 211 Here, conditional on X t * , M X t * γ,κ (dy, dξ t * ,y ) is a Poisson random measure on 221 (0, 1]×D with intensity X Nκ t * (dy) P γ,κ y (dξ t * ,y see Lemma 3 iii) in Appendix A. In particular, taking κ = 1, the limiting 234 expected value of relation (5) is given by this linear function, as long as t ≤ t * .

235
The linear term in t represents the effect of constant rate fixations and the 236 integral term independent of time represents the steady-state spectrum of 237 polymorphic frequencies.

238
In order to analyze the nonequilibrium AFS caused by applying popula- While in these representations we do not see directly a spectrum of frequen- see Theorem 1 in Appendix B for a detailed derivation. 253 We make two remarks: for f ∈ F with the additional property f (1) = 0, 254 the time-dependent AFS provides a bridge between the "marginal" limit For the case of neutral evolution, γ = 0, we have ω 0,1 ψ 0,1 (y) − ω 0,κ ψ 0,κ (y) = 257 2(1 − κ)/y and hence, for f ∈ F , t ≥ t * , and N → ∞, Application of selected functions f to the nonequilibrium AFS in Eq. (10) 259 allows to retrieve expressions for relevant summary statistics, such as nu-260 cleotide diversity and the fixation rate.

261
The ratio of nucleotide diversity during nonequilibrium 262 We derive and investigate the ratio of nucleotide diversity, π N /π S (Nei and . More generally, by applying Eq. (10), we obtain the time-dependent non-270 synonymous nucleotide diversity measure π γ,κ nonequilibrium.

272
In order to allow for variation in selection across sites for the nonsyn-273 onymous diversity, we integrate the previous expressions over a distribution 274 of fitness effects (DFE). We will denote the random variable generating the with shape parameter a > 0, scale parameter b > 0, and mean −ab. Inte-283 gration of the expression in Eq. (11) and π γ,κ N (t) over this density yields an 284 averaged diversity measure π κ N . Taken together it holds see Appendix C for details. The expectations in the above expression are used 286 to indicate integration over the DFE. We observe that π κ N (t) approaches a new equilibrium, π κ we obtain the synonymous diversity as with π κ S (t) → 2θκ as t → ∞. The ratio of nonsynonymous and synonymous 291 diversity, which we denote (π N /π S ) κ (t) := π κ N (t)/π κ S (t), is also time depen-292 dent and determined by Eqs. (13) and (14). The ratio in equilibrium will be 293 denoted (π N /π S ) κ .

294
The fixation rate ratio during nonequilibrium 295 The nonsynonymous to synonymous fixation rate ratio of selected and To obtain an explicit representation of Z γ,κ (t), we note that f fix (1) = 1 313 and that the expectation operator applied to f fix (y) can be rewritten in terms 314 of the fixation time distribution, Thus, compare Appendix D for technical details. Fixations that originate from 317 nonsynonymous mutations are averaged over the DFE in Eq. (12); for syn-318 onymous fixations γ is set to zero. Finally, the fixation rate ratio in nonequi-319 librium is defined as The nonequilibrium quantity ω κ (t) is consistent with the equilibrium fixation We note that measures modeled in this study are population functionals- for an increase in size (κ > 1) it takes longer. Also the extent of change de-359 termines how fast (π N /π S ) κ (t) attains the new equilibrium value. The more 360 the population size is reduced, the faster (π N /π S ) κ (t) reaches its new equilib-361 rium value; the more a population increases in size, the longer it takes. Also, 362 given a DFE restricted to deleterious mutations, (π N /π S ) κ (t) is negatively 363 correlated with population size as predicted by the nearly neutral theory of 364 molecular evolution.

365
The behavior of ω κ (t) after a change in population size is depicted in 366 Fig. 2B and is similar to the behavior of (π N /π S ) κ (t). The ratio decreases 367 for κ > 1, which means that less deleterious nonsynonymous mutations went 368 to fixation-in accordance with observations about selection acting more effi-369 ciently in larger populations. However, ω κ (t) is an accumulative measure over 370 the time interval [0, t], while (π N /π S ) κ (t) reflects a snapshot of the strength 371 of selection at time t. As a consequence, ω κ (t) takes longer to reach its new equilibrium than (π N /π S ) κ (t).  mental to discuss the differences and to assess which of the definitions are 397 relevant to relate to ω κ (t) and (π N /π S ) κ (t) in our modeling approach. The   We investigate the selection-drift relationship after a change in population 455 size and compare it to the equilibrium behavior. Therefore, we relate the 456 two estimates N h eff (t) and N π eff (t) to the fixation rate ratio, ω κ (t), and the

459
To evaluate the discrepancy between the nonequilibrium behavior and the 460 expected equilibrium balance between genetic drift and selection, we indicate 461 ω κ and (π N /π S ) κ as functions of κ.

462
First, we consider an ancient change in population size (Fig. 4). This  ing ω κ (t) and N π eff (t) than ω κ (t) and N h eff (t). For (π N /π S ) κ (t) the deviation 474 from equilibrium is less striking-even when relating to N h eff (t) in Fig. 4C.

475
But still, there exists a deviation which notably is the other way around 476 than for ω κ (t). While for κ < 1 the measure ω κ (t) falls below the equilib-477 rium expectation, the measure (π N /π S ) κ (t) for a given N eff is larger than in 478 equilibrium. The reverse holds for κ > 1. 479 For a recent change in population size in Fig. 5, when using the fixa- becomes clearly visible when using N π eff (t) as estimate of genetic drift.

487
The measures (π N /π S ) κ (t) and N π eff (t) react much faster on a change in 488 population size, which is especially eye catching in the comparison of Figs. 5A

489
and 5D. Figure 5D shows that there is hardly any deviation from the expected 490 equilibrium relationship since (π N /π S ) κ (t) and N π eff (t) both react quickly to In fact, this AFS is not restricted to periodically changing environments, and 573 hence may be used to develop the further case of allowing a prescribed, con- for a suitable domain D of twice differentiable functions on the unit interval.
Let τ be the absorption time of (ξ t ). Then, since Q γ f ∈ F 0 by assumption, where G γ (x, y) is the Green function of the Wright-Fisher diffusion. Hence, and therefore, uniformly in t, As the two integrals on the right hand side are finite by assumption, it 868 remains to show that (4), and 871ψ γ (y) = e 2γy − 1 γy(1 − y) , γ = 0,ψ 0 (y) = 2 1 − y .

877
(30)), we now have Since q γ (1) = 1, the integrand function y → Q γ f (y)−Q γ f (1)q γ (y) belongs to 879 F 0 . Hence we can proceed as above and obtain as replacement of Eq. (A.1), where the supremum is the same factor already treated under i). The obser- completes the proof.

883
The stationary AFS arises in the Poisson random field approach as a sity ω γ ψ γ (y)dy and evolve as Wright-Fisher diffusions (ξ 0,y t ) t≥0 .

905
Lemma 2. For f ∈ F 0 we have as N → ∞ the convergence in distribution, and convergence of expected values, iii) E Here, conditioning on {ξ s 0 } s<0 , Since f ∈ F 0 we have e f − 1 ∈ F 0 . By Lemma 1 i), T t (e f − 1) ∈ F 0 , t ≥ 0.  The modified Poisson random measure, denoted N Nκ (ds, dy, dξ s,y ), applies 951 Poisson points for which the dynamics of the paths (ξ s t ) t≥s change with the 952 current size of the population.

953
Theorem 1. For f ∈ F the nonequilibrium AFS at time t ≥ t * after a 954 change in population size at t * is given by Proof. The allele frequencies originating from mutations before t * form a 956 stationary AFS with Poisson intensity ω γ,1 ψ γ,1 (y) dy at time t * , compare Conditioning on X Nκ t * = N γ , by Lemma 3 ii) (with t * replacing t = 0 as initial By Lemma 3 iii),
The integral term vanishes and κ = 1, since for t ≤ t * the population size is 985 N κ = N .