A dynamical limit to evolutionary adaptation

Natural selection makes evolutionary adaptation possible even if the overwhelming majority of new mutations are deleterious. However, in rapidly evolving populations where numerous linked mutations occur and segregate simultaneously, clonal interference and genetic hitchhiking can limit the efficiency of selection, allowing deleterious mutations to accumulate over time. This can in principle overwhelm the fitness increases provided by beneficial mutations, leading to an overall fitness decline. Here, we analyze the conditions under which evolution will tend to drive populations to higher versus lower fitness. Our analysis focuses on quantifying the boundary between these two regimes, as a function of parameters such as population size, mutation rates, and selection pressures. This boundary represents a state in which adaptation is precisely balanced by Muller’s ratchet, and we show that it can be characterized by rapid molecular evolution without any net fitness change. Finally, we consider the implications of global fitness-mediated epistasis, and find that under some circumstances this can drive populations towards the boundary state, which can thus represent a long-term evolutionary attractor.

and b is defined by Note that to apply the MSSM approximation, ρ(s) must fall off exponentially, or faster than exponentially, with large positive s; otherwise, the above integrals do not converge and T c is not well-defined. The quantity T c is related to the underlying parameters N and U ρ(s) through the following system of equations for T c and the interference threshold x c (the fitness advantage above which individuals can fix largely unhindered by clonal interference, as well as, roughly speaking, the typical fitness advantage of the most-fit individuals in the population): where c ≡ U ρ(s) T c se Tcs + 1 − e Tcs ds.
Eq. (4) and Eq. (5) are used throughout this work to obtain theory predictions under the MSSM approximation. While numerical solution of these equations is straightforward, it is sometimes useful to further approximate these two equations as and log N x c ≈ T c (x c − U ) − vT 2 c 2 + U ρ(s)ds s e Tcs − 1 , respectively, which holds in the regime of validity of the MSSM approximation (which we discuss below). While defined by Eq.
(2), the quantity T c approximately corresponds to ⟨T 2 ⟩ /2-one-half the average time since two randomly chosen individuals in a population share a common ancestor-and can be interpreted as a coalescence timescale. In particular, the pairwise neutral heterozygosity π neu (the average number of neutral genetic differences among pairs of individuals in a population) evaluates approximately to 4T c U n ; the relation ⟨T 2 ⟩ /2 ≈ T c follows since π neu = 2 ⟨T 2 ⟩ U n . The pairwise selected heterozygosity (which counts selected genetic differences, occurring at rate U , with fitness effects drawn from ρ(s)) is given by Results have also been obtained for the full site frequency spectrum (which gives the expected number of mutations present in a sample at a given frequency, and from which additional statistics of genetic diversity can be computed) for both neutral mutations and selected mutations. The fixation rate of neutral mutations is simply given by U n , while the fixation rate of selected mutations is given by F = U ρ(s)e Tcs ; from these prediction, a prediction for α (as defined in the main text) follows. Approximate validity of the MSSM approximation requires that T c b ≫ 1 and that For populations subject purely to beneficial mutaitons, or purely to deleterious mutations, the second condition is roughly encapsulated by the conditions ≪ b, wheres ≡ ⟨s f ⟩ + 2∆s f denotes the largest "typical" effect size of a fixed mutation, as well as by the requirement that ρ(s) falls off exponentially, or faster than exponentially with large positive s. The quantity ⟨s f ⟩ denotes the average effect size of a fixed mutation, and the quantity ∆s f denotes the standard deviation in effect sizes of fixed mutation. When beneficial and deleterious mutations both occur in a population, separate largest "typical" effectss b ands d can defined for beneficial and deleterious mutations, and boths b ≪ b ands d ≪ b must be satisfied. These conditions of validity must be satisfied self-consistently; provided that these conditions are met, the distribution ρ f (s) of fixed mutational effects follows as ρ f (s) ∝ ρ(s)e Tcs within the "bulk" of ρ f (s)-the region dominating ρ f (s)ds. These conditions ensure that the distribution of relative fitnesses f (x) is well-approximated by and that the fixation probability w(x) of an individual with relative fitness x is well-approximated by within the important region of fitness space (the "fixation class", near the "nose" of the fitness distribution) which produces the majority of future common ancestors of the population. This region of fitness space (which dominates the integral f (x)w(x)dx) is particularly important because Eq. (8) is obtained by enforcing that 1/N = f (x)w(x)dx (i.e., that on average, one individual in the population at any given time will eventually fix).

II. APPLICABILITY OF THE MSSM APPROXIMATION ALONG THE v = 0 RIDGELINE
In the main text, we argued that the v = 0 ridgeline (in the space of scaled fitness effects) is given by the curve γ * d = 1 and γ * b = W 1 eη , parameterized by η. Here, we evaluate the quantities T c b, s d /b and s b /b along this v = 0 ridgeline, which we then use to comment on the validity of the MSSM approximation in describing the v = 0 ridgeline.
From the definition of b in Eq. (3), it follows that . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ; https://doi.org/10.1101/2023.07.31.551320 doi: bioRxiv preprint on the v = 0 ridgeline, which in turn simplifies to As a consequence, T c b ≫ 1 and s d /b ≪ 1 if T c U b ≫ 1 (assuming η < 1). Furthermore, given Eq. (14) we have with the right-hand side of Eq. (15) bounded above by 0

III. DISTRIBUTIONS OF FITNESS EFFECTS
Here, we consider simple classes of distributions of fitness effects, simplifying expressions obtained in the main text, and justifying the use of the MSSM approximation in describing the v = 0 ridgeline. We focus our attention on the case in which effect sizes of both beneficial mutations and deleterious mutations are drawn from gamma distributions, potentially differing in shape and scale. That is, we assume with s b and s d the mean available effect size of a beneficial mutation, or a deleterious mutation, respectivelythat is, s b ≡ ρ b (s)sds and s d ≡ ρ d (s)|s|ds. The parameters α b and α d comprise the shape parameters of the respective gamma distributions; the choice α b = 1 (or α d = 1) corresponds to an exponential distribution of beneficial (or deleterious) fitness effects; the limit α b → ∞ (or α d → ∞) corresponds to a delta distribution of beneficial (or deleterious) fitness effects. Given the U ρ(s) specified in Eq. (16), the scaled rate of adaptation T c v follows as where the quantities γ b and γ d denote the average scaled effect sizes of beneficial mutations, and of deleterious mutations, respectively. Note that the requirement γ b < α b emerges, a consequence of Eq. (2), which defines T c . The v = 0 surface then follows as The fixation rate F can also be computed, using Eq. (1), with the result so that . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ; https://doi.org/10.1101/2023.07.31.551320 doi: bioRxiv preprint Finally, the quantity (T c b) 3 simplifies to where in Eq. (21), we do not impose the constraint that v = 0. Below, we focus our attention on the case in which both beneficial mutations and deleterious mutations are drawn from exponential DFEs (so that α b = α d = 1).

A. Exponentially-distributed fitness effects
In the case α b = α d = 1, the v = 0 surface (in the space spanned by axes η, γ b and γ d ) is described by and the v = 0 ridgeline is given by Because p fix (s) ∝ e Tcs (at least for the majority of fixed mutations, assuming validity of the MSSM approximation), the distribution of fixed beneficial effects is also an exponential distribution, to a good approximation (and likewise for deleterious mutations); we have At the extremal point, we thus have and (25) and Eq. (27) it follows that at the extremal point,s b ≪ b,s d ≪ b and T c b ≫ 1-and thus the conditions of validity of the MSSM approximation are satisfied-if T c U b ≫ 1 (under the additional assumption that U d ≥ U b ). We note that in the case of exponential DFEs, Eq. (20) simplifies to and η| α=0 can be computed using Eq. (9), with the result . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ; https://doi.org/10.1101/2023.07.31.551320 doi: bioRxiv preprint

B. Arbitrary DFE
More generally, we can consider arbitrary DFEs ρ b (s) and ρ d (s) of beneficial and deleterious mutations. If we assume these DFEs have fixed shapes and consider changes in their scale, we can visualize the v = 0 constraint as a 2-dimensional surface within a 3-dimensional parameter space parameterized by η, γ b and γ d (where here γ b denotes the average scaled effect size of a beneficial mutation, and likewise for γ d ). As in the case of single effects, a larger γ b implies a more positive rate of adaptation, for fixed η and γ d . Likewise, an intermediate scale of deleterious effects is maximally impactful. For a particular shape of deleterious DFE, the maximally impactful deleterious (scaled) DFE ρ * d (γ), which lies on a v = 0 ridgeline, satisfies Because e −γ /N gives the fixation probability of a deleterious mutation with scaled effect γ, Eq. (30) has a simple interpretation in terms of the deleterious scaled effects which fix: on the v = 0 ridgeline, the meansquared fixed scaled deleterious effect equates to the mean fixed scaled deleterious effect. This further implies that ⟨T c s f ⟩ * d ≤ 1 on the v = 0 ridgeline (where s f denotes a fixed mutational effect, and the expectation value averages over all deleterious fixed effects); note that this can be seen by ensuring positivity of the variance in fixed scaled deleterious effects, which equals ⟨T c s f ⟩ * d 1 − ⟨T c s f ⟩ * d , given Eq. (30). Thus, the average fixed effect of the maximally impactful deleterious DFE, ρ * d (s), is subject only to moderate or weak selection. The upper bound ⟨T c s f ⟩ * d = 1 is achieved for the two-effect DFE considered above, while for the two-exponential DFE we have ⟨T c s f ⟩ * d = 1/2. For the corresponding beneficial DFE, ρ * b (s), lying on the v = 0 surface, a bound on ⟨T c s f ⟩ * b is less easily established without specifying the shape of ρ b (s).

IV. EVOLUTIONARILY STABLE DFE
In the main text, we identified constraints on U ρ(s) (or perhaps on U ρ(γ), the distribution of scaled fitness effects) that yield v = 0 and analyzed the resulting v = 0 surface. In this fashion, ρ b (s) and ρ d (s) are treated as parameters which could in principle be varied independently. In some of the simplest models of genome evolution, however, ρ b (s) and ρ d (s) are not independent. For example, assuming a genome of finite length L with no epistasis, the fixation of a beneficial mutation creates an opportunity for a deleterious back-mutation of the same magnitude, and vice versa. Rice et al. (2015) consider the resulting "evolutionarily stable" DFE that is reached at long times once, for each magnitude of fitness effect |s|, beneficial and deleterious mutations reach a state of detailed balance, in which ρ b (s)p fix (s) = ρ d (s)p fix (−s) (and consequently, the distribution of fixed beneficial mutations matches that of fixed deleterious mutations). A consequence of this detailed balance is that v = 0 for the evolutionarily stable DFE; here we discuss the application of the MSSM approximation to this state.
Given the underlying distribution ρ 0 (|s|) of absolute effects |s|, N p fix (s) ≈ e Tcs implies that and so that Below, we evaluate ⟨s f ⟩ and related expressions for the class of stretched exponential DFEs (ρ 0 (|s|) = 1 σΓ(1+β −1 ) e −(s/σ) β ) in the T c σ → ∞ limit. The quantity ⟨s f ⟩ simplifies to . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023 where C denotes Catalan's constant. The mean-squared fixed effect (of beneficial mutations, and also of deleterious mutations) is given by from which it follows that T c ∆s f ≈ 1.1. In the same limit, and sos ≪ b and T c b ≫ 1 in the a → ∞ limit if U ≫ σ. . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ; https://doi.org/10.1101/2023.07.31.551320 doi: bioRxiv preprint c v observed in simulations. For panels colored with a dark grey background (first, third and fifth rows) populations are subject to a two-effect DFE; for panels colored with a light grey background (second, fourth and sixth rows) populations are subject to a two-exponential DFE. The parameters N U and η for simulated populations in a given panel are as denoted on the left-hand side and top of the figure, respectively. Populations with TcU b < 1/2 (which, at the v = 0 ridgeline, suggests the MSSM approximation may break down) are denoted by points of smaller size. Solid curves denote predictions of the MSSM approximation, dashed lines denote predictions obtained using the standard formula for fixation probabilities assuming independently evolving loci, and dotted lines are the lines γ d = ηγ b .
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ; https://doi.org/10.1101/2023.07.31.551320 doi: bioRxiv preprint . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ;https://doi.org/10.1101https://doi.org/10. /2023  . CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ; https://doi.org/10.1101/2023.07.31.551320 doi: bioRxiv preprint Populations in the first, third, and fifth row are subject to two-effect DFEs; populations in the second, fourth, and sixth rows are subject to two-exponential DFEs. Solid curves denote MSSM predictions for F = U surface, with dotted lines denoting these predictions beyond the regime of validity of the MSSM approximation.
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. Populations in the first, third, and fifth row are subject to two-effect DFEs; populations in the second, fourth, and sixth rows are subject to two-exponential DFEs. Solid curves denote MSSM predictions for α = 0 surface, with dotted lines denoting these predictions beyond the regime of validity of the MSSM approximation. Dashed lines denote predictions obtained by using our "Ne-based heuristic" to compute Tc instead of the MSSM approximation.
. CC-BY 4.0 International license available under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint this version posted August 2, 2023. ; https://doi.org/10.1101/2023.07.31.551320 doi: bioRxiv preprint