Punctuated equilibrium as the default mode of evolution of large populations on fitness landscapes dominated by saddle points in the weak-mutation limit

Yuri Bakhtin; Mikhail I. Katsnelson; Yuri I. Wolf; Eugene V. Koonin

doi:10.1101/2020.07.20.212241

Abstract

Punctuated equilibrium is a mode of evolution in which phenetic change occurs in rapid bursts that are separated by much longer intervals of stasis during which mutations accumulate but no major phenotypic change occurs. Punctuated equilibrium has been originally proposed within the framework of paleobiology, to explain the lack of transitional forms that is typical of the fossil record. Theoretically, punctuated equilibrium has been linked to self-organized criticality (SOC), a model in which the size of ‘avalanches’ in an evolving system is power-law distributed, resulting in increasing rarity of major events. We show here that, under the weak-mutation limit, a large population would spend most of the time in stasis in the vicinity of saddle points in the fitness landscape. The periods of stasis are punctuated by fast transitions, in lnN_e time (N_e, effective population size), when a new beneficial mutation is fixed in the evolving population, which moves to a different saddle, or on much rarer occasions, from a saddle to a local peak. Thus, punctuated equilibrium is the default mode of evolution under a simple model that does not involve SOC or other special conditions.

Significance The gradual character of evolution is a key feature of the Darwinian worldview. However, macroevolutionary events are often thought to occur in a non-gradualist manner, in a regime known as punctuated equilibrium, whereby extended periods of evolutionary stasis are punctuated by rapid transitions between states. Here we analyze a mathematical model of population evolution on fitness landscapes and show that, for a large population in the weak-mutation limit, the process of adaptive evolution consists of extended periods of stasis, which the population spends around saddle points on the landscape, interrupted by rapid transitions to new saddle points when a beneficial mutation is fixed. Thus, punctuated equilibrium appears to be the default regime of biological evolution.

Introduction

Phyletic gradualism, that is, evolution occurring via a succession of mutations with infinitesimally small fitness effects, is a central tenet of Darwin’s theory (1). However, the validity of gradualism has been questioned already by Darwin’s early, fervent adept, T.H. Huxley (2), and subsequently, many non-gradualist ideas and models have been proposed, to account, primarily, for macroevolution. Thus, Goldschmidt (in)famously championed the hypothesis of “hopeful monsters”, macromutations that would be deleterious in a stable environment but might give their carriers a chance for survival after a major environmental change (3). Arguably, the strongest motivation behind non-gradualist evolution concepts was the notorious paucity of intermediate forms in the fossil record. It is typical in paleontology that a species persists without any major change for millions of years, but then, is abruptly replaced by a new one. The massive body of such observations prompted Simpson, one of the founding fathers of the Modern Synthesis of evolutionary biology, to develop the concept of quantum evolution (4), according to which species, and especially, higher taxa emerged abruptly, in ‘quantum leaps’, when an evolving population rapidly moves to a new ‘adaptive zone’, or using the language of mathematical population genetics, a new peak on the fitness landscape. Simpson proposed that the quantum evolution mechanism involved fixation of unusual allele combinations in a small population by genetic drift, followed by selection driving the population to the new peak.

The idea of quantum evolution received a more systematic development in the concept of punctuated equilibrium (PE) proposed by Eldredge and Gould (5–8). The abrupt appearance of species in the fossil record prompted Eldredge and Gould to postulate that evolving populations of any species spend most of the time in the state of stasis, in which no major phenotypic changes occur (9, 10). The long intervals of stasis are punctuated by short periods of rapid evolution during which speciation occurs, and the previous dominant species is replaced by a new one. Gould and Eldredge emphasized that PE was not equivalent to the “hopeful monsters” idea, in that no macromutation or saltation was proposed to occur, but rather, a major acceleration of evolution via rapid succession of ‘regular’ mutations that resulted in the appearance of instantaneous speciation, on geological scale.

A distinct but related view of macroevolution is encapsulated in the concept of evolutionary transitions developed by Szathmary and Maynard Smith (11–13). Under this concept, major evolutionary transitions, such as, for instance, emergence of multicellular organisms, involve emergence of new levels of selection (new Darwinian individuals), in this case, selection affecting ensembles of multiple cells rather than individual cells. These evolutionary transitions resemble phase transitions in physics (14)and appear to occur rapidly, compared to the intervals of evolution within the same level of selection. The concept of evolutionary transitions can be generalized to apply to the emergence of any complex feature (15).

Punctuated equilibrium has been explicitly linked to the physical theory of self-organized criticality (SOC). Self-organized criticality, a concept developed by Bak and colleagues (16), is an intrinsic property of dynamical systems with multiple degrees of freedom and strong nonlinearity. Such systems experience serial ‘avalanches’ separated in time by intervals of stability (the avalanche metaphor comes from Bak’s depiction of SOC on the toy example of a sand pile, on which additional sand is poured, but generally denotes major changes in a system). A distinctive feature of the critical dynamics under the SOC concept is self-similar (power law) scaling of avalanche sizes (16–22). The close analogy between SOC and PE was noticed and explored by Bak and colleagues, the originators of the SOC concept, who developed models directly inspired by evolving biological systems and intended to describe their behavior (16, 19, 20, 22). In particular, the popular Bak-Sneppen model (19) explores how ecological connections between organisms (physical proximity in the model space) drive co-evolution of the entire community. Extinction of the organisms with the lowest fitness disrupts the local environments and results in concomitant extinction of their closest neighbors. It has been shown that, after a short burn-in, such systems self-organize in a critical quasi-equilibrium interrupted by avalanches of extinction, with the power law distribution of avalanche sizes.

We asked whether SOC is a prerequisite for PE and, more broadly, what are the necessary and sufficient conditions for PE. To address this question, we analyze mathematically a simple model of population evolution on a rugged fitness landscape (23). We show that, under the assumptions of a large population size and low mutation rate (weak-mutation limit), an evolving population spends most of the time in stasis, i.e. percolating in a near-neutral mutational networks around saddle points on the landscape. The intervals of stasis are punctuated by rapid transitions to new saddle points after fixation of beneficial mutations. Thus, contrary to the general perception of the weak-mutation limit as an equivalent of gradualism (24), PE appears to be the default mode of evolution of large populations in this regime.

Results

Agent-based model of competitive exclusion

We consider a population of a large constant size N consisting of individuals, each with a specific genotype. To avoid dealing with the overwhelming complexity of the space of all genotypes, we work with a coarse-grained model that groups similar genotypes into ‘types’. The genotypes within the same type are considered to be homogeneous and densely connected by the mutation network. The only homogeneity assumption we need to make is that, within each type, the variations in fitness and available transitions to other classes due to mutations are negligible. We also assume that sizes of different types are comparable. The set of all types is denoted by .

The evolution of a population within the model involves reproduction and mutation. Reproduction of individuals occurs under the Moran model widely used in population genetics, that is, with rates proportional to their fitness and is accompanied by removal of random individuals to keep N constant (25). Mutations are modeled by transitions in a mutational network E. The individual mutation rate λ is assumed to be low compared to the reproduction rates. The evolutionary regime depends on: i) the geometry of the graph , ii) the fitness function f, iii) the values of parameters N and λ, iv) the initial configuration.

Let us now describe our basic model in more detail. We assume that the population size is a large number N, constant in time. The set of all possible types is finite or countable. It can be viewed as a graph with adjacency matrix . Two distinct types i, j are connected by an edge if they differ by a mutation (at the scale of the model, a mutation is assumed to occur instantaneously and without intermediate steps). In that case, we set E_ij = 1. Otherwise, E_ij = 0.

Each type is assigned a fitness value f_i > 0 which is identified with the reproduction rate. The numbers f_i are assumed to be distinct and of the order of 1 (more precisely, bounded), so essentially, time is measured in reproductions. It is convenient to work with relative sizes y_i of type populations (fractions) with respect to the total population size N. We denote by Δ the space of sequences such that y_i ≥ 0 for all i and . Denoting the fraction of individuals of type present in the population at time t ∈ ℝ by x_i(t) (taking values 0, N⁻¹, 2N⁻¹,…), we define random evolution of the vector as a continuous time pure jump Δ-valued Markov process, by specifying the transition rates. A single individual of type produces new individuals of the same type i at the rate f_i. Each reproduction is accompanied by removal of one individual that is randomly and uniformly chosen from the entire population. Thus, the total rate of reproduction of individuals of type i is Nx_if_i. Given that an individual of type i is reproducing, the probability that the child individual will replace an individual of type j is x_j. Thus, the total rate of simultaneous change x_i → x_i + N⁻¹ and x_j → x_j − N⁻¹ is Nf_ix_ix_j. Let us now introduce mutations. We will assume that mutation rates are much lower than the reproduction rates. To model this, we introduce a small parameter λ > 0. The rate of replacement of an individual of type i ∈ I(x), where by an individual of type j is given by λE_ij ∈ {0, λ}. The total rate of such transitions occurring in a population is NλE_ijx_i.

In what follows, we derive the PE evolutionary regime from certain reasonable assumptions on the geometry of the graph, the fitness function, population size, mutation rates and the initial state. Our results can be viewed as similar to those in previous work (26–28), where more sophisticated models were considered. However, our simple model allows for a more transparent analysis that is conducive to biological implications and we use it here to tie the PE concept to noisy dynamics near heteroclinic networks (29, 30) and emphasize the importance of saddle points on the landscape for the evolutionary process.

Evolution without mutations in the infinite population size limit

In this section, we examine the case where, in an infinite population, λ = 0, i.e., there are no mutations, and approximate the dynamics of our stochastic model by that of a deterministic ODE with the right-hand side given by where is the average fitness for the population state x. The system (1) is a well-known competitive exclusion system (see, e.g., (2.15)–(2.16) of (31)) restricted to nonzero components of x. Equation (1) emerges due to the averaging effect and can be viewed as a law of large numbers for our model.

To state our results, we need to introduce some notations and definitions. We denote I = I(x(0)) for brevity and note that, given the absence of mutations, our stochastic model and ODE (1) are defined on the simplex . This simplex is the convex hull of its vertices e⁽ⁱ⁾, i ∈ I, corresponding to pure states where only one type is present:

One of these vertices plays a special role. Let i* be the type with maximum fitness f* (within I), that is, f* = f_i* = max_i∈If_i. We will see that e^(i*) is an attractor for both deterministic dynamical system defined by (1) and for our stochastic model. For the approximation result, we need to define the discrepancy where x(t) is the Markov process without mutations and for any y, Φ^ty is the solution of ODE (1) with the initial condition y, at time t. We are going to estimate the maximal discrepancy up to time t, i.e., D*(t) = sup_s∈[0,t] || D(s) ||, where ||⋅|| is the L¹ norm in ℝ^I defined by

We assume that the number of types |I| is small compared to the population size, more precisely, there is μ < 1/2 such that

Because this model does not include mutations, if a type i becomes extinct at time s, i.e., x_i(s) = 0, then, x_i(t) = 0 for all t ≥ s. We denote the event on which no type i ∈ I becomes extinct before time t by B_t = {I(x(s)) = I for all s ∈ [0, t]}. Events from a sequence (A_N)_N∈ℕ are stretch-exponentially unlikely (SE-unlikely) if for some C, γ > 0,

This is fast decay in N, just short of being truly exponentially fast. We are now ready to state our main result for the system without mutations and to examine on the meaning of each of its parts.

Theorem 1.

Assume (4). Then:

There are constants c, β > 0 such that events B_clnN ∩ {D*(clnN) > N^−β} are SE-unlikely.
Let β be defined in Part 1 of the Theorem. Then, for any δ < β, there is a constant C > 0 such that, conditioned on the nonextinction of type i*, and up to a SE-unlikely event, |x(ClnN) − e^(i*)| ≤ N^−δ.
There are constants C′, α > 0 such that, if |x(0) − e^(i*)| ≤ N^−δ , then
There is a number p > 0 that does not depend on N, such that the probability of nonextinction of type i* is bounded below by p for all initial conditions x(0) satisfying x_i*(0) > 0.
For any δ ∈ (0,1), if x_i*(0) > N^−δ, then, extinction of type i* is SE-unlikely.

Part 1 of the theorem shows that, up to time clnN, if no type gets extinct, the stochastic process x(t) follows the deterministic trajectory Φ^tx(0) very closely, deviating from it at most by N^−β. This happens with a probability very close to 1, exceptions being stretch-exponentially unlikely.

Part 2 shows that, if type i* does not die out, then, with high probability, by time ClnN, it will dominate the population and all other types will be almost extinct.

Part 3 means that, after realization of the scenario described in Part 2 and an additional logarithmic time, i* will be the only surviving type.

Part 1 is conditioned on the nonextinction of any type, whereas Part 2 is conditioned on the nonextinction of type i*. If any type i dies out, Part 1 still applies to the continuation of the process on the simplex Δ_I\{i} of a lower dimension. By contrast, for Part 2 to be meaningful, we need to provide a bound on the nonextinction of i*. This is done in Parts 4 and 5.

Part 4 states that there is a positive probability (independent of the population size) that the progeny of even a single individual of type i* will drive out all other types.

Part 5 states that, once the fraction of the individuals of type i* reaches a (small) threshold N^−δ, then, it is almost certain that i* will dominate the population. To summarize these results, the chance of extinction for the fittest type is non-negligible only when there are very few individuals of this type, that is, when the initial state involves a recent mutation that produced a single individual of this type. Once the number of individuals reaches a certain modest threshold, the typical, effectively deterministic, behavior is to follow the trajectory of (1) closely, eventually reaching the pure state of fixation where only individuals of type i^* are present. The proof of Theorem 1 is given in the Appendix. Now, we turn to the analysis of the dynamics generated by ODE (1).

Behavior of the deterministic system

In this section, we explore the behavior of the system (1). Our basic analysis is only a minor extension of previous work (31)(Section 2.2.1), and we include it here for completeness and to stress the points central to the concept of evolution in the PE regime that is developed in this paper. The first statement characterizes the survival of the fittest under this dynamic.

Theorem 2.

Let x(t) be a solution of Eq. (1). If x_i*(0) > 0, then x(t) converges to e^(i*) exponentially fast.

One possible approach to the proof of this theorem is to define and note that together with equation (1) implies

Therefore, y(t) = 1 − x_i*(t) satisfies where . Thus, y(t) is dominated by the solution of the equation ż = −c(1 − z)z which converges to zero exponentially fast, so 1 − x_i*(t) ≤ Ke^−ct for some K > 0 depending on the initial condition, which completes the proof.

Here, our assumption that f takes distinct values was used to ensure that the constant c, the gap between the maximum value of f and the second highest value (this constant also plays the role of the convergence rate), is positive. If the maximum fitness is attained by several distinct types (as opposed to essentially indistinguishable microstates within a type), then, a similar estimate shows that, in the limit, only those maximum fitness types survive.

Although the analysis above already allows us to conclude that points e^(k) are hyperbolic critical points (saddles) of various indices (the index of a saddle is the number of negative eigenvalues of the linearization of the vector field at the saddle), we can show this more explicitly. It is easy to compute the linearization (∂_jb_i(e^(k))) of b at e^(k):

Therefore, for each i ∈ I such that i ≠ k, there is an eigenvalue f_i − f_k of (∂_jb_i(e^(k))) with an eigenvector e⁽ⁱ⁾ − e^(k) pointing along the simplex edge connecting e^(k) and e⁽ⁱ⁾. These eigenvalues span the simplex Δ_I, so the additional eigenvalue −f_k with eigenvector e^(k) that is transversal to Δ_I can be ignored. To demonstrate explicitly that the vertex e^(k) is a saddle, we note that the eigendirections given by e⁽ⁱ⁾ − e^(k) are stable or unstable, depending on the sign of the associated eigenvalue, i.e., on whether f_i < f_k or f_i > f_k. Moreover, there is a heteroclinic connection (a trajectory connecting two distinct saddle points) between e⁽ⁱ⁾ and e^(k). This trajectory coincides with the simplex edge between e⁽ⁱ⁾ and e^(k) and corresponds to the presence of exactly two types i, k. The dynamics on it is described by the logistic equation (see Figure 1 for the phase portrait). The key feature of this dynamics is a heteroclinic network formed by trajectories connecting saddle points to one another. The vertex e^(i*) is a sink (a saddle of index 0) if considered in Δ_I but it can also be viewed as a saddle in simplices of higher dimensions based on coordinates (types) that include those with higher fitness than f*. The types with higher fitness will appear if we include mutations into the model.

Figure 1. The phase portrait of the dynamical system (1).

Four types 1, 2, 3, 4 are shown such that f₁< f₂<f₃ <f₄. The dynamics is defined on the simplex Δ_{1,2,3,4} with vertices e⁽¹⁾, e⁽²⁾, e⁽³⁾, e⁽⁴⁾, corresponding to pure states where the population consists entirely of individuals of one type. These vertices are critical points of the vector field b. The edges of the simplex are heteroclinic orbits connecting these critical points to each other. Several other orbits are also plotted as arrows. The vertex e⁽⁴⁾ attracts every initial condition with nonzero fraction of individuals of the fittest type i* = 4.

Evolutionary process with mutations

We now consider the full process with positive but small rate λ and recall that, for each type i ∈ I(x), the rate of mutation to type j is given by λE_ij. We consider here only relatively late stages of evolution that are preceded by extensive evolutionary optimization so that the overwhelming majority of the mutations are either deleterious or at best neutral. More precisely, we assume that there is a constant M such that for each i ∈ I(x), the total number of available fitness-increasing (beneficial) mutations, that is, vertices such that E_ij = 1 and f_j > f*, is bounded by M. Our first assumption on the magnitude of λ is that

Then, for a fixed C > 0, large N, and any time interval of length ClnN, the probability of a beneficial mutation is bounded by

According to Theorem 1, if the evolutionary process is conditioned on the survival of type i*, then, typically, it takes ClnN time for the process x_i*(t) to reach 1 (fixation). Thus, the estimate (6) shows that the population is unlikely to produce a new beneficial mutation before it reaches the state of fixation where type i* is the only surviving one. Once a new beneficial mutation occurs and, accordingly, a new best-fit type emerges, it either gets extinct quickly or gets fixed in the population, in time of the order ln N. The trajectory, driven by differential reproduction of random mutations, closely follows the heteroclinic connection, i.e., the line connecting two vertices of the simplex Δ. The entire process can be described as follows: there is a moment when i* is the only type present, after which it takes time of order (kλN)^-1 to produce a new beneficial mutation, where k is the number of beneficial mutations that are available from i*. Then, it takes a much shorter time ClnN for this fittest type to take over the entire population, after which the process repeats.

Now consider deleterious mutations. There are N individuals, and each produces a suboptimal (lower fitness) type with the rate λL, where L is the number of available deleterious mutations. Using the Poisson distribution, we obtain that, by time t,it is highly unlikely to produce more than tNλL new suboptimal individuals. If t = C log N, then, this number is CλLN ln N, so requiring we obtain λLNlnN ≪ N, that is, over the travel time between saddles, the emerging individuals with deleterious mutations constitute an asymptotically negligible fraction of the entire population. Thus, the trajectory x(t) will be altered only by a term converging to 0 as N → ∞.

Thus, the emerging picture is as follows: the evolving population spends most of the time in a ‘dynamic stasis’ near saddle points. During this stage, a dynamic equilibrium emerges under purifying selection: deleterious mutations constantly produce individuals with fitness lower than the current maximum, and these individuals or their progeny die out. On time scale of (kλN)⁻¹, a new beneficial mutation will occur, and then, either the new type will go extinct fast (in which case, the population has to wait for another beneficial mutation) or will get fixed such that, in time lnN, the new type (followed by a small, dynamic cloud of suboptimal types) will dominate the population. The transition from one dominant type to the next occurs along the heteroclinic trajectory orbit coinciding with the edge of the infinite-dimensional simplex connecting the two vertices corresponding to monotypic populations. This iterative process of fast transitions between long stasis periods spent near saddle points is typical of noisy heteroclinic networks, as demonstrated in early, semi-heuristic work (32) (33, 34), and later, rigorously(29, 30). However, the two types of noisy contributions, from reproduction and mutation, play distinct roles here, so although the general punctuated character of the process that we describe here is the same as in the previous studies, their results do not apply to our case straightforwardly.

Because the process is random, deviations from this general description eventually will occur. Stretch-exponentially unlikely, extremely rare events can be ignored. However, the right-hand side of Eq. (6), albeit small, does not decay stretch-exponentially, and so, with a non-negligible frequency, a new beneficial mutation would appear before the current fittest type takes over the entire population. The result will be clonal interference such that the current fittest type starts being replaced with the new one before reaching fixation.

Taking the structure of the landscape into account

In general, the structure of the landscape can be complicated. The available information on the structure of complex landscapes is limited, and there are few mathematical results. Several rigorous results based on random matrix theory have been obtained for centered Gaussian fields on Euclidean spheres of growing dimension with rotationally invariant covariances of polynomial type (35, 36). For those models, the average numbers of saddles of different indices at various levels of the landscape have been shown to grow exponentially with respect to the dimension of the model, and a variational characterization of the exponential rates has been obtained. Although formally limited to concrete models, these results indicate that there are many local maxima and many more saddle points in such complex landscapes. In the context of the evolutionary process, this indicates that the evolutionary path through a sequence of temporarily dominant types is likely to end up not in a global but in a local maximum. Consider now what transpires near a local fitness peak. Suppose the current dominant genotype differs in k₀ sites from the locally optimal genotype, and sequential beneficial mutations in these sites in an arbitrary order produce a succession of increasing fitness values. Ignoring shorter times of order ln N of transitioning between saddles and only taking into account the leading contributions (that is, the sum of the waiting times for the beneficial mutations), the time it takes to reach the peak is then of the order of (k₀λN)⁻¹ + ((k₀ − 1)λN)⁻¹ + ⋯ + (2λN)⁻¹ + (λN)⁻¹ ≈ (λN)⁻¹ln k₀ (recall that our time units are comparable with reproduction rates). Once the peak is reached, it is extremely unlikely that the population moves anywhere else on the landscape. More specifically, the waiting time for the appearance of a new dominant genotype is exponentially large in N as follows from the metastability theory at the level of large deviations estimates.

Discussion

Fossil record analysis suggests that PE dominates organismal evolution (7, 8, 10). Here we examine mathematically a simple population-genetic model and show that PE is the default regime of population evolution under basic, realistic assumptions, namely, large effective population size, low mutation rate and rarity of beneficial mutations. In the weak-mutation limit, large populations spend most of their time in ‘dynamic stasis’, i.e. exercising short-range random walks within their local neutral networks, without shifting to a new distinct state in the vicinity of saddle points on the fitness landscape. The stasis periods are punctuated by rapid transitions between saddle points upon emergence of new beneficial mutations; these transitions appear effectively instantaneous compared to the duration of stasis (Figure 2). Eventually, the population might reach a local fitness peak where no beneficial mutations are available. This would lead to indefinite stasis as long as the fitness landscape does not change and the population size stays large (drift to a different peak is exponentially rare in N_e, that is, impractical for large N_e).

Figure 2. Evolution under punctuated equilibrium on a fitness landscape dominated by saddles: stasis around saddle points punctuated by fast adaptive transitions.

Planar shapes depict distinct classes of genotypes. The color scale shows a range of fitness values. Gray “ramp” strips show available transitions between the genotype classes (k transitions leading to classes with higher fitness and L transitions leading to classes with lower fitness, k ≪ L). The two blue circles indicate the original and the current states of the population; blue arrows show succession of genotypes within the same class, occurring within the effectively neutral network during the “dynamic stasis” phase; red arrows indicate fast adaptive transitions from a lower-fitness genotype to one with a higher fitness.

Two conditions determine the behavior described by this model: i) smallness of the overall mutation rate (dominated by the deleterious mutations), eq (7), λL ≪ 1/ln N and ii) smallness of the beneficial mutation rate, which results in the difference in scale between the waiting time (λkN)⁻¹ and the saddle-to-saddle transition time lnN, i.e. λkN ≪ 1/lnN. Comparison of the expressions for these conditions suggests that, for the PE to be pronounced, deleterious mutations should outnumber the beneficial mutations by at least a factor of N. This is a large but not unrealistic difference in the case of ‘highly adapted’ organisms, that is, in situations, most common in the extant biosphere, where the pool of trivial optimizations that presumably were available at the earlies stages of the evolution of life, is exhausted.

For example, with population and genomic parameters characteristic of animals, N of ~10⁵ and ~10⁷ amino acid-encoding sites in the genome, the local mutational neighborhood in the sequence space consists of 19×10⁷ mutations. Assuming that about half of these mutations are deleterious and noting that the number of beneficial mutations should be less by a factor of 10⁵, there must be 1<k<1000 beneficial mutations, apparently, a realistic value.

The condition on the overall mutation rate (λL ≪ 1/lnN) is more difficult to assert because both λ and L depend on the clustering of the whole sequence space into a coarse-grained network of distinct types. Note, however, that, as the first approximation, λ is bounded by the sequence-level mutation rate μ (only some of the sequence-level mutations lead to transitions between distinct types) and L is bounded by the genome size G (the number of available sequence-level single-position mutations is on the order of the genome size, but only some of these mutations have detectable deleterious effect). Thus, λL < μG, where μG is the expected number of sequence-level mutations per genome per generation. It has been shown that the values of μG tend to stay of the order of 1/N under ‘normal’ conditions (37, 38), therefore so that the weak-mutation regime is likely to hold under broad range of conditions.

Thus, our model suggests that the PE regime is common in the evolution of natural populations. The probable exceptions include stress-induced mutagenesis (39), whereby the mutation rate can rise by orders of magnitude, locally blooming microbial populations that might violate the kN ≪ L condition, and abrupt changes in the fitness landscape that might temporarily increase the number of immediately beneficial mutations k. All of these situations, however, are likely to be transient.

Theoretically, PE has been linked to SOC as the underlying mechanism (16, 19). However, we show here that PE naturally emerges in extremely simple models of population evolution that do not involve any criticality. The major conclusion from this analysis is that PE and not gradualism is the fundamental characteristic of sufficiently large populations in the weak-mutation limit which is, arguably, the most common evolutionary regime across the entire diversity of life. The parameter values that lead to PE appear to hold for evolving populations of all organisms, including viruses, under ‘normal’ conditions. Situations can emerge in the course of evolution when the PE regime breaks through disruption of the stasis phase. This could be the case in very small populations that rapidly evolve via drift or in cases of a dramatically increased mutation rate, such as stress-induced mutagenesis, and especially, when these two conditions combine (39–41). In many cases, disruption of stasis will lead to extinction but, on occasion, a population could move to a different part of the landscape, potentially, the basin of attraction of a higher peak. The evolution of cancers, at least, at advanced stages, does not appear to include stasis either, due to the high rate of nearly neutral and deleterious mutations, and low effective population size (39). Furthermore, the PE regime is characteristic of ‘normal’ evolution of well-adapted populations in which the fraction of beneficial mutations is small. If many, perhaps, the majority of the mutations are beneficial, there will be no stasis but rather a succession of rapid transitions in a fast adaptive evolution regime. Conceivably, this was the mode of evolution of primordial replicators at pre-cellular stages of evolution.

One of the most fundamental – and most difficult – problems in biology is the origin of major biological innovations (more or less, synonymous to macroevolution). In modern evolutionary biology, Darwin’s central idea of survival of the fittest transformed into the concept of fitness landscape with numerous peaks, where each stable form occupies one of the peaks (23, 42). Then, the fundamental problem arises: if a population has reached a local peak, further adaptive evolution is possible only via a stage of temporary decrease of fitness – how can this happen? A common answer is based on Wright’s concept of random genetic drift: the smaller the effective population size N_e, the greater the probability of random drift through (not excessively deep) valleys in the fitness landscape (42–44). This notion implies that major evolutionary transitions occur through narrow population bottlenecks. As formalized in our previous work, the evolutionary ‘innovation potential’ is inversely proportional to N_e (14). There are, however, multiple indications that drift cannot be the only mode of evolutionary innovation and that novelty often arises in large populations thanks to their high mutational diversity (45–48). Nevertheless, it remains unclear, within the tenets of classical population genetics, how a large population can cross a valley on the landscape. One obvious way to overcome this conundrum is to assume that the landscape changes in time due to environmental changes, so that a population can find itself in the basin of attraction of a new fitness peak (49, 50).

The analysis presented here suggests a greater innovation potential of large populations than usually assumed, stemming from the fact that a typical landscape in a multidimensional space contains many more saddle points than peaks. On the one hand, this intuitively obvious claim follows from the observation that, for any two peaks, the path connecting the peaks and maximizing the minimum height must pass through a saddle point. On the other hand, it is justified by precise computations of exponential (with respect to the model dimension) growth rates of the expected numbers of saddle points of various indices (including peaks) for random Gaussian landscapes under certain restrictions on covariance (35, 36). Thus, typical fitness landscapes are likely to allow numerous transitions and extensive, innovative evolution without the need for valley crossing.

In biological terms, it seems to be impossible to maximize fitness in all numerous directions (the number of these being at least on the order of the genome size), and therefore, the probability of beneficial mutations is (almost) never zero, however small it might be (in general, this pertains not only to single point mutations, but also to beneficial epistatic combinations of mutations as well as large scale genomic changes, such as gene gain, loss and duplication). In other words, the landscape is dominated by saddle points that are far more common than peaks, so that there is almost always an upward path which an evolving population will follow provided it is large enough to afford a long wait in saddles without risking extinction due to fluctuations.

Results similar to ours have been reported in the mathematical biology literature (26–28). Specifically, it has been proven that a trait substitution sequence process (sequential transition from one dominant trait to another) occurs in the limit of large population size and small beneficial mutation rate. Here we employ a very simple model to demonstrate the fundamental character of the concept of punctuated equilibrium, to tie it to the noisy dynamics near heteroclinic networks (29, 30) and to stress the key role of saddle points, in contrast to the wide-spread perception of peaks as the central structural elements of fitness landscapes.

To conclude, the results presented here show that PE is not only characteristic of speciation or evolutionary transitions but rather is the default mode of evolution under weak-mutation limit which is the most common evolutionary regime (24). In our previous work, we have identified conditions under which saltational evolution becomes feasible, under the strong-mutation limit (41). Here we show that, even for evolution in the weak-mutation limit that is generally perceived as gradual (24), PE is the default regime. Even during periods of stasis in phenotypic evolution, the underlying microevolutionary process appears to be punctuated.

Author contributions

YB, MIK, YIW, and EVK jointly incepted the project; YB performed the mathematical analysis; YB, MIK, YIW, and EVK analyzed the results; YB and EVK wrote the manuscript that was edited and approved by all authors.

Acknowledgements

YIW and EVK are supported by the Intramural Research Program of the National Institutes of Health of the USA. YB is partially supported by the National Science Foundation, grant DMS-1811444. MIK was supported by Spinoza Prize funds.

Appendix Proof of Theorem 1

To prove Part 1, our first goal is to represent the discrepancy D(t) defined in (2) in a convenient way. We can write the solution Φ^tx(0) of ODE (1) with initial value x(0) as

It is useful to represent x(t) in a similar form. To that end, we recall that every Markov process solves the martingale problem associated with its own generator. Therefore, introducing the projection function π_i(x) = x_i, we obtain that there is a martingale M_i such that where the generator is defined by

For our pure jump process the generator is determined by transition rates: where σ^ijx denotes the state obtained from state x by adding an individual of type i displacing an individual of type j:

We can compute directly:

Plugging this into (9), we obtain

Subtracting (8) from (10), we obtain

We will view M(t) = (M_i(t))_i∈I as a vector-valued martingale. To estimate the integral term, we recall the definition (3) and prove the following statement:

Lemma 1.

Let . Then, for all , || b(x) − b(y) ||≤ 3F || x − y ||, x, y ∈ Δ_/.

Proof. We have where and

Combining three displays above, we complete the proof. □

Taking the absolute value in (11), then taking the sum over i ∈ I and applying Lemma 1, we obtain where M*(t) = sup_s∈[0,t] || M(s) ||.

Using the Gronwall inequality, we obtain

To estimate M*(t), we first use (4) to write for any β > 0: where . Next, we will apply an exponential martingale inequality from (51)(Appendix B6) in the form given by van de Geer (52)(Lemma 2.1):

Lemma 2.

If jumps of a locally square integrable cadlag martingale (M(t))_t≥0 are uniformly bounded by a constant K > 0, then

Each M_i is a piece-wise linear martingale with jumps of size 1/N (its jumps coincide with those of x_i(t)). Since, in addition, the total jump rate is bounded by NF, we obtain that the predictable quadratic variation of M_i satisfies 〈M_i〉_t ≤ tNF/N² = tF/N. Thus, we can apply Lemma 2 with B² = tF/N, K = 1/N, and A = N^−β−μ:

Combining this with (13), choosing β so that β + μ < 1/2 and using t = clnN, we can find constants C, γ > 0 such that

Using this in (12), we complete the proof of Part 1 of the theorem. To prove Part 2, we notice that according to Part 1, up to a SE-unlikely event, the stochastic process follows the deterministic trajectory N^−β-closely up to time τ_e ∧ clnN, where τ_e is the first moment when one of the types goes extinct. We can restart the process at τ_e ∧ clnN treating x(τ_e ∧ clnN) as a new starting point and apply the same estimate to the restarted process (in case τ_e < clnN, with fewer nonzero coordinates involved). Patching several ODE trajectories together in this way and noting that, conditioned on nonextinction of type i*, the total time it takes to travel from any point x ∈ Δ_I with x_i* ≥ N⁻¹ to the neighborhood of e^(i*) of size N^−δ is bounded by ClnN for some C, we obtain Part 2.

The remaining parts follow from an auxiliary statement. To state it, we define a jump Markov process y(t) with values in {0, N⁻¹, 2N⁻¹…,1} such that y(0) = x(0) and y(t) makes a jump from x to x + N⁻¹ with rate Nf*x(1 − x) and to x − N⁻¹ with rate , where was defined in (5).

Lemma 3.

1. The process y(t) is stochastically dominated by x_i*(t). 2. The process y(t) considered only at times of jumps is an asymmetric random walk on {0, N⁻¹, 2N⁻¹…,1} with absorption at 0 and N and probabilities of a step to the right and left being p and 1 − p where p ∈ (1/2,1) solves

Proof. The coordinate x_i* jumps to the right with rate Nf_ix_i*(1 − x_i*) and to the left with rate

So, the jump rates to the left for both processes coincide and the jump rates to the right for process y(t) do not exceed those for process x_i*(t), and Part 1 of the lemma follows. To prove Part 2, it suffices to note that the ratio of the jump right rate to the jump left rate for process y(t) is equal to everywhere (except the absorbing points 0 and 1). □

To prove Part 3, we can use this lemma and the fact that if m ≥ N/2, then which implies that (except for an exponentially improbable event that x_i* hits level N/2 before 1), the time it takes for all non-i* types to die out is stochastically dominated by the extinction time for the linear birth-and-death process with birth rate λ_k = Ak and death rate μ_k = Bk where . The probabilty p_k(t) of extinction by time t starting with k individuals was probably first computed in (53). There is a misprint in formula (78) in (53) but one can use formula (68) of that paper (for generating functions) to obtain

Plugging t = C′lnN and k = N^1−δ into this formula we obtain and since α = C′(B − A) − 1 + δ > 0 if we choose C′ large enough, the desired result follows.

The last two parts of Theorem 1 follow from Lemma 3, and similar well-known statements for asymmetric random walks. □

References

1.↵
Darwin C (1859) On the Origin of Species (A.F. Murray, London).
2.↵
Huxley TH (1860) Darwin on the origin of Species. Westminster Review:541–570
3.↵
Goldschmidt RB (1940) The Material Basis of Evolution (Yale Univ Press, New Haven, CT).
4.↵
Simpson GG (1983) Tempo and Mode in Evolution (Columbia University Press, New York).
5.↵
1. Schopf TJM
Eldredge N & Gould SJ (1972) Punctuated equilibria: an alternative to phyletic gradualism. Models in Paleobiology, ed Schopf TJM (Freeman Cooper, San Francisco), pp 193–223.
6.↵
Gould SJ & Eldredge N (1977) Punctuated equilibrium: the tempo and mode of evolution reconsidered. Paleobiology 3:115–151
OpenUrl Abstract
7.↵
Gould SJ & Eldredge N (1993) Punctuated equilibrium comes of age. Nature 366(6452):223–227
OpenUrl CrossRef GeoRef PubMed Web of Science
8.↵
Eldredge N & Gould SJ (1997) On punctuated equilibria. Science 276(5311):338–341
OpenUrl GeoRef PubMed Web of Science
9.↵
Gould SJ (1994) Tempo and mode in the macroevolutionary reconstruction of Darwinism. Proc Natl Acad Sci U S A 91(15):6764–6771
OpenUrl Abstract/FREE Full Text
10.↵
Gould SJ (2002) The Structure of Evolutionary Theory (Harvard Univ. Press, Cambrdige, MA).
11.↵
Szathmary E & Smith JM (1995) The major evolutionary transitions. Nature 374(6519):227–232
OpenUrl CrossRef PubMed Web of Science
12.
Maynard Smith J & Szathmary E (1997) The Major Transitions in Evolution (Oxford University Press, Oxford).
13.↵
Szathmary E (2015) Toward major evolutionary transitions theory 2.0. Proc Natl Acad Sci U S A 112(33):10104–10111
OpenUrl Abstract/FREE Full Text
14.↵
Katsnelson MI, Wolf YI, & Koonin EV (2018) Towards physical principles of biological evolution. Physica Scripta:93043001
15.↵
Wolf YI, Katsnelson MI, & Koonin EV (2018) Physical foundations of biological complexity. Proc Natl Acad Sci U S A 115(37):E8678–E8687
OpenUrl Abstract/FREE Full Text
16.↵
Bak P (1996) How Nature Works. The Science of Self-Organized Criticality. (Springer, New York).
17.
Bak P, Tang C, & Wiesenfeld K (1987) Self-organized criticality: An explanation of the 1/f noise. Phys Rev Lett 59(4):381–384
OpenUrl CrossRef PubMed Web of Science
18.
Bak P, Tang C, & Wiesenfeld K (1988) Self-organized criticality. Phys Rev A Gen Phys 38(1):364–374
OpenUrl
19.↵
Bak P & Sneppen K (1993) Punctuated equilibrium and criticality in a simple model of evolution. Phys Rev Lett 71(24):4083–4086
OpenUrl CrossRef PubMed Web of Science
20.↵
Maslov S, Paczuski M, & Bak P (1994) Avalanches and 1/f noise in evolution and growth models. Phys Rev Lett 73(16):2162–2165
OpenUrl CrossRef PubMed Web of Science
21.
Maslov S & Zhang YC (1995) Exactly Solved Model of Self-Organized Criticality. Phys Rev Lett 75(8):1550–1553
OpenUrl PubMed
22.↵
Bak P & Paczuski M (1995) Complexity, contingency, and criticality. Proc Natl Acad Sci U S A 92(15):6689–6696
OpenUrl Abstract/FREE Full Text
23.↵
Gavrilets S (2004) Fitness Landscapes and the Origin of Species (Princeton University Press, Princeton).
24.↵
Gillespie JH (1994) The Causes of Molecular Evolution (Oxford University Press, Oxford).
25.↵
Moran PA (1958) Random processes in genetics. Proc. Philos. Soc. Math. and Phys. Sci. 54:60–71
OpenUrl
26.↵
Champagnat N (2006) A microscopic interpretation for adaptive dynamics trait substitution sequence models. Stochastic processes and their applications 116:1127–1160
OpenUrl
27.
Champagnat N & Méléard S (2011) Polymorphic evolution sequence and evolutionary branching. Probability Theory and Related Fields 151:45–94
OpenUrl
28.↵
Kraut A & Bovier A (2019) From adaptive dynamics to adaptive walks. J Math Biol 79(5):1699–1747
OpenUrl
29.↵
Bakhtin Y (2010) Small noise limit for diffusions near heteroclinic networks. Dyn Syst 25:413–431
OpenUrl
30.↵
Bakhtin Y (2011) Noisy heteroclinic networks.. Probability Theory and Related Fields 150:1–42
OpenUrl
31.↵
Nowak MA (2006) Evolutionary Dynamics: Exploring the Equations of Life (Belknap Press, Cambridge, MA).
32.↵
Stone E & Holmes P (1990) Random perturbation of heteroclinic attractors. SIAM J. Appl. Math. 50:726–743
OpenUrl
33.↵
Stone E & Armbruster D (1999) Noise and O(1) ampitude effects on heteroclinic cycles. Chaos: An Interdisciplinary Journal of Nonlinear Science 9:499–506
OpenUrl
34.↵
Armbruster D, Stone E, & Kirk V (2003) Noisy heteroclinic networks. Chaos: An Interdisciplinary Journal of Nonlinear Science 13:71–86
OpenUrl
35.↵
Auffinger A & Ben Arous G (2013) Complexity of random smooth functions on the high-dimensional sphere. Ann Probab 41:4214–4247
OpenUrl
36.↵
Ben Arous G, Mei S, Montanari A, & Nica M (2019) The landscape of the spiked tensor model. Comm. Pure Appl. Math. 72:2282–2330
OpenUrl
37.↵
Lynch M (2010) Evolution of the mutation rate. Trends Genet 26(8):345–352
OpenUrl CrossRef PubMed Web of Science
38.↵
Lynch M, et al. (2016) Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet 17(11):704–714
OpenUrl CrossRef PubMed
39.↵
Fitzgerald DM, Hastings PJ, & Rosenberg SM (2017) Stress-Induced Mutagenesis: Implications in Cancer and Drug Resistance. Annu Rev Cancer Biol 1:119–140
OpenUrl CrossRef PubMed
40.
Ram Y & Hadany L (2019) Evolution of Stress-Induced Mutagenesis in the Presence of Horizontal Gene Transfer. Am Nat 194(1):73–89
OpenUrl
41.↵
Katsnelson MI, Wolf YI, & Koonin EV (2019) On the feasibility of saltational evolution. Proc Natl Acad Sci U S A 116(42):21068–21075
OpenUrl Abstract/FREE Full Text
42.↵
Wright S (1949) Adaptation and selection. Genetics, Paleontology and Evolution. (Princeton Univ. Press, Princeton, NJ.
43.
Lynch M (2007) The origins of genome archiecture (Sinauer Associates, Sunderland, MA).
44.↵
Lynch M & Conery JS (2003) The origins of genome complexity. Science 302(5649):1401–1404
OpenUrl Abstract/FREE Full Text
45.↵
Masel J (2006) Cryptic genetic variation is enriched for potential adaptations. Genetics 172(3):1985–1991
OpenUrl Abstract/FREE Full Text
46.
Rajon E & Masel J (2013) Compensatory evolution and the origins of innovations. Genetics 193(4):1209–1220
OpenUrl Abstract/FREE Full Text
47.
Lynch M & Abegg A (2010) The rate of establishment of complex adaptations. Mol Biol Evol 27(6):1404–1414
OpenUrl CrossRef PubMed Web of Science
48.↵
Lynch M (2018) Phylogenetic divergence of cell biological features. Elife 7
49.↵
Gavrilets S & Vose A (2005) Dynamic patterns of adaptive radiation. Proc Natl Acad Sci U S A 102(50):18040–18045
OpenUrl Abstract/FREE Full Text
50.↵
Mustonen V & Lassig M (2009) From fitness landscapes to seascapes: non-equilibrium dynamics of selection and adaptation. Trends Genet 25(3):111–119
OpenUrl CrossRef PubMed Web of Science
51.↵
Shorack GR & Wellner JA (2009) Empirical processes with applications to statistics (Society for Industrial and Applied Mathematics, Philadelphia, PA).
52.↵
Van de Geer S (1995) Exponential inequalities for martingales, with application to maximum likelihood extimation for counting processes. Ann Statist 23:1779–1801
OpenUrl
53.↵
Bartholomay AF (1958) On the linear birth and death processes of biology as Markoff chains.. Bull Math Biophys 20:97–118
OpenUrl

View the discussion thread.

Posted July 20, 2020.

Download PDF

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5215)
Biochemistry (11752)
Bioengineering (8752)
Bioinformatics (29200)
Biophysics (14974)
Cancer Biology (12096)
Cell Biology (17411)
Clinical Trials (138)
Developmental Biology (9421)
Ecology (14182)
Epidemiology (2067)
Evolutionary Biology (18308)
Genetics (12245)
Genomics (16803)
Immunology (11869)
Microbiology (28097)
Molecular Biology (11594)
Neuroscience (60969)
Paleontology (451)
Pathology (1871)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2886)
Systems Biology (7340)
Zoology (1651)

[1] 1.↵
Darwin C (1859) On the Origin of Species (A.F. Murray, London).

[2] 2.↵
Huxley TH (1860) Darwin on the origin of Species. Westminster Review:541–570

[3] 3.↵
Goldschmidt RB (1940) The Material Basis of Evolution (Yale Univ Press, New Haven, CT).

[4] 4.↵
Simpson GG (1983) Tempo and Mode in Evolution (Columbia University Press, New York).

[5] 5.↵
Schopf TJM
Eldredge N & Gould SJ (1972) Punctuated equilibria: an alternative to phyletic gradualism. Models in Paleobiology, ed Schopf TJM (Freeman Cooper, San Francisco), pp 193–223.

[6] Schopf TJM

[7] 6.↵
Gould SJ & Eldredge N (1977) Punctuated equilibrium: the tempo and mode of evolution reconsidered. Paleobiology 3:115–151
OpenUrl Abstract

[8] 7.↵
Gould SJ & Eldredge N (1993) Punctuated equilibrium comes of age. Nature 366(6452):223–227
OpenUrl CrossRef GeoRef PubMed Web of Science

[9] 8.↵
Eldredge N & Gould SJ (1997) On punctuated equilibria. Science 276(5311):338–341
OpenUrl GeoRef PubMed Web of Science

[10] 9.↵
Gould SJ (1994) Tempo and mode in the macroevolutionary reconstruction of Darwinism. Proc Natl Acad Sci U S A 91(15):6764–6771
OpenUrl Abstract/FREE Full Text

[11] 10.↵
Gould SJ (2002) The Structure of Evolutionary Theory (Harvard Univ. Press, Cambrdige, MA).

[12] 11.↵
Szathmary E & Smith JM (1995) The major evolutionary transitions. Nature 374(6519):227–232
OpenUrl CrossRef PubMed Web of Science

[13] 12.
Maynard Smith J & Szathmary E (1997) The Major Transitions in Evolution (Oxford University Press, Oxford).

[14] 13.↵
Szathmary E (2015) Toward major evolutionary transitions theory 2.0. Proc Natl Acad Sci U S A 112(33):10104–10111
OpenUrl Abstract/FREE Full Text

[15] 14.↵
Katsnelson MI, Wolf YI, & Koonin EV (2018) Towards physical principles of biological evolution. Physica Scripta:93043001

[16] 15.↵
Wolf YI, Katsnelson MI, & Koonin EV (2018) Physical foundations of biological complexity. Proc Natl Acad Sci U S A 115(37):E8678–E8687
OpenUrl Abstract/FREE Full Text

[17] 16.↵
Bak P (1996) How Nature Works. The Science of Self-Organized Criticality. (Springer, New York).

[18] 17.
Bak P, Tang C, & Wiesenfeld K (1987) Self-organized criticality: An explanation of the 1/f noise. Phys Rev Lett 59(4):381–384
OpenUrl CrossRef PubMed Web of Science

[19] 18.
Bak P, Tang C, & Wiesenfeld K (1988) Self-organized criticality. Phys Rev A Gen Phys 38(1):364–374
OpenUrl

[20] 19.↵
Bak P & Sneppen K (1993) Punctuated equilibrium and criticality in a simple model of evolution. Phys Rev Lett 71(24):4083–4086
OpenUrl CrossRef PubMed Web of Science

[21] 20.↵
Maslov S, Paczuski M, & Bak P (1994) Avalanches and 1/f noise in evolution and growth models. Phys Rev Lett 73(16):2162–2165
OpenUrl CrossRef PubMed Web of Science

[22] 21.
Maslov S & Zhang YC (1995) Exactly Solved Model of Self-Organized Criticality. Phys Rev Lett 75(8):1550–1553
OpenUrl PubMed

[23] 22.↵
Bak P & Paczuski M (1995) Complexity, contingency, and criticality. Proc Natl Acad Sci U S A 92(15):6689–6696
OpenUrl Abstract/FREE Full Text

[24] 23.↵
Gavrilets S (2004) Fitness Landscapes and the Origin of Species (Princeton University Press, Princeton).

[25] 24.↵
Gillespie JH (1994) The Causes of Molecular Evolution (Oxford University Press, Oxford).

[26] 25.↵
Moran PA (1958) Random processes in genetics. Proc. Philos. Soc. Math. and Phys. Sci. 54:60–71
OpenUrl

[27] 26.↵
Champagnat N (2006) A microscopic interpretation for adaptive dynamics trait substitution sequence models. Stochastic processes and their applications 116:1127–1160
OpenUrl

[28] 27.
Champagnat N & Méléard S (2011) Polymorphic evolution sequence and evolutionary branching. Probability Theory and Related Fields 151:45–94
OpenUrl

[29] 28.↵
Kraut A & Bovier A (2019) From adaptive dynamics to adaptive walks. J Math Biol 79(5):1699–1747
OpenUrl

[30] 29.↵
Bakhtin Y (2010) Small noise limit for diffusions near heteroclinic networks. Dyn Syst 25:413–431
OpenUrl

[31] 30.↵
Bakhtin Y (2011) Noisy heteroclinic networks.. Probability Theory and Related Fields 150:1–42
OpenUrl

[32] 31.↵
Nowak MA (2006) Evolutionary Dynamics: Exploring the Equations of Life (Belknap Press, Cambridge, MA).

[33] 32.↵
Stone E & Holmes P (1990) Random perturbation of heteroclinic attractors. SIAM J. Appl. Math. 50:726–743
OpenUrl

[34] 33.↵
Stone E & Armbruster D (1999) Noise and O(1) ampitude effects on heteroclinic cycles. Chaos: An Interdisciplinary Journal of Nonlinear Science 9:499–506
OpenUrl

[35] 34.↵
Armbruster D, Stone E, & Kirk V (2003) Noisy heteroclinic networks. Chaos: An Interdisciplinary Journal of Nonlinear Science 13:71–86
OpenUrl

[36] 35.↵
Auffinger A & Ben Arous G (2013) Complexity of random smooth functions on the high-dimensional sphere. Ann Probab 41:4214–4247
OpenUrl

[37] 36.↵
Ben Arous G, Mei S, Montanari A, & Nica M (2019) The landscape of the spiked tensor model. Comm. Pure Appl. Math. 72:2282–2330
OpenUrl

[38] 37.↵
Lynch M (2010) Evolution of the mutation rate. Trends Genet 26(8):345–352
OpenUrl CrossRef PubMed Web of Science

[39] 38.↵
Lynch M, et al. (2016) Genetic drift, selection and the evolution of the mutation rate. Nat Rev Genet 17(11):704–714
OpenUrl CrossRef PubMed

[40] 39.↵
Fitzgerald DM, Hastings PJ, & Rosenberg SM (2017) Stress-Induced Mutagenesis: Implications in Cancer and Drug Resistance. Annu Rev Cancer Biol 1:119–140
OpenUrl CrossRef PubMed

[41] 40.
Ram Y & Hadany L (2019) Evolution of Stress-Induced Mutagenesis in the Presence of Horizontal Gene Transfer. Am Nat 194(1):73–89
OpenUrl

[42] 41.↵
Katsnelson MI, Wolf YI, & Koonin EV (2019) On the feasibility of saltational evolution. Proc Natl Acad Sci U S A 116(42):21068–21075
OpenUrl Abstract/FREE Full Text

[43] 42.↵
Wright S (1949) Adaptation and selection. Genetics, Paleontology and Evolution. (Princeton Univ. Press, Princeton, NJ.

[44] 43.
Lynch M (2007) The origins of genome archiecture (Sinauer Associates, Sunderland, MA).

[45] 44.↵
Lynch M & Conery JS (2003) The origins of genome complexity. Science 302(5649):1401–1404
OpenUrl Abstract/FREE Full Text

[46] 45.↵
Masel J (2006) Cryptic genetic variation is enriched for potential adaptations. Genetics 172(3):1985–1991
OpenUrl Abstract/FREE Full Text

[47] 46.
Rajon E & Masel J (2013) Compensatory evolution and the origins of innovations. Genetics 193(4):1209–1220
OpenUrl Abstract/FREE Full Text

[48] 47.
Lynch M & Abegg A (2010) The rate of establishment of complex adaptations. Mol Biol Evol 27(6):1404–1414
OpenUrl CrossRef PubMed Web of Science

[49] 48.↵
Lynch M (2018) Phylogenetic divergence of cell biological features. Elife 7

[50] 49.↵
Gavrilets S & Vose A (2005) Dynamic patterns of adaptive radiation. Proc Natl Acad Sci U S A 102(50):18040–18045
OpenUrl Abstract/FREE Full Text

[51] 50.↵
Mustonen V & Lassig M (2009) From fitness landscapes to seascapes: non-equilibrium dynamics of selection and adaptation. Trends Genet 25(3):111–119
OpenUrl CrossRef PubMed Web of Science

[52] 51.↵
Shorack GR & Wellner JA (2009) Empirical processes with applications to statistics (Society for Industrial and Applied Mathematics, Philadelphia, PA).

[53] 52.↵
Van de Geer S (1995) Exponential inequalities for martingales, with application to maximum likelihood extimation for counting processes. Ann Statist 23:1779–1801
OpenUrl

[54] 53.↵
Bartholomay AF (1958) On the linear birth and death processes of biology as Markoff chains.. Bull Math Biophys 20:97–118
OpenUrl