## Abstract

Fitness effects of mutations depend on environmental parameters. For example, mutations that increase fitness of bacteria at high antibiotic concentration often decrease fitness in the absence of antibiotic, exemplifying a tradeoff between adaptation to environmental extremes. We develop a mathematical model for fitness landscapes generated by such tradeoffs, based on experiments that determine the antibiotic dose-response curves of *Escherichia coli* strains, and previous observations on antibiotic resistance mutations. Our model generates a succession of landscapes with predictable properties as antibiotic concentration is varied. The landscape is nearly smooth at low and high concentrations, but the tradeoff induces a high ruggedness at intermediate antibiotic concentrations. Despite this high ruggedness, however, all the fitness maxima in the landscapes are evolutionarily accessible from the wild type. This implies that selection for antibiotic resistance in multiple mutational steps is relatively facile despite the complexity of the underlying landscape.

## Introduction

Sewall Wright introduced the concept of fitness landscapes in 1932 (** Wright, 1932**), and for decades afterwards it persisted chiefly as a metaphor, due to lack of sufficient data. This has changed considerably in recent decades (

**). There are now a large number of experimental studies that have constructed fitness landscapes for combinatorial sets of mutations relevant to particular phenotypes, such as the resistance of bacteria to antibiotics (**

*de Visser and Krug, 2014***;**

*Weinreich et al., 2006***;**

*Marcusson et al., 2009***;**

*Schenk et al., 2013***;**

*Mira et al., 2015***). Mathematical modeling of fitness landscapes has also seen a revival, motivated partly by the need to quantify and interpret the ruggedness of empirical fitness landscapes (**

*Knopp and Andersson, 2018***;**

*Szendro et al., 2013***;**

*Weinreich et al., 2013***;**

*Neidhart et al., 2014***;**

*Ferretti et al., 2016***;**

*Crona et al., 2017***). Conceptual breakthroughs, such as the notion of sign epistasis (where a mutation is beneficial in some genetic backgrounds but deleterious in others), have shed light on how ruggedness can constrain evolutionary trajectories (**

*Hwang et al., 2018***;**

*Weinreich et al., 2005***;**

*Poelwijk et al., 2007, 2011***).**

*Franke et al., 2011*Despite this progress, a limitation of current studies of fitness landscapes is that they focus mostly on ** G** ×

**(gene-gene) interactions, and little on**

*G***×**

*G***×**

*G***(where**

*E***stands for environment) interactions, i.e on how changes in environment modify gene-gene interactions. A few recent studies have begun to address this question (**

*E***;**

*Flynn et al., 2013***;**

*Taute et al., 2014***;**

*Gorter et al., 2018***). In the context of antibiotic resistance, it has been realized that the fitness landscape of resistance genes depends quite strongly on antibiotic concentration (**

*de Vos et al., 2018***;**

*Mira et al., 2015***). This is highly relevant to the clinical problem of resistance evolution, since concentration of antibiotics can vary widely in a patient’s body as well as in various non-clinical settings (**

*Ogbunugafor et al., 2016***;**

*Kolpin et al., 2004***). Controlling the evolution of resistance mutants thus requires an understanding of fitness landscapes as a function of antibiotic concentration. Empirical investigations of such scenarios are still limited, and systematic theoretical work on this question is also lacking.**

*Andersson and Hughes, 2014*In the present work, we aim to develop a theory of ** G** ×

**×**

*G***interactions for a specific class of landscapes, with particular focus on applications to antibiotic resistance. The key feature of the landscapes we study is that every mutation comes with a tradeoff between adaptation to the two extremes of an environmental parameter. For example, it has been known for some time that antibiotic resistance often comes with a fitness cost, such that a bacterium that can tolerate high drug concentrations grows slowly in drug-free conditions. While such tradeoffs are not universal, they certainly occur for a large number of mutations (**

*E***).**

*Melnyk et al., 2015*Our starting point for understanding these land-scapes is the knowledge of two phenotypes that are well studied – the drug-free growth rate (which we call the null-fitness) and the IC_{50} (the drug concentration that reduces growth rate by half), which is a measure of antibiotic resistance. These two phenotypes correspond to the two extreme regimes of an environmental parameter, i.e zero and highly inhibitory antibiotic concentrations. The function that describes the growth rate of a bacterium for antibiotic concentrations between these two extremes is called the dose-response curve or the inhibition curve (** Regoes et al., 2004**). When tradeoffs are present, the dose-response curves of different mutants must intersect as the concentration is varied (

**). This is schematically shown in Figure 1. The intersection of dose-response curves of the wild type and the mutant happen at point A, swapping the rank order between the two fitness values. The intersection point is known as the minimum selective concentration (MSC), and it defines the lower boundary of the mutant selection window (MSW) within which the resistance mutant has a selective advantage relative to the wild type (**

*Gullberg et al., 2011***;**

*Khan et al., 2017***).**

*Alexander and MacLean, 2018*When there are several possible mutations and multiple combinatorial mutants, a large number of such intersections occur as the concentration of the antibiotic increases. This leads to a succession of different fitness landscapes. Whenever the curves of two mutational neighbors (genotypes that differ by one mutation) intersect, there can be an alteration in the evolutionary trajectory towards resistance, whereby a forward (reverse) mutation now becomes more likely to fix in the population than the corresponding reverse (forward) mutation. These intersections change the ruggedness of landscapes and the accessibility of fitness maxima. In this way a rich and complex structure of selective constraints emerges in the MSW. To explore the evolutionary consequences of these constraints, we construct a theoretical model based on existing empirical studies as well as our own work on ciprofloxacin resistance in *E. coli*. Specifically, we address two fundamental questions: (i) How does the ruggedness of the fitness landscape vary as a function of antibiotic concentration? (ii) How accessible are the fitness optima as a function of antibiotic concentration?

We find that even when the null-fitness and resistance values of the mutations combine in a simple, multiplicative manner, the intersections of the curves produce a highly epistatic landscape at intermediate concentrations of the antibiotic. This is an example of a strong ** G** ×

**×**

*G***interaction, where changes in the environmental variable drastically alter the interactions between genes. Despite the high ruggedness at intermediate concentrations, however, the topology of the landscapes is systematically different from the oft-studied random landscape models, such as the House-of-Cards model (**

*E***;**

*Kauffman and Levin, 1987***), the Kauffman NK model (**

*Kingman, 1978***;**

*Kauffman and Weinberger, 1989***) or the Rough Mt. Fuji model (**

*Hwang et al., 2018***). For example, most fitness maxima have similar numbers of mutations that depend logarithmically on the antibiotic concentration. Importantly, all the fitness maxima remain highly accessible through adaptive paths with sequentially fixing mutations. In particular, any fitness maximum (including the global maximum) is accessible from the wild type as long as the wild type is viable. As a consequence, the evolution of high levels of antibiotic resistance by multiple mutations (**

*Neidhart et al., 2014***;**

*Hughes and Andersson, 2017***;**

*Wistrand-Yuen et al., 2018***) is much less constrained by the tradeoff-induced epistatic interactions than might have been expected on the basis of existing models.**

*Rehman et al., 2019*## Results

### Mathematical model of tradeoff-induced fitness landscapes

The chief goal of this paper is to develop and explore a mathematical framework to study tradeoff-induced fitness landscapes. We consider a total of ** L** mutations, each of which increases antibiotic resistance. A fitness landscape is a real-valued function defined on the set of 2

^{L}genotypes made up of all combinations of these mutations. A genotype can be represented by a binary string of length

**, where a 1 (0) at each position represents the presence (absence) of a specific mutation. Alternatively, any genotype is uniquely identified as a subset of the**

*L***mutations (the wild type is the null subset, i.e the subset with no mutations).**

*L*In this paper, unless mentioned otherwise, we define the fitness *f* as the exponential growth rate of a microbial population. The fitness is a function of antibiotic concentration. This function has two parameters – the growth rate at zero concentration, which we refer to as the null-fitness and denote by *r*, and a measure of resistance such as IC_{50} which we denote by *m*. Each single mutation is described by the pair (*r*_{i}, *m*_{i}), where *r*_{i} and *m*_{i} are the null-fitness and resistance values respectively of the *i*th single mutant. We further rescale our units such that for the wild type, *r* = 1 and *m* = 1. We consider mutations that come with a fitness-resistance tradeoff, i.e a single mutant has an increased resistance (*m*_{i} > 1) and a reduced null-fitness (*r*_{i} < 1) compared to the wild type. To proceed we need to specify two things: (i) how the *r* and *m* values of the combinatorial mutants depend on those of the individual mutations, (ii) how the fitness of the wild type and the mutants depend on antibiotic concentration, and in particular if this dependence exhibits a pattern common to various mutant strains. To address these issues we take guidance from two empirical observations.

#### Scaling of dose-response curves

** Marcusson et al.** (

**) have constructed a series of**

*2009**E. coli*strains with single, double and triple mutations conferring resistance to the fluoroquinolone antibiotic ciprofloxacin (CIP), which inhibits DNA replication (

**). In their study they measured MIC (minimum inhibitory concentration) values and null-fitness but did not report dose-response curves. Some of the present authors have recently shown that the dose-response curve of the wild-type**

*Drlica et al., 2009**E. coli*(strain K-12 MG1655) in the presence of ciprofloxacin can be fitted reasonably well by a Hill function (

**).**

*Ojkic et al., 2019*Here we expand on this work and determine dose-response curves for a range of single- and double-mutants with mutations restricted to five specific loci known to confer resistance to CIP (** Marcusson et al., 2009**) (see Materials and Methods). Figure 2A shows the measured curves for the wild type, the five single mutants, and eight double-mutant combinations. The genotypes are represented as binary strings, where a 1 or 0 at each position denotes respectively the presence or absence of a particular mutation. If we rescale the concentration

*c*of CIP by IC

_{50}of the corresponding strain,

*x*=

*c*/

*IC*_{50}, and the growth rate by the null-fitness

*f*(0), the curves collapse to a single curve that can be approximated by the Hill function (1 +

*x*

^{4})

^{−1}(Figure 2B). The precise shape of the curve is not important for further analysis. However, the data collapse suggests that we can assume that the dose-response curve of a mutant with (relative) null-fitness

*r*and (relative) resistance

*m*is i.e it has the same shape as the wild-type curve

*w*except for a rescaling of the fitness and concentration axes. Similar scaling relations have been reported previously by

**(**

*Wood et al.***) and**

*2014***(**

*Chevereau et al.***). A good biological understanding of the conditions underlying this feature is presently lacking, but it seems intuitively plausible that the shape**

*2015**w*(

*x*) would be robust to changes that do not qualitatively alter the basic physiology of growth and resistance.

#### Limited epistasis in *r* and *m*

An interesting recent finding reported by ** Knopp and Andersson** (

**) is that chromosomal resistance mutations in**

*2018**Salmonella typhimurium*mostly alter the null-fitness as well as the MIC of various antibiotics in a non-epistatic, multiplicative manner, i.e. if a particular mutation increases (decreases) the resistance (null-fitness) by a factor

*k*

_{1}, and another mutation does the same with a factor

*k*

_{2}, then the mutations jointly alter these phenotypes roughly by a factor of

*k*

_{1}

*k*

_{2}(with a few exceptions). We have done a similar comparison for the data on the null-fitness and MIC for

*E. coli*strains in

**(**

*Marcusson et al.***). We have analyzed a subset of 4 mutations for which the complete data set for all combinatorial mutants is available from**

*2009***(**

*Marcusson et al.***). The data are shown in Table 1. Out of 11 multiple-mutants, only 2 show epistasis in**

*2009**r*and 4 show epistasis in

*m*. Moreover, in all cases where significant epistasis occurs it is negative, i.e. the effect of the multiple mutants is weaker than expected from the single mutation effects.

#### Formulation of the model

The above observations suggest a model where one assumes, as an approximation, that all the *r* and *m* values of individual mutations combine multiplicatively. A genotype with *n* mutations (*r*_{1}, *m*_{1}), (*r*_{2}, *m*_{2}), …, (*r*_{n}, *m*_{n}) has a null-fitness *r* and a resistance value *m* given by

Moreover, the dose-response curves of the genotypes are taken to be of the scaling form (1), where the function *w*(*x*) does not depend on the genotype. As indicated before, and without any loss of generality, we choose units such that, for the wild type, *r* = 1 and *m* = 1. Therefore the dose-response curve of the wild type is *w*(*x*) with *w*(0) = 1, and choosing IC_{50} as a measure of resistance we have . Henceforth, we refer to *x* simply as the concentration. We also recall that the condition of adaptational tradeoff means that *r*_{i} < 1 and *m*_{i} > 1 for all mutations.

If the *r*_{i} and *m*_{i} values combine non-epistatically, and if the shape of the dose-response curve is known, it is thus possible to construct the entire concentration-dependent landscape of size 2^{L} from just 2** L** measurements (of the

*r*

_{i}and

*m*

_{i}values of the single mutants) instead of the measurement of 2

^{L}fitness values at every concentration. In practice we do not expect a complete lack of epistasis among all mutations of interest, and the dose-response curve is also an approximation obtained by fitting a curve through a finite set of fitness values known only with limited accuracy. However, the fitness rank order of genotypes, and related topographic features such as fitness peaks, are robust to a certain amount of error in fitness values (

**), and our model may be used to construct these to a good approximation.**

*Crona et al., 2017*Lastly, we require that the dose-response curves of the wild type and a mutant intersect at most once, which implies that the equation with *r* > 1 and *m* < 1 has at most one solution. This then also implies that the curves of any genotype *σ* and a proper superset of it (i.e. a genotype which contains all the mutations in *σ* and some more) intersect at most once. This property holds for all functions that have been used to represent dose-response curves in the literature, such as the Hill function, the half-Gaussian or the exponential function, as well as for all concave function with negative second derivate (see Materials and Methods for details).

### Properties of tradeoff-induced fitness landscapes

To understand the evolutionary implications of our model, we first describe how the fitness landscape topography changes with the environmental parameter represented by the antibiotic concentration. Next we analyze the properties of mutational pathways leading to highly fit genotypes.

#### Intersection of curves and changing landscapes

We start with a simple example of ** L** = 2 mutations and a Hill-shaped dose-response curve (Figure 3). At

*x*= 0, the rank ordering is determined by the null-fitness. The wild type has maximal fitness, and the double mutant is less fit than the single mutants. As

*x*increases, the fitness curves start to intersect, and each intersection switches the rank of two genotypes. In the present example we find a total of six intersections and therefore seven different rank orders across the full range of

*x*. This is actually the maximum number of rank orders that can be found by scanning through

*x*for

**= 2, see Materials and Methods. The final fitness rank order (to the right of the point F in Figure 3A) is the reverse of the original rank order at**

*L**x*= 0.

Figure 3B depicts the concentration-dependent fitness landscape of the 2-locus system in the form of fitness graphs. A fitness graph represents a fitness landscape as a directed graph, where neighboring nodes are genotypes that differ by one mutation, and arrows point toward the genotypes with higher fitness (** de Visser et al., 2009**;

**). A fitness graph does not uniquely specify the rank order in the landscape (**

*Crona et al., 2013***). For example, the region BE has a single fitness graph, but three different rank orders in the segments BC, CD and DE.**

*Crona et al., 2017*Because selection drives an evolving population towards higher fitness, a fitness graph can be viewed as a roadmap of possible evolutionary trajectories. In particular, a fitness peak (marked in red in Figure 3B) is identified from the fitness graph as a node with only incoming arrows. Fitness graphs also contain the complete information about the occurrences of sign epistasis. Sign epistasis with respect to a certain mutation occurs when the mutation is beneficial in some backgrounds but deleterious in others (** Weinreich et al., 2005**;

**). It is easy to read off sign epistasis for a mutation from the fact that parallel arrows (i.e. arrows corresponding to the gain or loss of the same mutation) in a fitness graph point in opposite directions. For example, in the graph for the region AB there is sign epistasis in the first position, since the parallel arrows 00 → 10 and 01 ← 11 point in opposite directions. Notice that in the current example, we start with a smooth landscape at**

*Poelwijk et al., 2007**x*= 0 (as seen in the fitness graph for OA), and the number of peaks and the degree of sign epistasis both reach a maximum in the intermediate region BE. This fitness graph displays reciprocal sign epistasis, which is a necessary condition for the existence of multiple fitness peaks (

**). Beyond the point E, the landscape starts to become smooth again, with only one fitness maximum and a lower degree of sign epistasis. In the last region FG, the landscape is smooth with only one peak (the double mutant 11) and no sign epistasis.**

*Poelwijk et al., 2011*These qualitative properties generalize to larger landscapes. To show this, we consider a statistical ensemble of landscapes with ** L** mutations, where the parameters

*r*

_{i},

*m*

_{i}of single mutations are independently and identically distributed according to a joint probability density

**(**

*P**r, m*). Figure 4 shows the result of numerical simulations of these landscapes for

**= 16. The mean number of fitness peaks with**

*L**n*mutations reaches a maximum at

*x*

_{max}(

*n*) where to leading order log

*x*

_{max}(

*n*) ∼

*n*⟨log

*m*⟩, independent of any further details of the system, as argued in Materials and Methods. The asymptotic expression works well already for

**= 16 (see inset of Figure 4A). Figure 4B shows the mean number of mutations in a fitness peak. This is well approximated by the curve , showing that the mean number of mutations in a fitness peak grows logarithmically in the concentration. This is consistent with what we would expect from the variation in the number of peaks with**

*L**n*mutations as shown in Figure 4A.

As another indicator of ruggedness, we consider the number of backgrounds in which a mutation is beneficial as a function of *x*. At *x* = 0, any mutation is deleterious in all backgrounds, whereas at very large *x* it is beneficial in all backgrounds. Therefore there is no sign epistasis in either case. Sign epistasis is maximized when a mutation is beneficial in exactly 1/2 of all backgrounds. Figure 5 shows the mean number of backgrounds *n*_{b} (with *n* mutations each) in which the occurrence a mutation is beneficial, for two different values of *n*. The curves have a sigmoidal shape, starting from zero and saturating at , which is the total number of backgrounds with *n* mutations. The blue curve shows the mean total number of backgrounds (with any *n*) in which a mutation is beneficial, which has a similar shape.

Since every mutation in every background goes from being initially deleterious to eventually beneficial, there must be some *x* at which every mutation is beneficial in exactly half the backgrounds. The inset of Figure 5 shows that for backgrounds with *n* mutations, the average concentration at which a mutation is beneficial in 1/2 the backgrounds is given by log *x* ≃ *n* ⟨log *m*⟩, which is the same concentration at which the largest number of fitness peaks were found in Figure 4. A derivation of this relation is given in Materials and Methods. Similarly, when summed over all mutation numbers *n*, the fraction of beneficial backgrounds reaches 1/2 around the same concentration at which the total number of fitness peaks is maximal. Since the number of backgrounds is largest at *n* = ** L**/2 for combinatorial reasons, this concentration is approximately given by .

#### Accessibility of fitness peaks

Having shown that tradeoff-induced fitness landscapes display a large number of fitness peaks at intermediate concentrations, we now ask how these peaks affect the evolutionary dynamics. We base the discussion on the concept of evolutionary accessibility, which effectively assumes a regime of weak mutation and strong selection (** Gillespie, 1984**). In this regime the evolutionary trajectory consists of a series of fixation events of beneficial single-step mutations represented by a directed path in the fitness graph of the land-scape (

**;**

*Weinreich et al., 2005, 2006***). We say that a genotype is**

*Franke et al., 2011**accessible*from another genotype if a directed path exists from the initial to the final genotype.

The accessibility of peaks in a fitness landscape is determined by the rank ordering of the genotypes. We now show that the rank orders of tradeoff-induced fitness landscapes are constrained in a way that gives rise to unusually high accessibility. Consider two distinct sets of mutations *A*_{i} and *A*_{j} that can occur on the genetic background ** W**, and the four genotypes

*W, WA*_{i},

*W A*_{j}and

*WA*_{i}

*A*_{j}, where a concatenation of symbols represents the genotype which contains all the mutations referred to by the symbols. The

**ordering condition**(derived in Materials and Methods) says that whenever

*W*is the fittest among these four genotypes,

*WA*_{i}

*A*_{j}must be the least fit, and whenever

*WA*_{i}

*A*_{j}is the fittest,

**must be the least fit. For the case of two single mutations this situation is illustrated by the fitness graphs in Figure 3B, where the background genotype**

*W***= 00 is the fittest in the first segment 0A and the genotype**

*W*

*WA*_{i}

*A*_{j}= 11 is the fittest in the last segment FG. The ordering condition has the immediate consequence that the fittest genotype is

*always*accessible from the background genotype

**. If the fittest genotype is one of the single mutants (segments AB, BE and EF), then it is of course accessible. If it is the double mutant**

*W*

*WA*_{i}

*A*_{j}(segment FG), then the background genotype must be the least fit genotype (from the ordering condition), and therefore

*W A*_{i}and

*WA*_{j}should be fitter than

**. Then**

*W*

*W A*_{i}

*A*_{j}is accessible from the wild type through the path

**→**

*W*

*WA*_{i}→

*W A*_{i}

*A*_{j}and the path

**→**

*W*

*WA*_{j}→

*W A*_{i}

*A*_{j}.

To fully exploit the consequences of the ordering property we need to introduce some notation. Let *σ* be a genotype with *n* mutations. We define a *subset* of *σ* as a genotype with *l* mutations, *l* ≤ *n*, which are all contained in *σ* as well. Likewise, a *superset* of *σ* is a genotype with *l* mutations, *l* ≥ *n*, that contains all the mutations in *σ*. With this, the ordering condition can be seen to imply that the superset of a fitness peak is accessible from its own supersets. For example, if ** W** is the fittest genotype, then

*W A*_{i}is a superset of it, and because of the ordering condition,

*W A*_{i}must be fitter than its superset

*WA*_{i}

*A*_{j}, and therefore accessible from it. Similarly, it is easy to show that the subset of a fitness peak is accessible from its own subsets. This property can be generalized and constitutes our main result on accessibility of fitness peaks.

### Accessibility property

*Any genotype* Σ *that is a superset of a local fitness peak σ is accessible from all the superset genotypes of* Σ. *Similarly, any genotype* Σ′ *that is a subset of a local fitness peak σ is accessible from all the subset genotypes of* Σ′.

The proof is given in Materials and Methods. Three particularly important consequences are

Any fitness peak is accessible from all its subset and superset genotypes.

**Any fitness peak is accessible from the wild type.**This is because the wild type is a subset of every genotype.For the same reason, when the wild type is a fitness peak, it is accessible from every genotype, and is therefore also the only fitness peak in the landscape. The same holds for the all-mutant, which is a superset of every genotype.

These properties are illustrated by the fitness graph in Figure 6. We assume that the landscape has (at least) two peaks at the genotypes 1001 (marked in red) and 0111 (marked in blue). The colored arrows point towards mutational neighbors with higher fitness and are enforced by the accessibility property. The edges without arrowheads are not constrained by the accessibility property and the corresponding arrows (which are not shown in the figure) could point in either direction.

Consider the genotype 0111 (marked in blue). It is accessible from all its subsets, namely 0000, 0010, 0010, 0001, 0110, 0101 and 0011, following the upward pointing blue arrows. These subsets are in turn accessible from their subsets. For example, 0011 is accessible from all its subsets – 0000, 0010, and 0001. The fitness peak is also accessible from its superset 1111. The same property holds for the other fitness peak. The subsets or supersets may access the fitness peaks using other (unmarked) paths as well, which would include one or more of the undirected lines in conjunction with some of the arrows. Moreover, other genotypes, which are neither supersets nor subsets, may also access these fitness peaks through paths that incorporate some of the undirected edges.

A fitness peak together with its subset and superset genotypes defines a sub-landscape with remarkable properties. It is a smooth landscape with only one peak which is accessible from any genotype via all direct paths, i.e paths where the number of mutations monotonically increases or decreases. For example, the fitness peak 1001 is accessible from the all-mutant 1111 by the two direct paths – 1111 → 1101 → 1001 and 1111 → 1011 → 1001. Likewise, the peak 0111 is accessible from its subset 0001 via the paths 0001 → 0101 → 0111 and 0001 → 0011 → 0111. In general, a peak with *n* mutations is accessible from a subset genotype with *m* mutations by (*n* − *m*)! direct paths, and from a superset genotype with *m* mutations by (*m* − *n*)! direct paths. This gives a lower bound on the total number of paths by which a fitness peak is accessible from a subset or superset genotype.

Importantly, the accessibility property formulated above holds under more general conditions than stipulated in the model. We show in Materials and Methods that it holds whenever the null fitness and resistance values of the mutations, *r* and *m*, do not show *positive* epistasis. This is a weaker requirement than our original assumption of a strict lack of epistasis in these two phenotypes. In this context it should be noted that the rank orderings forbidden by the ordering condition all show positive epistasis for the fitness values, whereas all the allowed orderings can be constructed without positive epistasis. Therefore, any landscape where positive epistasis in the fitness is absent will also display the accessibility property. However, whereas the lack of positive epistasis is a sufficient condition, it is not necessary. In particular, our model does allow for cases of positive epistasis in the fitness values.

#### Reachability of the fittest and the most resistant genotype

The preceding analyses have shown that within the mutant selection window, where mutants with higher fitness than the wild type exist, every fitness peak is accessible from the wild type. This includes in particular the fittest genotype at a given concentration. However, in general there will be many peaks in the fitness landscape, and it is not guaranteed that evolution will reach the fittest genotype. One can ask for the probability that the fittest genotype is actually accessed under the evolutionary dynamics, which we call its reachability. We assume that the dynamics is in the strong selection weak mutation (SSWM) regime, and the population is large enough such that the fixation probability of a mutant with selection coefficient *s* is 1 − *e*^{−2s} for *s* > 0, and 0 for *s* ≤ 0 (** Gillespie, 1984**). In our setting the selection coefficient is , where

*f*

_{1}is the growth rate of a mutant appearing in a population of cells with growth rate

*f*

_{0}.

Figure 7 shows the numerically obtained reachability for ** L** = 10, averaged over the distribution

**(**

*P**r, m*) given in Eq. (8). The reachability of the highest peak is 1 at very low and very high concentrations, since there is only peak, the wild type or the all-mutant, at these extremes. The reachability is lower at intermediate concentrations, where there are multiple peaks, all of which are accessible from the wild type. The dashed blue line is the mean of the reciprocal of the total number of fitness peaks, and is therefore the mean reachability of fitness peaks. The reachability of the highest peak follows the qualitative behavior of the mean reachability, but remains higher than the mean reachability everywhere. The green curve is the reachability of the most resistant genotype, i.e the all-mutant. It is extremely low at low and moderate concentrations and grows steeply and saturates quickly at a very large concentration. The all-mutant genotype is less-than-average reachable everywhere except at very high concentration, when it is the only fitness peak and accessible from every other genotype.

We have compared the reachability to two other widely studied landscape models. One is the House-of-Cards (HoC) model (** Kauffman and Levin, 1987**;

**), where each genotype is independently assigned a fitness value drawn from a continuous distribution. The reachability is found to be around 0.018, an order of magnitude smaller than the lowest reachability seen in the tradeoff-induced landscape. The mean number of fitness maxima in the HoC landscape is , which in this case is approximately 93.1, much higher than the maximum mean number of peaks in the tradeoff-induced landscape (inset of Figure 7). We would therefore naturally expect a smaller fraction of adaptive walks to terminate at the fittest peak. A more illuminating comparison is with the NK model (**

*Kingman, 1978***;**

*Kauffman and Weinberger, 1989***). Here, once again,**

*Hwang et al., 2018***= 10, and the mutations are divided into two blocks of 5 mutations each. As per the usual definition of the model, the fitness of a genotype is the sum over the contributions of each of the 10 mutations, and the contribution of each mutation depends only the state of the block to which it belongs. The fitness contribution of each mutation for any state of the block is an independent random number. The mean number of fitness maxima here is ≃ 28.44 (**

*L***;**

*Perelson and Macken, 1995***), which is comparable to the maximum mean number in the tradeoff-induced landscapes (see inset of Figure 7). Nonetheless, the reachability of the fittest peak (dotted pink line) is found to be nearly 4 times smaller than the lowest reachability in our landscape. We found that in a fraction of about 0.64 of the landscapes, the fittest maximum is not reached in any of 32000 dynamical runs, indicating the absence of an accessible path in most of these cases (**

*Schmiegelt and Krug, 2014***;**

*Schmiegelt and Krug, 2014***). In contrast, an evolutionary path always exists to any fitness peak in the tradeoff-induced landscapes, as we saw in the previous subsection. This endows the tradeoff-induced landscapes with the unusual property of being highly rugged and at the same time having a much higher evolutionary reachability of the global fitness maximum compared to other models with similar ruggedness.**

*Hwang et al., 2018*## Discussion

Fitness landscapes depend on the environment, and gene-gene-interactions can be modified by the environment. Systematic studies of such ** G** ×

**×**

*G***interactions are rare, but they are clearly of relevance to scenarios such as the evolution of antibiotic resistance, where the antibiotic concentration can vary substantially in space and time. In this paper we have explored the structure of such landscapes in the presence of tradeoffs between fitness and resistance. We summarize the main findings of our work.**

*E*We have shown experimental evidence that the dose-response curves of various mutant strains of

*E. coli*to the antibiotic ciprofloxacin have the same shape, except for a rescaling of the fitness and concentration values. If this shape is known, the fitness of a strain can be estimated at any antibiotic concentration simply by measuring its null-fitness and IC_{50}(or MIC). This makes it possible to construct empirical fitness landscapes at any antibiotic concentration from a limited set of data.Under the assumptions of our model the degree of epistasis, particularly sign epistasis, is low for zero and high antibiotic concentrations, but it is nevertheless high in the intermediate concentration regime. The number of local fitness peaks scales exponentially in the number of mutations at these concentrations. Epistasis is often discussed as a property intrinsic to mutations and their genetic backgrounds, with limited consideration of environmental parameters. But in the landscapes studied here, the environmental parameter is of paramount importance, since changes in it can dramatically alter gene-gene interactions.

The expected number of mutations at a fitness peak increases logarithmically with the antibiotic concentration. This implies that, at a given concentration, the highly fit genotypes that make up the fitness peaks carry an

*optimal number of mutations*that arises from the tradeoff between fitness cost and resistance.Despite the high ruggedness, the landscape displays strong non-random patterns. A rank ordering condition between sets of mutations holds at all concentrations. A remarkable and unexpected consequence of this is that any fitness peak is evolutionarily accessible from the wild type. This is contrary to the common intuition about highly rugged landscapes, where one expects any genotype to have access to only a fraction of the fitness peaks and adaptive walks to terminate after a small number of steps.

It is well known from experimental studies of antimicrobial resistance evolution that highly resistant genotypes often require multiple mutations which can be acquired along different evolutionary trajectories. Epistatic interactions constrain these trajectories and are generally expected to impede the evolution of high resistance. We find that strong and complex epistatic interactions inevitably arise in the mutant selection window, but at the same time the evolution of the most resistant genotype (the identity of which changes with concentration) remains facile and can occur along many different pathways.

All of these conclusions follow from three basic assumptions that are readily generalizable beyond the context of antimicrobial resistance evolution: the existence of tradeoffs between two *marginal phenotypes* that govern the adaptation at extreme values of an environmental parameter; the scaling property of the shape of the tradeoff function; and the condition of limited epistasis for the marginal phenotypes. How generally these assumptions are valid is a matter of empirical investigation. We have shown that they hold for certain cases, and the interesting evolutionary implications of our results indicate that more empirical research in this direction will be useful.

In the case of antimicrobial resistance, there can be fitness compensatory mutations (** Durão et al., 2018**;

**) that do not exhibit any adaptational tradeoffs. These mutations are generally found in a population in the later stages of the evolution of antibiotic resistance, which implies that they emerge in a genetic background of mutations with adaptational tradeoffs. An understanding of tradeoff-induced landscapes is therefore a prerequisite for predicting the emergence of compensatory mutations.**

*Levin et al., 2000*In the formulation of our model we have assumed for convenience that the marginal phenotypes combine multiplicatively, but this assumption is in fact not necessary. As shown in Materials and Methods, our key results on accessibility only require the absence of positive epistasis. These results therefore hold without exception for the combinatorially complete data set in Table 1, where epistasis is either absent or negative. More generally, our analysis remains valid in the presence of the commonly observed pattern of diminishing returns epistasis among beneficial mutations (** Chou et al., 2011**;

**;**

*Schoustra et al., 2016***). In addition, we expect our results to hold approximately when there is a small degree of epistasis (positive or negative) in**

*Wünsche et al., 2017**r*and

*m*, but we do not explore that question quantitatively in this paper.

We conclude with some possible directions for future work. Our model provides a principled framework for predicting how microbial fitness landscapes vary across different antibiotic concentrations. This could be exploited to describe situations where the antibiotic concentration varies on a time scale comparable to the evolution of resistance, either due to the degradation of the drug or by an externally imposed treatment protocol (** Marrec and Bitbol, 2018**). From the broader perspective of evolutionary systems with adaptational tradeoffs mediated by an environmental parameter, our study makes the important conceptual point that it is impossible to have non-epistatic fitness landscapes for all environments. Using the terminology of

**(**

*Gorter et al.***), the tradeoffs enforce reranking**

*2016***×**

*G***interactions which in turn, as we have shown, induce sign-epistatic**

*E***×**

*G***interactions at intermediate values of the environmental parameter. Notably, this general conclusion does not depend on the scaling property of the tradeoff function. It would nevertheless be of great interest to identify instances of scaling for other types of adaptational tradeoffs, in which case the detailed predictions of our model could be applied as well.**

*G*## Materials and Methods

### Experiments

#### Bacterial strains

We used strains from Marcusson et al. (2009) (courtesy of Douglas Huseby and Diarmaid Hughes). The strains are isogenic derivatives of MG1655, a K12 strain of the bacterium *E. coli*, with specific point mutations or gene deletions in five different loci: *gyrA:S83L, gyrA:D87N, parC:S80I*, Δ*marR*, and Δ*acrR*. There are 32 possible combinations of these alleles, but we only used the wild type, single mutants (5 strains) and double mutants (8 strains of 10 possible combinations): LM179 (00000), LM378 (10000), LM534 (01000), LM792 (00100), LM202 (00010), LM351 (00001), LM625 (11000), LM862 (10100), LM421 (10010), LM647 (10001), LM1124 (01100), LM538 (01010), LM592 (01001), LM367 (00011). A binary sequence after the strain’s name represents the presence/absence of a particular mutated allele (order as in the above list of genetic alterations).

#### Growth media and antibiotics

LB growth medium was prepared according to Miller’s formulation (10g tryptone, 5g yeast extract, 10g NaCl per litre). The pH was adjusted to 7.2 with NaOH, and autoclaved at 121°C for 20 min. Ciprofloxacin (CIP) solutions were prepared from a frozen stock (10mg/ml ciprofloxacin hydrochloride, pharmaceutical grade, AppliChem, Darmstadt, in sterile, ultra-pure water) by diluting into LB to achieve the desired concentrations.

#### Dose-response curves

We incubated bacteria in 96-well clear flat bottom micro-plates (Corning Costar) inside a plate reader (BMG LABTECH FLUOstar Optima with a stacker) starting from two different initial cell densities (half a plate for each), and measured the optical density (OD) of each culture every 2-5 min to obtain growth curves. Plates were prepared automatically using a BMG LABTECH CLARIOstar plate reader equipped with two injectors connected to a bottle containing LB and a bottle with a solution of CIP in LB. The injectors were programmed to create different concentrations of CIP in each column of the 96 well plate. The injected volumes of the CIP solution were 0, 20, 25, 31, 39, 49, 62, 78, 98, 124, 155, 195 *μ*l, and an appropriate volume of LB was added to bring the total volume to 195 *μ*l per well. Since different strains had MICs spanning almost two decades of CIP concentrations, we used a different maximum concentration of the CIP solution for each strain (approximately 1.5 - 2 times the expected MIC). Bacteria were diluted from a thawed frozen stock 10^{3} and 10^{4} times in PBS (phosphate buffered saline buffer), and 5*μ*l of the suspension was added to each well (10^{3} dilution to rows A-D, 10^{4} dilution to rows E-H). We used one strain per plate and up to 4 plates per strain (typically 1-2). After adding the suspension of bacteria to each well, the plates were immediately sealed with a transparent film to prevent evaporation, and put into a stacker (37°C, no shaking), from which they would be periodically fed into the FLUOstar Optima plate reader (37°C, orbital shaking at 200rpm for 10s prior to OD measurement). We then used the time shift methods to obtain exponential growth rates for each strain and different concentrations of CIP, see ** Ojkic et al.** (

**) for further details.**

*2019*### Mathematical Methods

#### Rank orders and fitness graphs

The total number of possible rank rank orders with ** L** mutations is 2

^{L}!, which is 24 for

**= 2. Not all these rank orders, however, can be realized as one scans through**

*L**x*. Since any two curves intersect at most once, the maximum number of distinct rank orders that can be reached is the rank order at

*x*= 0 plus the total number of possible intersections, which is . Thus the upper bound on the number of rank orders found by scanning through

*x*is 2

^{L−1}(2

^{L}− 1)+ 1, which is smaller than 2

^{L}! for

**≥ 2.**

*L*It is also instructive to determine the number of fitness graphs that can be found by varying *x* for a system with ** L** mutations. This can be computed as follows: At

*x*= 0 every mutation is deleterious, and every mutational neighbor with one less mutation is fitter; but due to the tradeoff condition, at sufficiently large

*x*every mutation is beneficial and any mutational neighbor with one less mutation is less fit. In order for this reversal of fitness order to happen, the dose-response curves of any two mutational neighbors must intersect at some

*x*. Therefore, the number of fitness graphs generated is equal to the number of distinct pairs of mutational neighbors, which is 2

^{L−1}

**, and the number of distinct fitness graphs encountered is 2**

*L*^{L−1}

**+1. For**

*L***= 2, this number is 5, as seen in the example in the main text.**

*L*#### Condition for two dose-response curves to intersect at most once

Consider two DR curves characterized by (*r, m*) and (*r*′, *m*′), where *r* < *r*′ and *m* > *m*′. We need to show that for the commonly observed cases, the curves and intersect at most once. First, notice that it is sufficient to prove this for the case *r* = 1, *m* = 1, because any rescaling of the *x* and *w* axes does not alter the number or ordering of intersection points. Therefore we require *r* < 1 and *m* > 1.

Let us consider the case where the dose-response curve is of the form of a Hill function, i.e , with *a* > 0. The intersection of curves happens at the solution of , which we denote by *x*^{*}(*r, m*). In this case the solution is given by
which is positive and unique if *rm*^{a} > 1; otherwise no solution with *x*^{*} > 0 exists. It is similarly easy to show that at most one intersection point exists for exponentials, stretched exponentials, and half-Gaussians.

The property also holds for any concave dose-response curve with *w*″(*x*) < 0. We prove this as follows. Any intersection point *x*^{*} is the solution of
where . We will show that ** F** (

*x*) is monotonic and therefore the above equation has at most one solution. We have and

**′(**

*F**x*) has the same sign as the numerator . Since

*w*(

*x*) is a decreasing function and

*m*> 1, . When

*w*″(

*x*) < 0, we also have . Since

*w*′(

*x*) < 0, this implies , and 𝒩 (

*x*) < 0. Therefore

**(**

*F**x*) is monotonically decreasing.

#### Proof of the accessibility property

To derive the ordering condition, let us start with the simplest case of two single mutations *A*_{i}, *A*_{j} occurring on the wild type background. There are correspondingly four different genotypes *W, WA*_{i}, *WA*_{j}, *WA*_{i}*A*_{j}, which are listed in decreasing order of fitness at *x* = 0. Let the intersection of the DR curves of two genotypes *σ*_{1} and *σ*_{2} occur at . Then is given by the solution *x*^{*}(*r*_{j}, *m*_{j}) of
and is given by the solution of

This last equation can be re-written as where . Comparing this with the first equation above, we have

This equation tells us that whenever the double mutant is fitter than one of the single mutants, the wild type must be less fit than the *other* single mutant. Consequently, when the double mutant is fitter than both the single mutants, the WT must be less fit than both the single mutants. In other words, the number of single mutants fitter than the wild type cannot be less than the number of single mutants less fit than the double mutant. This is the ordering condition given in the main text. Any ordering that violates this condition is a *forbidden ordering*. For greater clarity, we list all the possible forbidden orderings (up to interchange of indices *i* and *j*).

Although we showed this for two single mutations in the wild type background, the same arguments hold for any two sets of mutations in any background, since the succession of orderings is independent of the rescalings of the fitness and concentration axes. To put it more precisely, *W, A*_{i} and *A*_{j} are any three non-overlapping sets of mutations, where *A*_{i} and *A*_{j} are non-empty sets.

Next we use this to prove the accessibility property. Let *σ* have *n* mutations. It is sufficient to prove that (i) any superset of *σ* with *m* or fewer mutations is accessible from all its own supersets with *m* or fewer mutations, for all *m* ≥ *n* (the statement follows from the case *m* = ** L**); and that (ii) any subset of

*σ*with

*m*′ or more mutations is accessible from any of its own subsets with

*m*′ or more mutations, for all

*m*′ ≤

*n*(the statement corresponds to

*m*′ = 0). We prove this by induction.

Firstly, we notice that the case *m* = *n* is trivial, since *σ* is of accessible from itself. For the case of supersets, our base case is *m* = *n* + 1, and the assertion above holds because *σ* is a local fitness peak, and therefore accessible from all its supersets with *n* +1 mutations, which are of course accessible from themselves.

Now we prove the induction step. Assume that all supersets of *σ* that have *m* or fewer mutations (where *m* ≥ *n*) are accessible from all their supersets with *m* or fewer mutations. Consider a superset Σ of *σ* with *m* mutations, and denote it by Σ = *σ*** A**, where

**is the set of mutations in Σ not present in**

*A**σ*. By assumption,

*σ*is accessible from Σ. In the following, we use the notation

*σ*

_{1}>

*σ*

_{2}to indicate that a genotype

*σ*

_{1}is fitter than a genotype

*σ*

_{2}(we use the “<” and “=” signs in a similar way). Therefore, we have

*σ*> Σ =

*σ*

**.**

*A*Now consider any superset of Σ with *m* + 1 mutations, where the additional mutation not contained in Σ is denoted ** B**. Then this superset can be denoted by Σ

**=**

*B**σ*

**. We must have**

*AB**σ*>

*σ*

**since**

*B**σ*is a local fitness peak. We now have the relation

*σ*>

*σ*

**,**

*A**σ*

**. Therefore we must have**

*B**σ*

**<**

*AB**σ*

**,**

*A**σ*

**, for otherwise we violate the ordering condition. Now since Σ**

*B***=**

*B**σ*

**<**

*AB**σ*

**= Σ, Σ must be accessible from Σ**

*A***, proving that any superset with**

*B**m*mutations is accessible from any of its supersets with

*m*+ 1 mutations. This completes the proof of the induction step.

The proof for the case of subsets is essentially the same, utilizing the symmetry between the wild type and the double mutant in the ordering condition.

The accessibility property follows entirely from the ordering condition, and hence any landscape that obeys the ordering condition will obey the theorem. The ordering condition follows from , as obtained in (3). However, this same inequality obtains under more general conditions. To see this, let us define the null-fitness of the double mutant *WA*_{i}*A*_{j} as *r*_{ij}, and the resistance of the double mutant as *m*_{ij}. The dose-response curves of ** W** and

*WA*_{j}intersect at , whereas the curves for

*WA*_{i}and

*WA*_{i}

*A*_{j}intersect at

Now it is easy to show that *x*^{*}(*r, m*) is a decreasing function of both *r* and *m*. Therefore holds if *r*_{ij} ≤ *r*_{i}*r*_{j} and *m*_{ij} ≤ *m*_{i}*m*_{j}.

#### Number of local fitness peaks

When dealing with complex fitness landscapes with parameters that can vary across species and environments, a useful strategy is to model the fitness effects as random variables that are chosen from a probability distribution (** Kauffman and Levin, 1987**;

**;**

*Szendro et al., 2013***). In the limit of large system size**

*Hwang et al., 2018***, many properties emerge that are independent of the details of the system. In practice, even relatively small system sizes are often approximated well by results obtained in the asymptotic limit.**

*L*The mean number of peaks with *n* mutations in the tradeoff-induced landscapes is
where is the total number of genotypes with mutations, and (ω) is the probability that a genotype with *n* mutations is a fitness maximum at antibiotic concentration *x*. Then the total number of peaks at *x* is **∑***n* *K*_{n}(*x*). Let the resistance of a genotype *σ* be , and likewise its null-fitness be . The genotype *σ* is a local fitness maximum if it is fitter than all its subsets with *n* − 1 mutations and all its supersets with *n* + 1 mutations.

To find the concentration at which the curves of *σ* and its neighboring genotypes intersect, we start with the simplest case of the dose-response curves of the wild type and a single mutant (*r, m*). These curves intersect at the solution *x*^{*}(*r, m*) of , which is a decreasing function of *r* and *m*. The wild type is fitter than the single mutant when *x* > *x*^{*}(*r, m*). Now the intersection of the DR curves of a genotype *σ* with *n* mutations and a subset with *n* − 1 mutations that lacks the mutation (*r*_{i}, *m*_{i}) occurs at the solution of
which is read off as . Likewise, the intersection of the DR curves of *σ* and a superset with *n* + 1 mutations that contains the additional mutation (*r*_{j}, *m*_{j}) occurs at *M**x*^{*}(*r*_{j}, *m*_{j}). Therefore *σ* is a fitness maximum if
for all *i* and *j* with 1 ≤ *i* < *n* and *n* < *j* ≤ ** L**. Alternatively,

Let us consider the regime where ** L**,

*n*≫ 1. Then log

**∼**

*M**n*⟨

*log m*⟩; if log

*x*is smaller than

**(**

*O**n*), it is clear that the second inequality is almost certainly satisfied whereas the probability of the first inequality is vanishingly small. Both the probabilities are finite if log

*x*∼

*n*⟨

*log m*⟩. Thus the probability of

*σ*being a fitness peak is maximized when log

*x*= log(

**) +**

*M**η*, where

*η*∼

**(1) and depends on the details of the distribution**

*O***(**

*P**r, m*). Thus the mean number of fitness peaks with

*n*mutations is maximal at

*x*

_{max}(

*n*) where to leading order log

*x*

_{max}(

*n*) ∼

*n*⟨

*log m*⟩, independent of any further details of the system.

The total number of genotypes with *n* mutations is , and log , where , and

The mean number of fitness maxima can be found by multiplying this with *Q*_{n}. One may expect *Q*_{n} to be exponentially small in ** L**, since a total of

**inequalities (as indicated in (6)) need to be satisfied. However, this is complicated by the fact that the probabilities of the inequalities being satisfied are not independent. The correlations between the inequalities would depend on the distribution of**

*L***(**

*P**r, m*) and the dose-response curve. If the correlations are sufficiently weak, one might still expect to find an exponential scaling in large

**. To leading order is itself exponential in, and if the probability that a genotype is a fitness peak is exponentially small in**

*L***, we expect the mean number of peaks**

*L*

*K*_{n}to be exponential in

**as well. This is supported by the scaling shown in the inset of Figure 4A.**

*L*For the simulation results shown in the main text we chose a joint distribution of the form

The conditional distribution ** P** (

*m*|

*r*) is a shifted gamma distribution. The shift ensures that the curves of a background genotype and a mutant intersect.

#### Sign epistasis in the limit of large *L* and *n*

*L*

Sign epistasis with respect to a certain mutation occurs when the mutation is beneficial in one background but deleterious in another. To understand sign epistasis, we ask for the number of backgrounds *n*_{b} in which a mutation is beneficial at concentration *x*. If one considers only those backgrounds that have *n* mutations, then *n*_{b} would depend both on *n* and *x*.

In a statistical ensemble of landscapes, one may compute the probability *P*_{b} that a mutation is beneficial in a background with mutations, and of course . In the limit of large ** L** and

*n*,

*P*_{b}exhibits some universal properties to leading order. When log

*x*>

*n*⟨

*log m*⟩, we are in the regime of high concentration relative to

*n*, and we expect a mutation to be beneficial. We find that to leading order

*P*_{b}(

*ρ, x*) = 1, with corrections that are exponentially small in

*n*. When log

*x*<

*n*⟨

*log m*⟩, we are at concentrations that are too low to prefer additional mutations, and

*P*_{b}is exponentially small in

*n*. When log

*x*=

*n*⟨

*log m*⟩, we are at the threshold concentration where a new mutation becomes beneficial. Here we find that . For large

**we therefore expect a steep transition from 0 to 1 as the concentration crosses the threshold value (see inset of Figure**

*L***??**).

Consider a mutation (*r, m*) in a background with *n* mutations (*r*_{1}, *m*_{1}), (*r*_{2}, *m*_{2})… (*r*_{n}, *m*_{n}). The mutation is beneficial in this background if

Taking logarithms, we have

Define , and , and *z* = − log *x*^{*}(*r, m*). Then the above inequality becomes

Let the distribution of *z* be ** P** (

*z*), and let . Define the random variable , and denote its distribution

**(**

*P**ω*). Then the probability that a mutation is beneficial in a background with

*n*mutations is

The mean number of backgrounds with *n* mutations in which a mutation is beneficial is . Note that where *μ* = log *m*. When *n* ≫ 1, *C*_{z}(*n ω*) ≃ 1 for *ω* < 0 and *C*_{z}(*n ω*) ≃ 0 for *ω* > 0, with a sharp transition from 1 to 0 that happens within a region of width ∼ ** O**(1/

*n*) of the origin. Also for large

*n*,

**(**

*P**ω*) is sharply peaked around

*ω*over a region of width .

When ⟨*ω*⟩ < 0, *C*_{z}(*nω*) ≃ 1 over this entire region, as observed before. Thus to leading order, *P*_{b}(*ρ, ξ*) = 1. The mean number of backgrounds in which a mutation is beneficial is .
where ** H**(

*ρ*) is defined in (7). Therefore to leading order.

When ⟨*ω*⟩ > 0, the dominant contribution to the integral in (12) comes from *ω* ≤ 0, since *C*_{z}(*nω*) quickly drops from 1 to zero for *ω* > 0. Further, since *C*_{z}(*ω*) ≃ 1 for *ω* < 0 (except for a region of width ** O**(1/

*n*) around

*ω*= 0, as observed before), we can approximate log

*P*_{b}(

*ρ, ξ*) simply by the probability that

*ω*< 0. Then where

**is the large deviation function of −**

*I**μ*, and

This implies that *n*_{b} is reduced by a factor that is exponentially small in ** L** compared to (15)), and therefore the fraction of backgrounds in which a mutation is beneficial is very small.

Finally, when ⟨*ω*⟩ = 0, i.e , ** P**(

*ω*) is centered at the origin and decays over a width . For

*ω*> 0,

*C*_{z}(

*nω*) is 0 except over a much smaller width

**(1/**

*O**n*) to the right of the origin, whereas for

*ω*≤ 0, it is 1 except for a small region of width

**(1/**

*O**n*) left of the origin. Thus the dominant contribution to the integral in (12) comes from

*ω*≤ 0, and as before,

*P*_{b}can be approximated by the probability

*ω*≤ 0. Due to the central limit theorem,

**(**

*P**ω*) is approximately Gaussian and therefore symmetric around

*ω*= 0, and therefore . Consequently, we should have which is times the total number of backgrounds given by (14). This proves that the concentration where the mutation is beneficial in half of the backgrounds is given by ⟨

*ω*⟩ = 0 or log

*x*=

*n*⟨

*log m*⟩for large

**and**

*L**n*.

### Epistasis in null-fitness and MIC for *E. coli* in the presence of ciprofloxacin

Primary data shown in Table 1 were obtained from ** Marcusson et al.** (

**). In the third and fifth columns, the errors in the log(**

*2009**x*) are calculated as , where |Δ

*x*| are the standard error as calculated from the standard deviations reported in the paper. The errors in columns four and six were estimated as where the sum is over the mutations present in the combinatorial mutants. The detectable cases of epistasis are marked in blue. Negative epistasis is found in all these cases. Also, all the cases with epistasis correspond to two or more mutations that affect the same chemical pathways.

## Acknowledgements

We thank Douglas Huseby and Diarmaid Hughes for providing us with the *E. coli* strains of ** Marcusson et al.** (

**), and Tobias Bollenbach, Michael Brockhurst and Kristina Crona for useful comments. The work of SGD and JK was supported by DFG within CRC 1310**

*2009**Predictability in Evolution*, and JK acknowledges the kind hospitality of the Scottish Universities Physics Alliance and the Higgs Center for Theoretical Physics during the completion of the project. SOLD and RJA acknowledge the support of the ERC Consolidator Grant 682237 EVOSTRUC.