## Abstract

Evolved resistance to one antibiotic may be associated with “collateral” sensitivity to other drugs. Here we provide an extensive quantitative characterization of collateral effects in *E. faecalis*, a gram-positive opportunistic pathogen. By combining parallel experimental evolution with high-throughput dose-response measurements, we measure phenotypic profiles of collateral sensitivity and resistance for a total of 900 mutant-drug combinations. We find that collateral effects are pervasive but difficult to predict, as independent populations selected by the same drug can exhibit qualitatively different profiles of collateral sensitivity. Despite this apparent complexity, however, the sensitivity profiles cluster into statistically similar groups characterized by selecting drugs with similar mechanisms. Using a simple mathematical framework, we leverage these phenotypic profiles to design optimal drug policies that assign a unique drug to every possible resistance profile. Stochastic simulations reveal that these optimal drug policies outperform intuitive cycling protocols by maintaining long-term sensitivity at the expense of short-term periods of high resistance. Finally, we performed whole-genome sequencing on single isolates from each population and identified candidate genes statistically associated with increased sensitivity to particular drugs.

## I. INTRODUCTION

The rapid emergence of drug resistance is an urgent threat to effective treatments for bacterial infections, cancers and many viral infections^{1–6}. Unfortunately, the development of novel drugs is a long and arduous process, underscoring the need for alternative approaches to forestall resistance evolution. Recent work has highlighted the promise of evolution-based strategies for optimizing and prolonging the efficacy of established drugs, including optimal dose scheduling^{7–9}, antimicrobial stewardship^{10,11}, drug cycling^{12–14}, consideration of spatial dynamics^{15–17}, cooperative dynamics^{18–21}, or phenotypic resistance^{22–24}, and judicious use of drug combinations^{25–32}. In a similar spirit, a number of recent studies have suggested exploiting collateral sensitivity as a means for slowing or even reversing antibiotic resistance^{33–38}. Collateral evolution occurs when a population evolves resistance to a target drug while simultaneously exhibiting increased sensitivity or resistance to a different drug. From an evolutionary perspective, collateral effects are reminiscent of the trade-offs inherent when organisms are required to simultaneously adapt to different tasks, an optimization that is often surprisingly simple because it takes place on a low-dimensional phenotypic space^{39,40}. If similarly tractable dynamics occur in the evolution of multi-drug resistance, systematic optimization of drug deployment has the promise to mitigate the effects of resistance.

Indeed, recent studies in bacteria have shown that the cyclic^{38,41–43} or simultaneous^{44,45} deployment of antibiotics with mutual collateral sensitivity can sometimes slow the emergence of resistance. Unfortunately, collateral profiles have also been shown to be highly heterogeneous^{46,47} and often not repeatable^{48}, potentially complicating the design of successful collateral sensitivity cycles. The picture that emerges is enticing, but complex; while collateral effects offer a promising new dimension for improving therapies, the design of drug cycling protocols is an extremely difficult problem that requires optimization at multiple scales, from dynamics within individual hosts to those that occur on the scale of entire hospitals or communities. Despite many promising recent advances, it is not yet clear how to optimally harness collateral evolutionary effects to design drug policies, even in simplified laboratory scenarios. The problem is challenging for many reasons, including the stochastic nature of evolutionary trajectories, the high-dimensionality of the phenotype space, and–at an empirical level–the relative paucity of data regarding the prevalence and repeatability of collateral sensitivity profiles in different species.

In this work, we take a step towards answering these questions by investigating how drug sequences might be used to slow resistance in a simplified, single-species bacterial population. We show that even in this idealized scenario, intuitive cycling protocols–for example, sequential application of two drugs exhibiting reciprocal collateral sensitivity–are expected to fail over long time periods, though mathematically optimized policies can maintain long-term drug sensitivity at the price of transient periods of high resistance. As a model system, we focus on *E. faecalis*, a gram-positive opportunistic bacterial pathogen. *E. faecalis* are found in the gastrointestinal tracts of humans and are implicated in numerous clinical infections, ranging from urinary tract infections to infective endocarditis, where they are responsible for between 5 and 15 percent of cases^{49–53}. For our purposes, *E. faecalis* is a convenient model species because it rapidly evolves resistance to antibiotics in the laboratory^{54,55}, and fully sequenced reference genomes are available^{56}.

By combining parallel experimental evolution of *E. faecalis* with high-throughput dose-response measurements, we provide collateral sensitivity and resistance profiles for 60 strains evolved to 15 different antibiotics, yielding a total of 900 mutant-drug combinations. We find that collateral resistance and collateral sensitivity are pervasive in drug-resistant mutants, though patterns of collateral effects can vary significantly, even for mutants evolved to the same drug. Notably, however, the sensitivity profiles cluster into groups characterized by selecting drugs from similar drug classes, indicating the existence of large scale statistical structure in the collateral sensitivity profiles. To exploit that structure, we develop a simple mathematical framework based on a Markov Decision Process (MDP) to identify optimal antibiotic policies that minimize resistance. These policies yield drug sequences that can be tuned to optimize either short-term or long-term evolutionary outcomes, and they codify the trade-offs between instantaneous drug efficacy and delayed evolutionary consequences. Finally, we performed whole-genome sequencing of single isolates from all 60 populations. Our results reveal mutations in numerous genes previously linked with drug resistance or collateral sensitivity and also identify new candidate genes associated with increased antibiotic sensitivity.

## II. RESULTS

### A. Collateral effects are pervasive and heterogeneous

To investigate collateral drug effects in *E. faecalis*, we exposed four independent populations of strain V583 to increasing concentrations of a single drug over 8 days (approximately 60 generations) using serial passage laboratory evolution (Figure 1A, Methods). We repeated this laboratory evolution for a total of 15 antibiotics spanning a wide range of classes and mechanisms of action (Table 1). Many, but not all, of these drugs are clinically relevant for the treatment of enterococcal infections. As a control, we also evolved 4 independent populations of the ancestral V583 strain to media (BHI) alone. After approximately 60 generations, we isolated a single colony (hereafter termed a “mutant”) from each population and measured its response to all 15 drugs using replicate dose-response experiments (Figure 1B). To quantify resistance, we estimated the half maximal inhibitory concentration (IC_{50}) for each mutant-drug combination using nonlinear least squares fitting to a Hill-like dose response function (Methods; see Figure S1 for examples). A mutant strain was deemed collaterally sensitive (resistant) to an antibiotic if its IC_{50} decreased (increased) by more than 3*σ*_{WT}, where *σ*_{WT} is the uncertainty (standard error across replicates) of the IC_{50} measured in the ancestral strain. As a measure of collateral resistance / sensitivity, we then calculate *C* ≡ log_{2} (IC_{50,Mut}/IC_{50,WT}), the (log-scaled) fold change in IC_{50} of each mutant relative to wild-type (WT); values of *C* > 0 indicate collateral resistance, while values of *C* < 0 indicate collateral sensitivity (Figure 1C). For each mutant, we refer to the set of *C* values (one for each testing drug) as its collateral sensitivity profile .

Our results indicate that collateral effects–including sensitivity–are pervasive, with approximately 73 percent (612/840) of all (collateral) drug-mutant combinations exhibiting a statistically significant change in IC_{50}. Importantly, none of the four V583 strains propagated in BHI alone showed any collateral effects. The mutants in our study exhibit collateral sensitivity to a median of 4 drugs, with only 3 of the 60 mutants (5 percent) exhibiting no collateral sensitivity at all; by contrast, mutants selected by ceftriaxone (CRO) and fosfomycin (FOF) exhibit particularly widespread collateral sensitivity. Collateral resistance is similarly prevalent, with only 2 strains failing to exhibit collateral resistance to at least one drug. Somewhat surprisingly, 56 of 60 mutants exhibit collateral resistance to at least one drug from a different class (e.g. all mutants evolved to ciprofloxacin (CIP), a DNA synthesis inhibitor, show increased resistance ceftriaxone, an inhibitor of cell wall synthesis).

The measured collateral effects can be quite large. For example, we measure 8 instances of collateral sensitivity where the IC_{50} decreases by 16 fold or more. We also observe a strong, repeatable collateral sensitivity to rifampicin (RIF) when mutants were selected by inhibitors of cell wall synthesis, an effect that–to our knowledge–has not been reported elsewhere. More typically, however, collateral effects are smaller than the direct effects to the selecting drug, with 46 percent (384/840) exhibiting more than a factor 2 change in IC_{50} and only 7 percent (61/840) exhibiting more than a factor 4 change.

### B. Variation in collateral profiles is correlated with resistance to selecting drug

Our results indicate that collateral profiles can vary significantly even when mutants are evolved in parallel to the same drug (Figure 1C). For example, all 4 mutants selected by daptomycin exhibit high-level resistance to the selecting drug, but replicates 1 and 4 exhibit collateral resistance to ceftriaxone (CRO), while replicate 2 exhibits collateral sensitivity and replicate 3 shows little effect (Figure 1C, right panel).

To quantify the variation between replicates selected by the same drug, we considered the collateral profile of each mutant (i.e. a column of the collateral sensitivity matrix in Figure 1) as a vector in 15-dimensional drug resistance space. Then, for each set of replicates, we defined the variability , where *m* = 4 is the number of replicates and *d*_{i} is the Euclidean distance between mutant *i* and the centroid formed by all vectors corresponding to a given selecting drug (Figure 2A). Variability differs for different selecting drugs, with daptomycin and rifampicin showing the largest variability and nitrofurantoin the smallest (Figure 2B). We find that the observed variability is significantly correlated with average resistance to the selecting drug (Figure S2). This correlation persists even when one removes contributions to variability from the selecting drug itself (Figure 2C), indicating that collateral (rather than direct) effects underlie the correlation. We do note, however, that selection by spectinomcyin represents a notable exception to this trend. These results suggest that the repeatability of collateral effects is sensitive to the drug used for selection. As a result, certain drugs may be more appropriate for establishing robust antibiotic cycling profiles.

### C. Collateral resistance to daptomycin appears frequently under selection by different drugs

Daptomycin is a lipopeptide antibiotic sometimes used as a last line of defense against gram-positive bacterial infections, including vancomycin resistant enterococci (VRE). While daptomycin resistance was initially believed to be rare^{57}, it has become increasingly documented in clinical settings^{58}. Recent work in a related enterococcal species has shown that collateral resistance to daptomycin can arise from serial exposure to chlorhexidine, a common antiseptic^{59}, but less is known about collateral daptomycin resistance following exposure to other antimicrobial agents. Surprisingly, our results indicate that daptomycin resistance is common when populations are selected by other antibiotics, with 64 percent of all evolved lineages displaying collateral daptomycin resistance and only 11 percent displaying collateral sensitivity (Figure 2D).

### D. Selection by linezolid leads to higher chloramphenicol resistance than direct selection by chloramphenicol

Surprisingly, we found that mutants selected by linezolid (LZD) developed higher resistance to chloramphenicol (CHL) than mutants selected directly by CHL (Figure 3E). To investigate this phenomenon, we isolated linezolid-selected mutants at days 2, 4, 6 and 8 of the laboratory evolution and measured the resistance of each to chloramphenicol. Interestingly, we see that early-stage (days 4-6) mutants exhibit low level chloramphenicol sensitivity just prior to a dramatic increase in collateral resistance around day 8. These findings suggest linezolid selection drives the population across a chloramphenicol fitness valley, ultimately leading to levels of resistance that exceed those observed by direct chloramphenicol selection (Figure 2E, inset).

### E. Sensitivity profiles cluster into groups based on known classes of selecting drug

Our results indicate that there is significant heterogeneity in collateral sensitivity profiles, even when parallel populations are selected on the same antibiotic. While the genetic networks underlying these phenotypic responses are complex and, in many cases, poorly understood, one might expect that selection by chemically or mechanistically similar drugs would lead to profiles with shared statistical properties. For example, previous work showed (in a different context) that pairwise interactions between simultaneously applied antibiotics can be used to cluster drugs into groups that interact monochromatically with one another; strikingly, these groups correspond to known drug classes^{60}, highlighting statistical structure in drug interaction networks that appear, on the surface, to be extremely heterogeneous. Recent work in bacteria has also shown that phenotypic profiles of mutants selected by drugs from the same class tend to cluster together in *P. aeruginosa*^{38} and *E. coli*^{61}.

Similarly, we asked whether collateral sensitivity profiles in *E. faeaclis* can be used to cluster resistant mutants into statistically similar classes. We first performed hierarchical clustering (Methods) on collateral profiles of 52 different mutants (Figure 3, x-axis; note that we excluded mutants selected by CHL and NIT, which did not achieve resistance of at least 2x to the selecting drug). Despite the heterogeneity in collateral profiles, they cluster into groups characterized–exclusively–by selecting drugs from the same drug classes before grouping mutants from any two different drug classes. For example, inhibitors of cell wall synthesis (AMP, CRO, FOF, OXA) cluster into one group (noted by A in Figure 3), while tetraclycine-like drugs (TET, DOX, TGC) cluster into another (noted by B). This approach also separates spectinomycin from the tetracycline class of antibiotics (TET, DOX, TGC) even though they both target the 30S subunit of the ribosome, suggesting that it may help identify drugs with similar mechanisms but statistically distinct collateral profiles.

We then performed a similar clustering analysis of the collateral responses across the 14 different testing drugs (Figure 3, y-axis), which again leads to groupings the correspond to known drug classes. One drug, FOF, provides an interesting exception. Mutants selected for FOF resistance cluster with those of other cell-wall synthesis inhibitors (Class A, columns). However, the behavior of FOF as a testing drug (last row) is noticeably distinct from that of other cell-wall synthesis inhibitors (the 3 rows directly above FOF). Taken together, the clustering analysis reveals clear statistical patterns that connect known mechanisms of antibiotics to their behavior as both selecting and testing agents.

### F. A Markov decision process (MDP) model predicts optimal drug policies to constrain resistance

Our results indicate that collateral sensitivity is pervasive, and while collateral sensitivity profiles are highly heterogeneous, clustering suggests the existence of statistical structure in the data. Nevertheless, because of the stochastic nature of the sensitivity profiles, it is not whether this information can be leveraged to design drug sequences that constrain evolution. To address this problem, we develop a simple mathematical model based on a Markov decision process (MDP) to predict optimal drug policies. MDP’s are widely used in applied mathematics and finance and have a well-developed theoretical basis^{62–64}. In a MDP, a system transitions stochastically between discrete states. At each time step, we must make a decision (called an “action”), and for each state-action combination there is an associated instantaneous “reward” (or cost). The action influences not only the instantaneous reward, but also which state will occur next. The goal of the MDP is to develop a policy–a set of actions corresponding to each state–that will optimize some objective function (e.g. maximize some cumulative reward) over a given time period.

For our system, the state *s*_{t} at time step *t* = 0, 1, 2,… is defined by the resistance profile of the population, a vector that describes the resistance level to each available drug. At each time step, an action *a*_{t} is chosen that determines the drug to be applied. The system–which is assumed to be Markovian–then transitions with probability *P*_{a}(*s*_{t+1}*|s*_{t}, *a*_{t}) to a new state *s*_{ti+1}, and the transition probabilities are estimated from evolutionary experiments (or any available data). The instantaneous reward function *R*_{a}(*s*) is chosen to be the (negative of the) resistance to the currently applied drug; intuitively, it provides a measure of how well the current drug inhibits the current population. The optimal policy *π**(*s*) is formally a mapping from each state to the optimal action; intuitively, it tells which drug should be applied for a given resistance profile. The policy is chosen to maximize a cumulative reward function , where brackets indicate an expectation value conditioned on the initial state s_{0} and the choice of policy *π*. The parameter *γ* (0 ≤ *γ* < 1) is a discount factor that determines the timescale for the optimization; *γ* ≈ 1 leads to a solution that performs optimally on long timescales, while *γ* ≈ 0 leads to solutions that maximize near-term success.

To apply the MDP framework to collateral sensitivity profiles, we must infer from our data a set of (stochastic) rules for transitioning between states (i.e. we must estimate *P*_{a}(*s*_{t+1}*|s*_{t}, *a*_{t})). While many choices are possible–and different rules may be useful to describing different evolutionary scenarios–here we consider a simple model where the resistance to each drug is increased/decreased additively according to the collateral effects measured for the selecting drug in question. Specifically, the state s_{t+1} following application of a drug at time *t* is given by , where is one of the four collateral profiles (see Figure 1) measured following selection by that drug. Because resistance/sensitivity is measured using log-scaled ratios of IC_{50}’s, these additive changes in the resistance profile correspond to multiplicative changes in the relative IC_{50} for each drug. For instance, if one selection step increases the IC_{50} by a factor of 3, then two consecutive selection steps would increase IC_{50} by a factor of 9. This model assumes that selection by a given drug always produces changes in the resistance profile with the same statistical properties. For example, selection by daptomycin increases the resistance to daptomcyin (with probability 1) while simultaneously either increasing resistance to AMP (with probability 1/4), decreasing resistance to AMP (with probability 1/4), or leaving resistance to AMP unchanged (probability 1/2). Repeated application of the same drug will steadily increase the population’s resistance to that drug but could potentially sensitize the population to other drugs. Note that in this model, each time step is therefore assumed to cover approximately 60 generations, similar to the evolution time in our experiments. In addition, this model implicitly assumes sufficiently strong selection that, at each step, the state of the system is fully described by a single resistance profile (rather than, for example, an ensemble of profiles that would be required to model clonal interference). While we focus here on this particular model, we stress that this MDP framework can be easily adopted to other scenarios by modifying *P*_{a}(*s*_{t+1}*|s*_{t}, *a*_{t}).

For numerical efficiency, we discretized both the state space (i.e. the resistance to each drug is restricted to a finite number of levels) as well as the measured collateral profiles (exposure to a drug leads to an increase/decrease of 0, 1, or 2 resistance levels; Figure 4A, Figure S5). In addition, we restrict our calculations to a representative subset of six drugs (DAP, AMP, FOF, TGC, LZD, RIF). The set includes inhibitors of cell wall, protein, or RNA synthesis, and five of the six drugs (excluding RIF) are clinically relevant for enterococcus infections. We note, however, that the results are qualitatively similar for different discretization schemes (Figure S4) and for different drug choices (Figure S5-S7).

### G. Drug policies can be tuned to minimize resistance on different timescales

The optimal policy *π**(*s*) is a high-dimensional mapping that is difficult to directly visualize. For intuition on the policy, we calculated the frequency with which each drug is prescribed as a function of resistance to each of the six individual drugs (Figure S8, S9; top panels). Not surprisingly, we found that when resistance to a particular drug is very low, that drug is often chosen as optimal. In addition, the specific frequency distributions vary significantly depending on *γ*, which sets the timescale of the optimization. For example, the long-term optimal policy (*γ* = 0.99) yields a frequency distribution that is approximately independent of the level of resistance to FOF (Figure S8, upper right panel). By contrast, the frequency distribution for a short-term policy (*γ* = 0.1) changes with FOF resistance; at low levels of resistance, FOF is frequently applied as the optimal drug, but it is essentially never applied once FOF resistance reaches a certain threshold (Figure S9, upper right panel). Both the short- and long-term optimal policies lead to aperiodic drug sequences, but the resulting resistance levels vary significantly (Figure S8-S9, bottom panels). These differences reflect a key distinction in the policies: short-term policies depend sensitively on the current resistance level and maximize efficacy (minimize resistance) at early times, while long-term policies may tolerate short-term performance failure in exchange for success on longer timescales.

### H. Optimal policies outperform random cycling and rely on collateral sensitivity

To compare the outcomes of different policies, we simulated the MDP and calculated the expected resistance level to the applied drug over time, 〈*R*(*t*)〉, from 1000 independent realizations (Figure 4B). All MDP policies perform significantly better than random drug cycling for the first 10-20 time steps and even lead to an initial decrease in resistance. The long-term policy (*γ* = 0.99, blue) is able to maintain low-level resistance indefinitely, while the short-term policy (*γ* = 0) eventually gives rise to high-level (almost saturating) resistance. Notably, if we repeat this calculation on an identical data set but with all collateral sensitivities set to 0, the level of resistance rapidly increases to its saturating value (Figure 4B, light red line), indicating that collateral sensitivity is critical to the success of these policies.

To further understand these dynamics, we calculated the time-dependent probability distributions P(Drug)–the probability of applying a particular drug–and P(Resist)–the probability of observing a given level of resistance to the applied drug–for the MDP following the long-term policy (*γ* = 0.99, Figure 4C-D). We also calculated the (steady state) joint probability distribution characterizing the prescribed drugs at consecutive time steps (Figure 4E). The distributions reveal highly non-uniform behavior; after an initial transient period, RIF is applied most often, followed by FOF, while DAP is essentially never prescribed. Certain patterns also emerge between consecutively applied drugs; for example, FOF is frequently followed by RIF.

Somewhat surprisingly, the distribution of resistance levels is highly bimodal, with the lowest possible resistance level occurring most often, followed by the highest possible level, then the second lowest level, and then the second highest level (Figure 4D). The policy achieves a low average level of resistance not by consistently maintaining some intermediate level of resistance to the applied drug, but instead by switching between highly-effective drugs and highly-ineffective drugs, with the latter occurring much less frequently. In words, rare periods of high resistance are the price of frequent periods of very low resistance. These qualitative trends occur for other drug choices (Figures S5-S7) and are relatively insensitive to the number of discretization levels chosen (Figure S4).

### I. Optimal policies maintain lower long-term resistance than collateral sensitivity cycles

The resurgent interest in collateral sensitivity was sparked, in part, by innovative recent work that demonstrated the successful application of collateral sensitivity cycles, where each drug in a sequence promotes evolved sensitivity to the next drug^{41}. To compare the performance of the MDP to that expected from collateral sensitivity cycles, we identified all collateral sensitivity cycles for the six drug network and calculated 〈*R*(*t*)〉 for 100 time steps of each cycle. We then determined the “best” cycle of a given length–defined as the cycle with the lowest mean value of 〈*R*(*t*)〉 over the last ten time steps–and compared the performance of those cycles to the short- and long-term MDP policies (Figure 4F). The MDP long-term optimal solution (*γ* = 0.99) maintains resistance at a lower average value than for of the collateral sensitivity cycles. For MDP policies with shorter time horizons (e.g. the instant gratification cycle, *γ* = 0), however, the collateral sensitivity cycles of 3 and 4 drugs (as well as the long-term MDP solution) lead to lower resistance at intermediate or longer time scales, reflecting the inherent trade-offs between instantaneous drug efficacy and long-term sustainability. One advantage of the MDP optimization is that it allows for explicit tuning of the policy (via *γ*) to achieve maximal efficacy over the desired time horizon.

### J. Whole-genome sequencing reveals previously identified resistance determinants and common targets of selection

To investigate the genetic changes in drug selected populations, we randomly selected a single isolate from each of the 60 independently evolved populations for whole-genome sequencing. In addition, we sequenced two ancestral V583 colonies, as well as a control V583 strain that was propagated in BHI for the same 8 days as the drug mutants. To identify mutations relative to the ancestral strain we used breseq^{65}, an established computational pipeline for mutant identification (Methods). In total, we identified 96 mutational events (see SI for full table of results). The number of mutations per isolate ranged from 0 (n=7) to 6 (n=2), 25 isolates had at least two mutations, and in four cases, a single gene in a single isolate had more than one mutation. The ancestral strain propagated in BHI contained no mutations relative to the ancestral strains.

The analysis revealed mutations in a number of genes known to confer resistance to the selecting drug. For example, isolates selected by ampicillin, oxacillin, ceftriaxone and daptomycin contained mutations in EF 3290, a sensor histidine kinase which confers resistance to antibiotics targeting the cell wall^{66–68}. In addition, we identified mutations in ribosomal proteins S10 (isolates selected in tigecycline, doxycycline) and S5 (isolates selected in spectinomycin), which have been linked with resistance to protein synthesis inhibitors^{69,70}, and mutations in parC and DNA gyrase (isolates selected in ciprofloxacin, levofloxacin), which are frequently associated with fluoroquinolone resistance^{71–73}. On the other hand, the isolates selected by linezolid and daptomycin were missing mutations in several genes commonly associated with resistance. Specifically, we did not see evidence of mutations in 23S rRNA^{74} in any isolates selected by linezolid, and daptomycin isolates lack many of the other mutations identified in previous laboratory experiments with *E. faecalis*^{54,55}, though one isolate did show a mutation in sensor histidine kinsase.

We identified 12 genes (or intergenic regions) that were mutated in isolates selected by at least two different drugs, and seven of these genes have been previously linked with collateral sensitivity (Figure 5). For example, mutations in DNA gyrase (gyrA or gyrB) and parC affect DNA supercoiling and global gene expression, leading to altered sensitivity to a wide array of antibiotics^{75}. In addition, mutations to ribosomal genes, such as rpsJ and rpsE, are known to confer resistance to structurally and mechanistically unrelated antibiotics^{76}, while two-component regulatory systems (TCS) such as EF 0926 and EF 3290 (a sensor histidine kinase) are known to induce penicillin sensitivity in gentamicin-resistant mutants^{46}. Finally, mutations in pyruvate kinases (pyk) are linked to an increasing metabolic flux^{77}. In our data, mutations in a ribosomal genes (rpsJ, rpsE) and genes related to supercoiling (gyrA, gryB, parC) were particularly widespread, appearing in isolates from 12 and 7 (respectively) of the 15 selecting drugs.

The remaining five genes (EF 1168, EF 0149, EF 1873, EF 2155, EF 3121) have not previously been linked with collateral effects (Figure 5). EF 1168 and EF 1049 are associated with cell wall function, and EF 3121 is a cognate phosphatase in the eSTK/P pathway linked to cell-wall resistance and highly similar to TCS^{78,79}. EF 1873 and EF 2155 are annotated as intergenic regions. We do note, however, that the genes flanking EF 1873 are EF 1872, which is similar to the TipA a temperature sensing protein, and (EF 1874), which encodes a mutator family transposase; mutator genes have previously been associated with increased antibiotic resistance^{80}.

### K. Logistic regression identifies genes statistically associated with collateral sensitivity

It is difficult to connect specific genes or mutations with particular collateral profiles because multiple mutations often occur in a single isolate, and some mutations may not be identified by the analysis (particularly those in repeated sequences, such as rRNA). In addition, despite the strong selection pressure applied during lab evolution, the final populations may not be isogenic, meaning there is not necessarily a one-to-one mapping between the isolates sequenced and those used for phenotyping. On the other hand, because of the relatively large number of phenotype measurements, our data may provide statistical clues linking particular genes with collateral effects.

To identify candidate genes associated with collateral sensitivity, we performed binomial logistic regression to relate mutations in one of 12 genes (predictors) to collateral sensitivity to a particular drug (outcome variable). We restricted our analysis to the 12 genes that are mutated in at least two isolates (Figure 5). Because our goal is to identify candidate genes for future study, not conclusively establish any causal relationships, we chose lenient significance criteria (*p* < 0.1; see Methods and SI). The results suggest that mutations in 6 genes (rpsJ, rpsE, EF 0926, parC, and EF 3290 (sensor histidine kinase)) are associated with increased sensitivity to at least one drug (Figure 6). Only EF 3290 (sensor histidine kinase) is associated with sensitivity to drugs from multiple classes (protein synthesis inhibitors TET, DOX, TGC; DNA synthesis inhibitor LVX; and the nitrofuran NIT). In addition, the analysis identifies potential associations between ribosomal proteins (rpsJ, rpsE) and sensitivity to DOX and CRO, parC and sensitivity to DOX, and EF 0926 (coding for an OmpR family DNA response regulator) and sensitivity to CRO. The two intergenic regions (EF 1873 and EF 2155) are also associated with increased sensitivity to CRO. We caution that these results only report statistical associations, and considerable follow-up work will be required to solidify any mechanistic links. Nevertheless, the analysis does identify several potential genes not previously associated with increased drug sensitivity.

## III. DISCUSSION

Our work provides an extensive quantitative study of phenotypic and genetic collateral drug effects in *E. faecalis*. We have shown that collateral resistance and collateral sensitivity are widespread but heterogeneous, with patterns of collateral effects often varying even between mutants evolved to the same drug. Our results contain a number of surprising, drug-specific observations; for example, we observed a strong, repeatable collateral sensitivity to rifampicin when mutants were selected by inhibitors of cell wall synthesis. Additionally, cross-resistance to daptomycin is particularly common when cells are selected by other frequently used antibiotics. Because the FDA/CLSI breakpoint for daptomycin resistance is not dramatically different than the MIC distributions found in clinical isolates prior to daptomycin use^{81}, one may speculate that even small collateral effects could have potentially harmful consequences for clinical treatments involving daptomycin. In addition, we found that selection by one drug (linezolid) led to higher overall resistance to chloramphenicol than direct selection by chloramphenicol. While choramphenicol is rarely used clinically, the result illustrates that 1) collateral effects can be very dynamic, and 2) indirect selection may drive a population across a fitness valley to an otherwise inaccessible fitness peak.

Our findings also point to global trends in collateral sensitivity profiles. For example, we found that the repeatability of collateral effects is sensitive to the drug used for selection, meaning that some drugs may be better than others for establishing robust antibiotic cycling profiles. On the other hand, despite the apparent unpredictability of collateral effects at the level of individual mutants, the sensitivity profiles for mutants selected by drugs from known classes tend to cluster into statistically similar groups. As proof-of-principle, we show how these profiles can be incorporated into a rigorous but simple mathematical framework that optimizes drug protocols while accounting for effects of both stochasticity and different time horizons. Within this framework, drug policies can be tuned to optimize either short-term or long-term evolutionary outcomes. Finally, genome sequencing of isolates from each population reveals previously identified drug-resistance determinants and also identifies candidate genes associated with increased sensitivity to different drugs.

Our results complement recent studies on collateral sensitivity and also raise a number of new questions for future work. Multiple studies have shown that collateral profiles are heterogeneous^{46,47}, and optimization will therefore require incorporation of stochastic effects such as likelihood scores^{48}. These likelihood scores could potentially inform transition probabilities in our MDP approach, leading to specific predictions for optimal drug sequences based on known fitness landscapes. In addition, several previous studies have indicated that cycles involving mutually collaterally sensitive drugs may be chosen to minimize the evolution of resistance^{41,42}. In the context of our MDP model, these cycles fall somewhere between the short-time-horizon optimization and the long-term optimal strategy, and in some cases, the collateral sensitivity cycling can lead to considerable slowing of resistance. However, our results indicate that the MDP optimizations on longer time-horizons lead to systematically lower resistance, a consequence of intermixing (locally) sub-optimal steps where the drug is instantaneously less effective but shepherds the population to a more vulnerable evolutionary state.

It is important to keep in mind several limitations of our work. Designing effective drug protocols for clinical use is an extremely challenging and multi-scale problem. Our approach was not to develop a detailed, clinically accurate model, but instead to focus on a simpler question: optimizing drug cycles in single-species host-free populations. Even in this idealized scenario, which corresponds most closely to in vitro lab experiments, slowing resistance is a difficult and poorly understood problem (despite much recent progress). Our results are promising because they show systematic optimization is indeed possible given the measured collateral sensitivity profiles.

We have chosen to focus on a simple evolutionary scenario where collateral effects accumulate over time based on the history of drug exposure. The goal is to capture treatment failure that would occur as different resistance mutations accumulate on a single genetic background. But other evolutionary scenarios are certainly possible. For example, if selection is weak or the periods of drug exposure are short, one might not expect fixation of a single genotype at each time step; instead, drug cycles may lead to inverted selection for ancestral genotypes, including the wild-type strain, and these dynamics are not the focus of the current model. In addition, the model inherently assumes that the dominant collateral effects are independent of the genetic background. In fact, collateral sensitivity profiles in cancer have been previously shown to be time-dependent^{47,82} and epistasis certainly occurs^{46}, though the frequency and relative impact of these effects is not fully known.

Despite these limitations, we stress that the MDP framework can be readily extended to account for different evolutionary scenarios and to incorporate more complex clinically-inspired considerations. Our future work will focus on experimentally characterizing dynamic properties of collateral effects and expanding the MDP approach to account for time-varying sensitivity profiles and epistasis. It may also be interesting to investigate collateral effects in microbial biofilms, where antibiotics can have counterintuitive effects even on evolutionarily short timescales^{83}. On longer timescales, elegant experimental approaches to biofilm evolution have revealed that spatial structure can give rise to rich evolutionary dynamics^{84,85} and potentially, but not necessarily, divergent results for biofilm and planktonic populations^{86}

Finally, our results raise questions about the potential molecular and genetic mechanisms underlying the observed collateral effects. The phenotypic clustering analysis presented here may point to shared mechanistic explanations for sensitivity profiles selected by similar drugs, and the full genome sequencing identifies candidate genes associated with increased sensitivity. However, fully elucidating the detailed genetic underpinnings of collateral sensitivity remains an ongoing challenge for future work. At the same time, because the MDP framework depends only on phenotypic measurements, it may allow for systematic optimization of drug cycling policies even when molecular mechanisms are not fully known.

## IV. MATERIALS AND METHODS

### A. Strains, antibiotics and media

All resistant lineages were derived from *E. faecalis* V583, a fully sequenced vancomycin-resistant clinical isolate^{87}. The 15 antibiotics used are listed in Table 1. Each antibiotic was prepared from powder stock and stored at −20°C with the exception of ampicillin, which was stored at −80°C. Evolution and IC_{50} measurements were conducted in BHI medium alone with the exception of daptomycin, which requires an addition of 50 mM calcium for antimicrobial activity.

### B. Laboratory Evolution Experiments

Evolution experiments to each antibiotic were performed in quadruplicate. Evolutions were performed using 1 mL BHI medium in 96-well plates with maximum volume 2 mL. Each day, populations were grown in at least three different antibiotic concentrations spanning both sub- and super-MIC doses. After 16-20 hours of incubation at 37°C, the well with the highest drug concentration that contained visual growth was propagated into 2 higher concentrations (typically a factor 2x and 4x increase in drug concentration) and 1 lower concentration to maintain a living mutant lineage (always half the concentration that most recently produced growth). A 1/200 dilution was used to inoculate the next day’s evolution plate, and the process was repeated for a total of 8 days of selection. On the final day of evolution all strains were stocked in 30 percent glycerol. Strains were then plated on a pure BHI plate and a single colony was selected for IC_{50} determination. In the case of linezolid mutants, days 2, 4, and 6 were also stocked for further testing.

### C. Measuring Drug Resistance and Sensitivity

Experiments to estimate IC_{50} were performed in replicate in 96-well plates by exposing mutants to a drug gradient consisting of 6-14 points–one per well–typically in a linear dilution series prepared in BHI medium with a total volume of 205 uL (200 uL of BHI, 5 uL of 1.5OD cells) per well. After 20 hours of growth the optical density at 600 nm (OD600) was measured using an Enspire Multimodal Plate Reader (Perkin Elmer) with an automated 20-plate stacker assembly. This process was repeated for all 60 mutants as well as the wild-type, which was measured in replicates of 8.

The optical density (OD600) measurements for each drug concentration were normalized by the OD600 in the absence of drug. To quantify drug resistance, the resulting dose response curve was fit to a Hill-like function *f*(*x*) = (1 + (*x*/*K*)^{h})^{−1} using nonlinear least squares fitting, where *K* is the half-maximal inhibitory concentration (*IC*_{50}) and *h* is a Hill coefficient describing the steepness of the dose-response relationship. A mutant strain was deemed collaterally sensitive (resistant) to an antibiotic if its IC_{50} decreased (increased) by more than 3*σ*_{WT}, where *σ*_{WT} is the uncertainty (standard error across replicates) of the IC_{50} measured in the wild-type strain.

### D. Hierarchical clustering

Hierarchical clustering was performed in Matlab using, as input, the collateral profiles for each mutant. The distance between each pair of mutants was calculated using a correlation metric (Matlab function pdist with parameter ‘correlation’), and the linkage criteria was chosen to be the mean average linkage clustering.

### E. Markov decision process (MDP) model

The MDP model consists of a finite set of states (*S*), a finite set of actions (*A*), a conditional probability (*P*_{a}(*s*′*|s, a*)) describing (action-dependent) Markovian transitions between these states, and an instantaneous reward function (*R*_{a}(*s*)) associated with each state and action combination. The state of the system *s* ∈ *S* is an *n*_{d}-dimensional vector, with *n*_{d} the number of drugs and each component *s*^{i} ∈ {*r*_{min}, *r*_{min} + 1,…, *r*_{max}} indicating the level of resistance to drug *i*. The action *a* ∈ *A* ≡ {1, 2,…, *n*_{d}} is the choice of drug at the current step, and we take the reward function *R*_{a}(*s*) to be the (negative of the) resistance level to the currently applied drug (i.e. the *a*-th component of *s*). The goal of the MDP is to identify a policy *π*(*s*), which is a mapping from *S* to *A* that specifies an optimal action for each state. The policy is chosen to maximize a cumulative reward function , where *t* is the time step, *s*_{t} is the state of the system at time *t, R*_{π}(*s*_{t}) is a random variable describing the instantaneous reward assuming that the actions are chosen according to policy *π*, and brackets indicate an expectation value. The parameter *γ* (0 ≤ *γ* < 1) is a discount factor that determines the relative importance of instantaneous vs long-term optimization. In words, we seek an optimal policy–which associates the resistance profile of a given population to an optimal drug choice–that minimizes the cumulative expected resistance to the applied drug.

The MDP problem was solved using value iteration, a standard dynamic programming algorithm for MDP models. Briefly, the optimization was performed by first computing the optimal value function *V* (*s*), which associates to each state *s* the expected reward obtained by following a particular policy and starting in that state. Following the well-established value iteration algorithm^{62–64}, we iterate according to *V*_{i+1}(*s*) = max_{{a}} (*R*_{a}(*s*) + *γ* ∑_{s′} *P*(*s*′*|s, a*)*V*_{i}(*s*′)). Given the optimal value function, the optimal policy is then given by the action that minimizes the optimal value function at the next time step.

Once the optimal policy *π* = *π** is found, the system is reduced to a simple Markov chain with transition matrix *T*_{π*} = *P*_{π*(s)}(*s*′*|s, π**(*s*)), where the subscript *π** means that the decision in each state is determined by the policy *π** (i.e. that *a* = *π**(*s*) for a system in state *s*). Explicitly, the Markov chain dynamics are given by *P*_{t+1}(*s*) = *T*_{π*} *P*_{t}(*s*), with *P*_{t}(*s*) the probability to be in state *s* at time step *t*. All quantities of interest–including P(Drug), P(Resist) (see Figure 4), and 〈*R*(*t*)〉–can be calculated directly from *P*_{t}(*s*). For example, 〈*R*(*t*)〉 = ∑_{s∈S} *P*_{t}(*s*)*R*_{π*}(*s*), with *R*_{π*}(*s*) the instantaneous reward for a system in state *s* under optimal policy *π**.

### F. Whole-genome sequencing

To identify any genomic changes that contributed to the measured collateral phenotypes identified, we sequenced 60 independently evolved drug mutants along with two V583 ancestors as well as a control V583 strain propagated in BHI for the 8 days. Populations were streaked from a frozen stock, grown up in BHI, triple washed in PBS and DNA was isolated using a Quick-DNA Fungal/Bacterial Kit (Zymo Reserach). These samples were sequenced in two batches via the University of Michigan sequencing core. Batch one, a test batch, including Amp4, Cip2, Dox4, Spt4, Lzd1, Nit 1 and a V583 colony was sequenced on one lane of 150 paired-end sequencing on the MiSeq. Batch two, the rest of the mutants and an additional V583 strain was sequenced using the Illumina HiSeq 4000, run as paired-end 150 with 250ng DNA input.

The resulting genomic data was analyzed using the high-throughput computational pipeline breseq, with default settings. Average read coverage depth was about 50 on batch 1 and 300 on batch 2. Briefly, genomes were aligned to E. faecalis strain V583 (Accession numbers: AE016830 - AE016833) via Bowtie 2. A sequence read was discarded if less than 90 percent of the length of the read did not match the reference genome or a predicted candidate junction. At each position a Bayesian posterior probability is calculated and the log10 ratio of that probability versus the probability of another base (A, T, C, G, gap) is calculated. Sufficiently high consensus scores are marked as read alignment evidence (in our case a consensus score of 10). Any mutation that occurred in either of the 2 control V583 strains was filtered from the results. A single 23,000 bp deletion in Cip2 from batch one was also filtered from the results as it was likely due to poor coverage.

### G. Logistic regression to identify candidate sensitivity genes

We performed binomial logistic regression using a generalized regression model of the form logit *p* = *β*_{0} + ∑_{i} *β*_{i}*x*_{i}, where *p* is the binary outcome variable, logit *p* ≡ log(*p*/(1 − *p*)), {*β*_{j}} are the regression coefficients, and {*x*_{j}} are the predictor variables. For each drug, the outcome variable indicates whether or not there is increased sensitivity to that drug, while the predictor variables *x*_{j} indicate whether or not there is a mutation in gene *j*. To avoid over-parameterizing the model, we added variables (genes) *x*_{j} stepwise (using Matlab’s stepwiseglm function) starting from a model with only *β*_{0} ≠ 0. New terms were added (*p*_{a} < 0.2) or removed (*p*_{r} < 0.25) based on p-values (*p*_{a} or *p*_{r}) from an F-test on the change in deviance due to addition/removal. Candidate genes associated with collateral sensitivity were deemed to be those corresponding to coefficients *β*_{j} with an associated p-value less than 0.1.

## ACKNOWLEDGMENTS

This work is supported, in part, by the National Science Foundation (NSF No. 1553028 to KW) and the National Institutes of Health (NIH No. 1R35GM124875-01 to KW).

## Footnotes

↵a) Electronic mail: kbwood{at}umich.edu