Modeling of collaterally sensitive drug cycles, and optimization of the drug effect in the spirit of adaptive therapy

In previous work, we focused on the optimal therapeutic strategy with a pair of drugs which are collaterally sensitive to each other, that is, a situation in which evolution of resistance to one drug induces sensitivity to the other, and vice versa. [1] Here, we have extended this exploration to the optimal strategy with a collaterally sensitive drug sequence of an arbitrary length, N(≥2). To explore this, we have developed a dynamical model of sequential drug therapies with N drugs. In this model, tumor cells are classified as one of N subpopulations represented as {Ri|i = 1, 2, …, N}. Each subpopulation, Ri, is resistant to ‘Drug i’ and each subpopulation, Ri−1 (or RN, if i = 1), is sensitive to it, so that Ri increases under ‘Drug i’ as it is resistant to it, and after drug-switching, decreases under ‘Drug i + 1’ as it is sensitive to that drug(s). Similar to our previous work examining optimal therapy with two drugs, we found that there is an initial period of time in which the tumor is ‘shaped’ into a specific makeup of each subpopulation, at which time all the drugs are equally effective . After this shaping period, all the drugs are quickly switched with duration relative to their efficacy in order to maintain each subpopulation, consistent with the ideas underlying adaptive therapy. [2] Additionally, we have developed methodologies to administer the optimal regimen under clinical or experimental situations in which no drug parameters and limited information of trackable populations data (all the subpopulations or only total population) are known. The therapy simulation based on these methodologies showed consistency with the theoretical effect of optimal therapy.

(a) A collateral sensitivity map: This map represents the results of a hypothetical experiment in which tumors are exposed to one drug (column), and after resistance to this drug develops are tested against another drug (column). The color is then the change in sensitivity from the wild type to the evolved strain. (b) A collateral sensitivity network showing every drug pair and their collateral sensitivity relationship. Nodes represent drugs, and directed edges point from drugs, which when resistance develops, end with sensitivity to another (edge terminus). The four drugs connected by the red arrows is an example of collaterally sensitive drug cycle.
Tumors consists of diverse cells in terms of cellular traits (heterogeneity in genotypes 41 and phenotypes) and/or surrounding environment (affinity to blood vessels, fibroblasts, by introducing continuously changing cellular biology [22,23] or stromal microenvironment environment [24,25]. In this paper, we chose to account for the intermediate level of mod-53 eling complexity. Using constrained ODEs, as in EGT, but also considering differential drug 54 effect, similar to the concept of fitness landscapes [26]. With that, we defined a drug efficacy 55 measure based on cell population, and explored the effects of collaterally sensitive drug 56 schedules.

57
For parsimony and analytical tractability, we assume that there are possibly three fitness 58 levels under each drug: lowest (for most sensitive cells), highest (for most resistant cells) 59 and intermediate (for other neutral cells). Mutations in this population structure result only 60 in progressively higher fitness values, specifically from sensitive to neutral and neutral to 61 resistant types. We make no fine grained biological assumptions for the purposes of this 62 work, but assume only that a fitness metric of overall proliferation ability in the process 63 of convergent evolution [27,28,29,30]. A subpopulation having a same fitness value is 64 assumed to be homogeneous in terms of how it responds to drug exposure. And, while it has 65 been shown that fitness can often change as a function of drug dose, so called seascapes [31], 66 and that this can be useful for control, we do not consider that possibility here [32].

67
In this study we ask the following questions: Given knowledge of the evolutionary pat- The remainder of this manuscript is structured as follows. The details of our model are 72 described in Section 2. Based on the model, we derive the optimal treatment strategy and 73 a practical method of its implementation, which are discussed in Section 3 and Section 4 74 respectively. Our finding of optimal treatment is consistent with the concept of 'minimum 75 effective dosage' in the adaptive therapy paradigm, which optimizes drug effectiveness with 76 the least risk of drug resistance development [2]. Section 5 includes a discussion of adaptive 77 therapy, as well as overall conclusions and discussions of this work.  Table   83 1 for the definition of i ), and neutral under any of other drugs. Therefore, we can simu-84 late the patterns of collateral sensitivity sequences in terms of the resistant cell populations 85 (for example, R i ) under a particular drug (i.e., Drug i in the example) which subsequently 86 declines under the next drug in the cycle to which it is sensitive (i.e., Drug i).

87
Under each drug (Drug i), we assigned three types of total proliferation rates ("birth 88 rate"-"death rate"), for resistant (p i r > 0), sensitive (p i s < 0) and neutral (p i 0 ∈ (p i s , p i r )) 89 cells. Assuming evolution of each cellular type occurs toward higher fitness levels (i.e.,

90
proliferation rates in our model), we accounted for two kinds of transitions: from sensitive 91 to neutral types (g i s > 0) and neutral to resistant types (g i 0 > 0). The dynamics of R 1 , R 2 , ...,

92
and R N under any chosen drug, Drug i, are described in Figure 2. where R i and R i−1 (i − 1 = i ≥ 1) are resistant and sensitive cell populations respectively, with all the other compartments being populations of neutral type. The left panel shows the schematic of transitions/turnovers among R j s, and the right panel shows the associated system of differential equations. Here, {p i r , p i s , p i 0 } are proliferation rate of resistant, sensitive, and neutral cells, {g i s (and corresponding g i s = g i s /(N − 2)), g i 0 } are transition rates from sensitive to neutral and, from neutral to resistant types.
In our study, system (1) is used to describe the dynamics of cell populations under a 94 single drug. To study the dynamics under multiple drugs switched over time, we switch the 95 drug index, i, in the system accordingly. The resulting piece-wise continuous differential 96 system will describe the effect of this drug switch strategy. An example of population 97 histories with 4 collaterally sensitive drugs switched as indicated is shown in Figure 3.

98
Rapid drug rotation with chosen intensities of the N drugs (i.e., f i ∆t-long with Drug i 99 where ∑ N i=1 f i = 1 and ∆t → 0 + ) is employed in our optimal therapeutic strategy. This will 100 be described in detail in the next section. In our modeling framework, the fast switch is 101 highly related to the therapy with a drug mixture, since its corresponding cell population 102 model is in a similar form as the single drug dynamics (Equation (1)). The only difference 103 is the transition matrix which is replaced by the linear combination of the matrices of all the 104 drugs (D(k)s) with the relative intensities ( f k s), as described below.
For the derivation of the Equation (2), see Theorem A.6 in Appendix A. 3 Optimal therapeutic schedule 107 3.1 Description of the optimal population dynamics 108 Analysis of a system comprised by multiple systems of (1) with different values of i, p i s and 109 g i s, is limited due to the complexity of the system. Instead, we studied it numerically to find 110 the optimal therapeutic strategy. To define control, we chose to minimize the area under the 111 total population curve over a chosen time interval [0, x] for some time x > 0, for varying drug administrations. Our numerical study determined the treatment regimen 113 for the best possible effects to minimize (3). It can be achieved when the best drug(s) is 114 chosen at every given moment (t = t 1 ). Here, the best drug(s) at t 1 (e.g., Drug i) means 115 the drug(s) which have the highest effectiveness in decreasing total population (such that 116 e f i (t 1 ) ≤ e f j (t 1 ) for any j ∈ {1, 2, ..., N}), where the formulation of the drug effect is defined 117 in following way 118 e f j (t 1 ) := TPD(t 1 |P j ) = P j · R(t 1 ) j ∈ {1, 2, ..., N} for Drug j at t 1 . R under the optimal strategy obeys the following system, While we are deriving a mathematically optimal therapy schedule, we realize it is imprac-120 tical to instantaneously switch drugs in realistic clinical situations. Therefore, we developed 121 a simulation algorithm to choose and accordingly switch to an effective drug at a given 122 discrete time step (∆t 0), which is described in Figure 4 (a) -where a longer ∆t could be 123 considered for specific clinical situations. An example of the optimal administration simu-124 lated with the discrete scheme, compared to a choice of non-optimal administration which 125 changes drugs at minimum time points of total population, is shown in Figure 4 (b, c). The panel (b) shows tumor size, which is smaller with the optimal therapy over the time than the 127 size with the non-optimal therapy, and the panel (c) shows how much the optimal therapy 128 is better, measured by (3), in this example. In this section, we will study several properties about the optimal therapeutic regimen based 132 on two examples of "symmetric" drug cycle ( Figure 5) and an "asymmetric" drug cycle (Fig-133 ure 6). We define symmetric drug cycle as a set of drugs which have an identical parameter 134 value for each type of dynamical event, i.e., {p i r , p i s , p i 0 , g i s , g i 0 } = {p r , p s , p 0 , g s , g 0 } for all i, 135 and an asymmetric drug cycle as having different parameter values for at least one type of We will begin 137 with the simpler case of symmetric drug cycles first and then generalize our observations to 138 asymmetric drug cycles.

139
In both cases simulated by the algorithm (Figure 4 or simply, Theorem A.9. However, our analysis will focus on other general cases that the expression 155 (6) can be utilized.

156
The last stage lasts as long as therapy is necessary, keeping the same cellular makeup 157 meanwhile. 158 All the subpopulations and total population of the last stage are presumed to change expo-159 nentially (See the comparison between the population curves and the light-gray guideline 160 curves on Figure 5, 6 (a).). For symmetric cases, it has been proved in Theorem A.10, that 161 the exponential rate is the average of proliferation rates of all subpopulations, All stages except the last one choose better drugs and gradually level the effectiveness of Figure 6: Optimal therapeutic dynamics with a cycle of four asymmetric drugs. Same types of plots with Figure 5 with four asymmetric drugs: all drugs evenly, so we will call those finite stages the "shaping period". On the other hand,  197 In each case, we use one or more 'testing periods' each lasting N∆t where ∆t is the 198 smallest possible period of single drug administration, and in reality, response measure-199 ment. In the N successive ∆t-long time windows, all of the N drugs are given in turn, and 200 the available population data is measured at the end of each window. After this procedure 201 is performed, we will have discrete population data at N + 1 time points We base our 'realistic' strategy around total tumor population because of the recent 204 explosion of robust techniques to obtain this information in a clinically relevant setting [21].

205
In this particular method plasma cell-free DNA is sampled from a patient with relatively 206 high temporal frequency and used to resolve the corresponding evolutionary dynamics. 207 We use these techniques as a benchmark for clinically leverage-able tumor population data 208 which can be implemented in an algorithm as we describe below. Implementing the subpopulation data over just one single testing period (explained above), 211 we derive several equations which represent the relationships between the data and drug 212 parameters, and then, to find the parameter values. We will approach those problems for 213 two conditions separately: two drugs N = 2, and three or more drugs N ≥ 3.

214
In the case of two drugs, there are only two subpopulations R = {R 1 , R 2 }, and three parameters for each drug i ∈ {1, 2} which are two proliferation rates {p i r , p i s } and one 216 transition rate from sensitive to resistant types g i . No subpopulation or parameters related 217 to neutral cells is included in the system of two drugs. Solving the differential system (1) 218 with the initial conditions, we have the following equations for each ∆t-long time window. (9), we can specify the equations with 221 drug parameters and known values only. 222 Similarly, in the case of a larger number of drugs N ≥ 3, we can solve (1), s 1 = f s (∆t; p s − g s ; s 0 ) n 1 = f 0 (∆t; p 0 − g 0 , g s ; n 0 , S) r 1 = f r (∆t; p r , g 0 ; r 0 , N) .
Both (10) for the two-drug cases with three unknown parameters and (12) for N-drug 227 cases with five unknown parameters (N ≥ 3) are underdetermined. One strategy to resolve 228 this issue in either case is by assuming a specific ratio between proliferation rates and tran-229 sition rates, i.e., |p s | = αg for 2 drugs and {|p s |, |p 0 |} = {α s g s , α 0 g 0 } for more drugs, with a 230 reasonably chosen α, α s , α 0 ∈ (0, 1) like α = 0.1. Then, using the conditions and equations 231 (10 or 12) along with the data from one testing period, we can infer all the drug parameters.

232
The discovered parameters will give the complete information required to run the algorithm 233 of optimal administration (Figure 4 (a)). In most cases of cancer, however, detailed information about tumor heterogeneity cannot 236 be detected over time. Clinically, total tumor size is the highest resolution data that can 237 be reasonably (and even then poorly) measured. For such cases with limited data, rather 238 than trying to solve the differential equations as shown in the previous section, we tried to 239 approximate drug effects using the levels of changes of the total population data of testing 240 period, and checked which drug(s) are most reasonable to prescribe. The chosen drug or drugs is continually prescribed as long as it is effective, and its effects are continuously 242 monitored. When the chosen drug loses efficacy, we need to select drugs again. Since we 243 will not know sub-populations and drug parameters, the equation (4) is not applicable. In-244 stead, another (N∆t)-long testing period is required to choose the drugs of the next 'round'.

245
Therefore, in our algorithm of optimal prescription only with total population monitored, 246 we suggest to pair up "testing" period and "therapy" period, and repeat them (See Figure   247 7 (b)). Even though the testing period can transiently worsen outcomes compared to stan-248 dard therapies for a finite period of time, it is the only way to ascertain the relative strength 249 of each drugs selection pressure. So, we strived to reduce the time taken for each testing 250 period. One way we implemented this in our algorithm was to choose and apply multiple 251 effective drugs, rather than one single best drug. We chose this strategy because utilizing a 252 strict criterion of drug selection would not allow for currently chosen drugs to be effective 253 for a long time and would require a change of drug more frequently.

254
Now, let us describe our algorithm for optimal therapeutic prescription ( Figure 7  , which is between the average effect of "effective" drugs, and the highest effect of "ineffec-268 tive" drugs. In our study, we simply fixed the threshold parameter, η = 0.5 (See Figure 7

270
Our algorithm of approximated optimal therapy was compared to the actual optimal 271 therapy by measuring the area between population curves, in two ways. Using total popu-  Figure 7: Algorithm for the administration of the optimal drug schedules when total population data is available, but not subpopulation nor drug parameters. (a) Diagram of the classification of "effective" and "ineffective" drugs, and the threshold of drug effectiveness (e f * ). The effectiveness levels of the most effective drug (e f m ), effective drugs and ineffective drugs are indicated by blue-filled circle, black-filled circles and empty circles, respectively. (b) Flow chart of the optimal therapy algorithm based on the "effective" drugs from (a). and using subpopulations curves, 275 N ∑ k=1 R k (t|optimal therapy) − R k (t|approx. optimal therapy) dt.

276
Using both methods, we found that both strategies which were too strict or too generous 277 do not generate tumor histories close enough to the optimal one (See Figure 8 ( proper level of drug selection window is necessary, like our example case of Figure 8 in 279 which the approximation is reasonably close with ≈ 0.03 (See Figure 8 (a)). It is an 280 expected observation, because a strict drug selection strategy will require unnecessarily 281 frequent testing periods, and a generous strategy will include barely effective drugs as well 282 in the treatment.

283
(a) (b) Figure 8: Comparison between the realistic approximation of the optimal therapy (by the algorithm on Figure 7 (b)), and the actual optimal strategy. (a) Optimal therapeutic effect generated by the practical algorithm assuming that only total population is trackable (solid curves) with = 0.03 and η = 0.5, compared to the effect of the optimal therapy in theory (dashed curves). Other used parameters and initial tumor status are the same as in the example in Figure 6. (b) Errors between the approximated and actual optimal histories, over the range of the parameter of effective drug window.

284
The phenomena of collateral sensitivity presents an opportunity to improve effectiveness 285 in drug therapy without the need for drug discovery, a process that requires enormous 286 amounts of time and money. Taking advantage of CS clinically however, requires a better 287 understanding, of the evolutionary dynamics under changing therapy. To address this, we 288 developed a mathematical model of ordinary differential equations describing the effect of 289 a collaterally sensitive drug cycles, and explored the optimal treatment regimens within the 290 confines of this simplified model. Consequently, we found that the optimal therapeutic effect 291 can be derived when we switch drug to the best-effect-drug at every moment. While this is 292 somewhat intuitive, choosing the timing, and order of switching is a difficult prospect given 293 the lack of perfect data in clinical settings, and given the heterogeneity which is hallmark of 294 cancer.

295
In our ODE model, drug switching is implemented by changing drug-dependent pa-296 rameters (P i ) and transitions between cell types. In accordance with the 'optimal' drug 297 schedule, drugs are switched instantaneously after the first stage of single-drug therapy.

298
However, such rapid switching is not feasible clinically. To address this, we developed a 299  = N) and neutral under all the other drugs R vector of subpopulations, R = ( R 1 · · · R N ) T R * population fractions (i.e., ∑ N k=1 R * k = 1) at which all the drugs are equally effective, derivative of total population, TPD(t) = TP (t)    (Figure 4 (a)). As expected, the algorithm-based simulation is smooth with a small time step 301 (Figure 4 (b)) and shows a good consistency with the ODE system ( Figure 5, 6 (a)).

302
The order of the drug sequence in each stage of our example simulations is not exactly parameters are unknown, we cannot measure the drug effect and therefore the optimal 314 drug switch timing cannot be captured. For such cases, which is the majority of cancers, 315 we developed an algorithm within which drugs are selected based on only total population 316 (Figure 7(b)). Cell population dynamics with this algorithm show a good consistency with 317 the fully informed optimal therapeutic strategy ( Figure 8 (a)). The traditional time gaps 318 required to obtain updated diagnostics and prescription is several weeks or months, [21] 319 which would be too long to expect a result close to an optimal solution as we aim for here.

320
Based on the results of our algorithm simulation (Figure 8 (a) another drug could be a temporal phenomenon happening in a specific status, which is too 328 complicated to recapitulate within this structure. Also, even if a part of the cell population 329 follows the dynamics of a rotating resistance and sensitivity pattern, there could be many 330 more types of cells not involved in any cell types within this structure, which will result 331 in treatment failure, as we discussed in Figure 9. A detailed study with antibiotics [26] applied empirical fitness measurements to assign cellular classifications with genes and 333 drug-induced transitions among the genotypes. Finding best-case therapy ordering in multi-334 drug scenarios was also studied by Maltas et al. without reliance on the underlying fitness 335 landscape, somwhat akin to our "imperfect" clinical information, and found that therapy 336 could be improved. [34]

337
Each of these previous studies assume that the dynamics measured at the beginning of 338 the study do not change through time. Given the importance of (epi-)mutation, and not 339 just changing frequencies, it is uncertain on what timescales this is a fair assumption. [8]
Lemma A.5. For any positive integer, n, and an any integer, i, in [1, N] where η is some number in [0, 1) and 0 ≤ f k ≤ 1 with ∑ N k=1 f k = 1.

500
Proof. By the definition (4), the effect of Drug i is 501 e f k = P i · R = (P i ) T R.
where k is a constant representing the level of balanced drug effect. Since {P 1 , P 2 , . . . , P N } 505 is linearly independent, we can isolate R * with an inverse matrix. Proof. The derivative of total population, TPD, is the summation of the vector, dR dt .