Modeling collaterally sensitive drug cycles: shaping heterogeneity to allow adaptive therapy

1 In previous work, we focused on the optimal therapeutic strategy with a pair of drugs 2 which are collaterally sensitive to each other, that is, a situation in which evolution of 3 resistance to one drug induces sensitivity to the other, and vice versa. [1] Here, we 4 have extended this exploration to the optimal strategy with a collaterally sensitive drug 5 sequence of an arbitrary length, N(≥ 2). To explore this, we have developed a dynamical 6 model of sequential drug therapies with N drugs. In this model, tumor cells are classified 7 as one of N subpopulations represented as {Ri|i = 1, 2, ..., N}. Each subpopulation, Ri, 8 is resistant to ‘Drug i’ and each subpopulation, Ri−1 (or RN , if i = 1), is sensitive to it, so 9 that Ri increases under ‘Drug i’ as it is resistant to it, and after drug-switching, decreases 10 under ‘Drug i + 1’ as it is sensitive to that drug(s). 11 Similar to our previous work examining optimal therapy with two drugs, we found 12 that there is an initial period of time in which the tumor is ‘shaped’ into a specific makeup 13 of each subpopulation, at which time all the drugs are equally effective (R∗). After this 14 shaping period, all the drugs are quickly switched with duration relative to their efficacy 15 in order to maintain each subpopulation, consistent with the ideas underlying adaptive 16 therapy. [2, 3] 17 Additionally, we have developed methodologies to administer the optimal regimen 18 under clinical or experimental situations in which no drug parameters and limited in19 formation of trackable populations data (all the subpopulations or only total population) 20 are known. The therapy simulation based on these methodologies showed consistency 21 with the theoretical effect of optimal therapy. 22


23
Despite the development of a large pharmacopoeia of novel anti-cancer drugs, curative treat-24 ments remain elusive after systematic, or metastatic, spread of cancer. The evolution of re-25 sistance to initially effective therapies is one of the primary forces behind this phenomenon. 26 This evolution is a complex phenomenon influenced by a variety of factors and their in-27 teractions [4,5,6], including genetic mutation and changed frequency of gene expression 28 [7,8,9], drug efflux pumps on the cell membrane [10,11], tumor microenvironement [12] 29 and so on. Despite the difficulty of elucidating these complicated mechanisms, a multitude 30 eling complexity. Using constrained ODEs, as in EGT, but also considering differential drug 54 effect, similar to the concept of fitness landscapes [27]. With that, we defined a drug efficacy 55 measure based on cell population, and explored the effects of collaterally sensitive drug 56 schedules. 57 For parsimony and analytical tractability, we assume that there are possibly three fitness 58 levels under each drug: lowest (for most sensitive cells), highest (for most resistant cells) 59 and intermediate (for other neutral cells). Mutations in this population structure result only 60 in progressively higher fitness values, specifically from sensitive to neutral and neutral to 61 resistant types. We make no fine grained biological assumptions for the purposes of this 62 work, but assume only that a fitness metric of overall proliferation ability in the process 63 of convergent evolution [28,29,30,31]. A subpopulation having a same fitness value is 64 assumed to be homogeneous in terms of how it responds to drug exposure. And, while it has 65 been shown that fitness can often change as a function of drug dose, so called seascapes [32], 66 and that this can be useful for control, we do not consider that possibility here [33]. 67 In this study we ask the following questions: Given knowledge of the evolutionary pat-68 terns of resistance, what is the optimal method of using a large panel of drugs? What can we 69 learn about the evolution of resistance from observing patient outcomes? Can each patient 70 be their own control? 71 The remainder of this manuscript is structured as follows. The details of our model are 72 described in Section 2. Based on the model, we derive the optimal treatment strategy and 73 a practical method of its implementation, which are discussed in Section 3 and Section 4 74 respectively. Our finding of optimal treatment is consistent with the concept of 'minimum 75 effective dosage' in the adaptive therapy paradigm, which optimizes drug effectiveness with 76 the least risk of drug resistance development [2]. Section 5 includes a discussion of adaptive 77 therapy, as well as overall conclusions and discussions of this work.  Table   83 1 for the definition of i ), and neutral under any of other drugs. Therefore, we can simu-84 late the patterns of collateral sensitivity sequences in terms of the resistant cell populations 85 (for example, R i ) under a particular drug (i.e., Drug i in the example) which subsequently 86 declines under the next drug in the cycle to which it is sensitive (i.e., Drug i). 87 Under each drug (Drug i), we assigned three types of total proliferation rates ("birth 88 rate"-"death rate"), for resistant (p i r > 0), sensitive (p i s < 0) and neutral (p i 0 ∈ (p i s , p i r )) 89 cells. Assuming evolution of each cellular type occurs toward higher fitness levels (i.e., 90 proliferation rates in our model), we accounted for two kinds of transitions: from sensitive 91 to neutral types (g i s > 0) and neutral to resistant types (g i 0 > 0). The dynamics of R 1 , R 2 , ...,

92
and R N under any chosen drug, Drug i, are described in Figure 2.

93
In our study, system (1) is used to describe the dynamics of cell populations under a 94 single drug. To study the dynamics under multiple drugs switched over time, we switch the where R i and R i−1 (i − 1 = i ≥ 1) are resistant and sensitive cell populations respectively, with all the other compartments being populations of neutral type. The left panel shows the schematic of transitions/turnovers among R j s, and the right panel shows the associated system of differential equations. Here, {p i r , p i s , p i 0 } are proliferation rate of resistant, sensitive, and neutral cells, {g i s (and corresponding g i s = g i s /(N − 2)), g i 0 } are transition rates from sensitive to neutral and, from neutral to resistant types. drug index, i, in the system accordingly. The resulting piece-wise continuous differential 96 system will describe the effect of this drug switch strategy. An example of population 97 histories with 4 collaterally sensitive drugs switched as indicated is shown in Figure 3.

98
Rapid drug rotation with chosen intensities of the N drugs (i.e., f i ∆t-long with Drug i 99 where ∑ N i=1 f i = 1 and ∆t → 0 + ) is employed in our optimal therapeutic strategy. This will 100 be described in detail in the next section. In our modeling framework, the fast switch is 101 highly related to the therapy with a drug mixture, since its corresponding cell population 102 model is in a similar form as the single drug dynamics (Equation (1)). The only difference 103 is the transition matrix which is replaced by the linear combination of the matrices of all the 104 drugs (D(k)s) with the relative intensities ( f k s), as described below.
For the derivation of the Equation (2), see Theorem A.6 in Appendix A.
for varying drug administrations. Our numerical study determined the treatment regimen 113 for the best possible effects to minimize (3). It can be achieved when the best drug(s) is 114 chosen at every given moment (t = t 1 ). Here, the best drug(s) at t 1 (e.g., Drug i) means 115 the drug(s) which have the highest effectiveness in decreasing total population (such that 116 e f i (t 1 ) ≤ e f j (t 1 ) for any j ∈ {1, 2, ..., N}), where the formulation of the drug effect is defined 117 in following way for Drug j at t 1 . R under the optimal strategy obeys the following system, While we are deriving a mathematically optimal therapy schedule, we realize it is imprac-120 tical to instantaneously switch drugs in realistic clinical situations. Therefore, we developed 121 a simulation algorithm to choose and accordingly switch to an effective drug at a given discrete time step (∆t 0), which is described in Figure 4 (a) -where a longer ∆t could be 123 considered for specific clinical situations. An example of the optimal administration simu-124 lated with the discrete scheme, compared to a choice of non-optimal administration which 125 changes drugs at minimum time points of total population, is shown in Figure 4 (b, c). The 126 panel (b) shows tumor size, which is smaller with the optimal therapy over the time than the 127 size with the non-optimal therapy, and the panel (c) shows how much the optimal therapy 128 is better, measured by (3), in this example.

Properties of the optimal therapy
In this section, we will study several properties about the optimal therapeutic regimen based 132 on two examples of "symmetric" drug cycle ( Figure 5) and an "asymmetric" drug cycle (Fig-133 ure 6). We define symmetric drug cycle as a set of drugs which have an identical parameter 134 value for each type of dynamical event, i.e., {p i r , p i s , p i 0 , g i s , g i 0 } = {p r , p s , p 0 , g s , g 0 } for all i, 135 and an asymmetric drug cycle as having different parameter values for at least one type of We will begin 137 with the simpler case of symmetric drug cycles first and then generalize our observations to 138 asymmetric drug cycles.

139
In both cases simulated by the algorithm (Figure 4   Theorem A.9. However, our analysis will focus on other general cases that the expression 155 (6) can be utilized.

156
The last stage lasts as long as therapy is necessary, keeping the same cellular makeup 157 meanwhile. 158 All the subpopulations and total population of the last stage are presumed to change expo-159 nentially (See the comparison between the population curves and the light-gray guideline 160 curves on Figure 5, 6 (a).). For symmetric cases, it has been proved in Theorem A.10, that 161 the exponential rate is the average of proliferation rates of all subpopulations, All stages except the last one choose better drugs and gradually level the effectiveness of 163 all drugs evenly, so we will call those finite stages the "shaping period". On the other hand, In a symmetric drug regimen, after the shaping period, all the subpopulations maintain  197 In each case, we use one or more 'testing periods' each lasting N∆t where ∆t is the

205
In this particular method plasma cell-free DNA is sampled from a patient with relatively 206 high temporal frequency and used to resolve the corresponding evolutionary dynamics. 207 We use these techniques as a benchmark for clinically leverage-able tumor population data 208 which can be implemented in an algorithm as we describe below. Implementing the subpopulation data over just one single testing period (explained above),

211
we derive several equations which represent the relationships between the data and drug 212 parameters, and then, to find the parameter values. We will approach those problems for 213 two conditions separately: two drugs N = 2, and three or more drugs N ≥ 3.

214
In the case of two drugs, there are only two subpopulations R = {R 1 , R 2 }, and three 215 parameters for each drug i ∈ {1, 2} which are two proliferation rates {p i r , p i s } and one 216 transition rate from sensitive to resistant types g i . No subpopulation or parameters related 217 to neutral cells is included in the system of two drugs. Solving the differential system (1) 218 with the initial conditions, we have the following equations for each ∆t-long time window. (9), we can specify the equations with 221 drug parameters and known values only. 222 Similarly, in the case of a larger number of drugs N ≥ 3, we can solve (1), {s 0 , n 0 , r 0 } and {s 1 , n 1 , r 1 } are elements or the summations of elements in Pop i−1 and Pop i 225 respectively. Applying {s 1 , n 1 , r 1 } to (11) yields, Both (10) for the two-drug cases with three unknown parameters and (12) for N-drug 227 cases with five unknown parameters (N ≥ 3) are underdetermined. One strategy to resolve 228 this issue in either case is by assuming a specific ratio between proliferation rates and tran-229 sition rates, i.e., |p s | = αg for 2 drugs and {|p s |, |p 0 |} = {α s g s , α 0 g 0 } for more drugs, with a 230 reasonably chosen α, α s , α 0 ∈ (0, 1) like α = 0.1. Then, using the conditions and equations 231 (10 or 12) along with the data from one testing period, we can infer all the drug parameters.

232
The discovered parameters will give the complete information required to run the algorithm 233 of optimal administration (Figure 4 (a)). In most cases of cancer, however, detailed information about tumor heterogeneity cannot 236 be detected over time. Clinically, total tumor size is the highest resolution data that can 237 be reasonably (and even then poorly) measured. For such cases with limited data, rather 238 than trying to solve the differential equations as shown in the previous section, we tried to 239 approximate drug effects using the levels of changes of the total population data of testing 240 period, and checked which drug(s) are most reasonable to prescribe. The chosen drug or 241 drugs is continually prescribed as long as it is effective, and its effects are continuously 242 monitored. When the chosen drug loses efficacy, we need to select drugs again. Since we 243 will not know sub-populations and drug parameters, the equation (4) is not applicable. Instead, another (N∆t)-long testing period is required to choose the drugs of the next 'round'.

245
Therefore, in our algorithm of optimal prescription only with total population monitored, 246 we suggest to pair up "testing" period and "therapy" period, and repeat them (See Figure   247 7 (b)). Even though the testing period can transiently worsen outcomes compared to stan-248 dard therapies for a finite period of time, it is the only way to ascertain the relative strength 249 of each drugs selection pressure. So, we strived to reduce the time taken for each testing 250 period. One way we implemented this in our algorithm was to choose and apply multiple 251 effective drugs, rather than one single best drug. We chose this strategy because utilizing a 252 strict criterion of drug selection would not allow for currently chosen drugs to be effective 253 for a long time and would require a change of drug more frequently.

254
Now, let us describe our algorithm for optimal therapeutic prescription (Figure 7  using the current drug(s) or need to re-select drug (s), , 267 which is between the average effect of "effective" drugs, and the highest effect of "ineffec-268 tive" drugs. In our study, we simply fixed the threshold parameter, η = 0.5 (See Figure 7 Figure 7: Algorithm for the administration of the optimal drug schedules when total population data is available, but not subpopulation nor drug parameters. (a) Diagram of the classification of "effective" and "ineffective" drugs, and the threshold of drug effectiveness (e f * ). The effectiveness levels of the most effective drug (e f m ), effective drugs and ineffective drugs are indicated by blue-filled circle, black-filled circles and empty circles, respectively. (b) Flow chart of the optimal therapy algorithm based on the "effective" drugs from (a).
Using both methods, we found that both strategies which were too strict or too generous 277 do not generate tumor histories close enough to the optimal one (See Figure 8 Figure 8: Comparison between the realistic approximation of the optimal therapy (by the algorithm on Figure 7 (b)), and the actual optimal strategy. (a) Optimal therapeutic effect generated by the practical algorithm assuming that only total population is trackable (solid curves) with = 0.03 and η = 0.5, compared to the effect of the optimal therapy in theory (dashed curves). Other used parameters and initial tumor status are the same as in the example in Figure 6. (b) Errors between the approximated and actual optimal histories, over the range of the parameter of effective drug window.

284
The phenomena of collateral sensitivity presents an opportunity to improve effectiveness 285 in drug therapy without the need for drug discovery, a process that requires enormous 286 amounts of time and money. Taking advantage of CS clinically however, requires a better 287 understanding, of the evolutionary dynamics under changing therapy. To address this, we 288 developed a mathematical model of ordinary differential equations describing the effect of 289 a collaterally sensitive drug cycles, and explored the optimal treatment regimens within the 290 confines of this simplified model. Consequently, we found that the optimal therapeutic effect 291 can be derived when we switch drug to the best-effect-drug at every moment. While this is 292 somewhat intuitive, choosing the timing, and order of switching is a difficult prospect given 293 the lack of perfect data in clinical settings, and given the heterogeneity which is hallmark of 294 cancer.

295
In our ODE model, drug switching is implemented by changing drug-dependent pa-296 rameters (P i ) and transitions between cell types. In accordance with the 'optimal' drug 297 schedule, drugs are switched instantaneously after the first stage of single-drug therapy.

298
However, such rapid switching is not feasible clinically. To address this, we developed a 299 time-series algorithm mimicking the ODE system and the nearly instantaneous switches 300 (Figure 4 (a)). As expected, the algorithm-based simulation is smooth with a small time step 301 (Figure 4 (b)) and shows a good consistency with the ODE system ( Figure 5, 6 (a)).

302
The order of the drug sequence in each stage of our example simulations is not exactly   = N) and neutral under all the other drugs R vector of subpopulations, R = ( R 1 · · · R N ) T R * population fractions (i.e., ∑ N k=1 R * k = 1) at which all the drugs are equally effective, derivative of total population, TPD(t) = TP (t)    (Figure 7(b)). Cell population dynamics with this algorithm show a good consistency with 317 the fully informed optimal therapeutic strategy (Figure 8 (a)). The traditional time gaps 318 required to obtain updated diagnostics and prescription is several weeks or months, [22] 319 which would be too long to expect a result close to an optimal solution as we aim for here.

320
Based on the results of our algorithm simulation (Figure 8 (a) follows the dynamics of a rotating resistance and sensitivity pattern, there could be many 330 more types of cells not involved in any cell types within this structure, which will result 331 in treatment failure, as we discussed in Figure 9. In addition, our system and corresponding a analysis lend well to the theoretical and  , vol. 287, no. 1925, p. 20192454, 2020. 409 [20] J. Goldie and A. Coldman, "Quantitative model for multiple levels of drug resistance 410 in clinical tumors.," Cancer treatment reports, vol. 67, no. 10, pp. 923-931, 1983.
Lemma A.5. For any positive integer, n, and an any integer, i, in [1, N] where η is some number in [0, 1) and 0 ≤ f k ≤ 1 with ∑ N k=1 f k = 1.

500
Proof. By the definition (4), the effect of Drug i is 501 e f k = P i · R = (P i ) T R.
where k is a constant representing the level of balanced drug effect. Since {P 1 , P 2 , . . . , P N } 505 is linearly independent, we can isolate R * with an inverse matrix. Proof. The derivative of total population, TPD, is the summation of the vector, dR dt .