## Abstract

Recent advancements in Cybergenetics have lead to the development of new computational and experimental platforms that enable to robustly steer cellular dynamics by applying external feedback control. Such technologies have never been applied to regulate intracellular dynamics of cancer cells. Here, we show *in silico* that Adaptive Model Predictive Control (MPC) can effectively be used to steer signalling dynamics in Non-Small Cell Lung Cancer (NSCLC) cells to resemble those of wild-type cells, and to support the design of combination therapies. Our optimisation-based control algorithm enables tailoring the cost function to force the controller to alternate different drugs and/or reduce drug exposure, minimising both drug-induced toxicity and resistance to treatment. Our results pave the way for new cybergenetics experiments in cancer cells for drug combination therapy design.

## 1 Introduction

Cybergenetics is a recent field of synthetic biology, which refers to the forward engineering of complex phenotypes in living cells applying principles and techniques from control engineering [1].

Three main approaches have been proven to be effective for the control of different processes (such as gene expression, cell proliferation), namely: i) open- or closed-loop controllers embedded into cells by means of synthetic gene networks [2–6]; ii) external controllers, where the controlled processes are within cells, while the controller (either at single cell or cell-population level) and the actuation functions are implemented externally via microfluidics-optogenetics/microscopy-flow cytometry platforms and adequate algorithms for online cell output quantification and control [7–16]; iii) multicellular control, where both the control and actuation functions are embedded into cellular consortia [17–20]. Plenty of examples of embedded controllers have been engineered across different cellular chassis; instead, applications of external and multicellular controllers in mammalian cells are scarce and either just theoretical or limited to proof of concepts.

Here, we propose to apply cybergenetics, in particular external feedback control, to predict combinations of drugs (i.e. control inputs) which can bring dysregulated cellular variables (i.e. gene expression, control output of the system) within tightly controlled ranges in cancer cells. We take Non-Small Cell Lung Cancer (NSCLC) as an example; using a previously proposed differential equations mathematical model describing the dynamics of the EGFR and IGF1R pathways, we show *in silico* that external feedback controllers can effectively steer intracellular gene expression dynamics in cancer cells to resemble those of wild-type cells.

The use of feedback control is advantageous as it enables coping with changes in both steady-state levels and in the temporal dynamics of genes involved in dysregulated signalling cascades. The control action is implemented by means of an Adaptive Model Predictive Controller (MPC), thus not requiring an exact model of the system; this is particularly advantageous in biological applications, where the derivation of detailed models can be time-consuming and troublesome [21–23].

The possibility to control single/multiple outputs with one or more inputs can support the design of combination therapies which target different nodes in signalling cascades; this approach can be advantageous to maximise the efficacy of cancer therapies [24]. In this regard, our optimisation-based control algorithm enables tailoring the cost function to force the controller to alternate different drugs and/or reduce drug exposure. The controller should also be able to cope with the crosstalk among different signalling pathways and the presence of endogenous feedback loops within signalling pathways, which might be a further mechanism causing drug resistance by adaptive cellular responses [25].

In what follows, we demonstrate that adaptive MPC can be used to effectively steer the concentration of several proteins within different signalling pathways of a NSCLC cell in order to tune gene expression, whilst reducing the dose of each input.

## 2 Methods

### 2.1 Control Scheme Used in External Feedback

We applied a feedback controller to regulate the concentrations of two downstream genes (ERK and Akt) of the mTOR and MAPK pathways as modelled in [26], and as shown in Figure 1.

Figure 2 shows the response of y1(ERK) and y2(Akt) in a wild type ~ and in a NSCLC ~ cell to a phosphorylation of EGFR and IGF1R as modelled by the initial conditions [26], referred to as an activation. A wild type activation is caused by an active concentration of 8000 *μM* for both EGFR and IGF1R and a NSCLC activation is triggered by an active concentration of 800’000 and 400’000 μM for EGFR and IGF1R respectively. The term ‘Free’, refers to a free response of the NSCLC system (i.e. if no feedback control is applied). It can be seen that this overexpression in cancer cells causes different y1(ERK) and y2(Akt) activation dynamics; of note, the activation of y1(ERK) occurs over a timescale of an order of magnitude faster than the activation of y2(Akt).

The two pathways are both kinase activated cascades, meaning that an activation at the receptors in the cell membrane causes a cascade of phosphorylations in downstream genes. Therefore it is difficult to robustly control the system as once an error is measured in the outputs, it can be too late to have a significant effect by acting on the internal states higher up the cascade.

A model based controller is needed to interpret a small change in the outputs, rather than using a model free controller with a large gain that will have a large reaction to all errors in the outputs. For example, a PID controller can be tuned such that the reaction to the initial error of the outputs is enough to mimic the reference, however the response might be so finely tuned that robust control is not achieved (Section S8). A controller that can preempt the activation curve in the outputs by estimating the changes in internal states higher up the cascade can robustly control the NSCLC system.

Adaptive MPC is used as the model based control scheme here (Regulator block in Figure 1). The success of MPC relies on the quality of the model used to predict the future behaviour of the system, and on the cost function the controller uses to calculate the optimal inputs to be fed to the process. The novelty of the controller used here lies in the choice of adaptive model and cost function. The MPC implementation is presented in Section S4.

### 2.2 MPC’s Linearised Model

The NSCLC model [26] contains significant non-linear terms and a large number of internal states. An adaptive MPC controller which computes a linear approximation of the NSCLC model at each time step of the controller, predicts the future of the internal states and calculates the optimal input profile. The controller then applies the first input of its calculated optimal input profile to the actual system. At the next time step, the controller recalculates a new linear model by linearising the NSCLC model. The use of a linear system results in a convex optimisation problem which can be solved quickly. Adaptive MPC is used in all simulations unless stated otherwise.

Alternatively, non-linear MPC could have be used but it is computationally expensive and can takes a time to compute the next input which is longer than the time-step between inputs (see Section S7).

### 2.3 Improving the Traditional MPC Cost Function

The cost function used in the adaptive MPC algorithm to find the optimal input depends on the current internal state error, **e**_{0}, and the inputs, **u**(*t*). The error, **e**(*t*), is the difference between the reference, **r**(*t*), and the internal states of the NSCLC system, **x**(*t*), shown in Figure 1. The standard cost function used for linear MPC controllers [27], focuses on how readily the inputs, **u**(*t*), are used and on reducing the proportion error in the states, **e**(*t*).

In order to include both the magnitude and duration of the error, the integral of the state error, ∫ **e**(*t*) *dt*, is added to the standard cost function. It has been shown that it is beneficial to integrate the state error [28], meaning that the controller acts due to these longer, smaller errors in the outputs caused by states higher in the cascade which will later have a significant affect on the error in the output.

Moreover, to avoid rapid fluctuations in the control input, **u**(*t*), as this would be impractical *in vitro,* a differential cost of the inputs is also added to cost function.

The focus of the cost function is decided by varying the weights of term coefficients (*α, β, γ, η, θ*) to inform the controller what an optimum solution favours. The derivation of the cost function can be found in Section S4.

### 2.4 MPC Simulation Parameters

The MPC simulations are reproducible due to the deterministic nature of the model and controller, as long as the cost function coefficient weights and other MPC related parameters are consistent. Table 1 gives a summary of these parameters. Several key parameters are added to the caption of each MPC simulation.

### 2.5 Indexes Used to Quantify Control Performance

To assess quantitatively the performance of our controller, we define an Error Index, *EI*. It is the sum of the squared error between the output and the reference for the outputs, as used in [29].

**C** is the output matrix of the linearised NSCLC model. This index is displayed in the top left corner of output plots. A small *EI* indicates a good performance of the controller.

To get a sense of the controller effort needed to achieve a certain output, we assess the dose of input drug(s) received by the cell using a Dose Index, *DI _{i}*. It is the integral of the input signal, where

**u**(

*t*) = [

*I*

_{1}(

*t*),

*I*

_{2}(

*t*),

*I*

_{3}(

*t*)].

It is indicated in each input plot. The inputs can never be negative as they are physical concentrations, therefore there is no need to square the input signal.

### 2.6 Crosstalk Within the NSCLC System

Within the NSCLC cells, different signalling pathways interact with each other, having evolved to cumulatively work against an external disturbance in one pathway through changing the activation of another. Figure 3 shows the effect of this crosstalk for a step input of both *I*_{1} and *I*_{2} on y1(ERK). In this simulation, *EI* = 2.4237, where as the *EI* of a free response is, *EI* = 2.4002, showing that, although hardly visible in Figure 3, the inputs used on the mTOR pathway which would work to reduce the activation of y2(Akt) also increases the activation of y1(ERK) in the MAPK pathway, (see Figure 1 for the pathways). This is expected, as the network of pathways perceives the input as an external disturbance.

## 3 Results

Non-Small Cell Lung Cancer (NSCLC) accounts for 80% of lung cancer cases and is characterised by various mutations which usually lead to an overexpression of the EGF and IGF1R receptors. These receptors trigger several cascades including the mTOR and MAPK pathways; their downstream genes FOXO1 and C-FOS regulate cell apoptosis and proliferation, and their dysregulation can lead to tumours. The differential equation mathematical model for NSCLC signalling developed in [26] includes the mTOR and MAPK pathways along with some of the reactions between the two pathways, as shown in Figure 1; it enables comparing gene expression dynamics in wild type vs cancers cell. We chose the downstream genes ERK and Akt (noted as y1 and y2, respectively) as the control outputs for the external feedback loop (Figure 1). The two outputs can be tuned by varying three inputs (*I*_{1}, I_{2}, *I*_{3}), which inhibit three specific proteins within the mTOR and MAPK pathways. The pathways can influence each others’ reactions, creating internal feedback loops (crosstalk).

The code used to implement an adaptive MPC program on this NSCLC model [26], used in these simulations is available on GitHub: INSERT LINK HERE.

### 3.1 Assessing the Importance of Each Term in the Cost Function

Firstly, Single-Input Single-Output (SISO) simulations were performed. The controller tries to steer the dynamics of either y1(ERK) or y2(Akt) by varying the concentrations of a drug that acts directly on one of the two signalling cascades (*I*_{3} for y1(ERK) and either *I*_{1} or *I*_{2} for y2(Akt)). Figure 4 uses *I*_{2} to regulate y2(Akt), and shows the effect of different cost function terms on the performance of the controller.

Figure 4A) shows that using integral terms within the cost function reduces the error in y2(Akt), as compared to the proportional terms in (*EI* → 1.945~< 10.923~). However, it can be seen in plot C) ~ that the controller using integral terms can cause fluctuations in the input. Such fluctuations are reduced when using a differential cost, which also has low Error Index, but higher Dose Index (*EI* → 1.691~< 1.945~< 10.923~, *DI*_{2} → 603~> 546~> 129~). This is due to the fact that the cost function is relative to every non-zero term, therefore adding another set of terms means that the other terms are relatively less important to the controller. Therefore adding differential terms restricts the fluctuation of the inputs at the cost of a higher dose.

The derivative term is not included in further simulations to simplify the tuning of therapies, clearly showing the balance between the Error and Dose Indexes by only costing the integral error and inputs. All subsequent cost functions are consistent with plot C) apart from the weight associated with each input, γ, that is varied.

#### 3.1.1 Single-Input Single-Output Control with the Chosen Cost Function

Figure 5 shows Single-Input Single-Output (SISO) adaptive MPC simulations for *I*_{1}, *I*_{2} and *I*_{3}; each drug is used to control the downstream molecule in the cascade it acts on. It can be seen that plots C) and D) of Figure 5 are identical to ~ in plot A) and plot C) of Figure 4 as these are both SISO responses of *I*_{2} using the chosen cost function (costing the integral and input terms). All of the SISO controllers move the NSCLC response towards the wild type response ~ using a lower dose than just a step of each input at the maximum allowed dose (1*μM*), decreasing the Dose Indexes. The step input response can be found in Section S3. This demonstrates the benefits of using an external feedback loop compared to an open loop response with a static step input.

### 3.2 Multi-Input Multi-Output Control

Adaptive MPC can also be used to steer both outputs using all three inputs, in a Multi-Input Multi-Output (MIMO) simulation, as shown in Figure 6.

The error of y2(Akt) (Figure 6B)) is significantly smaller in comparison to Figure 5A) and C) (*EI* → 0.791 < 1.954 < 2.75), whilst using significantly less of *I*_{1} (*DI*_{1} → 570 < 1068) and *I*_{2} (*DI*_{2} → 418 < 546), suggesting that it might be advantageous to use adaptive MPC to predict and apply combination drug profiles.

However, due to the fast dynamics of the MAPK pathway, the output y1(ERK) fails to adequately follow the reference activation curve. Figure 7, compared to Figure 6, shows that if the time step is adequately reduced (for instance, to *T _{s}* = 0.02 minutes), the controller can handle the faster dynamics of the pathway and effectively control both outputs (

*EI*→ 0.001 < 0.791) whilst using a lower dosage of all the inputs (

*DI*

_{1}→ 545 < 570,

*DI*

_{2}→ 297 < 418,

*DI*

_{3}→ 4.35 < 6.87).

The large error within y1(ERK), when *T _{s}* = 1

*min*, means that the controllers optimal input profile will refrain from using

*I*

_{1}and

*I*

_{2}due to the negative effects on y1(ERK) caused by crosstalk. In order to investigate the effect of varying the weights of each input within the cost function to design combination therapies it was decided to run Multi-Input Single-Output (MISO) simulations with only y2(Akt). Allowing a more straightforward comparison of different combination treatments without having to factor in the added effect of the crosstalk from y1(ERK) suppressing the use of either input.

### 3.3 Combination Therapies Using *I*_{1} and *I*_{2}

If cells are exposed to drugs for an extended period of time, side effects and resistance might become an issue [24]. Therefore, the controller should be used to find potential drug profiles that can achieve a similar Error Index (*EI*) whilst reducing the dose of the inputs (*DI _{i}*). The weight associated with using each input,

*γ*, within the cost function can be varied for this aim, as shown in Figure 8. The Bliss Independence (BI) formula [30, 31] has been used as a combined normalised Dose Index to summarise the combined effect of multiple drugs.

Figure 8 focuses on the control of y2(Akt) using *I*_{1} and *I*_{2} as control inputs. The weights in the cost function associated with each input can be varied as a ratio of , ranging from low *R* values (where a high weight is associated with *I*_{1}, (*γ*_{1}), producing a SISO like plot only using *I*_{2}), all the way through to a high R value (where *γ*_{2} is relatively large and the controller will only use *I*_{1}). Figure 8 compares the normalised Error Index, , and the Bliss Independence, *BI* ~, to the weight ratio, *R*. It shows that there is a range of *R*, significantly reducing both *EI* ~ and *BI* ~. Therefore, the control performance of the MISO controller is better than any SISO simulation while keeping drug concentrations low. For the purpose of designing combination therapies, here the optimal input is associated to the minimum value of the *EI* ~.

It can be seen from Figure 8 that the minimum occurs when *R* = 10^{5}, corresponding to *γ*_{1} = 1 and *γ*_{2} = 10^{5}. Figure 9 compares the responses obtained using a very low or high *R* value to the MISO simulation at the optimum of *EI*. This optimum achieves a significantly lower Error Index (*EI* → 0.25 ~ < 1.95 ~ < 2.75 ~), and a lower Dose Index (*DI*_{1} → 568 ~< 1068~ and *DI*_{2} → 423 ~ < 546~).

The formation of the normalised Error Index, Dose Index and Bliss Independence are shown in Section S5.

### 3.4 Drug Holidays

If using an adaptive MPC, the user can set specific time intervals in which the controller does not give specific drugs (for example, to interrupt related toxicity). These Drug Holidays can be achieved by varying the weights associated with each input, online, during a single simulation. As an example, Figure 10 shows that the controller can retain a low Error Index whilst swapping inputs after 600 minutes (*EI* → 1.989 ≈ 1.950 < 2.75, the SISO *EI* of Figure 5).

Therefore it is shown that a programmed change of cost function weights during the simulation can decide which input to stop using.

Alternatively, the controller can be set to only choose one input at each time step. The inputs shown in Figure 11 have an ON or OFF state, 1*μM* or 0*μM* (discrete inputs). The step size, *T _{s}*, used makes a significant difference to the response. When

*T*= 1

_{s}*min*the drugs can switch ON or OFF every minute, leading to rapidly fluctuating inputs (shown in Section S6). Figure 11 shows the discrete simulation with a larger time step (

*T*= 30

_{s}*min*). It can be seen that there is a better performance when compared to the SISO simulations (Figure 5) as

*EI*→ 1.6878 < 1.95 < 2.75, whilst still using less of each input (

*DI*

_{1}→ 630 < 1068,

*DI*

_{2}→ 450 < 546). This controller benefits from producing combination therapies at a much larger time step of

*T*= 30

_{s}*min*whilst including drug holidays. However when compared to the optimal MISO response (Figure 9), these added constraints result in a higher Error Index (

*EI*→ 1.6878 > 0.2534) for a higher dose (

*DI*

_{1}→ 630 > 568,

*DI*

_{2}→ 450 > 423).

## 4 Discussion

Computational methods have been extensively used in the search for effective cancer treatments, with approaches including optimal control to regulate dynamics of different cell populations [32–41], and a feedback action to account for changes in the cancer system, either in *silico* or in *vitro* [42–49]. This is, to the best of our knowledge, the first attempt of using feedback control to regulate intracellular dynamics in cancer cells.

We showed that an adaptive MPC program can be used to inform treatments for NSCLC cells, steering the dynamics of several key signalling pathways, whilst offering a tunable cost function that allows to adjust the characteristics of an optimal input. Indeed, the controller can be tuned to choose different drug profiles that will achieve a similar control performance whilst reducing exposure to one or more drugs.

Other control strategies, like PID controllers, are less straightforward to tune as the gains are not so related to the observed output and desired input. The use of a linear model within the MPC algorithm makes the control algorithm running time short enough for it to be used, in the future, in external feedback control experiments. The implementation of those would require some practical aspects to be considered, which we did not account for. Firstly, there might be delays in cell responses to drugs/actuation, which the model used by the controller should account for. Also, the sampling/actuation time might need to be fast enough, if aiming at controlling genes with fast dynamics. This issue might be overcome using experimental optogenetics-based platforms instead of microfluidics-based ones, as they can reduce delays in the actuation. Finally, this method assumes the knowledge of a model. In the future we hope to work on an controller that adapts the model due to online system measurements.

We foresee a growing interest in applying cybergenetics approaches, and in particular feedback controllers, to steer mammalian cells dynamics. If we realise our ambition to implement the experiments proposed here on living cells and, longer term, on patient-derived organoids, feedback control might be a valuable tool for the design or personalised optimal treatments for a range of conditions.

## 5 Conclusion

It has been demonstrated that adaptive MPC can be used to better inform treatments for NSCLC cells, guiding the behaviour of several key signalling pathways and offering a tunable cost function which the user can adjust, modifying the characteristics of the optimal response. The controller can be tuned to focus on: the duration of the output state errors; the rapidity of the inputs; and the use of the inputs themselves. The weights can also be changed during its operation to choose different drug profiles that will achieve a similar performance whilst giving the cell a break from individual drugs. In the future we hope to test the controller on a microfluidics device as currently the controller has only been tested *in silico*.

## S1 NSCLC Model

The model of the NSCLC is [26]
where the vector field *f*(.,.) is detailed in (S4)-(S23), **x** is the state vector containing 21 molecule concentrations (Table S1), **u** is the input vector **u** = [*I*_{1}, *I*_{2}, *I*_{3}]^{T} (orange in (S4)-(S23), each input acts on both the active and inactive target molecule and therefore appears twice in (S4)-(S23)) and **y** is the vector of outputs *y*_{1} = pERK and *y*_{2} = pAkt (blue in (S4)-(S23)).

### S1.1 Initial Conditions

Mutations present in NSCLC cells can lead to an overexpression of *EGFR* and *IGF*1*R*; this is represented by using different initial conditions for *pEGFR* and *pIGF*1*R* are taken from [26] and lead to the response of the system as shown in Figure 2. These initial conditions are used for all the simulations where NSCLC cells are being controlled.

### Sl.2 Conservation Equations

Each molecule can either be active, *p*(.), or inactive, but there is a constant concentration of each molecule within the model. Therefore this total, (.)_{T}, is used to define the conservation equations.

### S1.3 Parameters in the NSCLC Model

## S2 Parameter Choice For Input Interactions

The equation governing the inputs reaction with the target molecule is given by

Where the Menten parameter values associated with each input are listed in Table S3. *Km* is equivalent to the *IC*_{50} value of the inhibitor being used on each target and is a property of the drug. To make the response of the controller inputs less ‘switch like’, inhibitors with relatively large *IC*_{50} values have been chosen. Kon has then been chosen by comparing the model of the drug to western blots for *I*_{1}-3MA [53], *I*_{2}-Oridonin [54] and *I*_{3}-Pimasetib [55].

## S3 Step Input Response

Step input simulations for each input, as described in Section 3.1.1, can be seen in Figure S1. In all cases, the control performance is worst as compared to that obtained with SISO control (Figure 5), as the *EI* is either comparable (S1A) or higher (S1C and E) than those obtained applying feedback control; also, in all step input simulations, the Dose Index is higher.

## S4 Model Predictive Control (MPC)

MPC uses a model of the Plant (system to be controlled) in the feedback loop to show the dominant dynamics of the Plant to estimate the effect of the inputs the controller will optimally choose at each time step. The inputs are chosen to minimise a user-defined cost function that typically includes terms penalising the magnitude of the inputs, **u**(*t*), and the magnitude of errors, **e**(*t*), between the response of the system and reference signals [27], as shown in Figure S2. The feedback loop then measures the outputs of the actual Plant, **y**(*t*), to estimate the actual states, **x**(*t*), and reiterates the MPC scheme to choose each subsequent input, **u**(*t*).

Some additions to the cost function will be discussed in Sections S4.2 and S4.3. MPC controllers are not limited to linear systems, however, non-linear systems will result in a larger computational effort and require more complex optimisation solvers, as discussed in Section S7.

For the present numerical study, the Plant is the nonlinear NSCLC model presented in Equations (S4)–(S23) and all the system states, **x**(*t*), are measured directly. In practice, only a few outputs/states can be measured and an Estimator is required to estimate the remaining states from input-output data as shown in Figure S2.

The control reference signal, **r**(*t*), is chosen as the response of a wild type cell (i.e. without cancer). **e**(*t*) is the error signal between the reference, **r**(*t*), and internal states of the Plant, **x**(*t*). **e**(*t*) is fed into the Regulation block of the control scheme; this is where the optimisation problem is solved.

### S4.1 Cost Function Derivation

The internal state errors **e**(*t*) are calculated and fed into the MPC block at each time step. The MPC controller uses the model of the Plant system to predict the future state error of the system for possible combinations of inputs, within the problem constraints, over the prediction horizon, *N*. The controller then optimally chooses the input profile that results in the minimum of a predetermined cost function, *J*(**U**), in the Regulator. The inputs of the first time step of this optimal sequence are then applied to the Plant system. At the next time step, the error in the states is estimated and this process repeats.

Usually, the optimisation problem contains the cost function to be minimised, and the state and input constraints.

The model that the regulator sees is a discrete approximations of the NSCLC model for 1 ≤ *k* ≤ *N* steps. **E** = [**e**(0), **e**(1),…, **e**(*N*)]^{T} describes the current and future predicted state errors. **U** = [**u**(1), **u**(2),…, **u**(*N*)]^{T} describes the future inputs. The weight of the cost function related to each term can vary what is considered the optimal input. **Q** weights the error of the states and **R** weights the use of the inputs [27]. For example, if the drugs used as an input need to be used forbearingly then the weight of the cost function associated with the inputs, **R**, should be relatively large compared to the weight of the state error, **Q**. Constraints on the input bounds are also included here (**U**_{L} ≤ **U** ≤ **U**_{U}, in all simulations **U**_{L} = 0μM and **U**_{U} = 1*μM*).

#### S4.1.1 Linear MPC

If a linear approximation of the model can be produced and represented by a state space, the future behaviour of the model can be calculated offline and the optimisation problem is convex (for affine constraints). The minimal cost can then be solved using a quadratic solver, which is relatively computationally light (MATLAB R2021b’s ‘quadprog’ solver was used here). Here, a state space was used to represent the Plant’s model.

As long as the current state is known, each future state can be estimated for a given input profile, i.e.

Defining a new notation containing all of the states in the prediction horizon.

Therefore the cost function can be rearranged as

The final term in the cost function can be removed as it is constant with respect to the inputs and therefore will not affect the position of the minimum point; thus,

Only the optimal inputs for the first time step are then applied to the Plant, **u**(1), the whole optimisation process is repeated at the next time step.

#### S4.1.2 Weighting of the Traditional Cost Function

The cost function as shown in (S30), can be weighted as a balance of using the inputs, *γ*; the error in all of the estimated states, *α* and the error in the outputs, *β*.
where **I** is an identity matrix. The cost function can be tuned by *α, β* and *γ*.

### S4.2 Differential Terms in the Cost Function

The controller can favour to rapidly change input concentrations, which is not ideal for in *vitro* experiments, where there might be delays in the actuation, and frequent media change might cause stress to cells. Therefore, a term related to the gradient of the inputs was added to the cost function to reduce fast variations of the inputs. The linear approximation of the model used within the MPC simulations is discrete and therefore the derivative is approximated by the scaled difference between inputs at adjacent time steps.

Using the squared sum of the derivative of the inputs, the gradient of the steps between the last actual input and the last step in the prediction horizon can be added to the cost function. Any constant scaling can be dropped as this would just change the weighting added to the term.

The gradient can be added to the cost function, *J*(**U**), through **D** and **d**, pre-multiplying by a scaling factor, *θ*.
where **I** is a square identity matrix (size equal to the number of inputs).

Figure 4, comparing plots C) and D) demonstrates the effect of the differential cost.

An alternative approach would have been to add the gradient as a constraint into the optimisation problem preventing the inputs from varying faster than a limiting value. However, this would have merely limited the maximum rate of inputs’ variation.

### S4.3 Integral Terms in the Cost Function

When controlling a signalling phosphorylation cascades, it is desirable to decrease both the peak and the duration of an error [28]. Both the duration and the peak are included in the integral of the outputs. Therefore the integral of the output errors should be added as a term in the cost function. The square of the integral errors has been approximated.

The first term is constant and is therefore dropped. **P** and **p** are weighted by *η* in the cost function.

The integral cost is defined in terms of the future state errors **E** rather than the inputs **U**. The matrices defined in Equation (S29), can be used.

The constant terms with respect to **U** can be dropped and the quadratic and proportional terms reorganised.

The cost function, J(**U**), can be formed including the integral of the state error.

This is the cost function that has been used in all the simulations with different weights. Figure 4A) demonstrates the effect of the integral cost on reducing both the amplitude and the duration of the outputs.

### S4.4 Adaptive MPC

Adaptive MPC describes an MPC control scheme in which the model of the Plant changes as the simulation progresses. Our non-linear model of the NSCLC, (S4) - (S23), can be linearised about the current estimate of the states of the actual Plant, **x**(*k*), at each time step. This linear model forms the state space in Equation S27 and the cost function is reformed at each iteration to better represent the local future dynamics of the Plant. The adaptive linear MPC has a better performance than a single linear model and is less computationally expensive than using a full non-linear model.

## S5 Normalisation of Indexes And Bliss Independence

In order to easily compare simulations, it is useful to have an index which summarises the performance of the controller and the type of input it takes to achieve this performance, *EI* and *DI _{i}*, respectively (normalised here such that multiple plots can be compared within Figure 8, see below).

The Error Index, *EI*, is the sum of the squared errors of the outputs, calculated by integrating the square of all the output error signals by using a trapezium approximation of the discrete data.

The largest *EI* (worst performance) found in any MPC simulation using the chosen cost function to control y2(Akt) is the SISO simulation using *I* (*EI* = 2.75, Figure 5). This has been used to normalise *EI* to form *EI*.

*DI _{i}* is equivalent to the integration of input profile for each input,

*I*.

_{i}*DI _{i}* is normalised by the

*DI*of

_{i}*I*’s SISO simulation (I1 and

_{i}*I*

_{2}acting on y2(Akt) and

*I*

_{3}acting on y1(ERK) in Figure 5,

*DI*

_{1SISO}= 1068,

*DI*

_{2SISO}= 546,

*DI*

_{3SISO}= 4), producing .

does not give a quantitative measure on the dose of the combined input profile. Within current literature, there are many methods of trying to summarises the joint effect and toxicity of combination therapies, where multiple drugs are given together at a determined time point [30]; however these do not look into dynamic dosages over a given time period. Therefore a combined effect of the drug profiles can be estimated by replacing these static drug dosages with the normalised Dose Index, .

An Isobole can be defined as for these therapies. From this definition our combinations are all antagonistic. The Bliss Independence formula, *BI*, assumes that there is no correlation between the two agents.

Our model is deterministic, with each input having a different target molecule, therefore within these in *silico* simulations there is no correlation between the inputs. Therefore the Bliss Independence formula can be used to gauge the combined effect of 2 drugs [31]. All three indexes are used in Figure 8 to compare multiple simulations.

## S6 Discrete Drug Holiday

A discrete MISO simulation, as described in Section 3.4, where *T _{s}* = 1min can be seen in Figure S3. Figure S3 achieves an

*EI*= 1.3533, significantly better than just a SISO simulation in Figure 5 (

*EI*→ 1.35 < 1.95 < 2.75) whilst using a lower Dose Index (

*DI*

_{1}→ 645 < 1068 and

*DI*

_{2}→ 438 < 546). However in this simulation both inputs rapidly fluctuate from 0

*μM*to 1

*μM*, therefore a longer time step can be used to reduce the fluctuations.

## S7 Linear vs Non-Linear MPC

All MPC simulations use an adaptive linear MPC controller, where the linear model is based off a linearisation of a non-linear model of the NSCLC system (S4)-(S23). Using non-linear MPC creates a non-convex optimisation problem requiring a more complex (and computationally heavy) solver. The non-linear simulations have used MATLAB’s ‘fmincon’, a gradient based non-linear solver. It is the fastest appropriate solver in MATLAB R2021b.

Figure S4 compares a MISO response using adaptive linear MPC ~ to non-linear MPC ~. It can be seen that the non-linear MPC has a lower Error Index of *EI* = 0.0148 compared to the adaptive linear MPC’s *EI* = 0.2520. However, the non-linear MPC had a significantly higher run-time, as expected.

Non-linear MPC would limit the controller’s use *in vitro,* as it the time to process the measurements might be longer than the data acquisition sampling time. This would then suggest using a larger sampling time, possibly causing issues with the controller performance (see Figures 6 and 7). When using the adaptive linear MPC controller, each iteration of the algorithm is well within the sampling time and enables capturing key dynamics of the system even though some of the non-linear couplings between the states are lost.

## S8 MPC vs Proportional (P) Control Schemes

All feedback simulations have used an MPC controller. Figure S5 compares the performance of a linear adaptive MPC controller ~ to a Proportional controller ~. Due to the relatively slow changing outputs, a differential gain was not used. An integral gain is not used as the output concentration is almost always greater than the reference, therefore the integral error never resets to zero, leaving the inputs at a non-zero steady state, causing a high *DI _{i}*. Therefore a Proportional (P) controller is used. The two gains (for each input) can be tuned such that the P controller’s initial reaction to the state error results in a relatively low

*EI*, as shown in Figure S5 ~, however the response is sensitive to the choice of gains. It can be seen in Figure S5 B) and C) that the inputs are identical, as the controller does not know the dynamics of the Plant.

In Figure S5 the two controllers obtain a similar performance, but the P controller ~ offers no control on the inputs used. The P controller does not achieve robust control, as the low *EI* is as an effect of the finely tuned gains reacting to the initial error in the output, whereas the adaptive MPC controller ~ can be tailored for specific inputs’ choices.