Mechanistic modeling of cell viability assays with in silico lineage tracing

Abstract Data from cell viability assays, which measure cumulative division and death events in a population and reflect substantial cellular heterogeneity, are widely available. However, interpreting such data with mechanistic computational models is hindered because direct model/data comparison is often muddled. We developed an algorithm that tracks simulated division and death events in mechanistically detailed single-cell lineages to enable such a model/data comparison and suggest causes of cell-cell drug response variability. Using our previously developed model of mammalian single-cell proliferation and death signaling, we simulated drug dose response experiments for four targeted anti-cancer drugs (alpelisib, neratinib, trametinib and palbociclib) and compared them to experimental data. Simulations are consistent with data for strong growth inhibition by trametinib (MEK inhibitor) and overall lack of efficacy for alpelisib (PI-3K inhibitor), but are inconsistent with data for palbociclib (CDK4/6 inhibitor) and neratinib (EGFR inhibitor). Model/data inconsistencies suggest (i) the importance of CDK4/6 for driving the cell cycle may be overestimated, and (ii) that the cellular balance between basal (tonic) and ligand-induced signaling is a critical determinant of receptor inhibitor response. Simulations show subpopulations of rapidly and slowly dividing cells in both control and drug-treated conditions. Variations in mother cells prior to drug treatment all impinging on ERK pathway activity are associated with the rapidly dividing phenotype and trametinib resistance. This work lays a foundation for the application of mechanistic modeling to large-scale cell viability assay datasets and better understanding determinants of cellular heterogeneity in drug response.

Data availability is a primary challenge for human whole cell modeling 29,41,42 .
Large-scale databases [43][44][45] containing cell viability assay data exploring sensitivity of multiple cell lines to multiple drugs are an attractive resource in this regard.Assessing how well computational models based on current knowledge explain this data can reveal existing knowledge gaps and inform next stages of research.
An obvious pre-requisite to leveraging such data sets for modeling purposes is the existence of robust methods for comparing simulations to experimental readouts appropriately.Most viability assays measure cell population size or a proxy.Typical mechanistic models do not explicitly describe cell division and death events, whose balance dictates cell population size, but usually prescribe empirical relations 22,46 .
Course-grained agent based modeling [47][48][49] can account for division and death events and were described in, for example, process control and optimization of production of therapeutic proteins using mammalian cell culture 50 , analyzing the impact of cell 4 population heterogeneity in colony and tissue context 51 , and elucidating the role of heterogeneity in IFNβ signaling 52 .However, there remains a need for algorithms that can simultaneously track large-scale mechanistic detail of drug response with cell division/death events.Such algorithms would help interpret the large body of cell viability response data for building better human whole-cell models.
In this work, we present an algorithm that combines detailed mechanistic descriptions of drug action with single cell lineage-resolved division and death events to construct simulation outputs that are directly comparable to cell viability assay data.As a test case, we use a previously developed large-scale model of single mammalian breast epithelial cell (MCF10A) proliferation and death, SPARCED 53 , and add to it mechanistic pharmacodynamic models based on known binding interactions between drugs and modeled targets.First, we describe the algorithm and the types of novel analytics that can be derived, such as cell population size dynamics, cross-generational biomarker tracking for cell lineages, and cell population dendrograms.Then, we simulate dose responses to multiple drugs for which experimental data exist, namely, alpelisib (PI-3K inhibitor), trametinib (MEK inhibitor), palbociclib (CDK4/6 inhibitor) and neratinib (EGFR inhibitor).Simulations agree with experiments for strong growth inhibition by trametinib and overall lack of efficacy of alpelisib; however, there is substantial model-experiment discrepancy for palbociclib and neratinib.Analysis of model discrepancies suggests that (i) the importance of CDK4/6 for driving the cell cycle is likely overestimated, and (ii) the cellular balance between basal (tonic) and ligand-induced ERK signaling is a critical determinant of EGFR inhibitor response.Simulations show subpopulations of rapidly and slowly dividing cells in both control and drug-treated conditions.We find that variations in 5 mother cells prior to drug treatment all impinging on ERK pathway activity are associated with the rapidly dividing phenotype and trametinib responsiveness.This work lays a foundation for the application of mechanistic modeling to large-scale cell viability datasets, which are critical to the development of whole human cell models, and also provide a unique analytical framework to generate hypotheses on molecular drivers of cellular variability across generations.

Lineage-resolved single-cell simulation framework
We first set out to construct a simulation algorithm that mirrors drug dose response viability assays (Fig. 1).These assays typically start with a population of asynchronously cycling cells that are treated with drug for ~3 days and then assayed for final cell number (or a metric proportional to it).The final cell number is related to the number of division and death events each initial single cell ultimately experienced.Thus, the simulation algorithm starts with a population of asynchronously cycling cells and counts the individual division and death events from each initial cell.Throughout this manuscript, we use our previously published mechanistic model of single cell proliferation and death signaling 53 representing MCF10A breast epithelial cells (SPARCED).In principle, however, any model with single-cell resolution and division/death readouts is potentially compatible with the approach.
The algorithm begins with the creation of a simulated population of asynchronously-cycling, single cells (Fig. 1A).The initial model state is an average, serum-starved cell (non-cycling).The first step is to generate a population of cells with heterogeneous gene expression profiles, enabled by descriptions of intrinsic noise in gene expression as previously described 21 .We refer to this process as "heterogenization".After 48 simulated hours, when the distribution of most protein levels across the cell population stabilizes, the addition of growth media is simulated (in the case of MCF10A cells and this model-EGF and Insulin).Subsequently, synchronized cell cycle progression is observed in simulations for an additional 48 hours, creating so-called "Generation 0" (Gen 0).To convert these Gen 0 synchronized cells into Gen 1 7 asynchronously cycling cells, we sample random times during the 48 hour growth media treatment window for each single cell.These selections become initial conditions for Gen 1, which is then subjected to simulated drug treatment for 72 hours (Fig. 1B).
Once these simulations are completed, the outputs are analyzed to determine cell division events (based on CyclinB-CDK1 peaks for this model) and their time points (Fig. 1B).Based on the cell division time points, the remaining simulation time (difference between division time and 72 hours), and initial conditions for each daughter cell are determined for the next generation, and lineage information is recorded.Subsequently, simulations for the next generation are run and this cycle continues until no division events occur in a given generation.Detected cell death events (based on cleaved PARP dynamics for this model), halt a lineage.These simulations not only mirror typical drug dose response experiments but also enable lineage-resolved analyses (Fig. 1C).Since individual division and death events from a parental cell are tracked, it also allows dynamic tracking of observables (such as ERK or Akt activity) across multiple generations of any single cell lineage.Lineage dendrograms can be also constructed, as is typical in such analyses.Such capability may generate hypotheses linking drug sensitivity or resistance with cell fates and lineage, or variations in biochemistry that predispose cells to response or resistance.

Comparing Simulated Drug Dose Responses to Experimental Measurements
With the above algorithm, we could now compare model predictions to experimental data for cell viability drug responses.Specifically, we focused on four previously-studied, targeted anti-cancer drugs for which our model includes primary targets and significant off-targets: trametinib (MEK inhibitor), alpelisib (PI-3K inhibitor), neratinib (EGFR inhibitor), and palbociclib (CDK4/6 inhibitor) 54 .We extended SPARCED by including known drug interactions with target species [55][56][57][58] and leveraging previously described capabilities to robustly and easily increase the model scope 53 (see Methods).
We then performed lineage-resolved simulations for various doses of the modeled drugs for which experimental data are available 54 , with no adjustment to the SPARCED model (besides addition of the drug pharmacodynamic modules-see Methods), using a starting population of 100 cells.This framework allows direct simulation of the dynamic cell population size in response to drug doses (Fig. 2A).The effect of drug action can be visualized by cell lineage dendrograms, showing in this example the clear effect of moderate trametinib doses to reduce the number of cell division events (Fig. 2B-C).The simulation outputs were used to calculate growth-rate inhibition 59 metrics which is the same method applied to the experimental dataset, allowing direct comparison of experimental and simulation results (Fig. 2D-G).The simulation results for trametinib (Fig. 2F) demonstrate surprising agreement with experimental data considered no parameter fitting was done, and also expectedly indicate substantial cell-to-cell variability at low doses.The simulations also captured the overall lack of efficacy for alpelisib, albeit with some slight differences in dose response slope (Fig. 2D).On the other hand, predicted palbociclib (Fig. 2G) and neratinib (Fig. 2E) responses were substantially different from experiments.Subsequent analyses explore the nature of these differences, as well as potential reasons for the cell-to-cell variability in the trametinib response.

Redundant for Cell Cycle Progression
What could explain the experiment/simulation discrepancy for palbociclib, a potent inhibitor of CDK4/6, canonically understood to be a central mediator of cell cycle progression from G0 and G1 to S-phase [60][61][62] (Fig. 2G)?Simulated palbociclib dose response starts to deviate from the experimental results at doses as low as 0.01 μM.Above 0.1 μM, the simulated dose response shows complete cytostasis.On the other hand, experimental results show minimal growth inhibition at 0.1 μM, and even high doses indicate only partial growth inhibition.
One consideration was doubling time.We reasoned that if the experimental doubling time was slower (greater) than the simulated doubling time, then in simulations many more cell divisions would be inhibited by the drug.That may explain why the simulated effects of palbociclib were much greater than that observed in experiments.
The used GR metric in principle should help to account for such doubling time-related phenomena 59 , but it also relies on assumptions such as exponential growth and constant drug effect on growth, which may not be satisfied.The model predicted a slower doubling time (~48 hours) than was reported in experiments for these MCF10A cells (~18-25 hours) (Fig. 3A and 54 ), although a wide range is reported for this cell line (~48 63 or even ~96 hrs 64 ).This is opposite of the difference we expected may explain the dose response curve discrepancy.Therefore, we conclude doubling time differences are unlikely to explain the observed differences.
Another consideration was restriction point behavior with respect to CDK4/6 activity, where a cell continues to divide even after complete inhibition of CDK4/6 at some 10 point in the cell cycle [65][66][67] .We reasoned that if the model does not capture such behavior, simulated palbociclib treatment could immediately stop cell divisions instead of letting already committed cells continue to divide once, leading to more predicted potency than observed.To explore this in simulations, we analyzed the probability of cell division in Gen 1 versus Gen 2 cells, as a function of dynamic progress in the cell cycle at the time of simulated drug treatment (Fig. 3B).Prior studies place the restriction point early in the cell cycle 65 .Simulations with saturating palbociclib dose (0.1 M) reflect such behavior, where most cells divide once if the cell cycle is at least ~10% completed, but subsequent cell divisions are nearly non-existent.This effect is also clear from the simulated lineage dendrogram which shows most cells divide once but not subsequently with this dose of palbociclib (Fig. 3C).Thus, we conclude that modeled restriction point behavior is also unlikely to explain discrepancies.
Finally, we considered that the canonically understood role of CDK4/6 as modeled in SPARCED is simply inadequate.That is, the assertion that CDK4/6 activity is a necessary and sufficient step to drive the early cell cycle 65,68,69 .A clinical line of evidence is the fact that CDK4/6 inhibitors have limited efficacy outside of hormone-positive breast cancers 60 .It has also been reported that proliferation can occur in CDK4/6 knockout cells 70 .More recent data have suggested that CDK4/6 activity has more of a probabilistic effect on cell cycle progression 71 , and the restriction point may be more reversible than previously thought in response to CDK4/6 inhibition 72 .CDK activities may also be overlapping; for example CDK2 and CDK4/6 may be compensatory 73 , and a sensor integrating multiple CDK activities 74 was shown to be highly predictive of restriction point behavior 75 .Therefore, we conclude that most likely, fundamental model reformulation is 11 needed to capture the effects of palbociclib, and that the canonical view of CDK4/6 as necessary and sufficient for cell cycle progression is inadequate.

The Balance of Tonic Versus Ligand-Induced Growth Factor Signaling is Critical for Capturing Drug Effects
Neratinib is an irreversible inhibitor of the EGFR (with some off-target activity for the closely related ErbB2/HER2), a receptor tyrosine kinase that, upon ligand binding, activates the pro-proliferative and survival ERK and AKT pathways [76][77][78] .Hence, drug action is expected to block ERK and AKT signaling when a ligand, such as EGF, binds to EGFR.The experimental dose response (Fig. 2E) shows strong growth inhibition at doses above 0.1 μM and complete cytostasis at ~3 μM.However, simulation-predicted growth inhibition within this range is significantly weaker.
To explain this discrepancy, we considered that the current modeled balance of ligand-induced versus basal (also called tonic) ERK signaling could be incorrect.Specifically, that basal ERK signaling was too strong and causes non-negligible proliferation in the absence of EGF.If cell cycling is initiated by basal signaling too strongly, coupled with the fact that neratinib cannot inhibit basal signaling, this could explain some of the model-experiment discrepancy.
MCF10A cells are dependent upon EGF for cell cycle progression 79,80 .Thus, in simulations, cells dividing without EGF would support the above explanation.In simulations where the growth media contained only insulin, some cell division events were observed (Fig. 4A).Since the proliferative signaling activity that caused these divisions did not originate as a result of simulated EGF-EGFR activity, simulated neratinib treatment cannot inhibit these.This is inconsistent with the experimentally observed cell 12 behavior and hence may be a major cause of mismatch between simulation and experiment.
How could the model be changed to account for these mismatches?First, we ensured that basal ERK signaling in the presence of insulin minimally induces cell cycle progression.Basal Ras-GDP to Ras-GTP exchange is the main reaction controlling basal ERK activity in the model.We reduced the value of the associated rate constant until the probability of cell division in the absence of EGF and presence of insulin was near zero (Fig. 4B-last point on left, 2x10 -4 s -1 ), and then simulated the dose responses again (Fig. 4C-F Further model refinement in this regard, therefore, will be the scope of future work.

Explaining Single-Cell Heterogeneity in Division Rate and Trametinib Response
A commonly observed phenotypic variation among cells within a population is the division rate 81,82 .Under both control (no drug) and trametinib (~half-maximal response dose = 0.03 nM) treatment conditions, simulations show large variability in the number of divisions arising from a particular Gen 1 mother cell (Figs.5A-B).Since rapidly dividing simulated cells (indicated by red) are present prior to drug treatment and persist after drug treatment, in simulations they are largely responsible for the partial response to trametinib.Could properties in the initial state of Gen 1 mother cells at the time of 13 trametinib treatment be a predictor of resistance, in this case marked by persistent rapid division in the presence of drug?Do the same mechanisms that drive rapid division in control conditions apply to this drug resistance?
To answer these questions, we first focused on simulated control cells.The initial conditions of Gen 1 cells under control conditions were extracted into a cell-by-species matrix (5,000 simulated cells by 934 initial conditions) (Fig. 5C).To identify species that may be associated with rapid division, we performed principal components analysis of this matrix, and colored cells by their division phenotype (Fig. 5D).The second principal component (PC2) stratified Gen 1 mother cells based on the number of divisions.To identify the most important model species contributing to PC2, we analyzed the loadings (Fig. 5E).Species with large loadings were associated with ERK signaling pathway components or downstream early cell cycle components.This suggests a simple hypothesis-that any fluctuation giving rise to higher ERK signaling capacity is associated with the rapid division phenotype under control conditions.Does this finding hold true under trametinib treatment conditions?That is, are cells with higher ERK signaling capacity more likely to retain rapid division phenotypes in the presence of sub-saturating doses of trametinib?Trametinib is a MEK inhibitor, and since MEK is a key component of the ERK signaling pathway, there is a clear connection.
Trametinib treatment shifts the number of divisions distribution to the left, reducing the number of rapidly dividing cells (Fig. 5B).To answer the question, we generated a new initial Gen 1 mother cell state matrix from trametinib-treated simulated cells, and then applied the projection learned from PCA of the control data onto this matrix.The results of this projection indicate again that rapidly-dividing cells cluster towards higher values 14 on PC2 (Fig. 5F).We conclude that in simulations, the rapidly dividing phenotype is driven by a multitude of factors impinging on higher ERK signaling pathway capacity, and these cells are likely to remain rapidly dividing in the presence of trametinib, contributing to acute resistance.

Discussion
Data availability is a major bottleneck for systems biology model development.
While there is a wide range of drug dose response viability assay data available, it is difficult to use for large-scale model development because simulation outputs often do not recapitulate the experiment outputs-cell number from cumulative division and death events in single cells.To address this gap, we developed a lineage-resolved simulation framework that tracks individual cell division and death events along with mechanistic detail that enables inference for why single cells have different outcomes.We demonstrate application of this framework using our previously developed model of proliferation and death signaling in single mammalian cells 53 , but in principle any model that simulates division and death events should be compatible.For palbociclib, the simulations overpredicted its efficacy, showing very high growth inhibition at moderate doses to complete cytostasis at high doses.This reflects the indispensability of the drug target, CDK4/6, as per the model of the cell cycle pathway.However, in experiments, even the highest doses resulted in only partial growth inhibition.CDK4/6 is associated with traversing the cell cycle restriction point 83 .In pre-S-phase cells, one of the regulators of restriction point, Rb, is bound to a key transcription factor of the cell cycle process, E2F.In presence of a growth stimulus, CDK4/6 is activated when bound to cyclin D. This activated cyclin D-CDK4/6 complex can phosphorylate and inactivate Rb, which then releases E2F.Subsequently, there is an upregulation of E2F which then mediates S-phase entry and progression by activating cyclin E and cyclin A.
The mechanism of CDK4/6 inhibitors such as palbociclib attempt to induce cytostasis by preventing the inactivation of Rb by CDK4/6 84 .This canonical understanding places CDK4/6 as indispensable, similar to how it is modeled, but experiments did not agree with this.One of the known resistance mechanisms of CDK4/6 inhibition is the loss of Rb function 85,86 .However, since MCF10A cells do not harbor such mutations, it is an unlikely explanation in this case.Another reported resistance mechanism in cancer cells is the overexpression of Cyclin E 87,88 , which is a regulator of the later stages of cell cycle, but is also not the case in MCF10A cells.In the results section, we investigated mismatch between model and experimental doubling time and restriction point behavior, finding neither likely to explain the discrepancies.Therefore, we think the most likely explanation is that CDK4/6 is simply not as indispensable for the cell cycle as contemporary views may portray.It has been increasingly reported that CDKs can compensate for one another 73 , so the activity of other CDKs could compensate for CDK4/6 activity in actively cycling cells.Such mechanisms were not included in the original cell cycle submodel 89 , so these additions are likely important for capturing effects of cell cycle-targeted therapies.
For neratinib, the simulations underpredicted its efficacy, showing weak inhibition for moderate to high doses whereas the experiments showed significant growth inhibition to complete cytostasis within this range.To investigate this discrepancy, we considered the progression of ERK proliferative signaling within the single cell model and how the neratinib drug action might affect it.Neratinib is an irreversible inhibitor of the EGFR, which attempts to block ERK and Akt proliferative signaling by inhibiting ligand-receptor interactions.The model incorporates both ligand-induced and basal signaling along the ERK pathway.In simulations, if cells enter the cell cycle in the absence of ligand, it would result in proliferation that the drug action would be unable to inhibit.To test this, we performed subsequent simulations where EGF was absent from the growth media, but several simulated cells were still cycling.This is contrary to the experimental observations that MCF10A cells do not proliferate without EGF 79,80 , and explains the discrepancy observed between simulation and experimental results of neratinib dose response.Furthermore, we sought to account for this mismatch by altering a key model reaction modulating basal ERK activity, basal Ras-GDP to Ras-GTP exchange rate.We reduced this rate constant to minimize the probability of cell division in absence of EGF and ran all dose response simulations again.This time, the neratinib dose response showed closer alignment with the experimental result, but we observed overprediction of growth inhibition for all other drugs, presumably due to the altered balance between basal and ligand induced ERK signaling.Hence, ideally, the model should incorporate a more improved balance between basal and ligand induced signaling for describing cell proliferation events.In our previous work, stochastic single cell simulations initiated from a representation of a serum-starved MCF10A cell minimally entered S-phase without EGF 21 .However, for cell population simulations done here, single cells are subject to randomized sampling for induction of an asynchronously cycling population which more closely resembles the experimental conditions whereby drug treatment is applied after growth media is introduced to the cells.Also, cells were followed for much longer, which amplifies small percentages of cells still cycling with Insulin treatment alone.Thus, the model's limitations become more apparent here at the population level.
The neratinib case study highlighted an important future direction focused on parameter estimation for such models with stochastic components.This is a challenging area due to the computational cost of model evaluation, and the wide range of datasets that are needed to constrain large stochastic models.One part of our previous work was to do this for a subset of rate constants by "initialization" 21,90 .In initialization, certain model parameters and initial conditions are determined for a specific cell-line context using a set of focused parameter estimation operations which aim to tune parameters based on constraints placed on model observables.It is a computationally intensive process whereby each parameter estimation step performs iterative execution of deterministic model simulations.The SPARCED model is composed of a stochastic gene expression and a protein biochemistry module which are executed simultaneously.However, communication bottlenecks between the modules caused the computation time to be impractical for the purpose of initialization 90 .Recently, we solved the communication bottleneck problem which speeded up the deterministic execution by over 200-fold 90 .
Fast deterministic parameter estimation solvers have been reported for large-scale models as well 22 .This drastic increase in computation speed for deterministic simulations will allow a more exhaustive exploration of the model parameters essential for defining a more robust initialization protocol, but extending this to stochastic evaluation remains an important unsolved problem.
For trametinib, model predictions closely resemble experimental observations.Trametinib has high specificity for MEK1/2; once MEK1/2 is inhibited, it is no longer able to phosphorylate ERK1/2 91 .ERK signaling controls the G1/S-phase transition of the cell cycle 91 through activation of RSK, which in turn upregulates the production of Cyclin D and CDK4/6.Cyclin D expression through the G1 phase can drive the cell through the G1/S-phase checkpoint, as described above.When these events are inhibited by trametinib, the cell is unable to progress through the G1/S-phase checkpoint.However, results indicate that low-medium doses of trametinib are unable to reduce cycling in every cell in the population.Spencer et al. hypothesize that the signaling response patterns of a mother cell pass on to daughter cells, enabling an expression pattern to continue through multiple generations 92 .Therefore, we hypothesized that the initial values of the Generation 1 mother cells could predict the number of divisions that occur from said mother cell.The number of division events was able to be explained by principal component analysis, with higher ERK signaling capacity in multiple regards was associated with an increased number of division events.Thus, in this case acute resistance to trametinib is simply related to a multitude of biochemical factors all impinging on increasing the activity of the target ERK pathway.
In conclusion, we have developed an algorithm that takes a mechanisticallydetailed model of stochastic proliferation and death, and generates lineage-resolved simulations that can be used to interpret dose response viability data and better understand cellular response heterogeneity.Specific demonstrations revealed new insights into drug response, cell cycle biology, rapidly dividing phenotypes, and acute drug resistance.Given the abundance of such drug dose response viability data that is available, we expect this work to help address the modeling bottleneck in data availability, for building mechanistic models of single cell behavior.
Palbociclib.Palbociclib enters and leaves the cell and the nucleus with first-order kinetics and the same rate constant (0.01 nM -1 s -1 ).Nuclear palbociclib reversibly binds to its target, nuclear CDK4/6, with a dissociation constant (Kd) of 1.9 nM and mass action kinetics (kon = 0.001 nM -1 s -1 ; koff = 0.0019 s -1 ).Any drug-bound species loses its kinase activity.Any drug-bound species undergoes first-order decay with a rate constant equal to that of the non-drug-bound species.
Trametinib.Trametinib enters and leaves the cell with first-order kinetics and the same rate constant (0.01 nM -1 s -1 ).Cytoplasmic trametinib reversibly binds to its target, unphosphorylated free MEK, with a dissociation constant (Kd) of 0.35 nM and mass action kinetics (kon = 0.001 nM -1 s -1 ; koff = 0.00035 s -1 ).Any drug-bound species loses its kinase activity and ability to bind substrates.This is a simplification of trametinib action but is effective for capturing the broad effects of downregulating the ERK pathway.Any drugbound species undergoes first-order decay with a rate constant equal to that of the nondrug-bound species.
Neratinib.Neratinib enters and leaves the cell with first-order kinetics and the same rate constant (0.01 nM -1 s -1 ).Cytoplasmic neratinib binds irreversibly to free EGFR, ErbB2, and ErbB4 with first-order kinetics (kon = 10 -4 nM -1 s -1 ).While this is a kinase inhibitor, and drug bound complex loses kinase activity, for simplicity we disallow subsequent interaction with other receptors and ligands.Any drug-bound species undergoes first-order decay with a rate constant equal to that of the non-drug-bound species.

Lineage-Resolved Simulations
Asynchronous population.Cell population simulations are initiated by creating a representation of an asynchronously cycling cell population.The starting size of the cell population is specified by the user.For each starting cell, initial conditions representing an average serum-starved MCF10A cell are used to create a heterogenized cell population 21 .For heterogenization, we run stochastic single cell simulations for 48 simulated hours under serum-starved conditions, using the initial conditions of the average serum-starved MCF10A cell.Thus, the intrinsic gene expression noise incorporated within the single cell model leads to heterogeneity in protein levels across the starting cell population over the duration of simulation time.Then, simulated growth media with EGF (3.3 nM) and insulin (1721 nM) is introduced and another series of stochastic simulations are run for each individual cell for 48 hours.From the generated trajectories, for each cell a timepoint is randomly selected from a uniform distribution using the NumPy randint function.The conditions at this time point for each cell are used as the initial conditions for the first generation.Single-cell simulations are executed for all first generation cells for the user-specified duration (typically 72 hours).
Identifying cell division events.Once the single-cell simulations are completed, the generated outputs are analyzed to determine cell division events.The cell division events are detected by identifying troughs in Cyclin B-CDK1 trajectories.For this, we defined a python function combining the find_peaks methods in the SciPy signal processing library and the n-th discrete difference calculation method (along any given axis) in the NumPy library.For any individual cell, if a division event is detected, timepoints after the occurrence of cell division events are discarded and the state vector at the time of cell division is selected as the initial condition for two new second generation cells.
Identifying cell death events.Cleaved PARP is the readout for cell death 21 .For any single cell, if more than half of PARP has been cleaved at any time point, the cell is labeled dead at that time point.Subsequent generations.For each generation, state matrices for individual cells are obtained and saved as part of output dataset.In the event of a cell division, we retain the state matrix of the mother cell until the time point of division and the remaining portion is truncated and discarded.For every cell, we scan the output for the duration of its lifetime to find division events.To determine the required simulation time for next generation of daughter cells, the division time point is subtracted from the total simulation time.The single cell outputs at the time point of division of each mother cell is recorded as initial conditions for the next generation of daughter cells.Thus, we define the required simulation time, population, and initial conditions for the next generation of cells.This process is repeated for the subsequent generations of cell populations.In a given generation, if there is no cell division event observed within the simulation time, the population simulation is terminated.
Implementation.Computation is performed using HPC-compatible parallel processing in Python whereby single cell simulations are run in individual CPU threads.
To run the cell population simulation, a computational environment with an implementation of MPI (Message Passing Interface), such as OpenMPI 93 on Linux and MSMPI on Windows systems needs to be set up in addition to the dependencies of the SPARCED model pipeline.Before the simulation can be performed, the SPARCED model is built using the python script under scripts/createModel.py, which creates an executable single cell model based on the specifications in the input files.Once the model build process is complete, MPI can be used to run cell population simulations using the below command.Upon completion of simulations, the results are saved to disk as Python pickle objects for analysis and visualization.

mpirun -np [n_cpu] python cellpop_drs.py --arguments [argument_value]
Here, n_cpu is the number of MPI processes that the user decides to use for parallelization.The arguments are: sim_name: User-defined string to create a directory under sparced/output where simulation outputs will be saved.Dose responses from cell population simulations were calculated using the growth rate inhibition metric (GR) 59 .Dose response simulations were run for 10 dose-levels matching experimental data for each drug and 10 replicates of each dose.Outputs from the cell population simulations were read and analyzed to determine the total number of living cells over time for the duration of experiment time.Using the number of living cells at 72 hours, the GR scores were computed for each replicate was computed with the Python script provided as part of GR-metrics git repository.

Calculation of Fractional Cell Cycle Progression
Cell cycle progress estimation.For the palbociclib dose response, the extent of cell cycle progress at the time of drug addition was estimated using a function of average cyclin concentration levels.CyclinE-CDK2, CyclinA-CDK2 and CyclinB-CDK1 species concentrations were converted to a relative measure based on their observed peaks.An average of these three variables over time generated an oscillating function, with troughto-trough distance representing total cell cycle time.We determined an average trajectory of this function using a deterministic simulation.Then we calculated the function trajectory from individual Gen 1 cells We calculated the relative cell cycle progression as the ratio of temporal distance between timepoint of alignment and its previous trough to distance between two neighboring troughs.

27
Estimation of cell division probability given cell cycle progression.For any given drug dose, the cell cycle progression of all cells at the time of dose administration was calculated.Then all living cells were grouped into those that divided and did not divide in both Gen 1 and Gen 2. Gaussian kernel density estimation was used to estimate the probability density function for each group.Using the probability density function, the number of cells for dividing and non-dividing groups within cell cycle progress time intervals with increments of 0.01 were estimated.For each interval, the probability of division at Gen 1 and 2 were calculated using the ratio of number of dividing cells and total number of cells.

Principal Components Analysis
The number of progeny arising from each Gen 1 cell ('mother cell') was determined from control condition simulations as above.Z-score normalization was applied to the initial condition matrix (cells-by-species) using standardScaler.transform in scikit-learn.
Principal component analysis was completed using decompostion.PCA in scikit-learn.
To apply this projection to the drug-treated simulated cells, we generated a new initial state matrix from simulated mother cells treated with 0.032 nM trametinib and normalized as above.
).The new simulated neratinib dose responses show closer alignment with experiments.However, for all other drugs, experiment-model agreement became significantly worse, most likely now because the absolute levels of EGF-induced ERK signaling are altered.This result reinforces the close interacting nature of signaling mechanisms in the model for influencing broad features of drug response, and cautions against developing models without considering comparison to a compendium of data.
We compare model simulations to experimental data for viability response to four different targeted anticancer drugs.Discrepancies between model and data for palbociclib and neratinib, elaborated on further below, lead to new biological understanding.Deeper analysis of trametinib cases suggest mechanisms of resistance and what drives rapidly cycling cells in general.

cellpop:
An integer specifying the number of starting cells for simulation exp_time: Duration of experiment in hours drug: String specifying species name for the drug of interest, available options for our work include: alpel_EC, nerat_EC, trame_EC and palbo_EC dose: Applied concentration of the drug in μM egf: EGF concentration in nM ins: insulin concentration in nM hgf: HGF concentration in nM nrg: heregulin concentration in nM pdgf: PDGF concentration in nM igf: IGF concentration in nM fgf: FGF concentration in nM GR Score Calculation

1 Figure 1 .
Figure 1.Workflow of the developed simulation algorithm.A. An asynchronously cycling cell population (Gen 1) is initiated by sampling the initial conditions at random time points from a pool of single cell simulations run with growth factor stimulation (Gen 0).B. Upon the execution of each generation, detection of new cell division events (or lack thereof) within simulation time determines the creation or not of a next generation.C. (left and center) Cross-generational trajectory of observed ERK and AKT activity from a randomly-chosen single-cell lineage.Varying colors represent subsequent generations, starting from Gen 1. (right) In silico lineage tracing capability is demonstrated with a lineage dendrogram.Lines representing individual cells are labeled with generation and index.

Figure 2 .Figure 3 .
Figure 2. Lineage Resolved Simulations for Comparing Simulated and Experimental Cell Viability Assays.A-C.Dose response simulations for an example drug (trametinib).Median (across simulation replicates) cell population dynamics for several doses (A) and cell population lineage dendrograms for specific doses: 0 µM (B) and 0.1 µM (C) are shown.D-G.Simulated dose responses measured in GR-value for four drugs compared experimental data.Error bars are standard error taken from original experimental data, or as calculated across simulation replicates (n=10).

Figure 4 .Figure 5 .
Figure 4. Simulation Analysis to Investigate Neratinib Dose Response Discrepancy.A. Lineage dendrograms under control (no drug) conditions without EGF but with insulin.There are multiple rapidly dividing cells.B. Dependence of the probability of cell division as a function of the rate constant controlling basal Ras activation.Simulations were done as in A, without EGF and with insulin.The baseline value for the rate constant in the current published version of the model is designated by the red diamond.C-F.Dose response curves as in Figure 2, except with the altered basal RasGTP activation rate constant from panel B (2x10 -4 s -1 ).