Gene expression noise randomizes the adaptive response to DNA alkylation damage in E. coli

DNA damage caused by alkylating chemicals induces an adaptive response in Escherichia coli cells that increases their tolerance to further damage. Signalling of the response occurs through methylation of the Ada protein which acts as a damage sensor and induces its own gene expression through a positive feedback loop. However, random fluctuations in the abundance of Ada jeopardize the reliability of the induction signal. I developed a quantitative model to test how gene expression noise and feedback amplification affect the fidelity of the adaptive response. A remarkably simple model accurately reproduced experimental observations from single-cell measurements of gene expression dynamics in a microfluidic device. Stochastic simulations showed that delays in the adaptive response are a direct consequence of the very low number of Ada molecules present to signal DNA damage. For cells that have zero copies of Ada, response activation becomes a memoryless process that is dictated by an exponential waiting time distribution between basal Ada expression events. Experiments also confirmed the model prediction that the strength of the adaptive response drops with increasing growth rate of cells.


INTRODUCTION 21
The accurate detection and repair of DNA damage is crucial for genome stability and cell survival. In 22 addition to constitutively expressed repair pathways, cells employ DNA damage responses that activate 23 DNA repair factors in the presence of DNA damage. The fidelity of the DNA repair system relies on a 24 series of processes: sensing the presence of DNA damage or DNA damaging agents, inducing a DNA 25 damage response, and correctly repairing lesions. Cell strains with genetic defects that impair the 26 function of any of these processes show sensitivity to DNA damage, elevated mutation rates, and 27 genome instability. However, even in fully repair-proficient cell strains the accuracy of the DNA repair 28 system is fundamentally limited by the stochastic nature of the molecular interactions involved (1, 2): 29 For example, proteins that signal or repair DNA damage perform a random target search and therefore 30 have a finite chance of overlooking lesions (3-7). The repair process itself can also be error-prone and 31 cause mutations, loss, or rearrangements of genetic material (8-12). Traditionally, research has focused 32 on genetic defects and such "intrinsic errors" in DNA repairi.e. errors that are inherent to the repair 33 mechanism and thus occur with the same probability in all cells of a population. 34 By comparison, less attention has been given to "extrinsic variation" in the DNA repair systemi.e. 35 fluctuations in protein abundances that may affect the repair capacity of individual cells. Gene 36 expression noise is ubiquitous (13, 14) and difficult for cells to suppress (15). Feedback gene regulation 37 can establish bimodal distributions so that subpopulations of cells maintain distinct states of gene 38 expression for long times. Whereas many biological processes are robust to a certain level of noise, 39 even transient variation in the capacity of a cell to repair DNA damage can have severe and potentially 40 irreversible consequences (16-18). For instance, cells that transiently express too little of a damage 41 sensor protein may be unable to signal DNA damage efficiently, leading to mutations or cell death. But 42 there may also be evolutionary benefits to heterogeneity and occasional errors in the DNA repair system 43 when cells are facing selective pressure (19)(20)(21). 44 The adaptive response to DNA alkylation damage in Escherichia coli is a case where gene expression 45 noise appears to cause significant cell-to-cell heterogeneity in DNA repair capacity (18,17 construction of a quantitative model of the core Ada regulation. The proposed model is remarkably 67 simple, yet accurately reproduces experimental observationsboth the cell average as well as the 68 stochastic behaviour of single cells. The model also predicts cell responses after different experimental 69 perturbations. No additional post-hoc noise term was required in our model but propagation of basic 70 Poisson fluctuations alone was sufficient to explain the observed cell-to-cell variation in response 71 activation. These results establish that intrinsic noise in the basal expression of the ada gene is solely 72 responsible for the stochastic nature of the adaptive response. The model also predicts that the strength 73 of the response should be inversely related to the growth rate of cells, which was confirmed in 74 experiments. 75

Experimental data 78
The construction of the model was based on experimental data described in reference (18). Briefly, the 79 adaptive response was monitored in live E. coli AB1157 cells carrying a functional fusion of Ada to the 80 fast-maturing fluorescent protein mYPet (35) that is expressed from the endogenous chromosomal 81 locus, thus maintaining native expression levels. Single cells growing continuously inside the "mother 82 machine" microfluidic device (36) were treated with the DNA methylating agent methyl 83 methanesulfonate (MMS) and Ada-mYPet fluorescence was measured using time-lapse microscopy in 84 multiple fields of view at 3-minute intervals. Fluorescence intensities were calculated from the average 85 pixel intensities within the segmented cell areas. To correct for the background fluorescence, the 86 intensity before MMS treatment was subtracted on a per-cell basis. 87 Additional experiments (data in Fig. 5) used the same microfluidic imaging setup and acquisition 88 parameters as described in our previous work (17). The only difference was that cell growth rates were 89 varied using minimal medium supplemented either with glucose or glycerol as carbon sources. 90

Ada response model 91
The structure of the model is based on previous genetic and biochemical characterization of the adaptive 92 response (23-25). Key to the model is a positive feedback loop in which DNA damage-induced 93 methylation of Ada creates meAda, which acts as a transcriptional activator for the ada gene. The 94 chemical kinetics of the model can be described as a system of ODEs according to the diagram in Figure  95 1A: 96 In the absence of DNA methylation damage, the ada gene is expressed at a constant basal rate kbasal 101 from the PAda promoter. Transcription of the ada gene is induced to rate kind when meAda binds to the 102 PAda promoter with an association rate kon and dissociation rate koff in a non-cooperative manner (37): 103 In the deterministic model, Ada is produced according to the fraction of time that the PAda promoter is 105 bound by meAda: 106 where kind is the fully induced production rate at saturating amounts of meAda. 108 Production of Ada and meAda molecules is counteracted by dilution due to exponential cell growth.

109
When time is expressed in units of generation times, the dilution rate is equal to ln(2). In addition to 110 dilution, our model also includes loss of meAda at a constant rate ρ. This feature was required to match 111 the rapid deactivation of Ada expression upon MMS removal that we observed in experiments and as 112 previously suggested (38, 39). The equation governing the concentration of the inactivated Ada species 113 is: 114 This formulation of the model approximates protein expression as one reaction where transcription and 116 translation are described with a single production rate constant. This reduces the number of free 117 parameters of the model and allows direct comparison of the experimental observables (i.e. Ada-mYPet 118 proteins) with the variables in the model. The simplification is valid when protein expression follows 119 first-order kinetics with a single rate-limiting step. This is consistent with the complete lack of ada 120 expression bursting in our experiments (18), and a short half-life and low translation efficiency of ada 121 mRNAs (40, 41). 122

Steady-state solution 123
Setting equations (1 -3) to zero gives the abundances of Ada and meAda at steady-state. These can be 124 expressed as the solution of a quadratic equation: 125 The total Ada level corresponding to the measured Ada-mYPet fluorescence is given by the sum [Ada] 132 . 133

Numerical solution 134
The time-dependent solution of the model equations was numerically obtained using the ode45 solver 135 in MATLAB. 136

Stochastic simulation 137
We simulated time-traces of Ada expression in single cells using a custom implementation of 138 Gillespie's algorithm in MATLAB (42). To this end, the equations (1-3) of the deterministic model 139 were expressed as elementary unimolecular or bimolecular reactions. Gillespie's algorithm assumes 140 memoryless kinetics, which is appropriate for transitions between discrete chemical states where the 141 system is defined entirely by its present state (Markov process). Stochasticity arises due to the 142 discreteness of the states of the system (i.e. the integer number of molecules in the cell) with 143 spontaneous random transitions given by the elementary reactions of the system. At a given time point, 144 the waiting time until the next transition is drawn from an exponential distribution with an expectation 145 value given by the inverse of the sum of all rates exiting that state (i.e. the rates of molecule production, 146 conversion, and loss). Which of the possible transitions occurs is then chosen randomly with 147 probabilities according to the relative rates of the reactions. Initial molecule numbers were drawn from 148 a Poisson distribution defined by the basal expression rate (see Fig. 2A). 149

Model parameters 150
Parameters were either obtained by direct experimental measurement (18)  caused rapid activation of Ada expression within 2 cell generations and steady-state expression was 175 reached within ~10 generations (Fig. 1C). For lower MMS concentrations (<350 µM), initial response 176 activation was delayed by more than 5 generations and expression reached steady state only after ~20 177 generations of treatment. The numerical solution of the model closely matched the measured dynamics 178 (Fig. 1C), using the same set of parameters as for the steady-state analysis. Furthermore, the model 179 confirms that removal of MMS leads to deactivation of the adaptive response (Fig 1D). level. Single-molecule counting of Ada-mYPet showed that the average production rate in the absence 202 of DNA methylation damage is as low as 1 Ada molecule per cell generation (18). This is equivalent to 203 a population average of 1.4 Ada molecules per cell, given that the loss rate by cell division is ln(2) per 204 cell generation. The distribution of Ada numbers ranged from 0 to ~6 molecules per cell. Spontaneous 205 induction of higher Ada expression in the absence of MMS treatment was never observed in 206 experiments ( Fig. 2A inset). Ada copy numbers were well described by a simulated Poisson distribution 207 when the mean was fixed by the average expression from experiments ( Fig. 2A). The integer numbers 208 of Ada molecules can be viewed as discrete cell states and transitions between these states occur with 209 a constant (memoryless) probability given by the average production and loss rates. but activation times were extremely broadly distributed across cells (Fig. 2C). Delays of more than 20 227 generations were frequently observed, a time in which a single cell can grow into a colony of millions. 228 Even at high MMS concentrations (500 µM -2 mM), activation times differed by multiple generations 229 between cells (Fig. 2D). Contrary to response activation, removal of MMS caused all cells to switch off 230 the adaptive response uniformly (Fig. 2C).  (Fig. 2). In particular, simulations reproduced the random Ada expression bursts at low 259 MMS as well as the stochastic activation followed by sustained Ada expression at high MMS 260 concentrations. Simulated cell traces also showed uniform deactivation of Ada expression after MMS 261 removal. Importantly, no additional features or noise terms had to be added to the model to achieve 262 these features. 263 delay times between addition of MMS and first activation of the adaptive response in single cells (Fig.  281  3A). The delay time distributions from stochastic simulations of the model closely resembled those 282 from experiments. However, it is evident that the fluctuations in Ada expression after response 283 activation are larger in experiments than in the simulated trajectories (Fig. 2). The additional variation 284 likely reflects "extrinsic noise" (50) due to fluctuations in factors that influence Ada expression but 285

Fig. 3 Activation of the adaptive response is delayed by gene expression noise. (A)
were not included in the model, such as RNA polymerase and ribosome concentrations and variation in 286 the length of the cell cycle. Nevertheless, the perfect match of the simulated and experimental delay 287 time distributions shows that stochasticity in the initial activation time of the response is not influenced 288 by such external noise sources but can be solely attributed to basic Poisson fluctuations in ada gene 289 expression. 290 In particular, noise in the low basal expression of Ada is responsible for a subpopulation of 20-30% of 291 cells that do not contain any Ada molecules (18). These cells are thus unable to activate the auto-292 regulatory adaptive response until they generate at least one Ada molecule. For simulated data, it is 293 possible to calculate response delay times conditional on the initial number of Ada molecules at the 294 time of MMS exposure. This analysis confirmed that the average delay time between MMS addition 295 and generation of the first meAda molecule converges to zero with increasing MMS concentration only 296 for cells that initially contain one or more Ada molecules (Fig. 3B-C). But for cells lacking any Ada 297 molecules, the average delay time approaches a limit defined by the average waiting time between 298 stochastic basal expression events (Fig. 3B-C). In the model, the basal ada production is a zero-order 299 reaction with an MMS-independent rate constant. Thus, response activation for cells without Ada 300 molecules follows a memoryless process with an exponential distribution of delay times (Fig. 3C) The predictive power of the model was tested by comparison to experiments in which cells were 316 subjected to perturbations that alter the regulation of the adaptive response in a defined manner (18) 317 (Fig. 4). When cell division was inhibited for 45 minutes using the antibiotic cephalexin prior to MMS 318 treatment, Ada molecules accumulate in cells and activation of the adaptive response becomes uniform 319 in the population (18). In agreement with this, prohibiting loss of molecules for 45 minutes was 320 sufficient to generate a uniform response in the simulations (Fig. 4B). The response was also perturbed 321 genetically by supplementing endogenous Ada-mYPet expression with a plasmid that is present at 1-2 322 copies per cell and expresses ada from the PAda promoter. The slight overexpression of Ada strongly 323 reduced cell-to-cell variation upon MMS treatment and eliminated the population of cells with a delayed 324 response (18). I modelled this perturbation by duplicating the ada gene copy number in the simulations 325 (Fig. 4C). This alteration resulted in uniform response activation as seen in experiments (Fig. 4C). 326 However, the simulations generated higher Ada expression levels than measured experimentally, likely 327 because Ada overexpression is toxic in experiments (18). 328

Effect of cell growth rate on the response strength 329
The doubling time of E. coli in rich growth medium is shorter than the time required to replicate the 330 chromosome. This is achieved by initiating new rounds of replication before completion of the previous 331 round (51). The early duplication of genes close to the replication origin increases their expression 332 proportional to the replication initiation frequency and thus maintains protein abundances at faster 333 growth. Expression of the ada gene, however, being located at 49.7 min on the chromosome map in the 334 vicinity of the terminus region, is expected to drop with increasing growth rates (Fig. 5A). This 335 prediction can be tested by the model of the Ada response. I fixed the Ada expression rate while 336 modifying the growth rate, and hence dilution rate. A strong Ada response occurred during slow growth 337 whereas faster growth did not sustain a response at the same MMS concentration (Fig. 5B). I tested this 338 prediction experimentally by growing cells in minimal medium supplemented with glucose or glycerol 339 carbon sources, which lead to 42 min or 75 min generation times, respectively. These measurements 340 confirmed the inverse relation between the growth rate and the strength of the adaptive response (Fig.  341  5C The role of noise in the fidelity of DNA repair has been investigated in eukaryotes, where nucleotide 355 excision repair involves stochastic and reversible assembly of repair factors into large complexes (52, 356 53). Collective rate control renders the overall repair pathway robust to variation in the abundances of 357 the individual components (34). The situation is opposite for damage signalling by Ada, which acts 358 alone in the regulation of the adaptive response and feedback amplification results in extreme sensitivity 359 to gene expression noise. Remarkably, random variation in the abundance of Ada by just a single 360 molecule was responsible for separating isogenic cells into distinct populations that either induced or 361 failed to induce the DNA damage response. This had important consequences because the lack of a 362 damage response decreased survival and increased mutation in those cells (17). The adaptive response 363 has been described as "a simple regulon with complex features" (24). Instead of attempting to 364 incorporate all mechanistic details, the model described in this article attempts to reduce the ada 365 regulation to its central features. For example, methylation of both Cys38 and Cys321 residues in the 366 N-and C-terminal Ada domains is required for optimal activation of the PAda promoter (54). The model 367 uses only one effective methylation rate and does not distinguish between single or double methylation 368 of Ada. Furthermore, unmethylated Ada has been reported to inhibit meAda-dependent transcription 369 activation (39), a feature that was not explicitly included in the model. The adaptive response also 370 interacts with other cellular responses and processes. For instance, the alternative sigma factor RpoS 371 induces Ada expression upon entry into stationary phase (24), while the SOS response is crucial for 372 initial survival of alkylation damage and contributes to alkylation-induced mutagenesis (17).

373
Considering these simplifications, it is remarkable that the most parsimonious model of the adaptive 374 response not only succeeds in quantitatively reproducing a large spectrum of stochastic single cell 375 dynamics but also in predicting the system's behaviour after different experimental perturbations.