Bayesian Modeling Reveals Ultrasensitivity Underlying Metabolic Compensation in the Cyanobacterial Circadian Clock

Mathematical models can enable a predictive understanding of mechanism in cell biology by quantitatively describing complex networks of interactions, but such models are often poorly constrained by available data. Owing to its relative biochemical simplicity, the core circadian oscillator in Synechococcus elongatus has become a prototypical system for studying how collective dynamics emerge from molecular interactions. The oscillator consists of only three proteins, KaiA, KaiB, and KaiC, and near-24-h cycles of KaiC phosphorylation can be reconstituted in vitro. Here, we formulate a molecularly-detailed but mechanistically agnostic model of the KaiA-KaiC subsystem and fit it directly to experimental data within a Bayesian parameter estimation framework. Analysis of the fits consistently reveals an ultrasensitive response for KaiC phosphorylation as a function of KaiA concentration, which we confirm experimentally. This ultrasensitivity primarily results from the differential affinity of KaiA for competing nucleotide-bound states of KaiC. We argue that the ultrasensitive stimulus-response relation is critical to metabolic compensation by suppressing premature phosphorylation at nighttime. Synopsis This study takes a data-driven kinetic modeling approach to characterizing the interaction between KaiA and KaiC in the cyanobacterial circadian oscillator and understanding how the oscillator responds to changes in cellular metabolic conditions. An extensive dataset of KaiC autophosphorylation measurements was gathered and fit to a detailed yet mechanistically agnostic kinetic model within a Bayesian parameter estimation framework. KaiA concentration tunes the sensitivity of KaiC autophosphorylation and the period of the full oscillator to %ATP. The model reveals an ultrasensitive dependence of KaiC phosphorylation on KaiA concentration as a result of differential KaiA binding affinity to ADP- vs. ATP-bound KaiC. Ultrasensitivity in KaiC phosphorylation contributes to metabolic compensation by suppressing premature phosphorylation at nighttime.

mature phosphorylation at nighttime. 48 Introduction which means that it can transfer a γ-phosphate group from a bound ATP to a phosphorylation site, but unlike 152 a typical phosphatase, it regenerates ATP from ADP during dephosphorylation, i.e., der the assumptions that i) the apo state is in a quasi-steady state, ii) the ADP and ATP on-rates are identical, 167 and iii) ATP release is slow, nucleotide exchange can be modeled as a one-step reaction 168 ATP + A C DP ) and ATP-bound T phosphoforms (k T,TP b ) are two Figure 2A and B, solid curves); we examine the effect of varying [KaiA] and %ATP in the following sections. 228 At the beginning of the phosphorylation reaction, KaiC molecules are predominantly in the ADP-bound U 229 state (C U DP ), the end product of the dephosphorylation pathway in the absence of KaiA (Figure 2A). With the 230 addition of KaiA, the C U DP state becomes rapidly depleted within the first 10 minutes of the reaction and 231 enters the C U TP state. Consistent with the kinetic ordering observed in the full oscillator, the C U TP population 232 is primarily converted into the T phosphoform over the S phosphoform. The exact pathway underlying the 233 preference for the T phosphoform is not well constrained by the data, but it appears to be the result of 234 more than just a difference in the relative U→T and U→S phosphorylation rates; there is an interplay be- 235 tween KaiC phosphorylation and KaiA (un)binding kinetics (see SI and Figure S6). The ADP-and KaiA-bound 236 T phosphoform species are unstable kinetic intermediates, and the population accumulates at the C T TP bot-237 tleneck for the first 4 hours. As phosphorylation reaches completion, the T phosphoform is converted first 238 into A C D TP through the unstable ADP-bound intermediates, and then to the C D TP state; the populations of the 239 A C D TP and C D TP states are comparable at steady state. We note here, however, that previous measurements We were surprised to see that the model fit predicts that the KaiA binding affinity for the ATP-bound T 250 phosphoform is lower than those for the S and D phosphoforms. This is apparently in contradiction with  Interestingly, the %ATP sensitivity of the system cannot be fully abolished even at saturating KaiA con-281 centrations ( Figure 2C). This effect can be interpreted qualitatively in terms of the structure of the model.    To assess the robustness of the model prediction across the ensemble, we use two metrics proposed 320 by Gunawardena (2005) to quantify the shape of the predicted stimulus-response curves for %C P at any 321 fixed %ATP: we use EC10 to measure the extent to which the curve acts as a threshold and EC90 -EC10 322 to measure the extent to which the curve acts as a switch. Here, ECx is the KaiA concentration required to 323 reach x% of the steady-state phosphorylation level at saturation. Figure 3E shows the distribution of these 324 quantities in the ensemble at 25% ATP. Overall these statistics are tightly constrained by the training data 325 set, and are clearly distinct from those from hyperbolic stimulus-response relations ( Figure 3E, dashed gray 326 line).

327
Given the robustness of the prediction, we sought to experimentally verify the shape of the stimulus-328 response function. We measured KaiC phosphorylation at t = 24 h at various concentrations of [KaiA] at 329 three %ATP conditions ( Figure 3B and Figure S8B). Consistent with model prediction, the experimentally-330 derived stimulus-response relations are ultrasensitive with an %ATP-dependent phosphorylation threshold, 331 and the stimulus-response relation of the S phosphoform at 100% ATP is non-monotonic. We then quan-332 tified the shape of the stimulus-response curve for %C P at 25% ATP using the same two metrics defined 333 above ( Figure 3E, yellow star). At 25% ATP, the shape of the experimentally-derived stimulus-response 334 curve is close to that of the model prediction, but the model fit is systematically less threshold-like (i.e., 335 smaller EC10) and less switch-like (i.e., larger EC90 -EC10). This inconsistency is likely due to a combination 336 of training data under-determining the shape of the curve at the sub-micromolar range (compare Figure   337 2C with Figure 3B) and the fitting method under-estimating uncertainties (see SI). 338 Lastly, the saturation phosphorylation levels in the steady-state measurements appear systematically 339 lower than those implied by the training data set (compare Figure 3B with D left). This may be a result 340 of batch-to-batch variations in protein and nucleotide quantification. This difference can be corrected by 341 refitting the full model to the steady-state measurement ( Figure 3B and Figure S8C). The refit results sug-342 gest that errors in protein and nucleotide concentrations primarily affect the kinetic properties of the S 343 phosphoform in the model (Figure S8D), but the refit does not change the qualitative conclusions.  In the ensemble of parameter sets, the KaiA dissociation constant of the ADP-bound (but not ATP-bound) 357 U phosphoform (C U DP ) is in or below the nanomolar range, much smaller than that of any other species of 358 KaiC ( Figure 1D). This is consistent with recent single molecule observations suggesting that the unphos-  In the model, the differential KaiA binding affinity leads to the following dynamics: KaiA promotes phos-366 phorylation by catalyzing the exchange of the bound ADP for ATP, but this process is in a kinetic competition 367 with ATP hydrolysis, which returns KaiC to the ADP-bound state. At the beginning of the phosphorylation 368 reaction, almost all the KaiA is bound to C U DP (Figure 2A  where the kinetic competition of multiple substrates for enzyme binding leads to ultrasensitivity. Here,

385
KaiA acts as the enzyme, while the ADP-bound U phosphoform and the T (as well as S and D to a lesser 386 extent due to phosphorylation ordering) phosphoform are the substrates that compete for KaiA binding.  The model suggests that the U phosphoform plays a special role in generating ultrasensitivity due to 391 the significant difference in the affinity of KaiA for the ATP-vs. ADP-bound states of KaiC ( Figure 1D). This 392 observation leads to two testable predictions. First, the amount of KaiA required to activate phosphoryla-393 tion should be higher when more U phosphoform is present. We tested this prediction experimentally by  (Table 1). Interestingly, a model where the KaiA on/off rates are com-413 pletely independent of the state of KaiC (model -n,-p; Figure S9D and E) is much worse than either model 414 -n or model -p (Table 1). We conclude that the nucleotide-bound state of KaiC plays a key role in regulating 415 its interaction with KaiA and thus in determining phosphorylation kinetics. KaiA to KaiBC is more labile or has lower affinity than previously assumed, or that the sequestration kinetics 446 are slow compared to the length of the dephosphorylation stage. In either case, substantial amounts of 447 KaiA appear to remain free of KaiABC complexes during oscillation.

448
The fact that the phosphorylation threshold scales with %ATP suggests that ultrasensitivity may also phosphorylation of some molecules, which can lead to phase decoherence, manifest as decaying oscilla-480 tion ( Figure S7E). The ultrasensitive stimulus-response that we report here implies that a finite amount of 481 KaiA must be liberated from KaiB before there is a noticeable impact on KaiC phosphorylation. In other 482 words, the inhibitory effect of ultrasensitivity is a synchronization mechanism. Importantly, the EC50 of the 483 stimulus-response function scales with %ATP, such that the capacity of phosphorylation suppression by C U 484 is enhanced at low %ATP, which compensates for weaker KaiA sequestration ( Figure 4F). This relation likely 485 also contributes to the scaling of the phosphorylation limit cycle size with %ATP ( Figure 4G). At higher %ATP, 486 the EC50 is smaller and thus more KaiA needs to be sequestered to trigger dephosphorylation, which im-487 plies that higher concentrations of the S and D phosphoforms need to accumulate to enable KaiB binding.

488
Since KaiC phosphorylation is ordered, this means that the T phosphoform concentration scales with %ATP 489 as well. In this work we undertook a data-driven kinetic modeling approach to understand the metabolic sensi- . 526 We hypothesized that ultrasensitivity in KaiC phosphorylation plays a role in stabilizing the oscillator at and K on = k DP on =k TP on is a ratio of the two nucleotide binding rate constants. 569 We make two further simplifying assumptions. First, we assume the on rates are completely diffusion 570 controlled and are thus the same for ATP and ADP, which allows us to set K on = 1. Second, based on 571 fit results ( Figure S4F) showing that the posterior for k TP r has a long tail to negative infinity in log space, 572 we follow the approach proposed by Transtrum and Qiu (2014) and set k TP r = 0; i.e., the dwell time of 573 ATP-bound states are sufficiently long such that a bound ATP cannot be released without first giving up its 574 γ-phosphate group. This assumption implies that the only ways for KaiC to enter an ADP-bound state are 575 through hydrolysis and phosphorylation, and solution ADP has no effect on the system except to slow down 576 the ADP to ATP exchange process. With these two assumptions, we eliminate (9) and (8) reduces to (4).

577
Model parameterization In Results, we introduced a model parameterization scheme in which rate con-  Figure S1 and Figure S10). The multiplicative-factor scheme introduces 38 584 ∆k parameters. Because of the requirement for detailed balance (see below), only 34 of these parameters 585 are free; these free parameters are listed on Table 2. The advantage of the multiplicative parameterization 586 scheme is that it facilitates ' 1 regularization, discussed below.

587
Detailed balance All elementary reactions, except ATP hydrolysis and nucleotide exchange, are assumed 588 to occur in equilibrium, and thus the free energy change over each reversible cycle must be equal to zero.

Initial conditions
Dirichlet(a) ‡ μM * phos., phosphoform; nuc., nucleotide-bound state † The mean of the priors, -, is zero unless specified by Table 5. § or s −1 ·μM −1 for the second-order rate constant k a . ‡ a = (20; 100; 1; 1; 1; 1; 1; 1); points drawn from the distribution are scaled by the total KaiC concentration. The support of the Dirichlet distribution implies that only seven of the eight initial conditions are free fitting parameters.
In practice, this means that the product of all rate constants in the forward direction of each cycle listed on 590 Table 3 must be equal to that in the reverse direction (see also Figure S10A). This introduces an additional 591 algebraic constraint for each such cycle, which is used to eliminate one free ∆k parameter. In total, one can 592 eliminate four such arbitrarily chosen parameters.

593
Fitting data set To constrain the model parameters, we collected experimental measurements that char-594 acterized different aspects of the KaiA-KaiC subsystem, which are summarized in Table 4. 595 The dephosphorylation reaction taken from Rust et al.  total KaiC concentration is conserved, this introduces seven additional free parameters (see Table 2).

621
For the dephosphorylation reaction, we do not estimate the initial conditions. To mimic the way the ex-622 periment was done, the dephosphorylation reaction is simulated in two stages. In the first stage, we assume Here, p(") is the prior distribution, which represents subjective belief in the model parameters " prior to In other words, the likelihood function gives the probability for observing a given data set provided that surements, which is then estimated during fitting as a hyperparameter (see Table 2).
The choice of the Gaussian likelihood function applies to all (de)phosphorylation data sets, but not the That is, the proportion of the initial walkers generated from each optimized walker is weighted by its  In general, the number of walkers in ensemble MCMC needs to be larger than the number of free pa-716 rameters; here the number 224 is chosen to optimize parallel performance on a local computer cluster (8 717 nodes × 28 CPU cores/node). 718 We found that this procedure outperformed either ensemble MCMC or numerical optimization by itself 719 (compare Figure S3A and B). For the full model we ran this procedure three times to assess the reproducibil-720 ity of the fit (see SI for further discussion). proposed by a "stretch move" where z is a random number drawn from the distribution The "stretch factor"¸is a tunable parameter that controls the step size. In an N-dimensional parameter that there is a model M with a well-defined functional form, whose parameters " need to be determined.

771
For the sake of model comparison, we make this assumption explicit and rewrite (11) as Assuming that we have no prior preference for any model, the ratio becomes the Bayes factor which we adopt as the metric for model comparison.

776
The primary difficulty in computing the Bayes factor is thus estimating the evidence, or the marginal 777 likelihood function, for each M i . There are several methods for computing the evidence (Vyshemirsky and 778 Girolami, 2008). Here we derive a formula compatible with the ensemble MCMC scheme that is directly 779 analogous to free energy perturbation (Zwanzig, 1954). For the sake of simplicity, we drop the model index where the first equality follows from the law of total probability and the second equality assumes that the for 0 ≤ -≤ 1 and then note that (24) can be recast as for 0 = -0 < -1 < · · · < -N = 1, and the -s are chosen to allow for sufficient overlap between successive 788 q -(")s. Each fraction in (25) is given by where the averages · qn can be approximated with MCMC. Equation (27)   For each simplified model in Table 1 and Table S1, the ensemble of walkers from the last time step of  Table 2) are centered on the best fit values, so that the refit model can be interpreted 803 as the "minimal" perturbation to the best fit that enables agreement with the steady-state measurement.
Here, P max , [I], K 1 , K 2 , and b are free model parameters. Unlike the Hill function, EC50 is not an explicit 809 parameter of (28) and thus needs to be determined numerically. Note that (28) can be reduced to a right- which is equivalent to a threshold-hyperbolic stimulus-response function, using 98% deuterated water (D 2 O). The proteins were purified by Ni-NTA affinity chromatography followed the gels were stained in SimplyBlue SafeStain (Invitrogen) and then imaged using Bio-Rad ChemiDoc Im-849 ager. The oscillatory reactions ( Figure 2D) were also monitored using the plate reader assay described in

951
Convergence of the model fit 952 We assessed the quality of the fit in three ways. First, we repeated the full procedure three times to assess 953 reproducibility of the fitting procedure ( Figure S1B). The three independent runs yielded marginalized pos-  Second, we compared the model predictions with a test data set that probed the phosphorylation reac-964 tion at two non-standard KaiC concentrations ( Figure S4E). The fit quality on the test data set is somewhat 965 worse compared to the training set (compare with Figure 1B). In particular, the model overestimates the To test whether KaiA binding has a direct effect on KaiC catalytic activities, we construct simplified mod-986 els where hydrolysis and/or phosphotransfer is independent of KaiA binding and compare the resulting 987 models to the full model using the Bayes factor (Table S1). We find that decoupling either phosphotransfer by nucleotide exchange, but we cannot conclusively distinguish between models -P and -H. 992 We analyzed a random selection of 500 walkers to understand the implications of their variations for the 994 mechanisms of ordered phosphorylation. To simplify the analysis, we converted each walker to two single-995 site models in which either T431 or S432 was available for phosphorylation but not both. We asked how 996 important each reaction rate constant is to the overall T and S phosphoform concentrations in the single-site 997 models using the relative first-order sensitivities computed at the standard reaction condition (i.e., 100% 998 ATP with 1.5 μM KaiA). We focus on the initial phosphorylation rates because the steady-state rates are 999 determined by balances of many contributing processes, making them harder to interpret. The parameter 1000 sensitivities at t = 1 h are used as proxies for the sensitivities of the initial phosphorylation rates.

1001
Because each single-site model has 18 parameters, there are 36 sensitivities for the two phosphoforms.

1002
To characterize this high-dimensional space, we used spectral clustering ( Figure S6A). Overall, the param-1003 eter sensitivities are much more constrained by the data for the T phosphoform than the S phosphoform,  (Figure S6A and B). In both clusters, the U → T transition is most sensitive to k UT p (i.e., the 1008 phosphorylation rate in the absence of KaiA).

1009
In the first cluster (319 parameter sets), the U → S transition is most sensitive to k A,US p in the presence of 1010 KaiA. This is primarily because k US p is very small in this cluster relative to k A,US p ( Figure S6C) to the U phosphoform is important in determining the initial phosphorylation rate of S. The best fit belongs 1014 to this first cluster.

1015
The phosphorylation pathway suggested by the second cluster (181 parameter sets) is more complex. In 1016 this cluster, the U→S transition is mostly independent of KaiA, similar to the U→T reaction. However, the S 1017 phosphoform is limited by the dephosphorylation reaction k SU d , which is much faster than the corresponding 1018 phosphorylation rate ( Figure S6C). In addition, the S phosphoform is sensitive to the rate constant for KaiA 1019 binding, k S,DP a , which is important for facilitating nucleotide exchange for the ADP-bound S phosphoform, 1020 but tends to be slower in the second cluster ( Figure S6C). Therefore, faster dephosphorylation and slower 1021 KaiA binding is important for determining the initial S phosphorylation rate in the second cluster. The difference in the treatment of KaiA binding affinity implies that some detailed balance conditions are incompatible between the two models. In the Paijmans model, the binding affinity of KaiA to KaiC hexamers during the phosphorylation phase depends on the phosphoform composition of the subunits, and each subunit i in the phosphoform X i other than U contributes an additive factor of ‹g CII·KaiA bind (X i ) to the changes in KaiA binding free energy, ∆G CII·KaiA bind . Due to detailed balance, the fact that KaiA binds to different KaiC phosphoforms with differential affinities implies that KaiA binding changes the free energy of phosphotransfer [see eq. 8 in Paijmans et al. (2017b)]. This condition is also present in our model, but is complicated by the nucleotide-bound states of KaiC. Using the multiplicative-factor parametrization scheme (see Materials and Methods), the detailed balance conditions in this work can be related to those in the Paijmans model by where k A phos varies with the specific phosphorylation reaction. To introduce ultrasensitivity, we add a thresh-1081 old term T (Figure 4D),  Figure S1: Overview of the model. A) A schematic of the full mass action kinetic model. Here, each arrow represents a reaction, and the associated rate constant is represented using the notation introduced in the main text. The thickness of the arrows is proportional to the best fit rate on a log scale (base 10) at 100% ATP and 1.5 μM KaiA. B) The posterior distributions for all rate constants, initial conditions, and the global error hyperparameter. The rate constants have the unit s -1 (·μM -1 ) and the horizontal axis has a log scale (base 10). The three distributions represent the results from three independent runs; the log posterior values for the best fits from the three runs are listed. The red lines represent the best fit from the best run (i.e., the blue distributions). See Materials and Methods and Figure S10 for further details on the model parameterization method.    Figure S1A for the KaiC state names. E) The predicted phosphorylation kinetics at 7 and 1.75 μM KaiC, both at 100% ATP and 1.5 μM KaiA, compared to experimental measurements. Note that these two time series are not part of the training set. F) The posterior distributions for k TP r and k DP r , the dissociation rates for ATP and ADP, respectively, in an early iteration of the model. The rate constants have a unit of s -1 and the horizontal axis has a log scale (base 10). The long tail to the left of the posterior distribution for k TP r suggests that the model can be simplified by setting the rate to zero. Figure S5: Correlation structure in the MCMC ensemble. A), B), and C) show parameters with a correlation coefficients larger than 0.9. In B), "X" represents the KaiC phosphoforms. In C), the projections of the 3D scatter plot onto pairwise correlations are shown in gray. D) The principal component/eigenvalue spectrum of the covariance matrix (left), and the alignment of the principal components with the coordinates (right). Here, I denotes the intersection of the principal component ellipoid with the coordinates and P denotes the projection of the principal components onto the corresponding coordinates (Gutenkunst et al., 2007). E) The integrated autocorrelation time for the 48 principal components (PC); the principal components are indexed from the largest to the smallest. The integrated autocorrelation time is calculated using an automated windowing procedure (Madras and Sokal, 1988) from the autocorrelation function averaged over the ensemble. F) The integrated autocorrelation time for the KaiA dissociation constants as a function of KaiC phosphoform and nucleotide-bound states. G) The ten largest vector components, ordered by absolute value, for the first and last three principal components. Figure S6: The mechanism of kinetic ordering is not well-constrained. A) Spectral clustering on the relative sensitivity of the T and S phosphoform concentrations at t = 1 h to rate constants in the T-and S-site models, respectively. Only the parameters with significant (> 0:2) relative sensitivities in either cluster are shown in the plot. "X" stands for either the T (horizontal axis) or S (vertical axis) phosphoform. The sensitivities are calculated using 500 sampled parameter sets chosen randomly from the ensemble. The clustering analysis was done using the FindClusters function in Mathematica 12.0. B) Model diagrams that highlight the reactions that have the highest relative sensitivities in the first (left) and second (right) clusters; the D phosphoform is not shown. C) Selected model parameter values in the two clusters. A comparison with blue distributions in Figure S1B indicates that the clustering based on sensitivity can be mapped onto the modes of the posterior distribution.  A) The steady-state stimulus-response relations for T, S, and D phosphoforms predicted by the model. B) The experimentally determined stimulus-response functions of the T, S, and D phosphoforms at three %ATP conditions; the curves are based on refitting the best fit to the steady-state measurements. C) The model-predicted stimulus-response relation of the total steady-state KaiC phosphorylation level as a function of %ATP and [KaiA] after refitting to the steady-state measurements. D) The differences in the log parameter values (base 10) of the best fit before and after refit. The differences are ordered by magnitude and only the 10 parameters (in the multiplicative-factor scheme) with the largest changes are shown. the dashed lines represent the best fit. C) Cross sections of the stimulus-response relation at three %ATP, computed using model -p. The inset represents posterior distribution for the shapes of the stimulus-response function at 25% ATP. The contours represent the 68% and 95% HDRs, and the gray star represents the model best fit. The shape of the stimulus-response function is quantified using two metrics: EC10, which quantifies threshold-like behavior, and EC90 -EC10, which quantifies switch-like behavior. The shape of the experimentally-determined stimulus-response function at 25% ATP is shown as the yellow star. The dashed line represents (EC10; EC90 − EC10) = (K=9; 80K=9), which characterizes the shape of a hyperbolic stimulus-response function [A]=(K + [A]) that has no switching or thresholding. D) Similar to B), but for model -n,-p, where there is a single KaiA on/off rate in the model. E) Similar to C), but for model -n,-p. Figure S10: Overview of the model with the multiplicative-factor parameterization scheme. Panels A) and B) are analogous to those in Figure S1, but the rate constants are represented as products of the factors that are actually optimized in the MCMC simulations. In B), the ‹k parameters are fixed parameters determined by detailed balance conditions. The parentheses denote species-dependent effects; A: KaiA-bound state, P: phosphoform, N: nucleotide-bound state. See Materials and Methods for further description of the detailed balance conditions and the model parameterization method.