Normative models of enhancer function

In prokaryotes, thermodynamic models of gene regulation provide a highly quantitative mapping from promoter sequences to gene expression levels that is compatible with in vivo and in vitro bio-physical measurements. Such concordance has not been achieved for models of enhancer function in eukaryotes. In equilibrium models, it is difficult to reconcile the reported short transcription factor (TF) residence times on the DNA with the high specificity of regulation. In non-equilibrium models, progress is difficult due to an explosion in the number of parameters. Here, we navigate this complexity by looking for minimal non-equilibrium enhancer models that yield desired regulatory phenotypes: low TF residence time, high specificity and tunable cooperativity. We find that a single extra parameter, interpretable as the “linking rate” by which bound TFs interact with Mediator components, enables our models to escape equilibrium bounds and access optimal regulatory phenotypes, while remaining consistent with the reported phenomenology and simple enough to be inferred from upcoming experiments. We further find that high specificity in non-equilibrium models is in a tradeoff with gene expression noise, predicting bursty dynamics — an experimentally-observed hallmark of eukaryotic transcription. By drastically reducing the vast parameter space to a much smaller subspace that optimally realizes biological function prior to inference from data, our normative approach holds promise for mathematical models in systems biology.

1 Our nomenclature is simply a shorthand for all co-factors necessary for eukaryotic transcriptional activation at an enhancer, which can include proteins not strictly a part of the Mediator family.
proximity or interaction. Crucially, the links can be es-133 tablished and removed in processes that can break de-134 tailed balance and are thus out of equilibrium. Here, we 135 consider that a link is established at a rate k link between 136 a bound TF and the Mediator complex; for simplicity, 137 we assume that the links break when the TFs dissociate 138 or upon the switch into OFF state (this assumption can 139 be relaxed, see Fig S2). 140 An important thrust of our investigations will con-  169 where K = k − /k 0 + , k + = k 0 + c (see also Fig 1 caption), 170 and L = log (κ + /κ − ). The k link parameter thus inter-171 polates between the equilibrium limit in Eq (1), corre-172 sponding to a textbook MWC model, and various non-173 equilibrium (kinetic) schemes which we will explore next.

174
A similar generalization with an equilibrium limit ex- 175 ists for thermodynamic Hill-type models, where, further-176 more, α can be directly identified with cooperativity be-177 tween DNA-bound TFs (see SI Section 1.3); we will see 178 that this qualitative role of α will hold also for the MWC 179 case.

181
How does the regulatory performance depend on the 182 enhancer parameters and, in particular, on moving away 183 from the equilibrium limit? To assess this question sys-184 Figure 1: normative non-eq model of (enhancer) regulation (B) Key reactions and rates of the non-equilibrium model. TFs can bind with concentrationdependent on rate (k+ = k 0 + c) and unbind with basal rate k− that is in principle sequence dependent (i). The Mediator state switches between the conformational states with basal rates κ+ and κ− (ii). Linking and unlinking of TFs to Mediator (iii) can move the system out of equilibrium: links are established with rate k link , and the link stabilizes both TF residence and the ON state of the Mediator by a factor α per established link. (C) Regulatory phenotypes. Mean TF residence time, TTF, on specific sites in functional enhancers (black) vs random site on the DNA (gray) increases with concentration (top), as does mean expression, E (the fraction of time the Mediator is ON; induction curve, middle, with sensitivity, H, defined at mid-point expression). Specificity, S, is defined as the ratio of expression from the specific sites in the enhancer relative to the expression from random piece of DNA.
tematically, we define a number of "regulatory pheno-185 types", enumerated in Table I     , and average expression, E (color), for MWC-like models with n = 3 TF binding sites, obtained by varying α and k link at fixed TF concentration, c0. Equilibrium models fall onto the red line; two models with equal TF residence times, I (EQ) and II (NEQ), are marked for comparison. Dashed gray lines show analytically-derived bounds. (B) Phase space of regulatory phenotypes is accessed by varying α at fixed values of k link (grayscale; top) or varying k link at fixed values of α (grayscale; bottom). (C) As in (A), but the TF concentration at each point in the phase space is adjusted to hold average expression fixed at E = 0.5 (green color). Plotted is a smaller region of phase space of interest; nearly vertical thin lines are equi-concentration contours ( Fig S6). (D) All models in the phase diagrams in (A) and (C) approximately collapse onto nearly one-dimensional manifolds ("fixed c", left axis, for (A); "fixed E", right axis, for (C)) when plotted as a function of mean TF residence time, TTF, supporting the choice of this variable as a biologicallyrelevant observable. Color on the manifold corresponds to mean expression E using the colormap of (A). Vertical scales are chosen so that models I and II coincide. (E) Induction curves of equilibrium model I and non-equilibrium model II for expression from functional enhancer that contains specific sites (basal TF off-rate k S − ; black curves) versus expression from random DNA containing non-specific sites (basal TF off-rate k NS − = 10 2 k S − here; gray curves).
for enhancers with larger number of binding sites (see at low TF concentration (that is, high affinity) achieved 259 through "cooperative interactions" at high α either has a 260 detrimental, or, at best, a marginally beneficial effect for 261 the ability to discriminate between cognate and random 262 DNA sites (that is, high specificity) in equilibrium [24].     . Average TF residence times are the matched between EQ and NEQ models at 2.1T0, T0 = 1/k S − = 1 s, and both induction curves (scaled for half-maximal concentration) are identical, with sensitivity H ≈ 2.7. When TF concentration is high, expression is fixed at E = 0.5. Parameters for NEQ model: α = 127, k link = 2, cmax = 0.065; for EQ model: k link → ∞, α = 19.8,cmax = 0.037. Rasters show the occupancy of TF binding sites; orange line above shows the enhancer ON/OFF state; zoom-in for EQ model is necessary due to its fast dynamics. (B) Regulatory phenotypes for EQ and NEQ models during steady-state epoch (gray in A). Specificity (S) and enhancer state correlation time (TE) are higher for the NEQ model; the Mediator mean ON residence time, TM , is the same between the models, but the probability density function reveals a long tail in the NEQ scheme, and a nearly exponential distribution for the EQ scheme. Last two panels show the TF occupancy histogram during high TF concentration interval, conditional on the enhancer being OFF or ON. (C) Transient behavior of the mean enhancer state (E), mean protein number (P ; assuming deterministic production/degradation protein dynamics given enhancer state), and gene expression noise, N = σP /P , for the NEQ and EQ models, upon a TF concentration low-to-high switch (left column) and high-to-low switch (right column). Traces shown are computed as averages over 1000 stochastic simulation replicates.
limits and trade-offs, and to identify the optimal operat-483 ing regime of the proposed enhancer model that is con-484 sistent with current observations, as we summarize next. interactions might be, however, they are unlikely to be 564 able to remove excess enhancer switching noise, due to 565 its slow timescale, suggesting that the tradeoffs that we 566 identify should hold generically.

567
One could also question whether the importance we as-568 cribed to high specificity is really warranted. Evolution-569 arily, regulatory crosstalk due to lower specificity helps