Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

The price of a bit: energetic costs and the evolution of cellular signaling

Teng-Long Wang, Benjamin Kuznets-Speck, Joseph Broderick, Michael Hinczewski
doi: https://doi.org/10.1101/2020.10.06.327700
Teng-Long Wang
1Department of Physics, Case Western Reserve University, Cleveland, OH 44106
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benjamin Kuznets-Speck
1Department of Physics, Case Western Reserve University, Cleveland, OH 44106
2Biophysics Graduate Group, University of California, Berkeley, CA 94720, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joseph Broderick
1Department of Physics, Case Western Reserve University, Cleveland, OH 44106
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Hinczewski
1Department of Physics, Case Western Reserve University, Cleveland, OH 44106
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mxh605@case.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Recent experiments have uncovered a fundamental information scale for cellular signaling networks: the correlation between input and output concentrations of molecules in a signaling pathway corresponds to at most 1-3 bits of mutual information. Our understanding of the physical constraints and evolutionary pressures that determine this scale remains incomplete. By focusing on a basic element of signaling pathways, the kinase-phosphatase enzymatic push-pull loop, we highlight the pivotal role played by energy resources available for signaling and their expenditure: the chemical potential energy of ATP hydrolysis, and the rate of ATP consumption. Scanning a broad range of reaction parameters based on enzymatic databases, we find that ATP chemical potentials in modern organisms are just above the threshold necessary to achieve empirical mutual information values. We also derive an analytical relation for the minimum ATP consumption required to maintain a certain signal fidelity across a range of input frequencies. Attempting to increase signal fidelity beyond a few bits lowers the bandwidth, the maximum characteristic signal frequency that the network can handle at a given energy cost. The observed information scale thus represents a balancing act between fidelity and the ability to process fast-changing environmental signals. Our analytical relation defines a performance limit for kinase-phosphatase networks, and we find evidence that a component of the yeast osmotic shock pathway may be close to the optimality line. By quantifying the evolutionary pressures that operate on these networks, we argue that this is not a coincidence: natural selection on energy expenditures is capable of pushing signaling systems toward optimality, particularly in unicellular organisms. Our theoretical framework is directly verifiable using existing experimental techniques, and predicts that more examples of such optimality should exist in nature.

I. INTRODUCTION

Survival for living cells depends in part on accurate and responsive signaling: the ability to collect enough information about the micro-environment to make decisions in response to external stimuli such nutrients, hormones, and toxic agents [1]. This capacity to react to extracellular cues developed early in evolutionary history, and is now seen at all levels of biological organization, from chemotaxis in unicellular organisms [2–4] to the pathways that regulate cell differentiation and disease in multicellular life [5–8]. Despite the resulting diversity of biochemical networks that implement this signaling, information theory provides a powerful universal framework to quantify the amount of information transferred through a network, allowing comparisons between different systems [9].

Over the last decade a remarkable experimental consensus has emerged from such comparisons: studies of both prokaryotic and eukaryotic signaling pathways have found they can transmit at most ∼ 1 to 3 bits of information [10–17], These values refer to mutual information (MI) between pathway input (concentrations of a molecule representing the signal) and output (concentrations of a downstream molecule produced by the network, sampled either at a single or multiple time points). MI is a measure of signal fidelity, representing the degree of correlation between input and output. Experiments have typically focused on a closely related quantity known as the channel capacity [18, 19]: the maximum MI achievable among all input distributions.

The consistently small channel capacities observed in cellular signaling pathways seem to indicate that cells operate with a fairly coarse representation of their surroundings: n bits of MI corresponds to being able to reliably distinguish between 2n levels of the input, so a 1 bit pathway can only discriminate between “high” versus “low” signal concentrations. Though 1 bit is typical for MI measured at single time points, one can achieve higher MIs by focusing on output responses collected over several time points [14, 15], or by designing the experiment to isolate single-cell responses (as opposed to estimating MI from the responses of a population of cells) [17]. But these enhancements, which can push values to the 2-3 bit range, do not change the fundamental order of magnitude of the MI.

The central question we explore in this work is to what extent this fundamental information scale is shaped by the energy requirements of the underlying biochemical signaling networks. In order to transmit information, these networks necessarily need to operate out of equilibrium, fueled by processes like ATP hydrolysis that consume energetic resources. Recent research highlights these costs as an essential factor in understanding constraints on signaling [2–4, 20–23], often focusing on the ATP hydrolysis chemical potential difference Embedded Image between the reactant (ATP) and products (ADP and inorganic phosphate, Pi), quantifying the free energy available to drive the system per ATP. Crossing a certain minimum threshold of Δμ is a prerequisite for a variety of signaling functions: accurate read-out of ligand-bound receptors [2, 3, 23], maintaining the phase coherence of oscillations in circadian clocks [20], or preserving the integrity of methylation-based “memory” to facilitate adaptation in chemotaxis [4]. This threshold is typically a few times larger (i.e. by a factor of ∼ 3 − 4 [2, 23]) than the energy scale of thermal fluctuations, kBT, where kB is the Boltzmann constant and T the temperature. And indeed cells across the various domains of life maintain a sufficiently high Δμ ≈ 21 − 29 kBT [24] to enable such functions.

The large value and remarkably narrow range of Δμ observed in modern organisms opens up additional questions. The metabolic cycles that sustain Δμ, constantly replenishing ATP as it is hydrolyzed, must almost necessarily have been far more inefficient and wasteful in the earliest stages of evolutionary history [25]. To what degree could organisms operating with smaller Δμ still process information about their environment? What kinds of evolutionary pressures might have driven Δμ to its modern range? And if the costs of individual signaling systems are non-trivial [2, 4], could natural selection have driven these networks toward optimized, energy-efficient solutions?

To investigate these issues, we focus on one of the canonical signaling circuits in biology, the kinase-phosphatase “push-pull loop”, which often forms a basic unit of more complicated signaling cascades [26–29]. An active kinase enzyme instigates the “push”, chemically modifying a substrate protein via phosphorylation (consuming ATP in the process), while a phosphatase enzyme provides the “pull”, dephosphorylating the modified substrate, reverting it to its original state. We derive the relationships between three facets of the system: i) the MI between the input (active kinase) and output (phosphorylated substrate) molecular populations; ii) the timescales over which the input signal varies; and iii) the energy requirements, expressed in terms of Δμ and the rate of ATP consumption. Exploring the entire spectrum of kinase/phosphatase enzymatic parameters from bioinformatic databases, we find that physiological Δμ values are just large enough to enable an MI of 1-2 bits for the widest possible parameter range. However to achieve this MI for signals that vary rapidly in time becomes more challenging, requiring both precise fine-tuning of parameters and a certain minimum rate of ATP consumption. In fact, taking advantage of results from optimal noise filter theory [30, 31], we derive a remarkably simple analytical relationship that describes the tradeoffs between minimum ATP rate, the MI, and the maximum characteristic signal frequency (the so-called bandwidth) which the push-pull network can handle. Verified via extensive numerical simulations across the whole gamut of enzymatic parameters, this relation is a novel theoretical prediction that can be directly tested in future experiments. The relation rationalizes the observed range of MI by showing that values much higher than 1-2 bits would require sacrificing the ability to process fast-changing signals. Finally we explore the question of whether there exist evolutionary pressures that would push such a system to be energy efficient, optimizing the ATP consumption for a given target MI and bandwidth. Using a recently developed formalism relating metabolic costs to the strength of natural selection [32, 33], we show that these pressures can indeed be significant, particularly for single-celled organisms. We highlight a kinase-phosphatase loop in the yeast Hog1 signaling pathway as a system that may have been optimized by such pressures.

II. THEORY

A. Modeling an enzymatic push-pull loop

This push-pull network consists of two opposing reactions: a kinase enzyme instigates the “push”, chemically modifying a substrate protein via phosphorylation, while a phosphatase enzyme provides the “pull”, dephosphorylating the modified substrate, reverting it to its original state [26–29]. Since a single kinase can catalyze the phosphorylation of many substrate proteins, this loop can effectively act like an amplifier [28], translating a weaker signal (a small cellular population of an active kinase) into a stronger one (a large population of a phosphorylated substrate). Often the substrate itself is a kinase that can exist in catalytically inactive and active states, with activation triggered by phosphorylation. In this case one can have multi-tiered signaling cascades enhancing the amplification (as shown schematically in Fig. 1A) with the active substrate produced by one loop serving as the kinase for a downstream loop [34]. More complex signaling networks are also possible, with multiple cascades connected by crosstalk through shared components [35], feedback from downstream to upstream populations [34], or activation requiring multisite phosphorylation [36]. How-ever, the starting point for understanding any of these more complex signaling topologies is the behavior of a single loop, with a substrate activated / deactivated through a single phosphorylation site.

FIG 1.
  • Download figure
  • Open in new tab
FIG 1.

(A) A schematic signaling pathway involving cascades of kinase phosphorylation, initiated by a receptor embedded in the cell membrane that responds to extracellular ligands. The system we focus on will be one stage of the pathway, a kinase-phosphatase push-pull loop, highlighted in the dashed box. (B) The molecular species and reaction parameters of the push-pull loop. The kinase (K) binds to the substrate (S), forming the complex (SK) that catalyzes the production of phosphorylated substrate (S∗). Phosphatase (P) binds to S∗, forming a complex Embedded Image that catalyzes the dephosporylation of the substrate. Forward reaction / binding rates are labeled in black, while reverse reaction / unbinding rates are in red. (C) The loop serves to transduce an input signal, defined as the total population of kinase (bound or unbound), X(t) = K(t) + SK(t), into an output, defined as the total population of phosphorylated substrate, Embedded Image. The input signal has a characteristic autocorrelation time Embedded Image.

The reaction scheme of a single push-pull loop is shown in Fig. 1B. Binding of free kinase (population K(t) at time t) to substrate (population S(t)) occurs with rate constant κb, forming a kinase-substrate complex (population SK(t)). Phosphorylation of the substrate and its subsequent release constitutes the catalytic step, with rate κr, yielding free phosphorylated substrates (population S∗(t)). A phosphatase can subsequently bind, with rate ρb, forming a phosphatase-substrate complex (population Embedded Image), and catalyzing the dephosphorylation / release of the substrate with rate ρr. These reactions also can occur in reverse: kinase-substrate unbinding (rate κu), reverse kinase catalysis (rate κ−r), phosphatase-substrate unbinding (rate ρu) and reverse phosphatase catalysis (rate ρ−r). Under physiological conditions some of these reverse rates may be negligible compared to their forward counterparts, but accounting for them is crucial to enforce thermodynamic consistency. In fact the product of the ratios of the reverse rates relative to the forward ones must satisfy a key thermodynamic relation arising from the principle of local detailed balance (closely related to the Haldane relation for enzymes) [37, 38], Embedded Image

This relation is derived in the Supplementary Information (SI), and reflects the fact that for every complete traversal of the loop along the forward direction (clockwise along the black arrows in Fig. 1B) a single ATP molecule is removed from the environment, hydrolyzed, and the products ADP and inorganic phosphate Pi released back into the surroundings. Δμ depends on the concentrations [ATP], [ADP], and [Pi] through Δμ = Δμ0 + kBT ln([ATP](1 M)/([ADP][Pi])), where Δμ0 is the standard free energy of ATP hydrolysis (Δμ0 ≈ 12 kBT at room temperature [24]). Living systems expend energetic resources to maintain an imbalance of [ATP] relative to [ADP] and [Pi], making Δμ in physiological conditions larger than Δμ0. Despite the wide variety of metabolic pathways used to achieve this, measured Δμ values in organisms from E. coli to humans lie within a relatively narrow range, Δμ ≈ 21 − 29 kBT [24]. This means reverse rates are sufficiently slow that the numerator in Eq. (1) is 9-12 orders of magnitude smaller than the denominator. One of the questions we tackle below is the significance of this disparity for transmitting information through the loop.

To quantitatively measure this information transfer, it is useful to explicitly describe the network behavior in terms of transducing an input signal into an amplified output, with degradation of the signal due to the stochastic nature of the reactions that mediate this process. We take the time-dependent input X(t) = K(t) + SK(t) to be the population of active kinases (both free and substrate-bound), and the corresponding output signal Embedded Image as the population of phosphorylated substrates (free and phosphatasebound). For any specific system, the input kinases would be activated through a particular upstream signaling network. Here, however, we are interested in a more general problem: what is the effectiveness of this loop in processing a variety of possible input signals, spanning different amplitudes and timescales. The simplest mechanism that allows us to tune the dynamical characteristics of the input is to imagine the kinases activated at a constant rate F and deactivated at a constant rate γK. We focus on the long-time limit where a stationary state has been achieved, and so F allows us to regulate the amplitude of the input signal while γK controls the autocorrelation time of the input fluctuations. While the analysis below could be done for other, system-specific models of the input, our choice allows us to explore a broad range of possible inputs to establish general bounds on information processing through the loop. With this input model, the reaction network model is fully specified. For a given set of parameters (drawn from distributions based on kinase/phosphatase biochemical information collected in enzymatic databases, as described below) we can derive analytical results for dynamical quantities using the linearized chemical Langevin approximation [39]. As shown in the SI, this provides excellent agreement with the exact kinetic Monte Carlo [40] simulation results in the parameter ranges of interest.

In focusing on how X(t) is transduced to Y (t), we frame our analysis in terms of three properties of the system. The first is the autocorrelation time of the input, Embedded Image, defined through Embedded Image, where the bar denotes an average over an ensemble of trajectories in the stationary state and Embedded Image. Note that instantaneous averages like Embedded Image and Embedded Image are independent of t in the stationary state. Embedded Image is the characteristic timescale of the input fluctuations, and we will denote its inverse, γx, as the effective “frequency” of the input. The second property is related to the mean rate at which phosphorylated substrates are produced through the catalytic reaction step, Embedded Image, relative to the mean total number of activated kinases Embedded Image. We define the gain parameter Embedded Image as a measure of the production of output for a given input level. Both γx and R0 can be expressed, to a good approximation, in terms of the reaction rates as follows (see SI for derivation): Embedded Image where Embedded Image. Here κ− ≡ κu + κr, ρ− ≡ ρu + ρr. Note the dependence on mean unmodified substrate Embedded Image and free phosphatase Embedded Image populations: these two numbers are free parameters that (along with the reaction rates) determine the network dynamics.

The final property of interest is the instantaneous stationary MI Embedded Image between X(t) and Y (t). This is defined in terms of the joint probability P (X, Y) of observing input value X and output value Y at the same moment of time, and the corresponding marginal probabilities P (X) and P (Y), Embedded Image

The value of Embedded Image is non-negative in all cases, and is measured in bits, with larger values translating to a greater degree of correlation between input and output. For our parameter ranges, P (X, Y) can be approximated as a bivariate Gaussian, and so we use an expression for Embedded Image valid in this limit that is more convenient to evaluate [19]: Embedded Image

Here E = 1 − ρ2, where ρ is the Pearson correlation coefficient, and hence lies in the range 0 ≤ E ≤ 1. For E = 0 (or equivalently Embedded Image) we have perfect correlation between the input and output signal, while E = 1 Embedded Image corresponds to an output that is completely independent of the input.

B. Determining the enzymatic parameter range

Once the input signal is specified through F and γK, there are ten parameters related to the kinase, phosphatase, and substrate that determine the observables of interest γx, R0, and Embedded Image discussed above. These parameters are: κb, κu, κr, κ−r, ρb, ρu, ρr, ρ−r, Embedded Image, Embedded Image. We know from surveys of enzymatic parameters that each of these quantities can span several orders of magnitude among different systems, often with an approximately log-normal distribution [41, 42]. To understand the performance limits of enzymatic loops in general, it makes sense to explore the entire range of biologically realistic parameters, rather than focus on a single choice of parameters. Existing online databases are excellent resources for this purpose, and Fig. 2 shows the resulting histograms of kinase / phosphatase parameters (full extraction details are available in the SI). For the substrate protein (which we take as a kinase) and the phosphatase, the concentrations [S] and [P] in Fig. 2A are derived from the PaxDb protein abundance database [43], using UnitProt gene ontology associations to identify kinases and phosphatases [44]. Enzymatic reaction parameters are available in the Sabio-RK database [45]. The reaction rates κr and ρr (Fig. 2D) are typically listed directly, but the others are most often in specific combinations: the Michaelis constants Embedded Image for kinase/phosphatase respectively (Fig. 2B) and the specificity ratios Embedded Image (Fig. 2C). For all of these parameters there is a paucity of data on phosphatases relative to kinases, but the phosphatase ranges seem to largely overlap with those of kinases. Thus for simplicity we take kinase and phosphatase parameters to have the same distributions (log-normal) and use a numerical fitting procedure to find an overall log-normal joint probability distribution for the eight underlying model parameters represented in the data: κb, κu, κr, ρb, ρu, ρr, Embedded Image (see SI). Note that data in concentrations units (like [S] and [P] in molars) is converted to mean abundances (Embedded Image and Embedded Image) by assuming a volume of 30 fL (comparable to the cytoplasmic volume of yeast [24, 46]). This procedure is designed so that the resulting joint distribution yields marginal probability densities (solid curves in Fig. 2) that exhibit good agreement with the histogram data for any of the measured parameter combinations. Despite this agreement, we note that the joint distribution likely spans a portion of the parameter space larger than the true distribution of biological values: this is because it cannot fully capture correlations between different parameters. (Such correlations are difficult to reconstruct since many database entries are incomplete, containing some but not all of the enzymatic parameters.) For our purposes, having a distribution that effectively acts like a superset of the biological distribution is fine: whatever performance bounds we infer from the whole distribution will then also apply to the subset of the distribution that corresponds to current real-world systems. Moreover this also allows us to explore a larger enzymatic design space, which may have been accessible at earlier points in evolutionary history.

FIG 2.
  • Download figure
  • Open in new tab
FIG 2.

Enzymatic parameter ranges for kinases/phosphatases based on the PaxDb [43] and Sabio-RK [45] databases. Because of the relative lack of phosphatase data (orange histograms) relative to kinases (blue histograms), we fit an overall log-normal joint probability to the total data set including both kinases and phosphatases. The marginal distributions from that global fit are plotted as purple curves. The parameters are as follows: (A) kinase substrate [S] and phosphatase [P] concentrations; (B) kinase/phosphatase Michaelis constants Embedded Image; (C) the corresponding specificity ratios Embedded Image (D) kinase/phosphatase catalytic rates κr and ρr.

Two of the model parameters are still unaccounted for: the reverse reaction rates κ−r and ρ−r. Though usually small in magnitude and typically not measured in enzyme kinetic assays, we also know that they are crucially related to Δμ through the local detailed balance relation of Eq. (1). Thus, as explained in the next section, these become important free parameters that we can vary to explore signaling efficiency and its dependence on Δμ.

III. RESULTS

A. Minimum cost of transmitting information

Given the model described above, with a parameter set drawn at random from the empirical joint distribution, we can ask a basic first question: what is the minimum chemical potential difference Δμ required to achieve a certain mutual information Embedded Image The answer will depend on the nature of the input signal X(t), and thus we would like to test different effective input frequencies γx. To do this we will fix the mean free kinase concentration at the level of a low amplitude input, [K] = 5 nM, and vary γK, which varies γx according to Eq. (2) with Embedded Image for fixed Embedded Image. In the SI we also show the same analysis for [K] = 0.5 and 50 nM, with results qualitatively similar to those described below. After drawing enzyme parameters from the joint distribution and specifying γx at a given [K], the only two free parameters are the reverse reaction rates κ−r and ρ−r.

Fig. 3A shows a contour diagram of Embedded Image as a function of κ−r and ρ−r for a sample enzyme parameter set and value of γx. Superimposed are dotted lines of constant Δμ from Eq. (1). If one were interested in achieving a particular Embedded Image value, for example Embedded Image = 1 bit, one can then numerically determine the κ−r and ρ−r point along the Embedded Image bit contour where Δμ is smallest. For this specific enzyme parameter set and γx, the value turns out to be Δμ = 6.72 kBT, which would then be recorded as the minimum necessary Δμ to achieve 1 bit of MI. Note that it is not guaranteed that a minimum Δμ solution exists for every parameter set sampled from the joint distribution. If the Embedded Image contours plateau at a maximum less than 1 bit, no possible Δμ will allow that particular system to achieve the desired MI target. We will return to this important point below.

FIG 3.
  • Download figure
  • Open in new tab
FIG 3.

(A) A representative contour diagram of Embedded Image (solid curves) as a function of κ−r and ρ−r for a parameter set drawn randomly from the joint distribution. Dotted lines denote contours of constant Δμ. In this case Δμ = 6.72 kBT is the smallest value at which the system can achieve Embedded Image bit. (B) For a sample parameter set, the minimum Δμ needed to achieve Embedded Image, 2 bits as a function of input frequency γx. For the 1 bit case, the dashed line represents Embedded Image, the maximum γx compatible with Embedded Image bit for this parameter set. As described in the text, we highlight two points along the curve: one at a frequency Embedded Image at roughly 95% of the bandwidth, and the other at frequency Embedded Image at roughly 1% of the bandwidth. The points will be plotted for a many random draws of the enzyme parameters from the joint distribution in the lower panels of the figue. (C) For each target value of Embedded Image, 1.5, 2 bits, the percentage probability of randomly drawing a parameter set that has a Embedded Image higher than a given frequency. (D-F) The distribution of Embedded Image and Embedded Image (green) for many random parameter draws, keeping only those that can achieve Embedded Image bit (D), 1.5 bits (E), or 2 bits (F). The probabilities of successfully drawing such a set are shown in red in each panel. The blue and green circles denote the median of each distribution respectively. (G-I) The same Embedded Image distributions as in panels (D-F), except plotted in terms of gain R0 on the vertical axis. The solid line is the analytical maximum bandwidth bound Embedded Image of Eq. (5). The purple circle in panel G shows the estimated result for the near-optimal yeast Pbs2/Hog1 system.

If one keeps the enzyme parameters (other than κ−r and ρ−r) fixed, and just varies γx, an interesting trend appears in the minimum Δμ results. Fig. 3B shows two examples of minimum Δμ curves, for target Embedded Image values of 1 and 2 bits respectively. For a given Embedded Image target, the minimum Δμ is nearly constant at low input frequencies, but then increases rapidly and diverges at a maximum frequency which we will dub the “bandwidth” of the system. This intuitively makes sense: the higher the input frequency, the more rapid the catalytic reaction rates needed to accurately transmit the signal through the system, increasing the required Δμ threshold. However, there is an inherent limit, given finite enzyme catalysis rates. Above the bandwidth, whose value depends on the enzyme parameters, the system can no longer achieve the target Embedded Image. The higher the informational burden (i.e. increasing the target Embedded Image from 1 to 2 bits) the lower the bandwidth: if one desires higher fidelity transmission, the range of transmissible signal frequencies will suffer.

To make more sense of these results, it is useful to look at a broad sample of enzyme parameters rather than a single set. To visualize global behaviors, we will calculate two numerical results for each set drawn from our joint distribution. The procedure is as follows: i) Sample an enzyme parameter set from the distribution; ii) Determine if it can achieve our target Embedded Image for any input frequency; iii) If the answer is yes, vary γK until one finds the maximum possible value Embedded Image where one can still achieve the Embedded Image target. iv) Calculate the minimum Δμ for an input signal very near the bandwidth frequency, where Embedded Image. We will call this result Δμhigh. The corresponding input frequency is Embedded Image. v) Analogously, calculate the minimum Δμ for an input signal with a frequency much lower than the bandwidth, where Embedded Image. This set of results we denote as Δμlow and Embedded Image. Fig. 3B shows the two points Embedded Image and Embedded Image as blue and green dots respectively for that particular parameter set at Embedded Image bit. These two points encapsulate several key features of the minimum Δμ versus γx curve: Δμlow roughly corresponds to an “entry level” price, the minimum ATP hydrolysis chemical potential necessary to transmit the signal at any frequency, while the difference Δμhigh − Δμlow is the premium one has to pay to transmit signals near the highest possible frequencies. The value Embedded Image approximately corresponds to the bandwidth.

If one were to make numerous draws from the parameter distribution, and plot Embedded Image, Embedded Image and Embedded Image for each draw, one would get a cloud of blue and green dots. These are shown in Fig. 3D-F for target Embedded Image of 1, 1.5, and 2 bits respectively. As mentioned above, not every draw will lead to a parameter set that can achieve the target, and the plots are labeled by the fraction of draws that are capable of reaching that particular value of Embedded Image. That fraction decreases with Embedded Image, from 13% for Embedded Image bit down to only 2% for Embedded Image bits. As Embedded Image increases not only does it become progressively more difficult to find enzymatic parameters compatible with higher fidelity, but the accessible frequency range becomes more restricted. Fig. 3C shows the percentage of the parameter space that can achieve bandwidths higher than a given frequency for different Embedded Image. For example let us consider the frequency 1.22 × 10−3 s−1, which is the Embedded Image bit bandwidth for the yeast Pbs2/Hog1 system described in detail in Sec. IIIC. (This system is part of the Hog1 osmotic stress response pathway, whose overall bandwidth has been experimentally estimated to be of a similar scale [47]). From Fig. 3C it is evident that only about 0.41% of the draws from the parameter distribution have Embedded Image for a target Embedded Image bit. If one were to attempt to transmit signals at such high frequencies for Embedded Image bits, the fraction of compatible parameter space shrinks to a miniscule 9 × 10−3%. This reflects the exquisite fine-tuning required to put together a set of enzymatic loops capable of responding to quick, life-or-death variations of the external environment on time scales of a couple of minutes. Going much beyond Embedded Image bit and maintaining fast response times for a single push-pull loop is extremely difficult, and hence it makes sense that biology settles for Embedded Image in the vicinity of 1 bit in many circumstances. Going much below 1 bit poses another set of difficulties, since such systems would not even be able to reliably transmit the difference between high and low values of input signal. For signaling that can occur over longer timescales (hours instead of minutes) it becomes much easier to find compatible parameter sets, with the median of the distribution of Embedded Image for Embedded Image bit around ∼ 6 × 10−5 s−1.

From the perspective of costs, the bulk of the distribution of entry level prices Δµlow for Embedded Image bit is ≳ 1 kBT. Any system much below this would be too close to equilibrium (reverse rates comparable to forward rates) for effective information transfer to occur. The median of the Δμlow distribution in Fig. 3C is 4 kBT, increasing to about 6 kBT for Embedded Image bits in Fig. 3E. These values are on the same scale as estimates of minimum Δμ ∼ 4 kBT ln 2 required for 99% accurate readout of a ligand-bound receptor via the activation of a downstream molecule, assuming an arbitrarily slow readout process [23]. In that system (as in ours), processing information at faster time scales requires large Δμ. Indeed we find that the median values for Δμhigh range between 8 − 10 kBT for Embedded Image bits. The minimum Δμ near the bandwidth is typically shifted up by about 4 kBT, reflecting the premium necessary to transmit near the frequency limit. Paying this premium is worthwhile: frequencies Embedded Image accessible at Δμlow prices are likely far too low to have biological relevance, with the distributions of Embedded Image largely below 10−5 s−1. To get the ability to respond to signals at more biologically reasonable time scales thus means being capable of transmitting closer to the bandwidth, making Δμhigh a more useful measure of minimum biological costs.

The Δμhigh distributions show that it is possible to have signaling systems that transmit at least 1 bit of MI and operate at Δμ lower than the current physiological range (Δμ ≈ 21−29 kBT [24], indicated in pink in Fig. 3D-F). This is true even for systems with the fastest responses (large Embedded Image near the right edges of the distribution). This means the one can imagine enzymatic signaling systems in the earliest stages of evolutionary history that can reliably distinguish high and low inputs even before ATP metabolism (maintaining high ATP concentrations relative to ADP and Pi) reached its modern levels of efficiency.

In fact a fascinating universal feature of the distributions is that the physiological Δμ range lies just above the top edge of the distributions. Naively it would seem as if the physiological values are just high enough to allow these signaling loops to transmit Embedded Image bits across the broadest possible parameter subset. This gives evolution the largest possible space in which to tweak tradeoffs between fidelity and response times without running into chemical potential limitations. Of course Δμ influences not just signaling networks but the entire range of cellular functions, so it is impossible to say with certainty what factors played the largest role in determining the values of Δμ we see in present-day organisms. But at least from the perspective of signaling at the level of a push-pull loop, it is clear that Δμ ≈ 21 − 29 kBT is more than good enough for basic information transfer needs, and there would be no benefit in having a system with substantially higher Δμ. To maintain Δμ = 40 or 50 kBT for example, would require significant additional metabolic resources, with little payoff in terms of either Embedded Image or bandwidth.

B. Analytical bound describes tradeoff between bandwidth and information

The results above already illustrated the tradeoff between bandwidth and MI, with parameter sets that achieve very large Embedded Image becoming progressively harder to find as the target Embedded Image increases. Can we understand this relationship in more detail? For this purpose we take advantage of optimal noise filter theories, originally developed in the context of signal processing [48–50], and in recent years applied to a variety of biological signaling networks [30, 31, 51–54]. The original motivation involved designing a filter for a signal corrupted by noise, such that the output matched the uncorrupted input signal as closely as possible. In the biological context, this same framework allows us to put bounds on the maximum MI achievable between input and output signals for given input and enzymatic parameters. As shown in the SI, our enzymatic push-pull loop can be approximately mapped onto an effective two-species input-output system, which is then amenable to analytical treatment using the Wiener-Kolmogorov optimal filter theory [30, 48–50].

The end result is a remarkably simple analytical relation between the maximum possible bandwidth Embedded Image achievable given a target value of Embedded Image, Embedded Image

The only other enzymatic parameter that appears in the relation is the gain R0, a measure of output production relative to the input. Fig. 3G-I shows the same parameter set distribution as the Embedded Image points in Fig. 3D-F, except replotted in terms of (Embedded Image, R0), where R0 is the gain for each parameter set. The solid line is the bound of Eq. (5). Even though this bound is based on an approximation of the full enzymatic system, and hence is not guaranteed to be exact, it still provides an excellent cutoff for the distribution of (Embedded Image, R0) points. For systems at a certain R0, we see that as Embedded Image is increased and the denominator in Eq. (5) gets larger, the maximum bandwidth Embedded Image shifts to lower values. If we are interested in a fast response time, increasing Embedded Image systematically reduces the compatible parameter space, since we are forced to rely on cases with larger and larger R0. Thus Eq. (5) rationalizes the earlier observation of limited options for networks that can simultaneously respond to signals fluctuating on minute time scales and achieve Embedded Image significantly larger than 1 bit.

C. Optimality and the yeast Pbs2/Hog1 push-pull loop

There is an alternative way of thinking about the R0 versus Embedded Image results in Fig. 3G-I. Imagine a system working at Embedded Image with a certain gain parameter R0 and achieving a target value Embedded Image. Comparing other parameter sets with the same bandwidth Embedded Image and target Embedded Image (taking a vertical slice of one of the panels in Fig. 3G-I), they will have a variety of different R0 values, but all of these will be bounded from below by the minimum value Embedded Image

When Embedded Image, the system sits on the optimality line of Eq. (5), with Embedded Image.

The discrepancy between R0 and Embedded Image for a given system allows us to see how close the signaling behavior is to optimality. Let us take a concrete biological example: the Pbs2/Hog1 enzymatic push-pull loop from yeast, part of the Hog1 signaling pathway that allows the organism to respond to osmotic stress. As described in the SI, key parameters for this system can be estimated based on an earlier model [46] fit to microfluidic experimental data where yeast was exposed to periodic salt shocks [55]. The results for the bandwidth and gain for Embedded Image bit are: Embedded Image and R0 = 0.0621 ± 0.0001 s−1, with the error bars reflecting uncertainties due to unknown parameters (where we used priors based on the log-normal distributions in Fig. 2.) The scale of the predicted bandwidth Embedded Image is consistent with microfluidic estimates. Ref. [47] found a steep dropoff in the mean amplitude of the Hog1 response to periodic step-like changes in external osmolyte concentrations when the frequencies of the changes increased from 10−3 s−1 to 10−2 s−1. At frequencies beyond the dropoff the Hog1 output can no longer reproduce the osmolyte input at high fidelity. Though the form of the input in this case is different than in our model, and the experiment probes the entire pathway rather than just the Pbs2/Hog1 component, the similarity in scales to our Embedded Image value suggests that the Pbs2/Hog1 system may play a major role in determining the bandwidth of the whole pathway (since the bandwidth of the whole is constrained by the bandwidths of the components).

Intriguingly, the estimated gain R0 is very close to the minimum possible value Embedded Image for signaling at the bandwidth Embedded Image with Embedded Image bit, as seen in Fig. 3G. Using Eq. (6), we find Embedded Image. This naturally leads to the question: is the fact that this system lies so close to optimality a coincidence, or are there reasons why natural selection might favor minimizing R0 in this case? To answer this question, we first have to consider the relationship between gain and ATP consumption.

D. Minimum ATP consumption to achieve a certain signaling fidelity and bandwidth

This bound on the gain parameter in Eq. (6) is directly related to the metabolic cost of signaling, since higher production of the output per given input level will generally require a higher rate of phosphorylation events. We can roughly quantify the average rate of phosphorylation: in the stationary state this is just the mean rate of the kinase-catalyzed reaction step, Embedded Image. Assuming one ATP hydrolyzed per reaction, A is the mean rate at which ATP is consumed by the system, and is related to R0 through Embedded Image as shown in the SI. In the enzymatic parameter ranges we consider, κr is typically much larger than R0, so we can approximate this relation as Embedded Image. Using Eq. (6) we can then estimate the minimum possible ATP consumption rate given a target Embedded Image and bandwidth Embedded Image: Embedded Image

Fig. 4A shows the same parameter set values as the Embedded Image points in Fig. 3D for Embedded Image bit, except plotted in terms of Embedded Image. The A values are exact, but the approximate relation of Eq. (7) provides an excellent lower bound on the distribution. Qualitatively, the individual elements of Eq. (7) all make intuitive sense. An increase in any of the constituent factors (the mean free input kinase population Embedded Image, the target information Embedded Image, the bandwidth Embedded Image) puts greater demands on the signaling system, requiring more catalytic activity and hence faster ATP consumption. Note that the above results are easily generalized if the reaction step consumes more than one ATP: for example the effective model for yeast Pbs2/Hog1 discussed above involves phosphorylation at two sites, which would lead to expressions for A and Amin getting a prefactor of two.

FIG 4.
  • Download figure
  • Open in new tab
FIG 4.

(A) The same Embedded Image point distribution as in Fig. 3C for Embedded Image bit, except plotted in terms of ATP consumption rate A on the vertical axis. The solid line is the approximate lower bound Amin on ATP consumption given by Eq. (7). (B) This distribution replotted with selection coefficient |s| on the vertical axis. |s| quantifies the fitness cost associated with a system that achieves the target Embedded Image bit but is sub-optimal in ATP consumption, relative to an optimal variant where A = Amin. The value of |s| becomes evolutionarily significant when it is higher than a “drift threshold” Embedded Image, where Ne is the effective population of the organism (a measure of genetic diversity). The ranges of Embedded Image for different classes of organisms are shown on the right [32, 56]. The vertical dotted line corresponds to the estimated Embedded Image for the yeast Pbs2/Hog1 system.

E. Evolutionary pressure on the metabolic costs of signaling

It is clear from Fig. 4A that for many parameter set choices the ATP consumption rate A is significantly larger than for a system near optimality (A ≈ Amin) given the same Embedded Image and Embedded Image. Let us consider a specific scenario where the bandwidth Embedded Image and the target Embedded Image are sufficient for the biological function of the signaling i.e. there are rapidly diminishing fitness returns in going to higher bandwidth and signal fidelity. In this scenario a system with A > Amin has no significant adaptive advantage over one with A ≈ Amin, but instead incurs a fitness penalty because of the superfluous ATP consumption. Would there be evolutionary pressure on this sub-optimal system to move toward optimality?

The answer to this question has practical ramifications, because it will allow us to predict whether we should expect to see natural enzymatic push-pull loops cluster around the optimality line (as we saw in the yeast Pbs2-Hog1 example). The alternative, in the absence of strong evolutionary pressure to optimize, is a wider dispersion, more similar to Fig. 4A where the points are drawn at random from the enzymatic parameter distribution. Note that this is a question that is directly amenable to future kinetic experiments: for systems where we can fully characterize the enzymatic parameters of the push-pull loop (for both the kinase and phosphatase), all the relevant quantities like Embedded Image, A, and Embedded Image can be calculated.

Naively one might expect evolution to always drive systems to optimality due to natural selection, but genetic drift can play a significant competing role, allowing sub-optimal variants to flourish and even fix in a population [57]. To be specific, let us consider a unicellular organism that reproduces via binary fission, and two genetic variants of that organism that differ in the enzymatic parameters of a push-pull signaling loop: both variants achieve the same Embedded Image and Embedded Image, but one has A > Amin and one has A = Amin. Let us denote the relative fitness of the sub-optimal versus the optimal type as 1 + s, defining a selection coefficient s. In other words the sub-optimal variant will have on average 1 + s offspring relative to the optimal one during the generation time of the optimal type. In the scenario described above, where the extra production does not confer any adaptive advantage and only imposes a metabolic cost, we will have s < 0, because the superfluous ATP consumption will lead to slower growth.

The magnitude of s determines the degree of selective pressure on the sub-optimal variant. The key quantity that sets the relevant scale for s is the effective population Ne of the organism, the size of an idealized population that exhibits the same changes in genetic diversity per generation due to drift as the actual population [56]. When s < 0 and Embedded Image Embedded Image, natural selection dominates drift, exponentially suppressing the probability of a sub-optimal mutant fixing in a population of optimal organisms. On the other hand if Embedded Image, drift dominates, and the fixation probability of sub-optimal mutants is roughly the same as for a neutral (s = 0) mutation [58]. In this case it would be difficult to maintain optimality in a population over the long term. Ne for organisms is typically smaller than their actual population in the wild, and varies by several orders of magnitude among different classes: for unicellular species it can be as high as ∼ 109 − 1010 in bacteria down to ∼ 106 − 108 in single-celled eukaryotes [32, 56]. (It becomes even smaller among higher eukaryotes, going down to ∼ 104 in vertebrates.) The corresponding ranges for the “drift threshold” Embedded Image [32] are shown on the right in Fig. 4B.

The question then becomes: how do we estimate s and how does it compare to the relevant Embedded Image for the class of interest? For the case where a variant imposes metabolic costs but no adaptive advantage, there is a very useful relation that posits s ∼ −δCT /CT [32, 59, 60]. Here CT is the total resting metabolic expenditure of an organism during a generation time, measured for example in units of P, where 1 P = one phosphate bond hydrolyzed (ATP or ATP equivalent consumed). δCT is the extra expenditure incurred by the more costly mutant. This relation has already been used to explore selective pressures in yeast [60], unicellular prokaryotes and eukaryotes [32], and viral infections [61]. It was recently derived from first principles through a general bioenergetic growth model [33], where the relation was refined with a more accurate prefactor: s ≈ − ln(Rb)δCT /CT. Here Rb is the mean number of offspring per individual (i.e. Rb = 2 for binary fission).

The value of CT can be readily estimated for single-celled organisms, where it scales roughly with cell volume [32, 33]. Given the 30 fL cell volume used in our calculations, and assuming a generation time (cell division time) tr = 1 hr, we find CT ≈ 7×1011 P (see details in the SI), comparable in magnitude to experimental estimates for yeast [32]. Since δCT reflects the extra ATP consumed by the costly mutant (with consumption rate A) versus the optimal variant (rate Amin) over one generation time, we can write δCT = (A − Amin)tr. We can thus calculate s for all the near-bandwidth Embedded Image = 1 bit parameter sets represented in Fig. 4A. The results for |s| versus Embedded Image are plotted in Fig. 4B. Because increased ATP consumption is required to achieve larger bandwidths (as seen in Eq. (7)), the distribution of selective penalties |s| for being sub-optimal is pushed to larger values with greater Embedded Image. In other words, higher bandwidths make the energetic stakes more significant.

We can now rationalize why the yeast Pbs2/Hog1 loop might be close to optimality. The bandwidth for that system (indicated by a vertical dashed line in Fig. 4B) is near the higher end of the spectrum. Suboptimal parameter values that achieve approximately the same bandwidth at Embedded Image bit span a range of |s| values between 10−8 and 10−4. Given Ne = 106 − 108 for single-celled eukaryotes [32, 56], and estimates of Ne ≈ 107 for wild yeast populations [62], these suboptimal systems likely have |s| near or above the drift threshold Embedded Image. Thus we would expect yeast to be under evolutionary pressure to optimize the energy expenditures associated with the enzymatic loop.

IV. DISCUSSION AND CONCLUSIONS

The kinase-phosphatase push-pull signaling network, which maintains a certain value of mutual information Embedded Image between input and output, incurs energetic costs in the form of ATP consumption. These costs have two related facets: (i) the free energy expenditure Δμ for each hydrolysis reaction, and (ii) the number of such reactions A per unit time. Achieving empirical values like Embedded Image bits requires satisfying both aspects of the cost. There is a minimal price in terms of Δμ to achieve any given Embedded Image, and this price increases if one demands either greater fidelity (larger Embedded Image) or the ability to process faster signals (larger γx). Modern cells are more than willing to pay this part of the price, with Δμ sufficiently high to meet the minimal requirements for any enzymatic parameter set that hits a target Embedded Image on the order of 1 bit. However, as the distributions in Fig. 3D-F illustrate, there are certainly options for signaling systems that work at similar fidelities under conditions of smaller Δμ, the presumptive scenario earlier in evolutionary history. In all cases we require some degree of fine-tuning of enzymatic parameters: the higher the fidelity or frequency demands, the smaller the fraction of parameter space that satisfies them. This leaves vanishingly small room to achieve networks that operate at Embedded Image significantly larger than the known empirical range.

For particular parameter combinations the system is optimal, exhibiting the maximum possible bandwidth (Embedded Image of Eq. (5)) with the minimal ATP consumption (Amin of Eq. (7)). Is such optimality widely realized in nature? Analyzing the selective pressures due to superfluous ATP expenditures indicates that this is a worthwhile question to pursue. We have already identified one near-optimal candidate in the yeast Hog1 signaling pathway. Based on the results of the previous section, we predict that the best place to look for others is among signaling pathways with high bandwidths, for example ∼ 10−3 − 10−2 s−1 at the extremes of the current biological distribution. Here the metabolic costs of being suboptimal would be significant for single-celled organisms.

More broadly, strong selective pressure on the costs of running signaling networks in single-celled organisms is likely to be a widespread phenomenon. To give another example, the expenditure of running the chemotaxis machinery in E. coli has been estimated to be about ∼ 107 P per ∼ 1 hr cell cycle [2, 4]. Compared to a value of CT ≈ 2 × 1010 P for E. coli [32, 33], we get an |s| ∼ 10−4, which is definitely significant for a bacterial population. We have barely begun to understand the kinds of optimization that such selective pressure has induced. Our approach readily generalizes beyond the kinase-phosphatase system, setting the stage for exploring these issues in a much wider array of biochemical networks.

Data and code availability

The code for our analysis, along with the data used to generate the figures, is available at: https://github.com/hincz-lab/cell-signaling.

Supplementary Information

I. DERIVATION OF THE LOCAL DETAILED BALANCE RELATION

FIG S1.
  • Download figure
  • Open in new tab
FIG S1.

The enzymatic push-pull loop from the perspective of an individual substrate molecule. The protein can exist in one of four states: unmodified substrate (σ), bound to kinase (σK), phos-phorylated (σ∗), and bound to phosphatase while phosphorylated Embedded Image. The forward (clockwise) transition rates between these states are indicated in black, while the reverse (counterclockwise) rates are in red.

To derive the local detailed balance relation of main text Eq. (1), it is convenient to focus on the reactions from the perspective of an individual substrate molecule [1]. A given molecule in our model can be in one of four states, indicated in Fig. S1 with corresponding forward and reverse transition rates. For example if the molecule is an unmodified sub-strate (state σ) it can transition to a kinase-bound substrate (state σK) with rate κb[K], proportional to the surrounding concentration [K] of kinase molecules. It can revert from σK to σ with rate κu. The other transitions in Fig. S1 are defined analogously, with forward rates colored black and reverse rates in red. Local detailed balance entails that product of reverse rates divided by the product of forward rates is equal to exp(βΔG), where ΔG is the free energy change of the system associated with a single forward traversal of the loop and β = (kBT)−1 [1]. Since after one loop from σ to σ the substrate is back in the same state (as well as the kinase and phosphatase), there is no contribution to ΔG from these molecules. However a single loop leads to the hydrolysis of a single molecule of ATP, so ΔG = Δμ, as defined in the main text. Putting everything together, the local detailed balance relation reads Embedded Image yielding main text Eq. (1).

II. CHEMICAL LANGEVIN APPROACH FOR THE KINASE-PHOSPHATASE PUSH-PULL LOOP

In this section we derive the stationary state properties of the kinase-phosphatase push-pull loop via the chemical Langevin approximation. The derivation will follow analogously to Ref. [2], except here the system is more complicated due to the inclusion of reverse enzymatic reactions. The end goal will be a method to estimate the mutual information Embedded Image, given by main text Eq. (4), Embedded Image which requires evaluating the variances of the input and output, Embedded Image Embedded Image, as well as the covariance Embedded Image. The quantity E here will be referred to as the “error” in signal propagation between input and output, and can be equivalently expressed as E = 1 − ρ2, where ρ is the Pearson correlation coefficient between X and Y.

A. Dynamical equations

Our starting point is the full system of reactions for the enzymatic push-pull loop, Embedded Image where ∅ represents the void (upstream deactivated kinase which does not enter into our model). The corresponding chemical Langevin equations [3] are given by: Embedded Image where the last line ensures that the total populations of free or bound phosphatase Embedded Image and free or bound substrate in all its forms Embedded Image remain constant. The noise terms Embedded Image, where ηi(t) are Gaussian noise functions with zero mean and correlations Embedded Image. The five noise terms are associated with reactions in the system, and the corresponding prefactors represent the sum of the mean production (forward) and deactivation/unbinding (backward) contributions to each reaction: Embedded Image

As described in the next section, we will be linearizing Eq. (S4), keeping terms up to first order in deviations from the stationary state values. In this linearized approach, the stationary state populations are given by: Embedded Image with the following definitions: Embedded Image

The input (total kinase) is X = K + SK and the output (total activated substrate) is Embedded Image, and hence Eq. (S6) can be used to calculate the stationary values Embedded Image and Embedded Image.

B. Second moments

In order to calculate the variance and covariance of the input and output, we also need to know Embedded Image. To estimate these quantities, the first step is to switch variables in Eq. (S4) to focus on deviations from the stationary state values: Embedded Image Embedded Image. We can in turn rewrite these four variables in terms of the input and output deviations Embedded Image and Embedded Image : Embedded Image where we have introduced two additional auxiliary variables δXq and δYq. Plugging Eq. (S8) into Eq. (S4), we simplify the system through linearization, ignoring any terms of second order or higher in the deviations. As demonstrated below in comparisons with kinetic Monte Carlo (KMC) simulations [4] of the original system, this linearized chemical Langevin approximation works well for our parameter ranges. Finally, we Fourier transform the linearized Eq. (S4), and the resulting system of equations takes the form Embedded Image where Embedded Image denotes the Fourier transform of quantity Q(t). The matrix M is given by: Embedded Image

The Fourier-space system of equations Eq. (S9)-(S10) can be solved for Embedded Image and Embedded Image. The expressions are complicated, but take the form of a linear combination of Fourier-space noise terms: Embedded Image where Embedded Image and Embedded Image are some prefactors which can be expressed as rational functions of ω. The prefactors have the property Embedded Image. In Fourier space the correlations among the noise terms take the form Embedded Image. Hence we can calculate the input power spectral density (PSD) PX(ω), the output PSD PY (ω) and the cross PSD PXY (ω), defined via Embedded Image

Plugging Eq. (S11) into Eq. (S12), we find expressions for the PSDs in terms of the prefactor functions: Embedded Image

The final step is to calculate the second moments from integrals of the PSDs, using the inverse Fourier transform of Eq. (S13) evaluated at t = ti:

Embedded Image

Given the explicit expressions for the prefactor functions in Eq. (S13) (which are available as part of the Mathematica notebooks in the Github repository associated with the manuscript), one can numerically evaluate the integrals in Eq. (S14) to get the moments.

C. Comparison to kinetic Monte Carlo simulations for mutual information

The chemical Langevin calculation of the second moments allows us to use Eq. (S2) to estimate the mutual information Embedded Image. We can then check whether this estimate is consistent with the results we would get from KMC simulations of the full system. Fig. S2 shows this comparison for two sample parameter sets drawn from the enzymatic parameter distribution described in Sec. IV. Since we are interested in exploring the full range of chemical potentials Δμ, in each case we calculate Embedded Image varying the reverse-to-forward rate ratio κ−r/κr, keeping all other parameters constant. Through main text Eq. (1), increasing κ−r/κr corresponds to decreasing the magnitude of Δμ. At very large Δμ (small κ−r/κr) the Embedded Image curves saturate at the maximum possible mutual information for that parameter set, while at small Δμ (large κ−r/κr) the mutual information approaches zero, the equilibrium limit. Across the whole range we see that the chemical Langevin theoretical prediction is in close agreement with the KMC results.

III. CHARACTERISTIC FREQUENCY γx, GAIN R0, AND THE CONDITIONS FOR WIENER-KOLMOGOROV NOISE FILTER OPTIMALITY

A Deriving the γx and R0 expressions in main text Eq. (2)

Since the effective frequency γx of the input and the gain R0 play central roles in the analysis, having simple closed form approximations for them [main text Eq. (2)] is useful. The original definitions of these two variables, as described in the main text, are as follows: (i) γx is related to the autocorrelation of input fluctuations, Embedded Image; (ii) Embedded Image measures output production for a given level of input. As demonstrated in the next section, both of these can be calculated from KMC simulations (at significant computational expense for each different set of parameters). Alternatively, the chemical Langevin approximation of SI Sec. II can be used to derive somewhat cumbersome analytical expressions.

However the most convenient option is to take advantage of the meaning of γx and R0 in an effective, two-species description of the kinase-phosphatase reaction network. Imagine a system with an input species population X(t), output Y (t), and a simplified chemistry with only four reactions: production of input at rate F, deactivation of input at rate γxX(t), production of output at rate R0X(t), and deactivation of output at rate γyY (t). In this two-species system the inverse input autocorrelation time is given by the deactivation rate parameter γx, and the coefficient R0 in the output production rate is also the gain parameter. To relate this simplified model to the full reaction network of SI Sec. II, we compare analogous quantities in the simplified and full schemes. For example, let us take the mean input population Embedded Image. In the simplified scheme this is given by Embedded Image

In the full network Embedded Image can be calculated from Eq. (S6) as Embedded Image where the Ci are expressed in terms of full network parameters in Eq. (S7). Comparing Eqs. (S15) and (S16) we see that γx should be given by Embedded Image which is the first expression in main text Eq. (2). Similarly the mean production rate of the output in the simplified scheme is Embedded Image. In the full system the mean output production is the average rate at which new phosphorylated substrate is produced via catalysis by the kinase-substrate complex, Embedded Image where we have again used Eqs. (S6)-(S7). Comparing Eq. (S18) to Embedded Image, we see that R0 should correspond to Embedded Image which is the second expression in main text Eq. (2).

FIG S2.
  • Download figure
  • Open in new tab
FIG S2.

The mutual information Embedded Image for the enzymatic push-pull loop as a function of the reverse-forward rate ratio Embedded Image. The predictions from the chemical Langevin approach (dashed line) are compared against the corresponding KMC simulation results (circles). The parameters sets are as follows (all units are s−1 except for the mean populations; molar units have been converted to populations by assuming a cell volume of 30 fL): (top) κb = 2.94 × 10−6, ρb = 3.68 × 10−7, κu = 1.58 × 10−2, ρu = 4.42 × 10−4, κr = 12.8, ρr = 1.34, ρ−r = 2.50 × 10−5, F = 2.49 × 10−3, γk = 2.68 × 10−5, Embedded Image, and Embedded Image (bottom) κb = 2.32 × 10−5, ρb = 1.46 × 10−4, κu = 6.94 × 10−2, ρu = 5.48, κr = 0.994, ρr = 5.05 × 10−2, ρ−r = 2.06 × 10−8, F = 2.46 × 10−2, γk = 2.65 × 10−4, Embedded Image and Embedded Image.

B. Validation through kinetic Monte Carlo simulations

To verify that the expressions for γx and R0 derived above are good approximations, we ran KMC simulations for various parameter sets drawn at random from the enzymatic parameter distribution detailed in the SI Sec. IV. For each parameter set the simulation was run long enough after reaching the stationary state to collect sufficient statistics for both the mean population values and the input autocorrelation function. As described above, these allow us to calculate γx and R0. The simulation results are compared against the approximation from Eqs. (S17) and (S19) in Fig. S3. The agreement is excellent for both quantities, across the entire range of γx and R0 values. Thus we can confidently use the simple analytical expressions of Eqs. (S17) and (S19) to predict γx and R0 for any given parameter set.

C. Relating maximum bandwidth, minimum ATP consumption rate, and mutual information via Wiener-Kolmogorov optimal noise filter theory

One of the benefits of the approximate relation between the full system and the two-species model described in Sec. IIIA is that it allows us to use results from the two-species case to make predictions for the behavior of the kinase-phosphatase push-pull loop. The two-species model has been analyzed in detail in Refs. [2, 5], where it was shown to be able to map onto a Wiener-Kolmogorov optimal noise filter. The error E from Eq. (S2) for the two-species case can be evaluated in closed form as [2]: Embedded Image

It achieves its minimum value (hence maximizing the mutual information Embedded Image) when the following condition is fulfilled: Embedded Image where Λ = R0/γx. The corresponding minimum E, where the system behaves like an optimal Wiener-Kolmogorov (WK) noise filter is given by: Embedded Image

Interestingly, this remains the bound even if we generalize the output production term R0X(t) to be nonlinear in X(t) [2]. Using the relation between E and Embedded Image in Eq. (S2), we can translate the bound E ≥ EWK into an equivalent statement that Embedded Image at a given value of mutual information Embedded Image. The value of Embedded Image is shown in main text Eq. (5): Embedded Image

FIG S3.
  • Download figure
  • Open in new tab
FIG S3.

Comparison of the simple analytical approximations for R0 from Eq. (S19) (top) and γx from Eq. (S17) (bottom) versus KMC simulation results. Each point corresponds to a parameter set drawn randomly from the enzymatic parameter distribution described in SI Sec. IV. The red dashed line corresponds to perfect agreement. Error bars for R0 are smaller than the symbol size, and hence not indicated in the figure.

As shown in main text Figs. 3G-I, the above Embedded Image expression provides an excellent approximate upper bound on the Embedded Image values calculated for the full enzymatic system. Even though the effective two-species model lacks reverse rates, it provides a useful tool for deriving this bound, since the maximum bandwidth is achieved when the reverse rates are negligible (large Δμ).

As mentioned in the discussion around main text Eq. (6), the expression for Embedded Image in Eq. (S23) also has an alternative interpretation. This gives the minimum production rate Embedded Image necessary to achieve mutual information Embedded Image at a certain bandwidth Embedded Image : Embedded Image

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE S1.

Results of log-normal fits to various kinase/phosphatase enzymatic parameters. For each fit the mean log10 Embedded Image and standard deviation σx are listed. The top rows of the table correspond to individual fits to parameters collected from the PaxDb and Sabio-RK databases. The bottom rows show the results of a joint fit, described in the text of SI Sec. IV.

By relating R0 in turn to the ATP consumption rate Embedded Image, we can convert Eq. (S24) into an expression for the minimum necessary ATP consumption rate Amin. To accomplish this, note that A can be rewritten as: Embedded Image where we have used Eqs. (S6) and (S19). Finally, taking advantage of the fact that typically κr R0 for the parameter distributions of interest, we make the approximation Embedded Image. This allows us to derive main text Eq. (7): Embedded Image

IV. ENZYMATIC PARAMETER DISTRIBUTION

Earlier surveys of enzymatic kinetic parameters in Refs. [6, 7], over broader classes than just kinases and phosphatases, showed that their distributions could be approximately described by log-normal distributions. For a given parameter x, we will denote this as Embedded Image, or in other words that the base-10 logarithm of x is distributed according to a normal distribution with mean log10 Embedded Image and standard deviation σx. The value Embedded Image is the median of the resulting log-normal distribution for x.

For our work the focus is on kinases and phosphatases, and we are interested in looking at the push-pull loop signaling behavior over the entire distribution of biologically plausible parameters. The parameter data we collected, summarized in the histograms of main text Fig. 2, had far more representation of kinases than phosphatases, which is a well known limitation of the existing experimental literature. Despite this sampling issue, the orders of magnitude spanned by phosphatase parameters were comparable to those of the kinases. For each parameter type, we thus decided to fit both types of enzyme with a single overall distribution, based on pooling of all the available kinase and phosphatase data together. The data available from the databases took the forms listed below (all raw data and the files used to process it are included in the Github repository associated with the manuscript). The mean log10 Embedded Image and standard deviation σx values from the log-normal fits for the different parameter classes are listed in the first four rows of Table S1.

Enzymatic data:

  • Mean substrate [S] and phosphatase [P] concentrations, where the substrate is taken to be a kinase [main text Fig. 2A]. These numbers were derived from the PaxDb protein abundance database [8], taking advantage of UnitProt gene ontology associations to focus on just kinases and phosphatases in signal transduction pathways [10]. Each PaxDb data entry is in terms of ppm (parts per million) of abundance, relative to the total number of proteins in the cell. To convert from ppm to molar concentrations, we looked at data from human cells (which had the best representation in the database), and used the estimated total concentration of 2.7 106 proteins per μm3 for human cells [11]. The latter concentration corresponds to 4.48 10−3 M. If y is the abundance in ppm units, then 4.48(y/106) 10−3 M is the corresponding molar concentration. Note that total concentrations are very similar across many different types of species [11], so there should not be a strong species-dependence in the analysis. For example the same analysis in mouse cells rather than human ones yields quantitatively similar results: a mean kinase/phosphatase concentration 10−8.31 M (versus 10−7.93 M in human cells), and a log-normal standard deviation of 1.03 (versus 0.84 in human cells).

  • Reaction parameters [main text Fig. 2B-D]. These values were taken from the Sabio-RK database [9], where they were most often available in the following forms: Michaelis constants Embedded Image for the kinase/phosphatase (main text Fig. 2B), the corresponding specificity ratios Embedded Image (main text Fig. 2C), and the reaction rates κr and ρr (main text Fig. 2D). The resulting distributions were entirely consistent (though slightly narrower) with the distributions for the same parameter types analyzed in Ref. [6], which considered all enzymes (not just kinases and phosphatases).

Note that the six reaction parameter types that were collected from the Sabio-RK database (Embedded Image, κr, ρr) are not directly in the form that we need to calculate push-pull loop signaling properties. For the latter we would like to know (κb, ρb, κu, ρu, κr, ρr), or equivalently (κb, ρb, Embedded Image, κr, ρr). Here the dissociation constants are defined as Embedded Image and Embedded Image. Let us denote the parameter vector (κb, ρb, Embedded Image, κr, ρr) as v, with components vα, α = 1, …, 6. We would like to find a joint distribution for v that is self-consistent with the individual log-normal distributions for the alternative parameter types fitted directly from the database values (first 4 rows of Table S1). We will assume the simplest form for the joint distribution Ф: a product of individual log-normal distributions for each parameter vα, with median values Embedded Image and standard deviations : Embedded Image

Note that the vα ln(10) term in the denominator of the prefactor comes from the Jacobian due to the variable change between log10 vα and vα. This ensures that the probability is properly normalized: Embedded Image. As explained above, kinase and phosphatase parameters are assumed to be drawn from the same distributions, so we enforce that Embedded Image Embedded Image, and analogously for the standard deviations σα. This leaves six distinct values that determine the distribution: Embedded Image, σ1, σ3, σ5.

To estimate these six distribution parameters, we use the following iterative numerical fitting procedure. We start with a guess for (Embedded Image, σ1, σ3, σ5) and then draw 104 parameter sets v from the resulting distribution Ф(v). For each parameter set we can calculate the alternative parameter types (Embedded Image). We then fit the resulting 104 values for these alternative types to individual log-normal distributions, and compare the means and standard deviations to the empirical results in the top half of Table S1. The sum of the relative absolute errors between the new joint fit values and the empirical results for the means / standard deviations is our overall goodness-of-fit measure. We perturb our guess for (Embedded Image, σ1, σ3, σ5) and accept the perturbation if it improves the goodness-of-fit. This procedure is iterated until convergence. The results of this joint fit are shown in the bottom half of Table S1. The joint fit predictions for the binding rate (κb, ρb) and dissociation constant Embedded Image distributions are consistent with earlier estimates of these parameters in specific kinase/phosphatase systems [12]. As another consistency check, the joint fit distribution for the reaction rates (κr, ρr) is nearly identical to the individual empirical fit based on the Sabio-RK database values.

Finally we note that the simple joint distribution Ф(v) in Eq. (S27) is by construction too broad: it may produce the correct marginal distributions for quantities collected from the Sabio-RK database, but it ignores any correlations between those individual parameters that may be present in natural systems. Estimating these correlations from the existing database entries is quite challenging, because relatively few entries have a complete list of all the parameters of interest. Hence, as explained in the main text, we take Ф(v) to be effectively a superset: it should contain the true, presumably narrower, biological distribution plus parameter sets that are less likely to be observed in nature. A convenient aspect of this interpretation is that any collective conclusion we draw from the entire distribution Ф(v) should also be true for the subset of biological parameters. Moreover we can thus explore a larger design space (potentially available for evolution) than what we currently observe in modern biological systems.

V. RESULTS FOR ALTERNATIVE INPUT KINASE CONCENTRATIONS

The results in main text Fig. 3D-F were for a mean input kinase concentration [K] = 5 nM. In Fig. S4 we show the analogous results for two different choices: [K] = 0.5 nM (left column) and [K] = 50 nM (right column). The main conclusions remain unchanged: the physiological Δμ range (highlighted in pink) is always just above the upper edge of the Embedded Image cloud, and the number of available parameter sets decreases rapidly as the mutual information Embedded Image is increased.

FIG S4.
  • Download figure
  • Open in new tab
FIG S4.

Analogous to main Figure 3D-F, except for input kinase concentration [K] = 5 nM (left column) and 50 nM (right column). The rows correspond to mutual information Embedded Image, and 2 bits respectively. The probabilities of successfully drawing a parameter set that achieves the specified Embedded Image value are shown in red in panel.

VI. ANALYSIS OF THE PBS2-HOG1 PUSH-PULL LOOP IN YEAST

To illustrate our theoretical framework in a concrete biological example, let us consider a kinase-phosphatase loop from one of the most extensively studied signaling pathways: the Hog1 mitogen-activated protein kinase (MAPK) pathway that allows yeast to adapt to extracellular osmotic changes [13–15]. We will focus in particular on the final portion of the pathway, where the active (phosphorylated) kinase Pbs2pp catalyzes the conversion of inactive Hog1 into phosphorylated Hog1pp. The latter protein is interchanged quickly between cytoplasm and nucleus, where it regulates a variety of responses to osmotic stress. Hog1pp is dephosphorylated by a combination of phosphatases Ptp2 (mainly in the nucleus) and Ptp3 in the cytoplasm [16]. Thus Pbs2pp will play the role of K in our model, Hog1 will be S, Hog1pp will be S∗, and Ptp2/Ptp3 will be P. To parameterize our model, we start with a more detailed theoretical description of the entire pathway developed by Zi et al. [13]. A key appeal of this work is that its parameters were carefully fit to extensive experimental data from yeast cells exposed to different time series of external salt shocks in microfluidic experiments [14]. However since the parameters of Zi et al. are not expressed in the same form as the enzymatic reaction rates of our model, we do have to convert from their framework to ours, as described below.

A. Parameter estimation based on earlier literature

Ref. [13] explicitly distinguishes between the concentration of Hog1 and Hog1pp in the cytoplasm and nucleus, denoted with c and n superscripts respectively: [Hog1c], [Hog1n], [Hog1ppc], [Hog1ppn]. If we are interested in the average concentrations overall, we can denote these as: Embedded Image where Vc and Vn are the volumes of the cytoplasm and nucleus respectively, taken to have a ratio of Vn/Vc = 0.14 [13]. Eq. (S28) also implies: Embedded Image where f = Vc/(Vc + Vn) = 0.88. As a simplification of Eq. (S28), we note in Ref. [13] import and export of the Hog1 proteins is fast relative to other reactions, and for a given input level the system rapidly reaches a stationary state with [Hog1n]≈[Hog1c]≈ [S], [Hog1ppn]≈[Hog1ppc]≈ [S∗].

We can now look at individual reactions that contribute to the time derivatives on the right-hand sides of Eq. (S29) and find their analogues in our model. For example the phosphorylation step that converts Hog1c to Hog1ppc is expressed in Ref. [13] as an effective second order reaction of the form Embedded Image, with rate constant Embedded Image Embedded Image. This contributes positively to d[Hog1ppc]/dt and with a minus sign to d[Hog1c]/dt, and so leads to contributions magnitude Embedded Image to the right-hand sides of Eq. (S29). Note that even though activation of Hog1 is actually a double phosphorylation (of a threonine and tyrosine residue), the entire process in this case can be well approximated through a single rate constant.

In our model the conversion of S to S∗ occurs through the intermediate state SK. How-ever if we want to compare to the phosphorylation step of Ref. [13] in order to match parameters, we can look at the deterministic contribution to the dynamics (ignoring fluctuations) in the Michaelis-Menten approximation for enzyme kinetics [17]. In this picture the phosphorylation reaction contributes to d[S]/dt and d[S∗]/dt through a term of magnitude Embedded Image, where the last simplification is valid when Embedded Image. If we compare Embedded Image to Embedded Image, noting that [K] = [Pbs2pp] and [S] ≈ [Hog1c], we can make the following identification:

Embedded Image

The dephosphorylation steps in Ref. [13] are modeled as two pseudo-first-order reactions: conversion of Hog1ppc to Hog1c with rate Embedded Image, and the conversion of Hog1ppn to Hog1n with rate Embedded Image. The pseudo-first-order rate constants are given by: Embedded Image and Embedded Image. These reactions will lead to contributions of magnitude Embedded Image to the right-hand sides of Eq. (S29). In our model (using a similar Michaelis-Menten approximation to the one described above, with Embedded Image, the analogous expression for dephosphorylation is effectively a second-order reaction with rate Embedded Image. Comparison of the two expressions, using the approximation [Hog1ppn]≈[Hog1ppc]≈[S*], leads to the identification: Embedded Image

Here we set [P] = 0.058 μM as an average measure of phosphatase concentrations, to facilitate the conversion from pseudo-first-order to second-order rate constants. The value of [P] is based on estimates of the concentrations of the two phosphatases in yeast from Ref. [18]: 0.049 μM for Ptp3 in the cytoplasm, and 0.067 μM for Ptp2 in the nucleus, where we have used Vc = f (Vc + Vn), Vn = (1 f)(Vc + Vn) and Vc + Vn 30 fL [13, 19] to convert from populations to concentrations. Since the concentrations were of similar scale, we let [P] be the mean of the two values.

As a consistency check to make sure the final estimates of the specificity ratios Embedded Image and Embedded Image in Eqs. (S30)-(S31) are biologically plausible, we can compare them with the distribution of these ratios among kinases/phosphatases from the Sabio-RK database in main text Fig. 2C. The values for the Hog1/Pbs2 system are not unusual, and lie near the higher end of the range, at about the 0.87 quantile. The final parameter value we can estimate from the literature is the mean Hog1 concentration [S] = 0.38 μM, based on the abundance reported in Ref. [18].

B. Estimation of remaining parameters

Based on the above analysis, we have estimates for four quantities in the Pbs2/Hog1 system drawn from the earlier literature: Embedded Image, [S], [P]. These are summarized in Table S2. The relationship of the enzymatic reaction/binding/unbinding rate parameters to the estimated values then takes the form: Embedded Image

View this table:
  • View inline
  • View popup
  • Download powerpoint
TABLE S2.

Summary of parameters for the yeast Pbs2/Hog1 system estimated from earlier literature.

The above parameters depend on the values of Embedded Image. While we do not know what these are for the Pbs2/Hog1 system, we can draw their values from the corresponding empirical log-normal distributions described in Table S1. By repeating the draw many times, we can check how our final optimality analysis (see below) depends on the precise values of the unknown parameters. As it turns out the dependence of R0, Embedded Image and Embedded Image on the unknown values is quite weak, and we will be able to make robust estimates for these quantities. In the cases of Embedded Image and Embedded Image, we constrain the random draw from their log-normal distributions to enforce Embedded Image and Embedded Image. This ensures self-consistency with the assumptions Embedded Image and Embedded Image, which were used in the previous subsection to match the form of the phosphorylation / dephosphorylation reactions between Ref. [13] and our model. The final two parameters are the reverse reaction rates κ−r and ρ−r. Since we do not have any experimental estimates of these for the Pbs2/Hog1 system, we assume that the physiological value of Δμ in yeast (around 21 kBT [19]) is sufficiently high that κ−r and ρ−r are negligible under normal conditions.

C. Bandwidth and gain

Given the parameter estimation procedure described above, we can calculate Embedded Image, R0, Embedded Image for each draw of the unknown parameters. The results remain within a narrow distribution, relatively insensitive to the values of the unknown parameters. The mean and standard deviations for 50 draws are: Embedded Image, R0 = 0.0621 ± 0.0001 s−1, Embedded Image.

VII. ESTIMATION OF TOTAL RESTING METABOLIC EXPENDITURE

For single-celled organisms, the total resting metabolic expenditure CT can be estimated by the approach outlined in Ref. [20]. CT has two contributions: CT = CG + trCM. Here CG is the expenditure involved in growth during one generation time tr, and CM is the maintenance cost per unit time. Using a large collection of metabolic data from Ref. [21], covering both prokaryotes and single-celled eukaryotes, one can observe that both CM and CT scale approximately linearly with cell volume V, agreeing with the prediction of the bioenergetic growth model of Ref. [20]. The expression for CT based on the results of these linear fits is [20]: Embedded Image where the unit P corresponds to the hydrolysis of a phosphate bond (i.e. the consumption of one ATP or ATP equivalent). Using the main text values of V = 30 fL and tr = 3600 s, we get CT = 7.0 × 1011 P.

Acknowledgments

We thank Shishir Adhikari for useful discussions. Parts of the numerical analysis were carried out using the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University.

Footnotes

  • Added some minor clarifications.

  • https://github.com/hincz-lab/cell-signaling

References

  1. [1].↵
    Balázsi, G., van Oudenaarden, A. & Collins, J. J. Cellular decision making and biological noise: from microbes to mammals. Cell 144, 910–925 (2011).
    OpenUrlCrossRefPubMedWeb of Science
  2. [2].↵
    Govern, C. C. & ten Wolde, P. R. Optimal resource allocation in cellular sensing systems. Proc. Natl. Acad. Sci. 111, 17486–17491 (2014).
    OpenUrlAbstract/FREE Full Text
  3. [3].↵
    ten Wolde, P. R., Becker, N. B., Ouldridge, T. E. & Mugler, A. Fundamental limits to cellular sensing. J. Stat. Phys. 162, 1395–1424 (2016).
    OpenUrl
  4. [4].↵
    Lan, G. & Tu, Y. Information processing in bacteria: memory, computation, and statistical physics: a key issues review. Rep. Prog. Phys. 79, 052601 (2016).
    OpenUrlCrossRef
  5. [5].↵
    Logan, C. Y. & Nusse, R. The Wnt signaling pathway in development and disease. Annu. Rev. Cell Dev. Biol. 20, 781–810 (2004).
    OpenUrlCrossRefPubMedWeb of Science
  6. [6].
    Parsons, D. W. et al. Colorectal cancer: mutations in a signalling pathway. Nature 436, 792 (2005).
    OpenUrlCrossRefPubMedWeb of Science
  7. [7].
    Sarkar, F. H., Li, Y., Wang, Z. & Kong, D. Cellular signaling perturbation by natural products. Cell. Sign. 21, 1541 – 1547 (2009).
    OpenUrlCrossRefPubMed
  8. [8].↵
    Riera, C. E., Merkwirth, C., De Magalhaes Filho, C. D. & Dillin, A. Signaling networks determining life span. Annu. Rev. Biochem. 85, 35–64 (2016).
    OpenUrl
  9. [9].↵
    Uda, S. Application of information theory in systems biology. Biophys. Rev. 1–8 (2020).
  10. [10].↵
    Tkačik, G., Callan, C. G. & Bialek, W. Information flow and optimization in transcriptional regulation. Proc. Natl. Acad. Sci. 105, 12265–12270 (2008).
    OpenUrlAbstract/FREE Full Text
  11. [11].
    Cheong, R., Rhee, A., Wang, C. J., Nemenman, I. & Levchenko, A. Information transduction capacity of noisy biochemical signaling networks. Science 334, 354–358 (2011).
    OpenUrlAbstract/FREE Full Text
  12. [12].
    Uda, S. et al. Robustness and compensation of information transmission of signaling pathways. Science 341, 558–561 (2013).
    OpenUrlAbstract/FREE Full Text
  13. [13].
    Voliotis, M., Perrett, R. M., McWilliams, C., McArdle, C. A. & Bowsher, C. G. Information transfer by leaky, heterogeneous, protein kinase signaling systems. Proc. Natl. Acad. Sci. 111, E326–E333 (2014).
    OpenUrlAbstract/FREE Full Text
  14. [14].↵
    Selimkhanov, J. et al. Accurate information transmission through dynamic biochemical signaling networks. Science 346, 1370–1373 (2014).
    OpenUrlAbstract/FREE Full Text
  15. [15].↵
    Potter, G. D., Byrd, T. A., Mugler, A. & Sun, B. Dynamic sampling and information encoding in biochemical networks. Biophys. J. 112, 795–804 (2017).
    OpenUrlCrossRef
  16. [16].
    Suderman, R., Bachman, J. A., Smith, A., Sorger, P. K. & Deeds, E. J. Fundamental trade-offs between information flow in single cells and cellular populations. Proc. Natl. Acad. Sci. 114, 5755–5760 (2017).
    OpenUrlAbstract/FREE Full Text
  17. [17].↵
    Keshelava, A. et al. High capacity in g protein-coupled receptor signaling. Nat. Commun. 9, 1–8 (2018).
    OpenUrlCrossRefPubMed
  18. [18].↵
    Shannon, C. E. A mathematical theory of communication. Bell Systems Tech. J. 27, 379–423 (1948).
    OpenUrl
  19. [19].↵
    Cover, T. M. & Thomas, J. A. Elements of information theory (John Wiley & Sons, 2012).
  20. [20].↵
    Cao, Y., Wang, H., Ouyang, Q. & Tu, Y. The free-energy cost of accurate biochemical oscillations. Nat. Phys. 11, 772–778 (2015).
    OpenUrl
  21. [21].
    Hasegawa, Y. Optimal temporal patterns for dynamical cellular signaling. New J. Phys. 18, 113031 (2016).
    OpenUrl
  22. [22].
    Mehta, P., Lang, A. H. & Schwab, D. J. Landauer in the age of synthetic biology: energy consumption and information processing in biochemical networks. J. Stat. Phys. 162, 1153– 1166 (2016).
    OpenUrl
  23. [23].↵
    Ouldridge, T. E., Govern, C. C. & ten Wolde, P. R. Thermodynamics of computational copying in biochemical systems. Phys. Rev. X 7, 021004 (2017).
    OpenUrl
  24. [24].↵
    Milo, R. & Phillips, R. Cell biology by the numbers (Garland Science, 2015).
  25. [25].↵
    Lane, N. & Martin, W. F. The origin of membrane bioenergetics. Cell 151, 1406–1416 (2012).
    OpenUrlCrossRefPubMedWeb of Science
  26. [26].↵
    Stadtman, E. & Chock, P. Superiority of interconvertible enzyme cascades in metabolic regulation: analysis of monocyclic systems. Proc. Natl. Acad. Sci. 74, 2761–2765 (1977).
    OpenUrlAbstract/FREE Full Text
  27. [27].
    Goldbeter, A. & Koshland, D. E. An amplified sensitivity arising from covalent modification in biological systems. Proc. Natl. Acad. Sci. 78, 6840–6844 (1981).
    OpenUrlAbstract/FREE Full Text
  28. [28].↵
    Detwiler, P. B., Ramanathan, S., Sengupta, A. & Shraiman, B. I. Engineering aspects of enzymatic signal transduction: photoreceptors in the retina. Biophys. J. 79, 2801–2817 (2000).
    OpenUrlCrossRefPubMedWeb of Science
  29. [29].↵
    Heinrich, R., Neel, B. G. & Rapoport, T. A. Mathematical models of protein kinase signal transduction. Molecular cell 9, 957–970 (2002).
    OpenUrlCrossRefPubMedWeb of Science
  30. [30].↵
    Hinczewski, M. & Thirumalai, D. Cellular signaling networks function as generalized wiener-kolmogorov filters to suppress noise. Phys. Rev. X 4, 041017 (2014).
    OpenUrl
  31. [31].↵
    Hinczewski, M. & Thirumalai, D. Noise control in gene regulatory networks with negative feedback. J. Phys. Chem. B 120, 6166–6177 (2016).
    OpenUrl
  32. [32].↵
    Lynch, M. & Marinov, G. K. The bioenergetic costs of a gene. Proc. Natl. Acad. Sci. 112, 15690–15695 (2015).
    OpenUrlAbstract/FREE Full Text
  33. [33].↵
    Ilker, E. & Hinczewski, M. Modeling the growth of organisms validates a general relation between metabolic costs and natural selection. Phys. Rev. Lett. 122, 238101 (2019).
    OpenUrl
  34. [34].↵
    Sturm, O. E. et al. The mammalian MAPK/ERK pathway exhibits properties of a negative feedback amplifier. Sci. Signal. 3, ra90–ra90 (2010).
    OpenUrlAbstract/FREE Full Text
  35. [35].↵
    Saxena, M., Williams, S., Taskén, K. & Mustelin, T. Crosstalk between camp-dependent kinase and map kinase through a protein tyrosine phosphatase. Nat. Cell Biol. 1, 305 (1999).
    OpenUrlCrossRefPubMedWeb of Science
  36. [36].↵
    Salazar, C. & Höfer, T. Multisite protein phosphorylation–from molecular mechanisms to kinetic models. FEBS J. 276, 3177–3198 (2009).
    OpenUrlCrossRefPubMed
  37. [37].↵
    Fersht, A. Structure and mechanism in protein science: a guide to enzyme catalysis and protein folding (Macmillan, 1999).
  38. [38].↵
    Qian, H. Cooperativity and specificity in enzyme kinetics: a single-molecule time-based perspective. Biophys. J. 95, 10–17 (2008).
    OpenUrlCrossRefPubMedWeb of Science
  39. [39].↵
    Gillespie, D. T. The chemical Langevin equation. J. Chem. Phys. 113, 297–306 (2000).
    OpenUrlCrossRefWeb of Science
  40. [40].↵
    Gillespie, D. T. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361 (1977).
    OpenUrlCrossRefPubMedWeb of Science
  41. [41].↵
    Bar-Even, A. et al. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry 50, 4402–4410 (2011).
    OpenUrlCrossRefPubMedWeb of Science
  42. [42].↵
    Liebermeister, W. & Klipp, E. Biochemical networks with uncertain parameters. IEE Proc.- Syst. Biol. 152, 97–107 (2005).
    OpenUrlCrossRefPubMed
  43. [43].↵
    Wang, M., Herrmann, C. J., Simonovic, M., Szklarczyk, D. & von Mering, C. Version 4.0 of PaxDb: protein abundance data, integrated across model organisms, tissues, and cell-lines. Proteomics 15, 3163–3168 (2015).
    OpenUrlCrossRefPubMed
  44. [44].↵
    UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2018).
    OpenUrlCrossRefPubMed
  45. [45].↵
    Wittig, U. et al. SABIO-RK—database for biochemical reaction kinetics. Nucleic Acids Res. 40, D790–D796 (2011).
    OpenUrl
  46. [46].↵
    Zi, Z., Liebermeister, W. & Klipp, E. A quantitative study of the Hog1 MAPK response to fluctuating osmotic stress in Saccharomyces cerevisiae. PLOS ONE 5, e9522 (2010).
    OpenUrlCrossRefPubMed
  47. [47].↵
    Hersen, P., McClean, M. N., Mahadevan, L. & Ramanathan, S. Signal processing by the hog map kinase pathway. Proc. Natl. Acad. Sci. 105, 7165–7170 (2008).
    OpenUrlAbstract/FREE Full Text
  48. [48].↵
    Wiener, N. Extrapolation, Interpolation and Smoothing of Stationary Times Series (Wiley,New York, 1949).
  49. [49].
    Kolmogorov, A. N. Interpolation and extrapolation of stationary random sequences. Izv. Akad. Nauk SSSR., Ser. Mat. 5, 3–14 (1941).
    OpenUrl
  50. [50].↵
    Bode, H. W. & Shannon, C. E. A simplified derivation of linear least square smoothing and prediction theory. Proc. Inst. Radio. Engin. 38, 417–425 (1950).
    OpenUrl
  51. [51].↵
    Becker, N. B., Mugler, A. & ten Wolde, P. R. Optimal prediction by cellular signaling networks. Phys. Rev. Lett. 115(258103, 2015).
    OpenUrl
  52. [52].
    Zechner, C., Seelig, G., Rullan, M. & Khammash, M. Molecular circuits for dynamic noise filtering. Proc. Natl. Acad. Sci. 113, 4729–4734 (2016).
    OpenUrlAbstract/FREE Full Text
  53. [53].
    Samanta, H. S., Hinczewski, M. & Thirumalai, D. Optimal information transfer in enzymatic networks: A field theoretic formulation. Phys. Rev. E 96, 012406 (2017).
    OpenUrl
  54. [54].↵
    Hathcock, D., Sheehy, J., Weisenberger, C., Ilker, E. & Hinczewski, M. Noise filtering and prediction in biological signaling networks. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2, 16–30 (2016).
    OpenUrl
  55. [55].↵
    Mettetal, J. T., Muzzey, D., Gómez-Uribe, C. & van Oudenaarden, A. The frequency dependence of osmo-adaptation in Saccharomyces cerevisiae. Science 319, 482–484 (2008).
    OpenUrlAbstract/FREE Full Text
  56. [56].↵
    Charlesworth, B. Effective population size and patterns of molecular evolution and variation. Nature Rev. Genet. 10, 195 (2009).
    OpenUrlCrossRefPubMedWeb of Science
  57. [57].↵
    Gillespie, J. H. Population genetics: a concise guide (JHU Press, 2010).
  58. [58].↵
    Kimura, M. On the probability of fixation of mutant genes in a population. Genetics 47, 713–719 (1962).
    OpenUrlFREE Full Text
  59. [59].↵
    Orgel, L. E. & Crick, F. H. Selfish DNA: the ultimate parasite. Nature 284, 604 (1980).
    OpenUrlCrossRefPubMedWeb of Science
  60. [60].↵
    Wagner, A. Energy constraints on the evolution of gene expression. Mol. Biol. Evol. 22, 1365–1374 (2005).
    OpenUrlCrossRefPubMedWeb of Science
  61. [61].↵
    Mahmoudabadi, G., Milo, R. & Phillips, R. Energetic cost of building a virus. Proc. Natl. Acad. Sci. 114, E4324–E4333 (2017).
    OpenUrlAbstract/FREE Full Text
  62. [62].↵
    Tsai, I. J., Bensasson, D., Burt, A. & Koufopanou, V. Population genomics of the wild yeast Saccharomyces paradoxus: quantifying the life cycle. Proc. Natl. Acad. Sci. 105, 4957–4962 (2008).
    OpenUrlAbstract/FREE Full Text

References

  1. [1].↵
    Qian, H. Cooperativity and specificity in enzyme kinetics: a single-molecule time-based per-spective. Biophys. J. 95, 10–17 (2008).
    OpenUrlCrossRefPubMedWeb of Science
  2. [2].↵
    Hinczewski, M. & Thirumalai, D. Cellular signaling networks function as generalized Wiener-Kolmogorov filters to suppress noise. Phys. Rev. X 4, 041017 (2014).
    OpenUrl
  3. [3].↵
    Gillespie, D. T. The chemical Langevin equation. J. Chem. Phys. 113, 297–306 (2000).
    OpenUrlCrossRefWeb of Science
  4. [4].↵
    Gillespie, D. T. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361 (1977).
    OpenUrlCrossRefPubMedWeb of Science
  5. [5].↵
    Hathcock, D., Sheehy, J., Weisenberger, C., Ilker, E. & Hinczewski, M. Noise filtering and prediction in biological signaling networks. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2, 16–30 (2016).
    OpenUrl
  6. [6].↵
    Bar-Even, A. et al. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry 50, 4402–4410 (2011).
    OpenUrlCrossRefPubMedWeb of Science
  7. [7].↵
    Liebermeister, W. & Klipp, E. Biochemical networks with uncertain parameters. IEE Proc.- Syst. Biol. 152, 97–107 (2005).
    OpenUrlCrossRefPubMed
  8. [8].↵
    Wang, M., Herrmann, C. J., Simonovic, M., Szklarczyk, D. & von Mering, C. Version 4.0 of PaxDb: protein abundance data, integrated across model organisms, tissues, and cell-lines. Proteomics 15, 3163–3168 (2015).
    OpenUrlCrossRefPubMed
  9. [9].↵
    Wittig, U. et al. SABIO-RK—database for biochemical reaction kinetics. Nucleic Acids Res. 40, D790–D796 (2011).
    OpenUrl
  10. [10].↵
    UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47, D506–D515 (2018).
    OpenUrlCrossRefPubMed
  11. [11].↵
    Milo, R. What is the total number of protein molecules per cell volume? a call to rethink some published values. Bioessays 35, 1050–1055 (2013).
    OpenUrlCrossRefPubMed
  12. [12].↵
    Schoeberl, B., Eichler-Jonsson, C., Gilles, E. D. & Müller, G. Computational modeling of the dynamics of the map kinase cascade activated by surface and internalized EGF receptors. Nat. Biotech. 20, 370–375 (2002).
    OpenUrlCrossRefPubMedWeb of Science
  13. [13].↵
    Zi, Z., Liebermeister, W. & Klipp, E. A quantitative study of the Hog1 MAPK response to fluctuating osmotic stress in Saccharomyces cerevisiae. PLOS ONE 5, e9522 (2010).
    OpenUrlCrossRefPubMed
  14. [14].↵
    Mettetal, J. T., Muzzey, D., Gómez-Uribe, C. & van Oudenaarden, A. The frequency dependence of osmo-adaptation in saccharomyces cerevisiae. Science 319, 482–484 (2008).
    OpenUrlAbstract/FREE Full Text
  15. [15].↵
    Hersen, P., McClean, M. N., Mahadevan, L. & Ramanathan, S. Signal processing by the HOG MAP kinase pathway. Proc. Natl. Acad. Sci. 105, 7165–7170 (2008).
    OpenUrlAbstract/FREE Full Text
  16. [16].↵
    Mattison, C. P. & Ota, I. M. Two protein tyrosine phosphatases, Ptp2 and Ptp3, modulate the subcellular localization of the Hog1 MAP kinase in yeast. Genes Dev. 14, 1229–1235 (2000).
    OpenUrlAbstract/FREE Full Text
  17. [17].↵
    Fersht, A. Structure and mechanism in protein science: a guide to enzyme catalysis and protein folding (Macmillan, 1999).
  18. [18].↵
    Ghaemmaghami, S. et al. Global analysis of protein expression in yeast. Nature 425, 737–741 (2003).
    OpenUrlCrossRefPubMedWeb of Science
  19. [19].↵
    Milo, R. & Phillips, R. Cell biology by the numbers (Garland Science, 2015).
  20. [20].↵
    Ilker, E. & Hinczewski, M. Modeling the growth of organisms validates a general relation between metabolic costs and natural selection. Phys. Rev. Lett. 122, 238101 (2019).
    OpenUrl
  21. [21].↵
    Lynch, M. & Marinov, G. K. The bioenergetic costs of a gene. Proc. Natl. Acad. Sci. 112, 15690–15695 (2015).
    OpenUrlAbstract/FREE Full Text
Back to top
PreviousNext
Posted October 08, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
The price of a bit: energetic costs and the evolution of cellular signaling
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
The price of a bit: energetic costs and the evolution of cellular signaling
Teng-Long Wang, Benjamin Kuznets-Speck, Joseph Broderick, Michael Hinczewski
bioRxiv 2020.10.06.327700; doi: https://doi.org/10.1101/2020.10.06.327700
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
The price of a bit: energetic costs and the evolution of cellular signaling
Teng-Long Wang, Benjamin Kuznets-Speck, Joseph Broderick, Michael Hinczewski
bioRxiv 2020.10.06.327700; doi: https://doi.org/10.1101/2020.10.06.327700

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Systems Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4222)
  • Biochemistry (9096)
  • Bioengineering (6741)
  • Bioinformatics (23922)
  • Biophysics (12069)
  • Cancer Biology (9484)
  • Cell Biology (13722)
  • Clinical Trials (138)
  • Developmental Biology (7614)
  • Ecology (11646)
  • Epidemiology (2066)
  • Evolutionary Biology (15467)
  • Genetics (10611)
  • Genomics (14285)
  • Immunology (9451)
  • Microbiology (22756)
  • Molecular Biology (9057)
  • Neuroscience (48816)
  • Paleontology (354)
  • Pathology (1478)
  • Pharmacology and Toxicology (2559)
  • Physiology (3819)
  • Plant Biology (8302)
  • Scientific Communication and Education (1467)
  • Synthetic Biology (2285)
  • Systems Biology (6164)
  • Zoology (1296)