Abstract
Microbes in fragmented environments profit from yield-efficient metabolic strategies, which allow for a maximal number of cells. In contrast, cells in well-mixed, nutrient-rich environments need to grow and divide fast to out-compete others. Paradoxically, a fast growth can entail wasteful, yield-inefficient modes of metabolism and smaller cell numbers. Therefore, general trade-offs between biomass yield and growth rate have been hypothesized. To study the conditions for such rate/yield trade-offs, we considered a kinetic model of E. coli central metabolism and determined flux distributions that provide maximal growth rates or maximal biomass yields. Maximal growth rates or yields are achieved by sparse flux distributions called elementary flux modes (EFMs). By implementing a framework we call Flux-analysis Enzyme Cost Minimization (fECM), we screened all EFMs in the network model and computed the biomass yields and the minimal amount of protein requirements, which we then use to estimate the growth rates. In a scatter plot between the growth rates and yields of all EFMs, a trade-off shows up as a Pareto front. At reference glucose and oxygen levels, we find that the rate/yield trade-off is small. However, in low-oxygen environments, a much clearer trade-off emerges: low-yield fermentation EFMs allow for a growth 2-3 times faster than the maximal-yield EFM. The trade-off is therefore strongly condition-dependent and should be almost unnoticeable at high oxygen and glucose levels, the typical conditions in laboratory experiments. Our public web service www.neos-guide.org/content/enzyme-cost-minimization allows users to run fECM to compute enzyme costs for metabolic models of their choice.
Introduction
Metabolic networks, their dynamics, and their regulation are shaped by evolutionary selection. When nutrients are in excess and the environment is well mixed, fast-growing bacterial cells will outcompete others. Under this selection pressure, organisms should evolve to maximize their growth rate. Indeed, microbiologists use the terms growth rate and fitness almost synonymously. As much as such well-mixed rich environments are common in laboratory settings, natural environments are much more diverse. In fragmented ecological niches with limited resources, there is no direct competition during the growth phase, and fecundity determines the evolutionary success regardless of growth speed. This puts a selection pressure on biomass yield rather than growth because the number of offspring for bacteria in a limited-nutrient environment is directly proportional to the biomass yield of their catabolism (for a given cell size).
At first glance, high growth rate and yield could be expected to go hand in hand: imagine a cell that can produce more offspring for the same amount of nutrient (i.e., higher yield); then it seems logical that it would reproduce quickly (i.e., produce more offspring per hour). However, this is not what we see in experiments. Many fast-growing cells employ low-yield metabolic pathways, e.g., bacteria that, when grown on glucose, display respiro-fermentative metabolism at high growth rates even though a complete respiratory growth would have a higher yield per mole of glucose. Similarly, yeast cells that produce ethanol (Crabtree effect) and cancer cells that product lactate (Warburg effect) in the presence of oxygen seem to waste much of the carbon that they take up (see [1] for a review of these strategies and hypotheses). These yield-inefficient strategies observed in completely unrelated organisms have lead to the suggestion that fast growth and high cell yield may even exclude each other due to physico-chemical reasons (e.g. following thermodynamic principles [2, 3]). This hypothesis has been supported by simple cell models, in which lower thermodynamic forces or higher enzyme costs in the high-yield pathways caused a rate/yield trade-off.
In [4], two versions of the glycolysis, both common among bacteria, have been compared in terms of ATP yield on glucose and by of enzyme demand (or, equivalently, their ATP production rate at a given enzyme investment). At a given glucose influx, the Embden-Meyerhof-Parnas (EMP) pathway yields twice as much ATP, but requires about 4.5 as much enzyme than the Entner-Doudoroff (ED) pathway. Thus, it was hypothesized that cells under yield selection will use the EMP pathway while those under rate selection would use the ED pathway. The economics of other metabolic choices, e.g. respiration versus fermentation, and the resulting trade-offs, remain to be better quantified (some recent approximations have been performed for yeast [5] and E. coli [6]).
Several lab-evolution experiments with fast-growing microorganisms have been conducted to bring the rate/yield hypothesis to the test, with varying levels of success. Growth rate and yield of microbial strains have been compared between different wild-type and evolved strains [8–11]. Most of these studies found poor correlations between growth rate and yield. Novak et al. [9] found a negative correlation within evolved E. coli populations, indicating a rate/yield trade-off. One of the few examples of bacteria evolving for high yield in the laboratory was the work of Bachmann et al. [12]. In their protocol, each cell is kept in a separate droplet in a medium-in-oil suspension, simulating a fragmented environment, and the offspring are mixed only after the nutrients in each droplet are depleted. This creates a strong selection pressure for maximizing biomass yield. Indeed, the strains gradually evolved towards higher yields at the expense of their growth rate, again indicating a trade-off between the two objectives. However, most of the evidence relies on laboratory experiments in which microorganisms may behave sub-optimally, and the existence of rate-yield trade-offs remains debatable.
How can we study rate/yield trade-offs by models? A major difficulty is the prediction of cell growth rates and their dependence on the metabolic state. For exponentially growing cells, the rate of biomass synthesis per cell dry weight – typically measured in grams of biomass per gram cell dry weight per hour – is equal to the specific growth rate µ. At balanced growth, the relative amounts of all cellular components are preserved over time, including the protein fraction associated with enzymes that catalyze central carbon metabolism. If a metabolic strategy achieved the same biomass synthesis rate, but with a lower cost in terms of total enzyme mass, evolution would have the chance to reallocate the freed protein resources to other cellular processes that contribute to growth, and thus increase the cell’s growth rate. Thus, a growth-optimal strategy will be one that minimizes enzyme cost (at a given rate of glucose-to-biomass conversion) [13]. This drive for low enzyme or nutrient investments should be reflected in the choice of metabolic strategies: if lowyield pathways provide higher metabolic fluxes per enzyme investment, this leads to a growth advantage [13–17]. Thus, the rate-yield trade-off in cells reflects a trade-off between enzyme efficiency and substrate efficiency in metabolic pathways. On the contrary, if cells can choose between a substrate-efficient and an enzyme-efficient pathway, they face a growth-yield trade-off. Since the enzyme cost of a pathway depends on substrate concentrations, this trade-off is likely to be condition-dependent.
The relation between protein investments and biomass yield YX/S, which we measure in grams of biomass per mole of carbon source carbons (i.e. per 1/6 mole of glucose), is not immediately clear. If the rate of carbon uptake qS were known, we could directly relate the yield and growth rate, using this formula: However, since changing metabolic fluxes affect carbon uptake, yield, and growth rate at the same time, it is difficult to gain insight about the relation between µ and YX/S without understanding the changes in qS. Some metabolic models (in particular, classical Flux Balance Analysis (FBA)) place a fixed upper bound on qS, effectively assuming a fixed relation between growth rate and biomass yield. Of course, rate-yield trade-offs cannot be explored in this way. Here, we take a different approach, which combines kinetic modeling with elementary flux mode analysis.
Elementary flux modes (EFMs) describe the fundamental ways in which a metabolic network can operate [18–21]. Among the possible steady-state flux modes, EFMs are minimal in the sense that they do not contain any smaller subnetworks that can support steady-state flux modes [18, 19, 21]. The EFMs of a metabolic network can be enumerated (even though a large number of EFMs may preclude this in practice). Each EFM has a fixed yield, defined as the output flux divided by the input flux, and is easy to compute. The maximal possible yield is always achieved by an EFM. EFMs might be expected to have very simple shapes, but since biomass production requires many different precursors, biomass-producing EFMs can be highly branched (e.g. Figure 2A). All biomass-producing EFMs are thermodynamically feasible, and in a model with predefined flux directions, the set of steady-state flux distribution is a convex polytope spanned by the EFMs.
EFMs have a remarkable property that makes them well-suited for studying rate/yield trade-offs: in a given metabolic network, the maximal rate of biomass production at a given enzyme investment is always achieved by an EFM [7, 22]. Therefore, to find flux modes that maximize growth, we only need to enumerate the EFMs and assess them one by one. Since biomass yield is a fixed property of EFMs, we can observe rate/yield trade-offs simply by plotting yields versus growth rates of all EFMs (Figure 1(a)).
To study metabolic strategies and trade-offs in microbes, we developed a new method for predicting cellular growth rates achievable by each EFM. To do so, we consider a kinetic model of metabolism and and determine an optimal enzyme allocation pattern in the network, realizing the required EFM and a predefined rate of biomass production at a minimal total enzyme investment. This optimization can be efficiently performed using the newly developed method of Enzyme Cost Minimization (ECM) [23]. For each EFM, the optimized enzyme amount at unit biomass flux is then translated into a mass doubling time of the proteins (i.e. the amount of time that metabolism would have to be running just to duplicate all the enzymes). Assuming that the protein fraction of the cell dry weight is a constant, this can further be translated into a cell growth rate by a semi-empirical formula described below. Building on recent developments in the field [7, 22, 23], we can then effectively scan the space of feasible flux modes and define a region of Pareto-optimal strategies – i.e., flux modes that maximize growth at a given yield or maximize yield at a given growth rate. The shape of this Pareto front tells us whether a rate/yield trade-off exists. We focus our efforts on the central metabolism of E. coli, because these fast-growing bacteria have often been used for experiments on the rate/yield trade-off and because of their well-studied enzyme kinetics.
Results
Computing the cell growth rate achievable by an elementary flux mode (EFM)
Flux-analysis Enzyme Cost Minimization (fECM) is a method for finding optimal metabolic states in kinetic models, i.e., flux distributions that realize a given flux objective (e.g., a given biomass production rate) at a minimal enzyme investment. These are the flux distributions that can be expected to allow for maximal growth rates. The fundamental difference between fECM and constraint-based methods is the underlying kinetic model. For scoring a flux profile, rather than using approximations such as the sum of fluxes [24] (or other linear/quadratic functions of the flux vector [25]), we directly the total amount of required enzyme predicted by a kinetic model.
To compute the optimal enzyme investments for a given flux mode, we use Enzyme Cost Minimization (ECM), a method that uses metabolite log-concentrations as the variables and can be quickly solved using convex optimization [23]. ECM finds the most favorable enzyme and metabolite profiles that support the provided fluxes – i.e. the profiles that minimize the total enzyme cost at a given biomass formation flux and therefore maximizes the specific biomass formation flux. The minimal enzyme cost of all EFMs can be computed in reasonable time (a few minutes on a shared server).
Given the enzyme demands, we next ask: how fast can a cell grow at the given metabolic fluxes and enzyme abundances? The steady-state cell growth rate is given by the biomass production rate divided by the biomass amount. Focusing on enzymes in central metabolism of E. coli, we first try to answer what is the total amount of enzyme required (Emet) to produce biomass at a given rate (vBM) in a kinetic model. Then, we define the enzyme doubling time in hours as . This is the time a cell would need toreproduce all its metabolic enzymes if it didn’t have to produce any other biomass. Since E. coli cells contain also other proteins and biomass constituents, the real doubling time is longer and depends on the fraction of metabolic enzymes within the total biomass. This fraction, however, decreases with the growth rate as seen in experiments [26] and as expected from trade-offs between metabolic enzymes and ribosome investment [27]. Here, we use the approximation T = 7.4 ·τmet + 0.51[h] derived in the Methods section. The resulting growth rate (in h−1), is a decreasing function of the enzyme cost (Emet).
We can now compute the minimal total enzyme cost associated with a given flux mode, and we know that flux modes that minimize this cost will also maximize growth. So how can we find an optimal flux mode? Under some reasonable model assumptions, and no matter how the kinetic model parameters are chosen, the enzyme-specific biomass production rate, and therefore growth, is maximized by elementary flux modes [7, 22]. What is the reason? If we consider all feasible flux modes, constrained to predefined flux directions and to a fixed biomass production rate, these flux modes together form a convex polytope in flux space, called benefit-constrained flux polytope (see Figure 1(b)). The vertices of this polytope are EFMs. Since the flux cost function is concave [7], any cost-optimal flux mode must be a vertex of the flux polytope, and therefore be an EFM. Thus, in fECM, we can screen all EFMs, score each of them by the minimal necessary enzyme cost per unit of biomass production flux, and determine the growth-maximizing mode.
To summarize, rate-yield trade-offs in metabolism can be studied by considering a kinetic metabolic model (with rate laws and rate constants, external conditions such as glucose and oxygen levels, and constraints on possible metabolite and enzyme levels), screening all of its metabolic steady states, and computing growth rate and yield for each of them. Finally, the EFM with the maximal growth rate is chosen. However, our method provides not only the growth-optimal flux mode, but also the full spectrum of growth rates and yields of all EFMs. The growth-yield diagram, a scatter plot between the two quantities, shows the possible trade-offs between the two objectives. An EFM that is not beaten by any other EFM in terms of both growth rate and yield is called Pareto-optimal. By connecting these Pareto-optimal points by straight lines, we obtain their convex hull (see Figure 1(a)). If we could evaluate the growth rates and yields for all metabolic states in the model (including non-elementary flux modes), the resulting rate/yield points would form a compact set, and this entire set would be enclosed in the convex hull of the EFM points. The Pareto-optimal EFMs therefore mark the best compromises between growth rate and yield that are achievable in the model. By inspecting the rate/yield diagram, we can tell, for the set of all metabolic states, whether there is an extended Pareto front or a single solution that optimizes both rate and yield. While every EFM has a constant yield that never changes, its growth rate is condition dependent. Therefore, the entire picture and the emergence of rate-yield trade-offs can vary between conditions. We demonstrate this for one case study: the central metabolic network of E. coli.
Metabolic strategies in E. coli central metabolism
To study optimal metabolic strategies in E. coli, we applied our method to a model of central carbon metabolism, which provides precursors for biomass production. Our model is a modified version of the model presented in [28] and comprises glycolysis, the Entner-Doudoroff pathway, the TCA cycle, the pentose phosphate pathway and by-product formation (SI Figure S4). Precursors and cofactors, provided by central metabolism, are converted into macromolecules (“biomass”) by a variety of processes. These processes are not explicitly covered by our network model, but summarized in a biomass reaction. Reaction kinetics are described by modular rate laws [29], with enzyme parameters obtained by balancing [30] a large collection of literature values. Parameter balancing yields consistent parameters in realistic ranges, satisfying thermodynamic constraints, and in optimal agreement with measured parameters.
All EFMs were scaled to a standard biomass production of 1, and their yields are defined as grams of biomass produced per mole of carbon atoms taken up in the form of glucose. EFMs that contain both oxygen-sensitive enzymes and oxygen-dependent reactions cannot be used by the cell. After removing such EFMs, we obtained 567 EFMs that produce biomass under aerobic conditions and 336 under anaerobic conditions, of which 97 can operate under both conditions (Figure 2(b)). Statistical properties of the EFMs (size distribution, usage of individual reactions, and similarity between EFMs and a measured flux distribution) are shown in Supplementary Figure S7. Some EFMs have very similar shapes, and we used the t-SNE layout algorithm to arrange all EFMs by their similarities (see Supplementary Figure S6). To avoid any biases in our growth predictions, we considered all EFMs that have a non-zero biomass yield, even those that contain physiologically unreasonable fluxes. For example, the futile cycling between PEP and pyruvate (by the combined activity of pyruvate kinase (pyk) and pyruvate water dikinase (pps) wastes ATP and is generally expected to be suppressed by strict enzyme regulation [31]. Such EFMs were consistently predicted to show low growth rates and had no effect on the outcomes of our study.
The spectrum of possible growth rates and yields is shown in a rate/yield diagram (Figure 2(c)). The growth rates refer to our reference conditions – [glucose] = 100 mM, [O2] = 0.21 mM. While the yields (as immediate properties of the EFMs) are condition-independent, growth rate depends on enzyme requirements and therefore on kinetics and external conditions such as nutrient levels. If we assume that all EFMs require identical total enzyme amounts for the same glucose uptake rate, growth rates and yields would be proportional. Alternatively, if we assume that all EFMs require the same total enzyme amounts for the same rate of biomass production, all EFMs would have exactly the same growth rate, regardless of the yield. Both these näıve assumptions are replaced by our kinetic model and the fECM method, as described above.
To compare some typical metabolic strategies, we focused on 5 particular flux modes with different characteristics and followed them across varying external conditions and kinetic parameter values. These five EFMs and an experimentally determined flux distribution (denoted exp [32]; for calculations see Supplementary Text S4.1) are marked by colors in Figure 2(b) and listed in Table 1 and we refer to them as focal EFMs. Their full flux maps (produced using software from [33]) can be found in the SI section S5.2. As an example, the map for max-gr is also included in Figure 2(a). max-yield, as mentioned earlier, has the highest yield. This EFM does not produce any by-products nor does it use the pentose-phosphate pathway. max-gr has a slightly lower yield, but reaches the highest growth rate (0.739 h−1) in our reference conditions and uses the pentose-phosphate pathway with a relatively high flux. In addition, we chose another EFM from the Pareto front whose growth rate and yield are somewhere between the two extreme EFMs (denoted pareto). Curiously, the EFMs comprising the Pareto front span only a relatively narrow range of biomass yields (18.6–22.1). This is not a trivial finding, and other choices of parameters or extracellular conditions can lead to much broader Pareto fronts, as we observe in low oxygen conditions (Figure 4(c)). In such cases, the trade-off between growth rate and yield becomes much more pronounced.
We have also added two by-product forming modes to our set of focal EFMs: an anaerobic lactate fermenting mode (ana-lac) with a very low yield (2.1 g/C-mol) and an aerobic acetate fermenting mode (aero-ace) with a medium-high yield (15.2 g/C-mol). Interestingly, the growth rate of the anaerobic lactate fermenting mode (ana-lac) is still about one third of the maximal growth rate, even though its yield is ∼10 times lower, thanks to the lower enzyme cost of the PPP and lower glycolysis compared to the TCA cycle and oxidative phosphorylation (per mol of ATP generated). This recapitulates a classic rate-versus-yield problem, associated with overflow metabolism. Some acetate producing EFMs have the highest growth rate of all by-product producing EFMs, which might explain why E. coli in fact excretes acetate in aerobic conditions, rather than lactate or succinate. Nevertheless, none of these by-product forming EFMs has a higher growth rate than max-gr and therefore they are not Pareto optimal. As we will see later, this fact is also subject to change when conditions are different, specifically at lower oxygen levels.
To associate high yields or high growth rates with specific reaction fluxes or chemical products, we selected four uptake or secretion reactions, computed their fluxes in the different EFMs, and visualized them by colors in the rate/yield diagram (Figure 3). The best-performing EFMs (in the top-right corner) consume intermediate amounts of oxygen and do not secrete any acetate, lactate or succinate. Another group of EFMs (visible in red in Figure 3(b)) consume slightly less oxygen, but secrete large amounts of acetate. This aerobic fermentation exhibits lower biomass yields compared to pure respiration, but it maintains comparable growth rates, suggesting that a lower demand for enzyme compensates for the lower yield. Other important fluxes are shown in Supplementary Figure S8.
The growth rates associated with metabolic strategies depend on environmental conditions and enzyme parameters
To study how optimal EFMs and the resulting growth rates vary across external conditions, we varied a single model parameter and traced its effects on the growth rate. Figure 4(a) shows how a decreasing external oxygen concentration affects growth: lower oxygen levels need to be compensated by higher enzyme levels in oxidative phosphorylation, which again lowers the growth rate (Figure 4(b)). However, EFMs that function anaerobically, such as ana-lac, are not affected (see SI Figure S16 for the enzyme allocations). Therefore, the rate/yield tradeoff becomes much more prominent at low oxygen levels, with a Pareto front spanning a wide range of growth rates and yields (Figure 4(a)).
The effects of varying glucose levels can be studied in a similar way (SI Figure S11): at a lower glucose concentration, the PTS transporter becomes less efficient and in order to maintain the flux, cells must compensate for this by expressing more of the associated PTS genes. This increases the total enzyme cost, and therefore slows down growth. At very low glucose concentrations, as low as 10−5 mM, the cost of the transporter completely dominates the enzyme cost (see Figure 5(b) and SI Figure S16 for a breakdown of the enzyme allocations). Note that in our model, the PTS transporter is the only glucose transporter available, therefore it is used by all of the EFMs, and the monotonic relationship between glucose concentration and growth rate is universal. Nevertheless, the exact shape of this glucose/growth plot, known as the Monod curve [34, 35], depends on the PTS flux and on many other parameters that differ between EFMs (see SI Figure S17).
By varying both glucose and oxygen levels at the same time, we can screen environmental conditions and see which EFM reaches the highest growth rate. This overview of winning strategies across the glucose/oxygen phase diagram is summarized in Supplementary Figure S12(a). We find that there are more than 20 different EFMs that achieve a maximal growth rate at least in one of the scanned conditions. To simplify this picture, we chose to present a single feature at a time and overlay it on the phase plot to have a birds-eye view of the winning strategies in different regions (Figure 4(c)-(f)). As expected, the oxygen uptake rate (Figure 4(d)) decreases when oxygen levels are low. This pattern occurs across the entire range of glucose levels, but the transition – from full respiration to acetate overflow (Figure 4(e)) and then to anaerobic lactate fermentation EFMs (Figure 4(f)) – is shifted slightly the lower the glucose levels are. Interestingly, in extremely low glucose concentrations (0.1 µM), this transition cannot be seen in our plot as the fully respiring EFM pareto exhibits the highest growth rate even at the lowest oxygen levels tested (SI Figure S12(a)).
While glucose concentrations are relatively easy to adjust experimentally, the steady-state oxygen concentration in the local environment of cells growing exponentially is quite difficult to measure. Therefore, there is a long standing debate regarding the exact conditions the E. coli cells experience in batch cultures. This makes our prediction for the transition point from acetate fermentation to full respiration hard to validate. Nevertheless, our model predicts that at a constant level of [O2], E. coli will tend to fully respire at lower glucose levels, and secrete acetate at high glucose levels. Qualitatively, this prediction is in agreement with experimental evidence from chemostats [36], where cells start to secrete acetate only at high dilution rates (i.e. when the glucose level is high as well).
Cell growth rates and choices of metabolic strategies do not only depend on external conditions, but also on enzyme parameters. As an example case, we varied the kcat value of triose-phosphate isomerase (tpi) and studied its effects on the rate/yield diagram. Not surprisingly, slowing down the enzyme decreases the growth rate (Supplementary Figure S18). But to what extent? Two of our focal EFMs (max-gr and pareto) are completely unaffected by this kcat value, since they do not use the tpi reaction at all. The growth rates of all other focal EFMs, however, are strongly decreased. To study the effect of parameter changes more generally, we predicted the growth effects of all enzyme parameters in the model by computing their growth sensitivities, i.e., the first derivatives of the growth rate (or biomass-specific enzyme cost) with respect to the enzyme parameter in question (see Supplementary Files). Growth sensitivities are informative for more than one reason. On the one hand, parameters with large sensitivities are likely to be under strong selection pressures (where positive or negative sensitivities indicate a selection for larger or smaller parameter values, respectively). On the other hand, these parameters have a big effect on growth predictions, and precise estimates of these parameters are critical to obtain reliable models. Even for the same reaction, some parameters can have a much higher growth sensitivity than others. For example the sensitivities of the kcat and KM values of pgi are low, but the growth rate is very sensitive to the Keq value.
Finally, to systematically study the choice between metabolic pathways we can “switch off” pathways by discarding all EFMs that use a certain pathway. Based on this restricted set of EFMs, and on our previous analysis of the full network, we can easily run an analysis of the restricted network without any need for new optimization runs. Unlike Flamholz et al., [4], we can now study the choice between the (high ATP yield, high enzyme demand) EMP and (low ATP yield, low enzyme demand) ED versions of glycolysis as part of a whole-network metabolic strategy, which also involves the choice between respiration and fermentation, or combinations of them. Moreover, we can compare the effect of constraining the model to use only one of these pathways across the different environmental conditions, specifically the external concentrations of glucose and oxygen. By calculating the (condition-dependent) growth defects of the two pathway knockouts compared to the wild-type, we can assess the importance of each of the pathways to the fitness in that condition (see SI section S3.6). As shown in Figure 6, at relatively low oxygen levels and mediumhigh glucose levels (10 µM – 100 mM) cells profit considerably from employing the ED pathway, therefore knocking it out would decrease growth rate by up to 25%. The EMP pathway has a much more limited advantage (up to 10%), and only in a narrow range of low-oxygen conditions.
Discussion
Our case study in E. coli shows that although there is no strict coupling of growth rate and biomass yield, the existence of a rate/yield trade-offs strongly depends on the circumstances. This might come as no surprise, since yield can kinetically influence growth in two contrary ways. On the one hand, high-yield strategies produce biomass at a lower glucose influx, and this lower influx allows for lower enzyme investments and therefore for higher growth rates. On the other hand, high-yield pathways leave a smaller amount of Gibbs free energy to be dissipated in reactions [38] and, therefore, must be compensated by higher enzyme levels that lead to lower growth rates [39–41]. The second relation may be obscured by a second substrate such as oxygen, which provides additional driving force. If the first relation dominates, there may be a single EFM that maximizes both growth and yield; if the second relation dominates, there will be a trade-off, i.e., a Pareto front formed by several EFMs. In our simulations, the extent of this trade-off strongly depended on conditions and kinetic parameters. At high oxygen levels, our growth-maximizing solution showed almost the maximal yield and the Pareto front was very narrow. Under low-oxygen conditions, low-yield strategies showed the highest growth rates and a broad Pareto front emerged.
Experimental results indicating rate-yield trade-offs are difficult to interpret; as shown in [9], the original cell populations might be far from the trade-off line, and a selection for growth may push the populations and individuals closer to it. Selection for growth rate and selection for yield would be needed to demonstrate trade-offs experimentally. From our simulation results, we expect the experimental results to be as scattered as they are. It would be interesting to see whether the experimental results are in fact condition-dependent (e.g. dependent on oxygen availability).
Our standard conditions used in this paper describe almost saturating glucose and oxygen concentrations, which are comparable to typical laboratory conditions. However, different conditions are used in different experiments, and actual oxygen concentrations are very hard to estimate (the question of oxygen availability may be as complex as in yeast, where it has been suggested that oxygen may diffuse too slowly to supply the mitochondria with enough oxygen [42]). Furthermore, it is very difficult to obtain realistic values for the affinity of the reactions to oxygen, so even very precise knowledge of the oxygen concentration would not suffice. Under these standard conditions we do not predict a pronounced trade-off, although we know E. coli uses a lower-yield acetate producing strategy. We do find, however, that the acetate producing mode is optimal in lower oxygen levels (Figure 4(e)), suggesting that the cells might be perceiving a lower concentration of oxygen than the ambient conditions (0.21 mM). Moreover, our strain might not be optimally adapted; recently it has been shown that different strains of E. coli show different phenotypes and the strain we used for the data in this paper is not the fastest growing one [43].
To predict metabolic fluxes and growth rates, we developed a new method in which enzyme cost minimization, a numerically efficient method to predict metabolic fluxes and enzyme profiles ab initio, is combined with an exhaustive screening of EFMs, i.e., potentially optimal flux modes. Unlike other numerical optimization or screening methods (as, e.g., in [44] and [17]), it allows us to optimize metabolic states directly. To translate enzyme-specific biomass production rates into growth rates, we used a nonlinear formula that accounts for the growth-rate dependent composition of the proteome. The enzyme cost calculations made by fECM are only based on a network model, on kinetic enzyme properties, and on a few transparent model assumptions. No flux or proteome measurements are used. We use common modular rate laws because they yield realistic results and guarantee strict convexity ([23]; Joost Hulshof, personal communication). Furthermore, the optimized enzyme cost is a concave function in flux space [7, 22]. The combination of convexity and concavity facilitates fast optimization of fluxes and enzyme levels for each condition and set of parameters. Moreover, we chose to implement the optimization on the NEOS platform, which makes it easy to scale up and run multiple screens in parallel (e.g. of external conditions or knock-out libraries). Our implementation of fECM can also handle other rate laws and is freely available to the community via our website (see Methods).
Our model predicts a much higher maximal biomass yield than the yield measured in batch cultures (22.1 vs 11.8 [gr dry weight per mole carbon] [45]), while the predicted growth rate is slightly lower (0.74 vs 0.89 h−1). For the experimentally determined flux mode, we overestimate the yield (17.7 vs 11.8 [32]) and underestimate the growth rate (0.41 vs 0.61) as well. The overestimation of the yield (which solely depends on the stoichiometric model structure) might be caused by the fact that some waste products or processes that dissipate energy are missing in our model. The low predicted growth rates might result from our simplistic conversion of enzyme costs into growth rates, a part of the method that could be improved. However, we expect that these over- and underestimations occur consistently across EFMs and will not affect the qualitative results of this study.
Our calculations of growth rates rely on a large number of kinetic constants. Uncertainties in these parameters will introduce uncertainties into all our predictions, but methods with fewer unknown parameters all have there own drawbacks. Stoichiometry-based methods (e.g. FBA without any additional flux constraints) do not require such parameters, but would not even be able to address rate/yield trade-offs because their model assumptions force growth rates and yields to be proportional (see Figure S13). More recent FBA methods that bound or minimize the presumable enzyme demand (such as FBA with flux minimization [24], molecular crowding [25], or constrained allocations [46]) can address such trade-offs and predict low-yield flux modes, but these follow quite straightforward from the inputs as these methods are unable to comprise the complex interactions between reactions due to shared metabolites. The assumption, that some (or all) enzymes operate always at their maximal capacity, leads to an overestimation of growth rates because it ignores the “unused enzyme fraction” [47], e.g. enzymes that are not bound their ligand due to low saturation levels. The term itself may be misleading: as suggested by our fECM results, enzymes may work below their maximal rates not because they are deliberately left unused, but because of the fact that reactions, to be thermodynamically and kinetically efficient, require high substrate and low product levels. This causes contradicting requirements in different reactions, and even in the best possible compromise, many enzymes will be used inefficiently. One could also employ a simplified version of fECM that resembles methods assuming enzymes work at their maximal capacity, relating fluxes and enzyme levels not by rate laws, but by a simple proportionality. For example, assuming that all enzymes work at their maximal speed (as given by their kcat values), the fECM optimization would become obsolete: using the enzyme weights and kcat values, we could directly translate any flux mode into a total required amount of enzyme by a simple linear formula which does not depend on metabolite concentrations (external or internal). This simplified version is used by satFBA [48], with the addition that the weights of the exchange reactions can be varied to reflect differential saturation of the transporter enzymes, which allows for the investigation of changing external conditions (similar to our Figure 4(c)). In previous work [23], we showed that this simplified rate law provides inferior predictions for enzyme concentrations, and as expected, the growth rate prediction is harmed as well. The growth rate would be overestimated by a factor of about 2.4 (see supplementary figure S3) and, more severely, the growth differences between EFMs would be distorted. This overestimation is purely an artifact and has no biological interpretation, therefore results form these “linear” methods that agree with measurements could actually have wrong assumptions and have to be carefully interpreted. Given the overestimation of the growth rate, it seems quite surprising that these methods can actually be quite predictive (e.g. [6]). In the case of Resource Balance Analysis (RBA) [49], the overestimation of growth rates is avoided by using experimentally measured apparent kcat values, which are lower than the actual kcat values and capture the fact that enzymes work below their full efficiency. In RBA simulations, where growth rate is a simulation parameter, different apparent kcat values are chosen for different growth rates, reflecting the fact that enzyme efficiencies depend on metabolite levels [23], which vary between growth rates.
Beyond the high uncertainty in kinetic constants, it seems that the fluxes employed by “true wild-type” E. coli are vaguely defined. Recent data shows that even closely-related strains of E. coli sometimes use drastically different metabolic strategies, even though we expect their metabolism to be mostly identical [43]. Interestingly, a few of these strains do not display any respiro-fermentative metabolism in aerobic environments, but rather use a fully respiratory strategy without secreting any byproducts. Furthermore, the growth rate of these strains is among the highest. This finding raises questions regarding the universality of the rate/yield trade-off principle and supports our conclusion that it is almost non-existent in highly oxidative conditions.
Being based on enzyme kinetics, fECM is fully quantitative and allows modelers to address a great variety of questions. Unlike other flux prediction methods, our method can account for allosteric regulation and for the quantitative effects of external conditions such as oxygen concentration, kinetic parameters, and enzyme costs (see Supplementary Figure S2). The fact that our model can account for low glucose concentrations also implies that our method can be used to describe chemostat settings, while in flux-only methods everything would just scale linearly with a lower glucose uptake flux. In a chemostat the steady-state growth rate is externally controlled by setting the dilution rate, and the steady-state glucose level reaches a value that supports exactly this growth rate. There are likely trade-offs between growth at low and high oxygen concentrations and our model can be used to estimate these (see SI Figure S17). In standard conditions we do not see a trade-off between growth rates at high and low glucose, perhaps explaining why there is no significant negative selection on the Monod constant in the long term experimental evolution of E. coli, where rate selection could have been expected [51].
Once an fECM analysis has been run, additional analyses require only very little additional effort. For example, parameter sensitivities or uncertainties caused by small parameter variations can be easily computed without re-running any optimizations: all necessary sensitivities can be obtained from the existing results (see Supplementary Text S4.2 and S4.3). Moreover, the decomposition into EFMs already provides all information that is needed to study gene knock-outs. To simulate a single or multiple knock-out, we simply need to exclude all affected EFMs from our analysis (Supplementary Figure S2f). The yield of knock-out mutants and the yield-related epistatic interactions between knock-outs have been computed before (see SI Figure S21), but the growth rates of the knock-out mutants and their epistatic effects under different conditions have not been computed so far (see Supplementary Figure S20).
fECM can be extended, both to larger network models and by incorporating more detailed kinetic information than we did in this paper. Bigger networks will bring two main challenges: data availability and calculation issues. The EFMs of a large network would greatly increase in number. A subsampling of EFMs can be problematic because depending on model conditions, the high-growth EFMs may easily be missed (see SI section S3.1). However, a promising avenue is to subdivide large networks by setting all strongly connected metabolites external [52] and predefining their concentrations. These concentrations could also be varied to assess their effects on predicted metabolic strategies. The resulting subnetworks can then be analyzed independently, and their EFMs can be combined to yield favorable, global, elementary flux distributions. As stated earlier, the convexity of the enzyme cost minimization problem for individual EFMs allows for large networks to be solved with fECM (as noted by e.g. [53]). Predictions with fECM will be strengthened by more accurate knowledge of enzyme properties. Although kinetic information about enzymes in the central carbon metabolism of E. coli is relatively complete, we still had to fill some missing gaps by parameter balancing [30]. To make the model parameters more precise, one could take temperature and pH into account [30]. The biomass composition of E. coli depends on the growth rate. Considering such a growth-dependent biomass composition is likely to improve the predictions, but requires changes to the algorithm. Ideally, model results should be self-consistent, i.e., the predicted growth rate should match the growth rate for which the biomass composition has been assumed. This challenge can perhaps be solved by an iterative procedure in which we assume a certain growth rate, choose the corresponding biomass composition, predict the resulting optimal growth rate, update the biomass composition, and so on. However, along with the different biomass composition, the set of EFMs will change in every iteration step, and it is unclear whether this procedure is bound to converge. Another open question concerns models with non-enzymatic reactions, which can effectively render the flux-polytope non-convex, possibly leading to non-elementary growth-optimal flux modes. Finally, we could study optimal EFMs and growth rates at different predefined rates of ATP consumption, implemented by a flux constraint. Since non-EFMs can be optimal with additional flux constraints, the algorithm for finding the possible optimal flux modes will have to be adjusted. Since these new points can be found by interpolating between EFMs we expect that an efficient algorithm can be developed, perhaps building onto the concept of elementary flux vectors [54, 55].
Although the importance of efficient protein allocation for reaching high growth rates has often been stressed (reviewed in [56]), real cells may not always minimize enzyme cost. Lactococcus lactis, for example, can display a metabolic switch associated with large changes in growth rate, but without any changes in protein investments [57]: these cells could save enzyme resources, but do not do so – possibly because unused enzyme provides other benefits, e.g., being prepared for metabolic changes to come. In order to capture such behavior in optimality approaches, other optimization criteria could be considered, such as additional flux constraints, extra ATP production for maintenance or stress, or a need for robustness. If different EFMs provide growth rates close to the optimal one, these EFMs, or maybe mixtures of them, may coexist in a single cell population. Mixtures of EFMs could also provide resistance, as single EFMs are inherently not robust against a repression of one of the active enzymes. To model such flux patterns in a population, instead of considering only one optimal EFM, we could consider a set of EFMs close to the optimal growth rate (e.g. between 99% and 100% of the optimal growth rate), because selection between these EFMs may be rather weak. Averaging over these EFMs could yield smoother transitions when parameters are varied, e.g., in the condition-dependent enzyme expression or the usage of alternative pathways. Specific fluxes over a range of conditions, such as shown in Supplementary Figure S12, could be averaged over a set of suboptimal EFMs as well, to give a more robust prediction of population behavior. As mentioned before, a sensitivity analysis can indicate selection pressures on kinetic parameters. However, larger changes such as how to evolve from using one EFM to another are still a challenge. Our method can be used to sketch a fitness landscape and predict what mutations would be necessary for such larger transitions.
Although kinetic approaches to flux optimization pose some challenges, they are a necessary and useful addition to existing constraint-based flux analysis methods. Only with relatively assumption-free methods we can address fundamental issues of unicellular growth and cell metabolism, such as the trade-off between growth rate and biomass yield.
Methods
Flux and enzyme profiles for maximal enzyme-specific biomass production
A metabolic state is characterized by its enzyme levels, metabolite levels, and fluxes. The relationships between all these variables are defined by rate laws and are condition- and kinetics-dependent. Our algorithm finds optimal metabolic states in the following way. The elementary flux modes of a network, which constitute the set of potentially growth-optimal flux modes, are enumerated. Now we consider a specific model condition, defined by a choice of kinetic constants and external metabolite levels in the kinetic model. For this condition, we first compute the growth rates for all EFMs. To determine the optimal metabolic strategy– the one expected to evolve in a selection for fast growth – we then choose the EFM with the highest growth rate.
To determine the growth rate of an EFM, we predefine a biomass production rate vBM, scale the EFM to realise this production rate, and compute the enzyme demand by applying ECM. Computing the enzyme demand involves an optimisation of metabolite levels c and enzyme levels E by ECM. Thus, in summary, the optimal state (v, c, E) can be found efficiently by a nested screening procedure. First, we consider all feasible flux modes v, requiring stationary and a predefined biomass production rate vBM. For each flux mode v, we consider all possible logarithmic metabolite concentration profiles ln c, where an upper and a lower bound is set for each metabolite. For each such profile, we compute the necessary enzyme levels El and obtain the total enzyme cost Emet. Since the cost function with respect to logarithmic metabolite concentrations is convex, it can be easily minimized; and since the optimized cost, as a function of fluxes, is concave, we need not screen the flux space exhaustively, but can restrict our search to elementary flux modes. This yields an optimization over all the possible states of our kinetic model.
ECM and NEOS online tool
Enzyme Cost Minimization has been recently applied to a similar kinetic model of E. coli’s central carbon metabolism network [23]. It uses a given flux distribution (in our case, given by an EFM) to formulate the enzyme concentrations as explicit functions of their substrate and product levels. We then score each possible enzyme concentration profile by the total enzyme mass concentration Emet = Σl βl El (in mg l−1), where βl denotes the molar mass of enzyme l in Daltons (mg mmol−1) and enzyme concentrations are measured in mM (i.e., mmol l−1). Written as a function of the logarithmic metabolite levels, Emet is a convex function; this greatly facilitates optimization and allows us to find the global minimum efficiently. Our online service for enzyme cost minimization and is freely available to the community. Users can run ECM for their own models. For the model in this paper, the optimization for one flux distribution takes several seconds, and for the complete set of all EFMs several minutes on a shared Dell PowerEdge R430 server with 32 intel xeon cores. Details can be found on the web page describing this case study (http://www.neos-guide.org/content/enzyme-cost-minimization).
Computing growth rates from enzyme-specific biomass production rates
The cell growth rate can be approximately computed from the enzyme cost of biomass production. In the spirit of Scott et al. [27], the growth rate of a cell is given by µ = vBM/cBM, where cBM is the biomass concentration, i.e. the amount of biomass per cell volume and vBM is the rate of biomass production (amount of biomass produced per cell volume and time). We further define the enzyme-specific biomass production rate rBM = vBM/Emet, which would be exactly equal to the growth rate if the entire cell biomass were composed of central metabolism enzymes (i.e. the ones directly accounted for in Emet). Since that is not the case, we account for the conversion between Emet and cBM using the following approximation Emet/cBM = αprot(a − b µ), where αprot = 0.5 is the fraction of protein mass out of the cell dry weight, and a = 0.27, and b = 0.2[h] are fitted parameters that approximate the proteomic fraction dedicated to central metabolism (as a linear function of the growth rate [27]). As shown in Supplementary Text S1.3, we obtain the following formula Note that in our model, the biomass flux, vR70, is always set to 1 [mM s−1] and by pure unit conversion we obtain vBM = 7.45 × 107[mg l−1 h−1]. As shown in the previous subsection, the total enzyme mass concentration is given by the formula Emet = Σl βl El in units of [mg l−1], so it requires no further conversion. The final formula for growth rate is thus
Therefore, maximizing the growth rate µ is equivalent to minimizing Σl βl El. See Supplementary Text S2.4 for more details.
The connection between biomass rate, the total enzyme mass concentration, and the growth rate can also be understood through the cell doubling time. We first define the enzyme doubling time which represents the doubling time of a hypothetical cell comprised only of central metabolism enzymes. The doubling time of a whole cell would thus be
Growth sensitivities
The sensitivities between enzyme parameters and growth rate can be approximated in the following way. A parameter change that slows down a specific reaction could to be compensated by increasing the enzyme level in the same reaction, thus keeping all metabolite levels and fluxes unchanged. For example, as a catalytic constant decreases by a factor of 0.5, the enzyme level needs to increase by a factor of 2. More generally, the cost increase for an enzyme follows from the simple formula [old enzyme cost]. For other parameters, this local enzyme increase could be simply computed from the reaction’s rate law. However, instead of adapting only one enzyme level, the cell may also adjust other enzyme levels, accept a change in metabolite levels, and therefore decrease its total cost even further. The additional cost decrease is only a second-order effect: for small parameter variations, it can be neglected, and the first-order local and global cost sensitivities are therefore identical (proof in SI section S4.2). Sensitivities to external parameters (e.g., extracellular glucose concentration) can be computed similarly. The growth sensitivities for a given EFM can be easily computed by multiplying the enzyme cost sensitivities by the derivative between growth rate and enzyme cost in a reference state.
Acknowledgements
We thank H.-G. Holzhütter, Joost Hulshof, Avi Flamholz, Philip van Kuiken, Timo Maarleveld and Bas Teusink for fruitful discussions, and the Lorenz Center of Leiden University for providing a space for developing ideas. This work was funded by the Swiss Initiative in Systems Biology (SystemsX.ch) TPdF fellowship (2014-230) (to EN) and by the German Research Foundation (Ll1676/2-1) (to WL).
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].
- [11].↵
- [12].↵
- [13].↵
- [14].
- [15].
- [16].
- [17].↵
- [18].↵
- [19].↵
- [20].
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵