## ABSTRACT

Design principles to improve enzymatic activity are essential to promote energy-material conversion using biological systems. For more than a century, the Michaelis-Menten equation has provided a fundamental framework of enzymatic activity. However, there is still no concrete guideline on how the parameters should be optimized to enhance enzymatic activity. Here, we demonstrate that tuning the Michaelis-Menten constant (*K*_{m}) to the substrate concentration (*S*) maximizes enzymatic activity. This guideline (*K*_{m} = *S*) was obtained by applying the Brønsted (Bell)-Evans-Polanyi (BEP) principle of heterogeneous catalysis to the Michaelis-Menten equation, and is robust even with mechanistic deviations such as reverse reactions and inhibition. Furthermore, *K*_{m} and *S* are consistent to within an order of magnitude over an experimental dataset of approximately 1000 wild-type enzymes, suggesting that even natural selection follows this principle. The concept of an optimum *K*_{m} offers the first quantitative guideline towards improving enzymatic activity which can be used for highthroughput enzyme screening.

## MAIN TEXT

### Introduction

Enzymes are responsible for catalysis in virtually all biological systems,^{[1,2]} and a rational framework to improve their activity is critical to promote biotechnological applications. Since the early 20^{th} century, a reaction mechanism where the enzyme first binds to the substrate (E+S → ES) before releasing the product (ES → P) has been used as the conceptual basis to understand enzyme catalysis (Scheme 1).^{[3-6]} The reaction rate of this mechanism is given by the Michaelis-Menten equation:
Here, the reaction rate (*v*) is expressed as a function of a rate constant (*k*_{2}), the Michaelis-Menten constant (*K*_{m}), and the substrate (*S*) and enzyme (*E*_{T}) concentrations. *K*_{m} can be interpreted as a quasi-equilibrium constant for the formation of the enzyme-substrate complex, defined as:
with rate constants defined based on the mechanism shown in Scheme 1. *k*_{2} is the rate constant for releasing the product from the enzyme-substrate complex (ES → P), routinely expressed as *k*_{cat} in the enzymology literature. These parameters are experimentally accessible by fitting the theoretical rate law (Eq. (1)) with experimental data^{[7-10]} and are subsequently registered in databases such as BRENDA^{[11]} and Sabio-RK.^{[12]} In principle, the accumulated data may help rationalize and improve the activity of existing enzymes.

However, there is no concrete understanding on how these parameters influence enzymatic activity. For example, increasing *k*_{2} may enhance activity according to Eq. (1), or diminish it due to a larger *K*_{m} (Eq. (2)).^{[13]} Thus, the mutual dependence between *k*2 and *K*_{m} complicates their influence on the enzymatic activity ( *v* ), hindering the rational design of enzymes towards biotechnological applications such as the synthesis of commodity chemicals,^{[14]} antibiotics,^{[15]} or pharmaceuticals,^{[16]} increasing the nutritional content of crops,^{[17]} and restoring the environment.^{[18]}

In this study, we analyzed the Michaelis-Menten equation to clarify the relationship between the enzyme-substrate affinity (*K*_{m}) and the activity (*v*). The key ingredient of our mathematical analysis is the Brønsted (Bell)-Evans-Polanyi (BEP) relationship,^{[19-23]} which models the activation barrier as a function of the driving force. This is a well-known concept in heterogeneous catalysis, and in conjunction with the Arrhenius equation,^{[24]} can be used to evaluate the mutual dependence between *k*_{2} and *K*_{m} to quantitatively. This allowed us to calculate the optimum value of *K*_{m} required to maximize enzymatic activity (*v*), a finding which is supported by our bioinformatic analysis of approximately 1000 wild-type enzymes.

## Results

### Construction of the Thermodynamic Model

In principle, an ideal enzyme with low *K*_{m} and large *k*2 can be realized if both *k*1 and *k*2 are increased simultaneously. However, this is physically unrealistic, because the driving force which can be allocated to *k*_{1} and *k*_{2} is limited by the free energy change of the entire reaction. Within this thermodynamic context, maximum activity is realized by optimizing the distribution of the total driving force between the first (E +S → ES) and second (ES → P) steps shown in Scheme 1. To quantitatively evaluate the relationship between the driving force and the activity, we have used the BEP relationship^{[19-23]} to convert driving forces ( Δ*G* ) into activation barriers ( *Ea* ), and the Arrhenius^{[24]} equation to convert activation barriers to rate constants.

The thermodynamic model which served as the basis of our calculations is shown in Fig. 1. In a classical Michaelis-Menten reaction, the enzyme and substrate first form an enzyme-substrate complex (E+S → ES) before producing the product in the second step (ES → P). This mechanism is conceptually similar to reactions that occur on a heterogeneous catalyst surface, where the substrate molecule first binds to the catalyst surface before being converted into the product.^{[19-23]} The Gibbs free energies for the formation of the enzyme-substrate complex and the product are denoted as Δ*G*_{1} and Δ*G*_{2}, respectively. By definition, their sum must equal the total free energy change of the reaction Δ*GT*:

## Reaction Coordinates

From these thermodynamic constraints, we will use the BEP relationship^{[19-23]} to obtain activation barriers (*Ea*), and then the Arrhenius^{[24]} equation to obtain rate constants, which ultimately yields quantitative insight on the relationship between *k*_{1}, *k*_{2}, and *K*_{m}. Based on the BEP relationship, the activation barrier corresponding to *k*_{1} can be written as:
where represent the activation barriers when the elementary reaction is in equilibrium (Δ*G*_{1} = 0). They are positive constants which reflect the inherent favorability of this elementary step. α_{1} is a positive constant coefficient which indicates the sensitivity of the activation barrier with respect to the driving force. Recently, Kari et al have shown that fungal cellulases indeed satisfy such linear free energy relationships between the activation barrier and the driving force.^{[25]} Next, activation barriers can be converted to rate constants based on the Arrhenius equation^{[24]} as follows:
Here, *A*_{1} is a pre-exponential factor, and *R* and *T* are the gas constant and absolute temperature, respectively. Using Eqs. (4) and (5), *k*_{1} can be expressed as:
where and were used to aggregate factors independent and dependent on the driving force, respectively (see Supporting Information, Appendix 1 for details). *k*_{1r} and *k*_{2} can also be written similarly as:
using notations similar to those defined for *k*_{1} (See Appendices 2 and 3 for details). Substituting these rate constants into Eq. (2) yields the following expression for *K*_{m}:
where *K* was defined as Finally, based on Eqs. (8) and (9), the enzymatic activity (*v*) can be expressed as:
To illustrate how Eq. (10) captures the tradeoff relationship between *k*_{2} and *K*_{m}, numerical simulations were performed (Fig. 2A). Hereafter, all simulations will be performed at α_{1} = α_{1}*r* = α_{2} = 0.5, which is a common assumption used to make baseline models in heterogeneous catalysis.^{[22,26-28]} Physically, this means that when the driving force of an elementary reaction is increased by 1 kJ/mol, its activation barrier decreases by 0.5 kJ/mol. In reality, typical experimental values of α range between 0.3 and 0.7 for artificial catalysts,^{[29-31]} and the experimental value reported for cellulases is 0.74.^{[25]} Therefore, the influence of α deviating from 0.5 will be discussed in detail in Fig. 5D.

Fig. 2A shows three possible thermodynamic landscapes for a reaction with a total driving force of Δ*GT* = −40 kJ/mol. This parameter was chosen as a representative value based on the fact that the Δ*GT* of typical biochemical reactions is between −80 ~ + 40 kJ/mol.^{[32,33]} Similar calculations with different values of Δ*GT* can be found in Figs. S1-S3. When the first reaction is thermodynamically favorable compared to the second (Δ*G*_{1} < Δ*G*_{2}; Fig. 2A, black lines), the activity increases rapidly from low substrate concentrations (Fig. 2B, solid black line), consistent with the small *K*_{m} value. However, an enzyme with a small *K*_{m} suffers from a small *k*_{2} value, which is evident from the saturating behavior at *S* > 1 µM. Increasing the driving force of the second step (blue and red lines) leads to a larger *k*_{2} and thus higher activity at large *S* values (*S* > 1 µM) compared to the enzyme shown in black. At the same time, however, *K*_{m} increases, which decreases the enzymatic activity at low *S* (*S* < 1 µM). The difference in activity at low and high substrate concentrations occurs because the substrate participates in only the first elementary step. For example, even if *k*1 < *k*2 (Δ*G*1 > Δ*G*2), the rates of the two forward reactions (*k*1*E* ∙ *S* and *k*_{2}(*ES*)) can be matched if the substrate concentration (*S*) is sufficiently large. However, at low substrate concentrations, a small *k*_{1} can no longer be compensated, resulting in the first step being rate-limiting. For this reason, a large *k*_{1} is necessary to increase the enzymatic activity at low substrate concentrations, whereas a large *k*_{2} is more desirable when the substrate concentration is sufficient. The balance in tradeoff changes when the rates of the two forward reactions are equal As the optimum values of *k*_{1} and *k*_{2} are dependent on the substrate concentration (*S*), the *K*_{m} value necessary to maximize the activity must also be dependent on (*S*).

### Analysis of the Activity – Driving Force Relationship

To directly illustrate the influence of driving force ( Δ*G*_{1} and Δ*GT* ) on enzymatic activity, we performed numerical simulations using Eq. (10) at various substrate concentrations (Fig. 3). At a substrate concentration of 0.1 *μ*M (Fig. 3A), the region of highest enzymatic activity (orange) was observed in the bottom left region. It is reasonable for activity to be higher in the lower half of the panel, due to the more negative Δ*GT*. A negative Δ*G*_{1} is also beneficial for activity at a low substrate concentration (*S* =0.1 *μ*), leading to enzymatic activity being higher in the left half of the panel. At higher substrate concentrations, the overall color within each panel changed from blue to red, because a higher substrate concentration always increases activity (Figs. 3B-3D). At the same time, the Δ*G*_{1} corresponding to maximum activity gradually shifted positively (black dashed lines). This finding is consistent with Fig. 2 which shows that a more positive Δ*G*_{1} is desirable when the substrate concentration is increased. In each panel, the location with the highest activity at a given Δ*GT* value is shown as a dashed black line. Notably, when the *K*_{m} value was calculated at the (Δ*G*_{1}, Δ*GT*) values under the dashed line using Eq. (9), the obtained value was always equal to the substrate concentration *S* in each panel. In other words, the dashed line is not only the ridge of the volcano plot, but also the contour line showing *K*_{m} = *S* . This suggests that the condition for maximizing enzymatic activity can be represented by *K*_{m} = *S*.

To examine why *K*_{m} = *S* leads to maximum activity, Eq. (10) was rearranged to give the following expression for the activity (*v*):
in which *g*_{1} is only in the denominator. The derivative of the denominator, denoted as *f* is:
To maximize the activity (*v*), *f* must be minimized which is realized at:
Considering that *K*_{m} is defined as *K*_{m} ≡ *g*_{1}(1 + *K*) (Eq. (10), Eq. (13) yields a surprisingly simple formula for the condition of maximum activity when α1 = α1*r* = α2 = 0.5:
Eq. (14) provides the theoretical basis for why maximum activity was consistently observed along the contour line *K*_{m} = *S* in Fig. 3: The combination of (Δ*G*_{1}, Δ*GT*) necessary to maximize activity guarantees *K*_{m} = *S*. This finding is further illustrated in Fig. 4, where the activity (*v*) is plotted as a function of *K*_{m} at different substrate concentrations. In all cases, maximum activity (*v*) is observed when the binding affinity (*K*_{m}) is equal to the substrate concentration (*S*). Thus, the derivations and simulations so far provide mathematical evidence that having a *K*_{m} value equal to the substrate concentration *S* guarantees maximal enzymatic activity as long as the enzyme follows the Michaelis-Menten mechanism (Scheme 1), and the rate constants follow the BEP relationship with *α*_{1} = *α*_{1r} = *α*_{2} = 0.5.

### Robustness of the Theoretical Model

To confirm the robustness of our finding, we have performed numerical simulations by loosening each of the theoretical requirements. Deviation from the Michaelis-Menten mechanism (Scheme 1) are shown in Fig. 5A-C, and deviation of α values from 0.5 are shown in Fig. 5D. The possibility of reverse reactions (P→S) or inhibition (E + I → EI or ES + I → ESI) are common deviations from Michaelis-Menten kinetics.^{[34]} The net rate in the presence of a reverse reaction when the substrate and product are in equal concentrations (*S* = *P* = 10 µM) is shown in Fig. 5A. In terms of maximizing the activity in the forward direction (S → P), the physically meaningful region is (Δ*GT* < 0), where the net reaction proceeds in the forward direction. Under this condition, the dashed line corresponding to *K*_{m} = *S* and the solid line corresponding to the true maximum activity (forward minus reverse reaction rates) overlap almost completely, indicating that *K*_{m} = *S* is a good guideline to enhance activity even in the presence of reverse reactions (P → S).

Similar calculations for competitive and uncompetitive inhibition, where the inhibitor binds to either the free enzyme or the enzyme-substrate complex, are shown in Fig. 5B,C. The degree of inhibition , is determined by the inhibitor concentration (*I*) and the equilibrium constant of inhibition (*Ki*).^{[34]} Based on the experimental data of Park et al.,^{[35]} γ can range from 10^{−4} to 10^{4}. As γ was less than 10 in approximately 80% of their data, γ = 10 was used here for the numerical simulations. Again, the optimal *K*_{m} (solid line) deviates only slightly from the dashed line (*K*_{m} = *S*), and both lines pass through the region of high activity (orange). The *K*_{m} values are approximately 1 order of magnitude apart between dashed and solid lines, yet there is only a 57 % difference in activity at a specific Δ*GT*. This is much smaller than the scale of the entire diagram (10 orders of magnitude), suggesting that adjusting *K*_{m} to the substrate concentration *S* is a robust strategy to enhance the activity, even in the presence of inhibition. A detailed discussion on the parameter dependence (γ, *S*), as well as for other mechanisms such as substrate inhibition or allostericity can be found in Section 3 of the supporting information. The derivations for the equations of the true optimal *K*^{m} can also be found in the same section.

The influence of the second assumption (α1 = α1*r* = α2 = 0.5) is shown in Fig. 5D. As physical constraints require α1*r* = 1 − α1 (Appendix 2), only α1 and α2 are independent. In an extreme case where α_{1} = α_{2} = 0.2, the activity is markedly diminished because rate constants hardly change even if their driving force is increased. However, the dashed line still passes through the region of high activity, and the activity is still less than an order of magnitude away from the true optimum (solid lines). Taken together, these simulations confirm that *K*_{m} = *S* is a robust theoretical guideline to enhance enzymatic activity.

### Validation based on Experimental Data

Finally, to evaluate whether *K*_{m} = *S* can rationalize enzymatic properties in nature, we have analyzed their relationship based on the experimental data from Park et al.^{[35]} The original data consisted of *K*_{m} values of wild-type enzymes obtained from BRENDA, and intracellular *S* values obtained from *Escherichia coli, Mus musculus*, and *Saccharomyces cerevisiae* cells, yielding a total of 1703 *K*_{m}–*S* combinations. This dataset was then classified based on the number of entries for each substrate, based on the expectation that a substrate which participates in many reactions is more likely to deviate from Michaelis-Menten kinetics. ATP is the most frequent substrate with 313 entries and is shown in black. Both the raw *K*_{m} and *S* values (Fig. 6A) and their relative distribution (Fig. 6B) shows that *S* > *K*_{m} for ATP. The deviation from *K*_{m} = *S* may be because the Michaelis-Menten mechanism, which is the basis of our mathematical analysis, does not consider scenarios where multiple reactions compete for the same substrate. The next subset shown in blue covers 410 entries and consists of 5 substrates which each appear more than 50 times: NAD^{+}, NADH, NADP^{+}, NADPH, and acetyl-CoA. These cofactors are less universal than ATP, and *S* is only slightly larger than *K*_{m}. The remaining 980 entries are shown in red. This subset contains 115 substrates such as carbon metabolites and amino acids and appear within the dataset 8 times on average. As the substrate becomes less universal, their *K*_{m} and *S* values become roughly consistent. In particular, the Gaussian distribution fitted to the red histogram (Fig. 6B) has a center at log_{10} *S*/*K*_{m} = 0.18 and a standard deviation of 1.3, which is reasonable considering that influences from inhibitors or the BEP coefficient can change the optimum *K*_{m} by roughly an order of magnitude (Fig. 5). Thus, the dataset from wild-type enzymes supports the theoretical prediction that a Michaelis-Menten constant equivalent to the substrate concentration is favorable for the activity, especially when the substrate participates in fewer reactions and Michaelis-Menten kinetics becomes more accurate.

## Discussion

So far, various criteria^{[13,34,36]} such as large *k*2 ( *kcat* ), small *K*_{m}, or large *k*2/*K*_{m} have been proposed to characterize enzymes with high activity, making it difficult to rationally evaluate or engineer the activity of an enzyme. The lack of a universal consensus is largely due to the mutual dependence between *k*_{2} and *K*_{m}. As our theoretical model addresses this challenge directly and maximizes the activity within the thermodynamic constraints imposed by *k*_{2} and *K*_{m}, we believe that *K*_{m} = *S* is a criterion for high activity which is viable in a wider range of scenarios.

The idea that the Michaelis-Menten constant should be increased at higher substrate concentrations to maximize activity is consistent with the experimental work by Kari et al,^{[37]} who measured the activity of cellulases with different *K*_{m} . When the substrate concentration was increased 6 times, the *K*_{m} value of the most active enzyme increased approximately 2.4 times. Considering that *K*_{m} can change by roughly 6 orders of magnitude, the experimental trend supports our hypothesis *K*_{m} = *S*, especially when their experimental BEP coefficient of 0.74 is considered. The idea of the optimum binding affinity being dependent on the reaction condition and driving force is also consistent with recent theoretical models of heterogeneous catalysis.^{[22,38-40]}

As a corollary, our model which quantifies the relationship between *K*_{m} and *k*_{2} immediately rationalizes the recently reported free energy relationship between them in cellulases.^{[25]} Namely, the relationship between *K*_{m} and *k*_{2} can be written as:
This equation shows that log *k*_{2} and log *K*_{m} are linearly correlated by a factor of α_{2}, and provides a physical basis to the high linearity (R_{2} = 0.95) observed for cellulases.^{[25]} The consistency between our theoretical model and previously accumulated experimental insight suggests that it may be possible to quantitatively rationalize enzymatic properties based on fundamental principles of physical chemistry.

### Online Methods

The mathematical formulas were derived by hand, and the step-by-step derivations for the standard Michaelis-Menten mechanism are explained in the main text. The derivations in the presence of inhibition and allostericity are provided in the supporting information. Numerical simulations and bioinformatic analysis were performed using Python 3.9.12. The code used for the analysis can be found in the extended data or accessed directly at github: https://github.com/HideshiOoka/SI_for_Publications.

## Acknowledgments

H.O. gratefully acknowledges the support from the JST FOREST program (Grant Number JPMJFR213E, Japan). Y. C. is grateful for the support from the JST ACT-X program (Grant Number JPMJAX20BB, Japan).

## Footnotes

**Competing Interest Statement:**The authors declare no competing interests.