Generative machine learning produces kinetic models that accurately characterize intracellular metabolic states

Generating large omics datasets has become routine practice to gain insights into cellular processes, yet deciphering such massive datasets and determining intracellular metabolic states remains challenging. Kinetic models of metabolism play a critical role in integrating omics data, as they provide explicit connections between metabolite concentrations, metabolic fluxes, and enzyme levels. Nevertheless, the challenges associated with determining the kinetic parameters that underlie cellular physiology pose significant obstacles to the broader acceptance and adoption of these models within the research community. Here, we present RENAISSANCE, a generative machine learning framework for efficiently parameterizing large-scale kinetic models with dynamic properties matching experimental observations. Through seamless integration and consolidation of diverse omics data and other relevant information, like extracellular medium composition, physicochemical data, and expertise of domain specialists, we show that the proposed framework accurately characterizes unknown intracellular metabolic states, including metabolic fluxes and metabolite concentrations in E. coli’s metabolic network. Moreover, we show that RENAISSANCE successfully estimates missing kinetic parameters and reconciles them with sparse and noisy experimental data, resulting in a substantial reduction in parameter uncertainty and a notable improvement in the accuracy and reliability of the parameter estimates. The proposed framework will be invaluable for researchers who seek to analyze metabolic variations involving changes in metabolite and enzyme levels and enzyme activity in health and biotechnological studies.

traditional kinetic modeling methods, thus allowing its broad utilization for high-throughput dynamical studies of metabolism.We showcase this framework through three studies: (i) generating a population of large-scale dynamic models of E. coli metabolism, (ii) characterizing intracellular metabolic states in the E. coli metabolic network accurately, and (iii) integrating and reconciling available experimental kinetic data.

RENAISSANCE for parameterization of biologically relevant kinetic models
In its conception, RENAISSANCE can parameterize kinetic models to satisfy a broad range of biochemical properties or physiological conditions.For example, it can parameterize models reproducing experimentally observed fermentation curves or drug adsorption patterns.Herein, we use RENAISSANCE to parameterize kinetic models to be consistent with an experimentally observed steady state.This approach to model construction was introduced within the ORACLE conceptual framework 15,28,34,[37][38][39][40] , which parameterizes kinetic models by unbiased sampling.In contrast, in RENAISSANCE, we leverage machine learning to perform stratified sampling biased toward kinetic models producing metabolic responses over time with timescales 41 matching experimental observations in studied organisms.Due to its capability to bias parameter sampling toward desired model properties, the proposed framework substantially improves model construction efficiency, enabling comprehensive studies of multiple physiological conditions.
In this context, prior to using RENAISSANCE, we compute a steady-state profile of metabolite concentrations and metabolic fluxes that will be used for parameterization (Methods).To accomplish this, we integrate information about the structural properties of the metabolic network (stoichiometry, regulatory structure, rate laws) as well as available data (metabolomics, fluxomics, thermodynamics, proteomics, and transcriptomics) into the model (Figure 1b, c, Methods).A parameterized kinetic model exhibits highly nonlinear but deterministic responses that depend on the intracellular state determined by the network topology and the integrated data.We require function approximators with similar complexity, such as neural networks 32 , to capture this nonlinear behavior and determine kinetic parameters.In RENAISSANCE, we iteratively optimize the weights of feed-forward neural networks (generators) using NES (Figure 1a) to obtain kinetic parameters leading to biologically relevant kinetic models, meaning that the metabolic responses obtained from these models have experimentally observed dynamics (Methods).
The NES algorithm produces a population of candidate solutions to an optimization problem and assigns a fitness score to each candidate solution (Figure 1a).The algorithm uses the fitness scores of the current solutions to generate the next generation of candidate solutions, which are likely to have better fitness scores than the current generation.The iterative procedure stops as soon as the obtained solutions are satisfactory.Unlike traditional gradient-based deep learning methods that require data to train a neural network, NES requires only a scoring function.
The iterative process in RENAISSANCE consists of four steps (Figure 1d).We start by initializing a population of generators with random weights (step I).We select one generator at a time and generate a batch of kinetic parameters consistent with the network structure and integrated data using multivariate Gaussian noise as input.We then parameterize the kinetic structure of the metabolic network (step II) with the generated parameter set.Next, we evaluate the dynamics of each parameterized model by computing the eigenvalues of its Jacobian and the corresponding dominant time constants (Methods).These quantities allow us to assess if the generated kinetic models have dynamic responses corresponding to experimental observations (valid models) or not (invalid models).Based on this evaluation, we assign a reward to the generator (step III).NES repeats steps II and III for every generator in the population and uses the rewards of the entire population to estimate the local gradient landscape and find the weights of the generator that improve the design objective (step IV).Then, we mutate the obtained generator by injecting random noise in its weights and recreate the new population of generators (step I).We iterate steps I-IV until we obtain a generator that meets the design objective, i.e., it can generate biologically relevant kinetic models (Methods).
The generated kinetic models are applicable to a broad range of metabolism studies.Here, we present a few of these (Figure 1e).

Generating large-scale kinetic models of E. coli metabolism
To test and validate RENAISSANCE, we generated biologically relevant kinetic parameter sets for central carbon pathways of E. coli metabolism (Methods, Supplementary Note 2).The objective was to find kinetic parameters resulting in dynamic models consistent with an experimentally observed doubling time of 134 minutes for the studied E. coli strain 42 .A valid kinetic model satisfying this requirement should produce metabolic responses with the dominant time constant of 24 mins, which corresponds to having the largest eigenvalue  !"# < −2.5 (Methods).The model structure consisted of 113 nonlinear ordinary differential equations (ODEs) parameterized by 502 kinetic parameters, including 384 Michaelis constants,  ! (Methods, Supplementary Figure 4).To integrate the experimental data 42 and compute a steady-state profile of metabolite concentrations and fluxes, we used Thermodynamics-based flux balance analysis 13 (Methods).
We ran RENAISSANCE for 50 evolution generations.We repeated the optimization process 10 times with a randomly initialized generator population to obtain statistical replicates.At every generation, we generated 100 kinetic parameter sets for every generator in the population and computed the maximum eigenvalue,  !"# , for each parameter set.To evaluate the generators, we used the incidence of valid models, defined as the proportion of the generated models that are valid (with  !"# < −2.5, Methods).We observed that the incidence of valid models steadily increases with the number of generations, with the mean incidence converging around 92% after 50 generations (Figure 2a, thick black line).For some repeats, we could achieve incidence up to 100% (Figure 2a, greenshaded region).
For further analysis of the generated models, we selected a statistical repeat with fast convergence (Figure 2a, dashed line) and chose 10 generators from that repeat with monotonically increasing incidence over generations (Figure 2a, black diamonds).For each of the 10 chosen generators, we generated 500 kinetic parameter sets and examined the distribution of the resulting maximum eigenvalues (Figure 2b).Remarkably, the generated models gradually shifted over the optimization process from having slow dynamics ( !"# > −2.5) to having fast dynamics, with the metabolic processes settling before the subsequent cell division, indicating that RENAISSANCE-generated models could capture the experimentally observed dynamics.
Since cellular organisms maintain phenotypic stability when faced with perturbations 43 , the generated models that describe cellular metabolism should possess the same property.To test the robustness of the models, we perturbed the steady-state metabolite concentrations up to ±50% and verified if the perturbed system returned to the steady state.For this purpose, we generated 1000 relevant kinetic models using the last of 10 selected generators (Figure 2a, generation 45).Inspection of the time evolution of the normalized biomass showed that the biomass returned to the reference steady state (()/ $%& = 1) within 24 minutes for 100% of the perturbed models (Figure 2c).Similarly, the perturbed time responses of a few critical metabolites, namely, NADH, ATP, and NADPH, returned to their steady-state values within 24 minutes for 99.9%, 99.9%, and 100% of the 1000 generated kinetic models, respectively (Figure 2c).Examining every cytosolic metabolite collectively revealed that 75.4% of the models returned to the steady state within 24 minutes and 93.1% returned within 34 mins, demonstrating that the generated kinetic models are robust and obey imposed context-specific observable biophysical timescale constraints.
Next, we tested the generated models in nonlinear dynamic bioreactor simulations closely mimicking real-world experimental conditions 42,44 .The temporal evolution of biomass production showed similar trends as typical experimental observations with clear exponential and stationary phases of E. coli growth (Figure 2d, Supplementary Figure 5).Similarly, glucose uptake and anthranilate production also reproduce trends observed in experiments with glucose being completely consumed and anthranilate production saturating around 20 hours 29,30 .This study indicates that the RENAISSANCE models can accurately reproduce the physiologically observable and emergent properties of cellular metabolism, even without implicit training to reproduce fermentation experiments.

Characterizing the intracellular states of E. coli metabolism
Accurately determining the intracellular levels of metabolite profiles and metabolic reaction rates is crucial for associating metabolic signatures with phenotype.Yet, our capabilities to establish the intracellular metabolic state are limited.Even with the ever-increasing availability of physiological and omics data, a significant amount of uncertainty in the intracellular states remains.We propose using kinetic models to reduce this uncertainty because of their explicit coupling of enzyme levels, metabolite concentrations, and metabolic fluxes.Moreover, kinetic models allow us to consider dynamic constraints in addition to steady-state data, thus allowing us further uncertainty reduction.
After integrating available physiology and omics data 42,[45][46][47] using the constraint-based thermodynamics-based flux balance analysis 13 , significant uncertainty was present in the intracellular metabolic state as indicated by the wide ranges of metabolite concentrations and metabolic fluxes.
We sampled 5000 steady-state profiles of metabolite concentrations and metabolic fluxes from this uncertain space and deployed RENAISSANCE to find the fastest possible dynamics (maximum negative eigenvalues,  !"# ) for each steady state (Methods, Supplementary figure 6).We visualized the steady-state profiles by performing dimension reduction with Principal Component Analysis (PCA) 48 and t-Distributed Stochastic Neighbor Embedding (t-SNE) 49 (Methods) and colored each steady-state profile according to the obtained  !"# (Figure 3a).We observed a high variation in the dynamics ( !"# ) of the studied steady-state profiles (Figure 3c, blue distribution).Of 5000 steadystate profiles, 918 (18.4%) had  !"# larger than -2.5, meaning these intracellular metabolic states could not correspond to the experimental observations.Indeed, the dynamic responses corresponding to these states have a time constant superior to 24 mins, i.e., slower than the experimental observations.Inspection of the intracellular steady-state space suggested that the steady-state profiles corresponding to slow (Figure 3a, yellow dots) and fast (Figure 3a, blue dots) dynamics are locally clustered.From this observation, we hypothesized that distinct subregions corresponding to the experimental observations exist and that steady-state profiles sampled near the chosen local cluster would likely satisfy dynamic requirements.
To test this hypothesis, we selected one of these local clusters (Figure 3b), which contained 22 steady states with fast dynamics with −3.8 ≤  !"# ≤ −8.5 (Figure 3c, green distribution), and analyzed its neighborhood (Figure 3d, left).We sampled 90 additional steady states within this neighborhood from the Gaussian distribution with a mean and standard deviation estimated on the initial 22 steady states.The sampled steady states allowed us to improve the resolution of the initial dynamic landscape (Figure 3d, right, circles).Crucially, the sampled steady states had linearized dynamics in the same range as the initial 22 states (Figure 3d, e), confirming our hypothesis.Indeed, RENAISSANCE allows us to select subsets of intracellular states consistent with experimentally observed dynamics and generate additional ones with the same characteristics.Moreover, it allows us to discard subregions with experimentally inconsistent states, thus reducing uncertainty.
We next examined individual metabolite concentrations of the 5000 steady-state profiles to identify patterns corresponding to the experimentally observed phenotype.We observed a clear bias in the dynamics depending on the concentrations for some of the metabolites (Figure 3f, Supplementary Figure 7).For example, in the case of 3-Phosphoglyceric acid (3pg), we obtain models with relevant dynamics only when the concentration of this metabolite is less than ∼ 0.002 .In contrast, steadystate profiles with 3pg concentrations between 0.002 − 0.003  do not have relevant dynamics (Figure 3f).To investigate this further, we identified 30 cytosolic metabolites that showed such concentration biases by visual inspection (Supplementary Figure 7) and sampled 40 new steady states from the same Gaussian distribution as before (Figure 3d, left) but constrained the selected 30 metabolites to concentration ranges that do not support relevant dynamics (e.g., peach shaded region in Figure 3f).As expected, almost all of these new intracellular states did not yield models with relevant dynamics (Figure 3f, right and 3g).This result demonstrates that information stemming from the dynamic responses can be used to constrain values of intracellular metabolites to specific ranges.
Overall, dynamic characterization of a broad range of intracellular states allows us to reduce uncertainty at the level of steady-state profiles and individual metabolite concentrations and metabolic fluxes.

Integration and reconciliation of experimental information
Experimentally measured Michaelis constants,  ' s, are curated in comprehensive databases like BRENDA 50 .However, as we transition to large genome-scale kinetic models, a vast majority of the associated kinetic parameters remain unknown.Integrating experimental results from in vivo and in vitro studies, despite the disparities in their parameter values, can help further constrain uncertainty and lead to a more accurate description of intracellular metabolic states.To this end, we retrieved experimentally measured values for 108 out of 384  ' s in our model from BRENDA (Methods).To investigate how the integrated kinetic data constrain unknown kinetic parameters, we started by integrating 4  ' values of aconitase (ACONTa, b) from the citric acid cycle (Figure 4a, Methods), obtained generators with a high incidence of valid models (>99%), and generated 500 valid kinetic models (Supplementary Figure 8).To quantify the effect of integrating one experimental  ' value on the generated values of the other kinetic parameters, we compared the estimates of the other  ' s and maximum velocities,  !"# , with ones obtained when no kinetic parameters were integrated.
We next enquired if RENAISSANCE improves its  ' estimates as the number of integrated  9).These findings indicate that the integration of experimental information may improve prediction accuracy beyond the subsystem level.
Inspecting the distributions of the generated  ' s that were not part of the TCA subsystem revealed that the predictions for a vast majority of these  ' s (85 out 91) improved upon the integration of TCA  We further examined the impact of integrating experimental kinetic data on parameters that lack verifiable experimental measurements, which accounted for 276 out of 384  ' s.To obtain a qualitative assessment of the effects of integration, we employed PCA 48 to visualize the RENAISSANCE predictions for these unknown  ' s (Figure 5c).The analysis revealed notable shifts in the estimates of these  ' s when experimental data was integrated compared to the case where no data was integrated (Fig 5c, blue cluster).Additionally, the estimates for the cases with integrated experimental data exhibited greater similarity than those without integration.
These results suggest that integrating experimental kinetic information reduces quantitative uncertainties in the intracellular metabolic state of the cell, allowing RENAISSANCE to make more informed predictions on the dynamic properties of the entire metabolic network.We anticipate that the inclusion of new experimental data and its subsequent integration will enhance the predictive capabilities of RENAISSANCE even further.

Discussion
Metabolism plays a defining role in shaping the overall health of living organisms.A reprogrammed or altered metabolism is not only associated with the most common causes of death in humanscancer, stroke, diabetes, heart disease, and others -but is also related to many congenital diseases 51 .Thus, a better understanding of metabolic processes is crucial to accelerate the development of new drugs, personalized therapies, and nutrition.Biotechnological advances like the bioproduction of industrially essential compounds and environmental bioremediation also hinge on our ability to describe cellular metabolism accurately.
Kinetic models provide the most thorough mathematical representation of metabolism.The efficient construction of these models will open new possibilities for various biomedical and biotechnological applications.However, acquiring the parameters of these models with traditional kinetic modeling approaches is computationally expensive and arduous 15,32 .Several machine learning methods were recently proposed for more efficient kinetic model generation, including iSCHRUNK As RENAISSANCE is agnostic to the nature, range, and number of the parameters it needs to generate, it is straightforward to make adjustments in the framework to meet the specific demands the models need to satisfy.The parameters this framework can handle are not restricted to Michaelis constants only and can include other kinetic parameters, such as enzyme saturations 37 and enzyme states 54 , and other unknown quantities in the studied system, such as metabolite concentrations.
Crucially, given proteomic data, RENAISSANCE can predict unknown enzyme turnover number,  6": , values and consolidate them with the experimentally measured  6": values from databases such as BRENDA and SABIO-RK 55 .As such, it represents a valuable complement to current machine learning methods that estimate  6": values directly [56][57][58] .
In summary, we provide a fast and efficient framework that leverages machine learning to generate biologically relevant kinetic models.The open-access code of RENAISSANCE will facilitate experimentalists and modelers to apply this framework to their metabolic system of choice and integrate a broad range of available data.

E. coli model structure and data integration
The studied metabolic network included central carbon pathways of E. coli such as glycolysis, pentose phosphate pathway (PPP), tricarboxylic cycle (TCA), anaplerotic reactions, the shikimate pathway, glutamine synthesis, and had a lumped reaction for growth generated using lumpGEM 59 .The resulting model structure had 113 mass balances, including one for biomass accumulation, involving 123 reactions.The kinetic mechanism for each reaction was assigned based on the reaction stoichiometry.The overall model was characterized by 507 kinetic parameters consisting of 384  (  and 123  !"#  (Supplementary Figure 4).
We integrated different data types into the model to create a context-specific model.We integrated exo-fluxomics and exo-metabolomics data, such as the growth rate, uptake rates, and extracellular concentrations of different medium components, from an earlier experimental study 42 .We used data from other experimental works to impose constraints on the ranges of intracellular concentrations for different metabolites in E. coli 47 .Additionally, we imposed constraints on thermodynamic variables calculated using the Group Contribution Method 45,46 to ensure that any sampled flux directionalities and metabolite concentrations were consistent with the second law of thermodynamics.
After defining the model structure and integrating the available data, we sampled 5000 sets of steady-state profiles consistent with the integrated data using thermodynamics-based flux balance analysis implemented in the pyTFA tool 60 .Each steady-state profile comprises metabolite concentrations, metabolic fluxes, and thermodynamic variables.Once these profiles are available, we can generate kinetic models around these steady-states 15,28,34,37-40 using the RENAISSANCE framework.

Determining the validity of kinetic models
Herein, we consider a kinetic model valid (biologically relevant) if all time constants of the model response are consistent with the experimental observations of the studied organism.The time constant defines the time required for the system response to decay to > % ≈ 36.8% of its initial value.To test the model's time constants, we compute the Jacobian of the dynamic system formed by the model 37 .The dominant time constant of the linearized system is defined as the inverse of the real part of the largest eigenvalue of the Jacobian.The dominant time constants allow us to characterize the model dynamics -small time constants characterize fast metabolic processes such as glycolysis and electron transport chain.In contrast, polymerization processes involving the synthesis of DNA, RNA, and proteins typically occur at slower time scales.Additionally, the sign of the Jacobian eigenvalues provides information on the local stability of the generated models, where a model is locally stable if the real parts of all eigenvalues are negative.To ensure that a perturbation of the metabolic processes settles within 1% of the steady state before cell division, the dominant time constants of the model response should be five times faster than the cell's doubling time 44 .The biochemical response should also have a characteristic time slower than the timescale of proton diffusion within the cell 32 .With these properties, models can reliably describe the experimentally measured metabolic responses.The doubling time of the E. coli strain used in this study is  ?+;@AB,9 = 134 , which corresponds to a growth rate of 2 . Therefore, the dominant time constant of the model's responses should be smaller than one-fifth of the doubling time (26.8 ).Here, we imposed a stricter dominant time constant of 24 minutes, corresponding to an upper limit of ( B ) < −2.5 (or −60/24), on the real parts of the eigenvalues,  B , of the Jacobian.All kinetic parameter sets resulting in the model obeying this constraint are labeled valid and the rest invalid.

Assigning rewards to determine fitness in RENAISSANCE
RENAISSANCE uses Natural Evolution Strategy, NES (Supplementary Note 1), to optimize the weights of the generator network.However, to calculate the local gradient estimate, NES also requires an objective function, , to evaluate the fitness of each generator network, .In our study, we use the incidence of the generator, (), as the objective function, which is defined as the fraction of the generated models that are relevant (0 ≤ () ≤ 1).Thus, generator networks with a higher incidence of relevant models are 'fitter' than those with low incidence and have a higher weight in determining the parameters of the seed generator network for the next generation.In many cases, we observed that initially the generator neural networks do not generate any relevant models, and thus the optimization does not proceed as the fitness is always 0. To mitigate this, we added a sigmoidal term defined as follows,  = 0.01 1 +  (H )*+,-+, -H .*/,&,&"' ) where,  &":E%:E corresponds to the smallest maximal eigenvalue of the generated models and  J"$EBEB+, is the maximal eigenvalue partition that determines the relevancy of the kinetic model.In this study,  J"$EBEB+, = −2.5 (see previous section).This term rewards generators that generate models with dynamics closer to the relevant range more than those which generate models with slower, irrelevant, or unstable dynamics.This effectively pushes the optimization process toward finding generators that generate relevant models.So, the overall reward, , for a generator, , can be summarized as () = G , () = 0 (), () > 0 For the large-scale analysis of intracellular states (Fig. 3), the fitness for NES was no longer the incidence of the generators but the fastest possible dynamic for the models generated by a given generator.Thus, the reward was changed suitably as follows,  = 0.5 -D.> H 0-*'

L
where  !%", is the mean of the 10 fastest maximum eigenvalues (Supplementary figure 6) generated by a generator (out of 100 for this case study).This reward function ensured that the generators which generated models with more negative maximum eigenvalues (faster linearized dynamics, λ "#$ ) were rewarded more than the others.

Hyperparameter tuning of RENAISSANCE
RENAISSANCE has several hyperparameters that can be tuned to achieve the desired objective (Supplementary Notes 3 and 4).In this study, the hyperparameters used are as follows: the population size of the generator networks,  = 20, the noise level in generating the agent population from the mean optimal weights in each generation,  = 10 -L , learning rate of the gradient step,  = 10 -M , and the decay rate of learning,  = 5%.In addition, the generated  !s were constrained strictly between {1.3 × 10 ->> , 20} to accurately represent experimentally measured  !values as curated in the BRENDA database 50 .The hyperparameters of the neural networks are listed below.

Dimension reduction and visualization of steady states
For generating Fig. 3 a, d, f (left), the following steps were followed: I) the steady state matrix (consisting of 1127 features) was subjected to principal component analysis (PCA) 48 .II) The components of PCA that contributed to over 99% of the total expected variance were reduced to 2 dimensions using t-SNE 49 .III) The t-SNE components { N ,  J } were then subjected to polar coordinate transformation as follows, { > ,  L } were then plotted to generate the figures.

Figure 1 |
Figure 1 | Overview and applications of the RENAISSANCE framework.a, A Natural Evolutionary Strategy (NES) algorithm iteratively generates candidate solutions for an optimization problem based on their assigned fitness scores until satisfactory solutions are obtained.b, Context-specific structural properties of the metabolic networks are established and incorporated into the model.c, Once the model structure is fixed, available omics data are integrated into the model.d, Generators for parameterizing biologically relevant (valid) kinetic models are optimized iteratively in four steps to meet the design objective: a population of generators is randomly initialized (step I); generators produce parameters needed to parameterize kinetic models (step II); the fitness of

Fig. 2 |
Fig. 2 | Generation, validation, and application of RENAISSANCE-parameterized kinetic models.a, The incidence of models exhibiting the desired dynamic properties increases with the number of generations, as indicated by the mean incidence (black line) and the maximum and minimum incidence (green-shaded region) observed over 10 statistical repeats for every generation.The dashed line indicates the incidence of a repeat with fast convergence.The black diamonds indicate the generators selected for subsequent analysis from that repeat.b, The distribution of the maximum eigenvalues ( !"# ) for the generated models over generations.The vertical dashed lines indicate  !"# = −2.5 (left) and  !"# = 0 (right).Robustness analysis: c, The time evolution of the normalized perturbed biomass, ()/ $%& , (upper left) and concentrations, (() −  $%& )/ $%& , Nicotinamide

Fig. 3 |
Fig. 3 | Dynamic characterization reduces uncertainty in intracellular metabolic states.a, The twodimensional representation of the fastest linearized dynamic modes (corresponding to the maximum eigenvalue  !"# ) of 5000 intracellular steady states (reaction fluxes and metabolite concentrations) obtained with Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE) 49 (Methods).Each point represents a steady state, with its color indicating the corresponding  !"# value computed by RENAISSANCE.b, Magnified view of 22 neighboring steady states with fast dynamics (−3.8 ≤  !"# ≤ −8.5).The color scheme is
experimental  ' values increases.We also examined how the localization of integrated  ' values, such as the integration of  ' values from the citric acid cycle, affects the estimation of  ' values in other subsystems of the metabolic network.Specifically, we integrated 10 random combinations of half(9) of the 17 available experimentally measured  ' values associated with the citric acid cycle (TCA) of E. coli, one combination at a time.For each of the 10 combinations, we obtained generators with a high incidence of valid models (> 90%) and generated 2000 of these models.In total, we generated 20000 models containing 10 distinct combinations of the remaining 8 Michaelis constants to be estimated.This process ensured that each of the 17 Michaelis constants was integrated at least once and estimated at least once within the 10 combinations.The comparison between the experimentally observed and the RENAISSANCE estimated range of TCA  ' values, quantified through the overlap (OS) between these two ranges (Fig 5d), showed that integrating  ' s improves the estimates of the non-integrated individual  ' s within the same subsystem (Fig 5a left, red bars, Supplementary Figure 9), compared to when no  ' values are integrated (black diamonds).Indeed, significant improvement was observed in the predictions of 16 out of the 17  ' values in TCA when experimental values of  ' were integrated.The average prediction accuracy for the entire subsystem also increased (Fig 5a right, red bars) compared to the case with no integration of experimental  ' s (blue bars).A similar analysis was conducted for other subsystems, Pentose Phosphate Pathway (PPP), Glycolysis (gly), Anaplerotic reactions (anpl), Shikimate pathway (shkk), and Pyruvate metabolism (pyr), and consistently, estimates of  ' values within the same subsystem improved upon the integration of experimental information for all the cases (Fig 5a, right, Supplementary Figure

𝐾
' s (Fig 5b, upper left, colored bars) compared to the case where no  ' s were integrated (black diamonds).Similarly, the mean overlap score (OS) of the entire set increased (Fig 5b, upper right).We then examined the top 15  ' s that exhibited the most significant improvement in their estimates and determined the metabolic subsystem in which they are located.The integration of experimental  ' values from TCA yielded the most significant improvement in the estimates of the shikimate pathway (6 in the top 15), followed by glycolysis (3 out of 15) and anaplerotic reactions (2 out of 15) (Fig 5b, leftmost donut plot).Interestingly, a similar analysis conducted by integrating  ' s from other subsystems showed that the estimates from these three subsystems (shkk, gly, and anpl) consistently yielded the most significant improvement (Fig 5b, Supplementary Figure10).These results provide evidence that RENAISSANCE effectively incorporates experimental kinetic data from a specific subsystem of the metabolic network, resulting in improved parameter estimates across the entire network.

Fig. 5 |
Fig. 5 | Integrating experimental kinetic information improves other parameter estimates: a, Left: The overlap scores (OS, defined in d) of RENAISSANCE estimates for 17 experimentally available Michaelis 53,52and REKINDLE32. REKIN, in particular, has demonstrated remarkable gains in model generation efficiency by using generative adversarial networks (GANs)53.Nevertheless, it required existing kinetic modeling approaches to create the data needed for the GAN training.Herein proposed RENAISSANCE retains the model generation efficiency of REKINDLE without needing training data.RENAISSANCE can achieve more than 90% incidence of valid models within 10 to 20 minutes of computational time on a standard workstation.Once trained, the generators generate valid models with a rate of ~1 million valid models in 18 seconds, making it 150-200 times faster than traditional sampling based kinetic frameworks.RENAISSANCE also does not require specialized hardware to execute.The proof-of-concept applications shown here demonstrate RENAISSANCE's applicability to a broad range of studies.In this work, we deployed RENAISSANCE to parameterize valid models of metabolism consistent with an experimentally observed steady-state, with validity being characterized by the biological relevance of their timescales.However, conceptually, any other requirement can be imposed or data used, such as consistency with knockout studies or time series from drug absorption trials.