Advances in the integration of transcriptional regulatory information into genome-scale metabolic models

A major goal of systems biology is to build predictive computational models of cellular metabolism. Availability of complete genome sequences and wealth of legacy biochemical information has led to the reconstruction of genome-scale metabolic networks in the last 15 years for several organisms across the three domains of life. Due to paucity of information on kinetic parameters associated with metabolic reactions, the constraint-based modelling approach, flux balance analysis (FBA), has proved to be a vital alternative to investigate the capabilities of reconstructed metabolic networks. In parallel, advent of high-throughput technologies has led to the generation of massive amounts of omics data on transcriptional regulation comprising mRNA transcript levels and genome-wide binding profile of transcriptional regulators. A frontier area in metabolic systems biology has been the development of methods to integrate the available transcriptional regulatory information into constraint-based models of reconstructed metabolic networks in order to increase the predictive capabilities of computational models and understand the regulation of cellular metabolism. Here, we review the existing methods to integrate transcriptional regulatory information into constraint-based models of metabolic networks.


Introduction
Extensive research by biochemists in the last century has resulted in the chemical characterization of thousands of biochemical reactions (Kanehisa and Goto, 2000;Chang et al., 2009).Towards the end of 20 th century, complete genome sequences became .CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted May 15, 2016.; https://doi.org/10.1101/053520doi: bioRxiv preprint constraint-based FBA models of genome-scale metabolic reconstructions can lead to more accurate models.However, there are several challenges in integration of omics data stemming from the inherent experimental and biological noise in such datasets (Quackenbush, 2004).Nonetheless, several constraint-based methods have been developed to integrate experimental data, especially on transcriptional regulation and gene expression, within the FBA framework to build improved models (Åkesson et al., 2004;Covert et al., 2004;Becker and Palsson, 2008;Blazier and Papin, 2012;Hyduke et al., 2013).In this review, we discuss the existing methods to integrate regulatory information into constraint-based FBA models by broadly classifying them into three different approaches (Covert et al., 2001;Åkesson et al., 2004;Covert et al., 2004;Becker and Palsson, 2008;Chandrasekaran and Price, 2010;Blazier and Papin, 2012;Hyduke et al., 2013;Kim and Reed, 2014).Such methods have already proven successful in building context-specific metabolic models for human tissues and predicting novel drug targets in pathogens (Becker and Palsson, 2008;Folger et al., 2011;Bordbar et al., 2012;Collins et al., 2012).
The review is organized as follows.In the second section, we describe the constraint-based FBA framework.In the third section, we discuss existing methods to integrate omics data within the FBA framework as additional flux constraints to build context-specific metabolic models.In the fourth section, we describe the reconstruction and analysis of integrated regulatory-metabolic models where Boolean transcriptional regulatory networks (TRNs) are incorporated within the FBA framework.In the fifth section, we discuss the need for automated methods to integrate information on regulatory network architecture and expression measurements within metabolic networks to reconstruct integrated regulatory-metabolic models.Note that previous reviews in this area only emphasize on methods that are descriptive in nature which are presented in section 3 of this review.In comparison to previous reviews, we here provide a much more comprehensive overview of the area by also describing in detail the methods which are predictive rather than just descriptive in nature in sections 4 and 5 of this review.

Flux balance analysis
Flux balance analysis (FBA) is a constraint-based modelling approach that is widely used to investigate the capabilities of available genome-scale metabolic networks (Varma and Palsson, 1994;Kauffman et al., 2003;Price et al., 2004;Orth et al., 2010; .CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted May 15, 2016.; https://doi.org/10.1101/053520doi: bioRxiv preprint Lewis et al., 2012).FBA primarily uses the information on the list of biochemical reactions in an organism along with the stoichiometric coefficients of involved metabolites to predict the fluxes of all reactions in the metabolic network.Such biochemical information is contained within available organism-specific genome-scale metabolic reconstructions.For any organism, the genome-scale metabolic reconstruction contains information on all known metabolic reactions and genes encoding enzymes catalysing different reactions in the network (Palsson, 2006) (Fig. 1A).Notably, genome-scale metabolic reconstructions for most organisms also include reactions for transport of metabolites across the cell boundary, and a pseudo-reaction capturing the production of biomass in terms of their precursor metabolites (Fig. 1A).
In the FBA framework, the list of reactions along with the stoichiometric coefficients of involved metabolites in a network reconstruction is mathematically represented in the form of a stoichiometric matrix ‫܁‬ of dimensions ݉ ൈ ݊, where ݉ denotes the number of metabolites and ݊ denotes the number of reactions in the network (Fig. 1B).Entries in each column of the matrix ‫܁‬ give the stoichiometric coefficients of metabolites participating in a particular reaction, where negative coefficients signify consumption of a metabolite, positive coefficients signify production of a metabolite, and zero coefficients signify no participation of a metabolite in the reaction (Fig. 1B).These stoichiometric coefficients of metabolites in various reactions impose constraints on the flow of metabolites in the network (Heinrich and Schuster, 1996;Schilling et al., 1999;Palsson, 2006).
Subsequently, the method capitalizes on these stoichiometric constraints and assumes steady state to predict the fluxes of all reactions in the network.
In any metabolic steady state, different metabolites attain a mass balance wherein the rate of production of each metabolite is equal to its rate of consumption, and this leads to the system of mass balance equations given by: where ‫ܞ‬ is the vector of fluxes through all reactions in the network (Fig. 1B).For each metabolite in the network, Eq. 1 gives a linear equation relating fluxes of various reactions in which the metabolite participates (Fig. 1B).Since, the number of metabolites is much less than the number of reactions in genome-scale metabolic networks of most organisms, the number of linear equations is much less than the number of reaction fluxes (unknowns) to be determined.Thus, Eq. 1 typically leads to an under-determined system .CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted May 15, 2016.; https://doi.org/10.1101/053520doi: bioRxiv preprint of linear equations, and a large solution space of allowable fluxes for genome-scale metabolic networks (Fig. 1B,C).
The size of the allowable space can also be reduced by incorporating additional constraints on reaction fluxes.Firstly, certain reactions in the metabolic network are irreversible under physiological conditions, and such thermodynamic constraints (Beard et al., 2002;Orth et al., 2010) can be used to constrain the flux of irreversible reactions.
Secondly, the activity of specific enzymes may limit the flux through certain reactions.
Thirdly, the availability of nutrients in the growth medium can be used to constrain the fluxes of transport reactions.Note that unlike stoichiometric or mass-balance constraints, these additional constraints represent bounds on reaction fluxes in the metabolic network (Fig. 1B).
Since stoichiometric and additional constraints lead to an under-determined system with a large space of possible solutions, FBA uses linear programming (LP) to find a particular solution within the allowable solution space that either maximizes or minimizes a certain biologically relevant linear objective function ܼ (Watson, 1984;Fell and Small, 1986;Varma and Palsson, 1994) (Fig. 1C).Some examples of biologically relevant objective functions that have been explored using FBA include maximization of biomass production, maximization of ATP production and minimization of redox potential (Schuetz et al., 2007).However, maximization of biomass production (Feist and Palsson, 2010) is usually chosen as the objective function in FBA.The LP formulation of FBA can be written as: where ‫ܞ‬ ‫܁܁ۯۻ۽۷۰‬ is the objective function corresponding to the biomass growth flux, and vectors ‫܊ܔ‬ and ‫܊ܝ‬ contain the lower and upper bounds on different reaction fluxes contained in ‫.ܞ‬The solution of the FBA problem in Eq. 2 is a flux distribution ‫ܞ‬ that maximizes the biomass objective function subject to the stoichiometric and additional constraints.FBA has found wide applications which include microbial strain improvement for producing industrially important metabolites, understanding molecular pathways .CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted May 15, 2016.; https://doi.org/10.1101/053520doi: bioRxiv preprint involved in microbial pathogenesis, identification of drug targets, understanding reductive evolution in microbes and generating tissue-specific metabolic models for normal and cancerous tissues (Burgard et al., 2003;Pal et al., 2006;Becker and Palsson, 2008;Gianchandani et al., 2010;Folger et al., 2011;Bordbar et al., 2012;Collins et al., 2012).

Integration of transcriptomic profiles to build context-specific metabolic models
FBA predicts the fluxes of reactions and the biomass production rate without accounting for the regulatory constraints that are crucial in determining the presence and activity of enzymes in an environmental condition (Orth et al., 2010).This omission of regulatory constraints is an important limitation of FBA (Covert andPalsson, 2002, 2003;Åkesson et al., 2004) and can partly explain incorrect predictions by this method on gene essentiality (Covert et al., 2004) and gene interactions (Szappanos et al., 2011).The metabolic state of a cell in a given condition is governed by the expression of metabolic genes encoding enzymes.Thus, gene expression measurements obtained from microarrays or RNA-sequencing (RNA-seq) provide vital information on the state of the regulatory network and its influence on the metabolic network.With drastic reduction in cost of microarrays and RNA-seq, these high-throughput technologies are being increasingly used to generate abundant information on the expression of genes for several organisms under varied environmental conditions.
Such data can be readily exploited to understand the condition-specific activity and regulation of metabolism.
Towards this goal, several constraint-based methods have been developed in the last few years to integrate gene expression data within the FBA framework to generate contextspecific metabolic models (Blazier and Papin, 2012;Lewis et al., 2012;Hyduke et al., 2013;Kim and Reed, 2014;Machado and Herrgård, 2014;Saha et al., 2014;Imam et al., 2015).The available methods can be broadly classified into three categories (Estévez and Nikoloski, 2014), namely, switch-and valve-based methods that use omics data to determine active and inactive set of genes in a given condition, methods that generate context-specific metabolic models without the need for a predefined biological objective function, and methods based on iterative removal of inferrred non-functional reactions to build tissue-specific metabolic models.
In the first category of methods, transcriptomic data is used to determine the set of active and inactive (absent) genes in a given condition (Fig. 2).Next, the information on active and inactive genes is used to set bounds on reaction fluxes catalysed by associated gene products (enzymes) within the FBA framework before computing the flux distribution and biomass rate.The treatment of bounds on reaction fluxes catalysed by active and inactive enzymes varies between different methods in this category, and this leads to a further classification into two sub-categories (Hyduke et al., 2013) (Fig. 2).In the sub-category of switch-based methods, the upper bound for the maximum flux through reactions catalysed by active enzymes is left unconstrained while those for reactions catalysed by inactive enzymes is set to zero before computing the flux distribution using FBA (Fig. 2).In the sub-category of valve-based methods, the upper bound for the maximum flux through reactions catalysed by enzymes is set proportional to the normalized expression of the associated genes before computing the flux distribution using FBA (Fig. 2).Basically, these methods are all based on the assumption that gene expression is correlated with reaction fluxes which may not necessarily hold (Gygi et al., 1999;ter Kuile and Westerhoff, 2001;Yang et al., 2002).Moreover, omics measurements suffer from a lack of sensitivity leading to false negatives in the identification of active and  and Palsson, 2008;Schmidt et al., 2013) have also been utilized to incorporate other types of omics data, such as proteomics and metabolomics, within the FBA framework.
In recently proposed which unlike INIT and iMAT enables the user to also specify a set of required metabolic functions to be satisfied by final context-specific metabolic model (Agren et al., 2014).Thus, tINIT is a MILP based method which also shares principles with the first category of LP based methods.
The third category includes methods that were envisaged to build mammalian tissue-specific metabolic models.These methods integrate diverse omics data and literature based annotations to define two sets of reactions, core and non-core, for a specific tissue type.Reactions in the core set are highly likely to be functional while those in the non-core set are unlikely to be functional in the tissue under consideration.After defining the core and non-core set of reactions for a given tissue, these methods attempt to iteratively remove each non-core reaction subject to preserving the functionality of the core reaction set, and this leads to a tissue-specific metabolic model from a generic metabolic model.Some of the methods which fall into this third category include the model We emphasize that some of the proposed methods to integrate omics data within the FBA framework, such as metabolic adjustment by differential expression (MADE) (Jensen and Papin, 2011) and temporal expression-based analysis of metabolites (TEAM) (Collins et al., 2012), cannot be clearly assigned to any of the above mentioned three categories.MADE requires expression data from two or more successive conditions, such as a temporal expression profile, to determine differential expression of genes.Based on statistically significant changes in expression of genes between successive conditions, MADE infers highly and lowly expressed reactions for a sequence of conditions without the need for a predefined threshold.Finally, MADE solves a single MILP problem to generate context-specific metabolic model that best recapitulates the expression dynamics across successive conditions.On the other hand, TEAM is a fusion method based on dynamic FBA (Mahadevan et al., 2002) and GIMME to integrate temporal gene expression data within the FBA framework to predict time course flux distributions.In Supplementary Table 1, we list all the different methods developed so far to integrate omics data within the FBA framework.

Incorporation of Boolean transcriptional regulatory networks to build integrated regulatory-metabolic models
In the previous section, we discussed several existing methods to integrate multiomics data as additional flux constraints within the FBA framework to understand the condition-specific regulation of metabolism.Note that such methods are inherently descriptive rather than predictive in nature (Covert et al., 2004;Chandrasekaran and Price, 2010).That is, such methods are only able to describe the regulation of metabolism in conditions with available omics data but cannot predict in conditions lacking omics data.
Large-scale transcriptional regulatory networks (TRNs) have been reconstructed for several organisms including Escherichia coli (Gama-Castro et al., 2016), Bacillus subtilis (Sierro et al., 2008;Nicolas et al., 2012;Kumar et al., 2015), Saccharomyces cerevisiae (Teixeira et al., 2014) and humans (Gerstein et al., 2012) based on diverse biological datasets.Our current knowledge of these TRNs is mostly limited to the set of interactions and information on the nature of interactions (activating or repressing) between transcriptional factors (TFs) and their regulated genes.But inadequate information on parameters characterizing most interactions render detailed modelling of large-scale TRNs based on differential equations infeasible (Bornholdt, 2005).Boolean networks (Kauffman, 1969a;Kauffman, 1969b;Thomas, 1973;Kauffman, 1993), an alternate approach first proposed by Stuart Kauffman, has been widely used to study the qualitative dynamics of large-scale TRNs (de Jong, 2002).In the Boolean framework, each gene in the network is in one of two states, active or inactive.The state of each gene at a given time is determined by the state of its regulating gene(s) at the previous time based on a Boolean input function.The state of genes' in a Boolean network are either updated synchronously or asynchronously in a discrete time setting.Thus, Boolean networks provide a qualitative description of the dynamics of large-scale TRNs (Bornholdt, 2005).
. CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted May 15, 2016.; https://doi.org/10.1101/053520doi: bioRxiv preprint Integrated regulatory-metabolic models (Covert et al., 2004;Herrgard et al., 2006;Goelzer et al., 2008) have been manually reconstructed for some model organisms by unifying Boolean models of TRNs with FBA models of metabolic networks (Fig. 3).Covert et al. (Covert et al., 2004) were the first to reconstruct such a genome-scale integrated regulatory-metabolic model iMC1010 for E. coli, which was obtained by fusing a Boolean model of the TRN with the genome-scale metabolic network iJR904.iMC1010 accounts for 1010 genes in E. coli of which 104 genes code for 103 TFs and 906 genes code for different enzymes in the metabolic network iJR904.Moreover, the activity of genes in iMC1010 is determined by Boolean logical functions that depend on the state of its regulating TFs in the TRN.In addition, the Boolean logical functions that determine the activity of genes in iMC1010 also depend on the presence or absence of certain metabolites in the environment, flux of metabolic reactions and certain stimuli such as heat shock, stress, etc. Since, constraint-based FBA framework assumes steady state to predict the flux distribution in a metabolic network, the method cannot predict internal metabolite concentrations.Thus, in iMC1010, the authors use the flux of internal reactions as surrogate to model the allosteric regulation of proteins which is dependent on internal metabolite concentrations.Similar integrated regulatory-metabolic models have also been manually reconstructed for B. subtilis (Goelzer et al., 2008) and S. cerevisiae (Herrgard et al., 2006).Note that these integrated regulatory-metabolic models can also predict the metabolic phenotype of TF perturbations.
Regulatory flux balance analysis (rFBA) (Covert et al., 2001;Covert et al., 2004) is a method that has been developed to simulate these integrated regulatory-metabolic models incorporating TRNs as Boolean networks.rFBA simulates the growth of an organism in batch cultures to predict a series of steady state flux distributions corresponding to changes in the growth environment in successive time intervals.For each time interval, rFBA determines the state of the regulatory network or activity of genes in the integrated model based on logical functions in the Boolean regulatory network where the present regulatory state is also influenced by the metabolic state in the previous time interval.Next, rFBA uses the predicted activity of genes in the current time interval to constrain the flux of associated reactions within the FBA framework and compute the steady state flux distribution or metabolic state for the current interval (Fig. 3).Note that rFBA also predicts the temporal dynamics of concentration of external substrates available for uptake, concentration of secreted by-products and cell growth.It has been shown that the application of rFBA to the integrated regulatory-metabolic model iMC1010 for E. coli can significantly increase the ability to predict knockout phenotypes across diverse environmental conditions (Covert et al., 2004).
One limitation of rFBA is that it arbitrarily chooses one optimal flux distribution as the metabolic state of the integrated model from a space of alternate optimal flux distributions at each time interval, and steady-state regulatory flux balance analysis (SR-FBA) (Shlomi et al., 2007) is a MILP based method which tries to overcome this limitation.
Another limitation of using rFBA is that constraint-based methods do not predict the kinetics of internal metabolite concentrations which is important to understand the allosteric regulation of metabolism.On the other hand, detailed kinetic models (Chassagnole et al., 2002;Bettenbrock et al., 2007) can properly account for allosteric regulation but such models are limited to specific metabolic pathways due to paucity of kinetic data.For example, such a detailed kinetic model (Bettenbrock et al., 2007) with less than 50 reactions has been built to study carbon catabolite repression in E. coli.et al. (Covert et al., 2008) have developed a hybrid framework, integrated FBA (iFBA), which combines rFBA for genome-scale networks and kinetic models for smaller well characterized pathways to improve the predictive ability of integrated models.

Automated reconstruction of integrated regulatory-metabolic models
In the previous section, we reviewed manually reconstructed integrated regulatorymetabolic models where transcriptional regulatory interactions is represented as Boolean rules.However, manual reconstruction of Boolean models of TRNs from available information on interactions between TFs and their regulated genes can be a slow and tedious process.Firstly, it is extremely difficult to infer a single well-defined set of Boolean logical functions for genes in the TRN based on experimentally determined expression states in different conditions (Henry et al., 2013).Secondly, the Boolean approximation where each gene can have only two states, active or inactive, is an over-simplified view of the system that is inadequate to capture the complex regulation of enzymes in many situations.Thus, Boolean network based integrated regulatory-metabolic models have been manually reconstructed for only three organisms (Covert et al., 2004;Herrgard et al., 2006;Goelzer et al., 2008) to date.
On the other hand, large-scale sequencing projects such as ENCODE and modENCODE (Gerstein et al., 2010;mod et al., 2010;Gerstein et al., 2012) have .CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted May 15, 2016.; https://doi.org/10.1101/053520doi: bioRxiv preprint generated vast amounts of regulatory information for humans and model organisms, mainly, in the form of RNA-seq and ChIP-seq data in diverse biological contexts. Moreover, with the rapid fall in the cost of sequencing, experimental biologists are routinely generating RNA-seq and ChIP-seq data for their pet organisms.RNA-seq data gives information on mRNA abundances or gene expression in a given condition and ChIP-seq data on genome-wide binding profile for TFs or transcriptional regulatory interactions.Combination of ChIP-seq and RNA-seq data can be used to reconstruct the TRN of an organism, and thus, such information can be incorporated within genome-scale metabolic reconstructions to build integrated regulatory-metabolic models (Fig. 4).Given the slow pace of manual reconstruction process, automated approaches combining information on regulatory network architecture and gene expression data are best suited to address the challenge of turning the deluge of high-throughput data into predictive systems biology models.
In this direction, probabilistic regulation of metabolism (PROM) (Chandrasekaran and Price, 2010), is the only automated method to date.PROM combines information on TRN architecture and gene expression data within the FBA framework to build an integrated model.In PROM, the set of interactions between TFs and target genes in the TRN is used along with gene expression data across multiple conditions to predict the likelihood of the expression of a target gene given the expression state of the controlling TF.In PROM, such probabilities are then used to constrain the maximum allowable flux through reactions associated with the target genes before computing the flux distribution using FBA.Thus, PROM can predict the consequences of TF perturbations unlike methods in the first approach.Also, the prediction accuracy achieved by PROM was similar to the manually reconstructed integrated regulatory-metabolic model of E. coli.
Nevertheless, the existing methods including PROM have limitations that provide scope for future improvements.Firstly, the existing automated method does not make use of information on environment-dependent or condition-specific regulation.Secondly, the existing methods are designed to integrate high-confidence regulatory interactions from manually reconstructed TRNs.In contrast, regulatory interactions inferred from ChIP-seq data are likely to be very noisy, and future methods should address this challenge.Thirdly, existing methods do not account for available information on feedback regulation of enzymes by intracellular metabolites.Addressing this challenge may require transition to hybrid models where a combination of constraint-based and kinetic approaches will be necessary to simulate metabolic networks (Covert et al., 2008).Finally, the existing automated method to build integrated regulatory-metabolic model is static in time and not designed to incorporate temporal expression data.We expect future automated methods will overcome some of the above-mentioned limitations.

Conclusion
In this review, we have presented an overview of three different approaches for integrating available regulatory information within constraint-based FBA models of genome-scale metabolic reconstructions.Previous reviews (Blazier and Papin, 2012;Hyduke et al., 2013;Estévez and Nikoloski, 2014) in this area have largely focused on the first approach where considerable research has led to several methods where omics data is directly integrated within the FBA framework as additional flux constraints (Supplementary Table 1).Such methods including GIMME (Becker andPalsson, 2008), iMAT (Shlomi et al., 2008), INIT (Agren et al., 2012) and MBA (Jerby et al., 2010) have been successful in building context-and tissue-specific metabolic models.Notably, the methods in the first approach are rather descriptive than predictive as they do not explicitly incorporate the available information on regulatory interactions in the TRN (Chandrasekaran and Price, 2010).In this review, we also focus on two other approaches to build integrated regulatory-metabolic models.First such approach involves manual reconstruction of integrated regulatory-metabolic models where transcriptional regulatory information is represented as Boolean networks (Covert et al., 2004;Herrgard et al., 2006;Goelzer et al., 2008).Such Boolean based integrated regulatory-metabolic models can be studied using rFBA and allied approaches (Shlomi et al., 2007;Samal and Jain, 2008).
Since manual reconstruction of integrated models is a painstaking and time consuming process, the need of the hour is to develop automated methods to incorporate the flood of next generation sequencing data such as RNA-seq and ChIP-seq within genome-scale metabolic reconstructions to build integrated regulatory-metabolic models.Such automated methods will eventually enable the systems biology community to reconstruct whole-scale models (Karr et al., 2012) for several organisms.
We would like to apologize to the authors of other relevant articles whose work was not cited in this review due to limited space.AS acknowledges support from Department of Science and Technology (DST) India start-up grant (YSS/2015/000060), Ramanujan fellowship (SB/S2/RJN-006/2014), Max Planck India mobility grant and IMSc PRISM project (XII Plan).framework to build integrated regulatory-metabolic models.In this approach, the known TRN of an organism is captured in a Boolean model where the activity of each gene is determined by the state of its regulating genes coding for TFs.For example, gene G c coding for enzyme P c associated with reaction R3 is controlled by transcription factors TF 2 and TF 3 , where TF 3 represses the transcription of G c while TF 2 activates it.Thus, the Boolean rule for gene G c captures this combinatorial regulation by TF 2 and TF 3 .The figure also shows a representative metabolic network where constituent reactions are directly controlled by the state of genes coding for enzymes and indirectly controlled by genes coding for TFs.Such an integrated regulatory-metabolic model can be investigated using the rFBA (Covert et al., 2001;Covert et al., 2004) approach.

Reaction Equation
inactive genes.Notably, switching off flux through reactions catalysed by inactive genes which are false negatives due to experimental noise, may render the FBA predictions inconsistent with the biological objective or required metabolic function.Thus, the first category of methods also have the option to selectively re-enable flux through some reactions catalysed by inactive enzymes, to overcome any potential inconsistent FBA predictions with the required metabolic function (Becker and Palsson, 2008).Akesson et al. (Åkesson et al., 2004) developed the first switch-based method to integrate gene expression data within the FBA framework.In this method, inactive genes were determined based on their non-detection across the microarray replicates for a given condition, and the maximum flux through reactions associated with such inactive genes is set to zero.To overcome pitfalls due to experimental noise, Akesson et al. performed a manual assessment of the inactive gene set for false negatives, and re-enabled any reaction flux associated with such manually identified false negatives.Subsequent to the work of Akesson et al., Becker et al. (Becker and Palsson, 2008) have developed a switch-based method, Gene inactivity moderated by metabolism and expression (GIMME), which is widely used to build context-specific metabolic models.Unlike Akesson et al, GIMME semi-automatically assesses the inactive gene set for potential false negatives, and re-enables any reaction flux associated with such false negatives.In addition, GIMME has the advantage of reporting an inconsistency score between the gene expression data and the predicted flux distribution.Although the methods in this category were initially developed to integrate transcriptomic data, overtime these methods (Becker building algorithm (MBA)(Jerby et al., 2010), the metabolic context-specificity assessed by deterministic reaction evaluation (mCADRE)(Wang et al., 2012), fast reconstruction of compact context-specific metabolic network model (FASTCORE)(Vlassis et al., 2014) and cost optimization reaction dependency assessment (CORDA)(Schultz and Qutub, 2016).

Figure 1 :
Figure 1: Overview of the FBA framework.(a) Genome-scale metabolic network reconstruction contains the list of metabolic reactions in an organism including the transport reactions and the biomass reaction.Gene-Protein-Reaction (GPR) associations link genes to encoded enzymes catalysing various reactions in the network.(b) The list of reactions along with the stoichiometric coefficients is mathematically represented in the form a matrix ‫܁‬ where rows correspond to metabolites and columns to reactions in the network.In any metabolic steady state, the stoichiometric constraints lead to a system of mass-balance equations relating various reaction fluxes in the network.Additional constraints based on thermodynamic and other considerations bound the possible range of fluxes through specific reactions.(c) The stoichiometric and additional constraints lead to an under-determined system of equations with a large space of allowable solutions.FBA uses LP to find a particular solution within the allowable solution space that maximizes the biomass production.Here, the schematic figure shown in (c) is inspired from various sources including Orth et al.(Orth et al., 2010).

Figure 2 :
Figure 2: Integration of gene expression data within FBA framework using switchand valve-based methods.Switch-based methods classify genes as active and inactive based on a threshold on expression value, and this leads to a binary classification of upper bounds on associated reaction fluxes.In the example network on the left, genes G c and G d are inactive, and thus, the upper bound on flux through their associated reactions R3 and R4 are set to zero (or, in other words, reactions R3 and R4 are switched off).In contrast, valve-based methods use normalized gene expression to set upper bounds on maximum allowable reaction fluxes.In the example network on the right, the magnitude of upper bounds on different reaction fluxes is represented via blue boxes with varying widths.Note that unlike in the switch-case, the upper bound on flux through reactions R3 and R4 are not set to zero in the valve-case.

Figure 3 .
Figure 3. Incorporation of Boolean transcriptional regulatory networks within FBA

Figure 4 .
Figure 4. Automated reconstruction of integrated regulatory-metabolic models.Integrated regulatory-metabolic models are holistic and predictive models that capture the complex interactions between transcriptional regulatory network (TRN) and metabolic network of an organism.Such integrated regulatory-metabolic models can be reconstructed based on information on the TRN derived from genome-wide ChiP-seq experiments and gene expression profile from microarray or RNA-seq experiments.Efficient automated algorithms are then employed to integrate diverse regulatory information into the genome-scale metabolic network to generate high quality integrated regulatory-metabolic models.
review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available underThe copyright holder for this preprint (which was not this version posted May 15, 2016.; https://doi.org/10.1101 contrast to the switch-based methods,Colijn et al. (Colijn et al., 2009)developed a valve-based method, E-flux, to integrate gene expression data within the FBA framework.In E-flux, bounds are set such that the maximum allowable flux through each reaction is a function of the normalized expression of the associated genes (Fig.2).CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under Shlomi et al. (Shlomi et al., 2008;Zur et al., 2010)eFang et al., 2012), 2012)impose a continuum of flux constraints (Fig.2).Note that all methods in the first category make use of LP and a biologically relevant objective function to predict flux distribution and generate context-specific metabolic models.The second category includes methods that integrate gene expression data to build context-specific metabolic models without imposing a biologically relevant objective function.Shlomi et al. (Shlomi et al., 2008;Zur et al., 2010)developed the first method in this category, integrated metabolic analysis tool (iMAT), which can integrate transcriptomics and proteomics data to build tissue-specific metabolic models in multicellular organisms.iMATusesomicsdatato classify genes as highly, moderately or lowly expressed in a given condition, and subsequently, resorts to mixed integer linear programming (MILP) to generate a context-specific metabolic model where the presence of reactions associated with highly expressed genes is maximized while the presence of reactions associated with lowly expressed genes is minimized.Agren et al. (Agren et al.,   2012) developed a similar MILP based method, integrative network inference for tissues(INIT), which can also integrate diverse omics data to generate context-specific metabolic models.In contrast to iMAT, INIT enforces the additional constraint of positive production of experimentally determined metabolites based on metabolomics data while building the tissue-specific metabolic models.For normal tissues in mammalian systems (unlike cancerous tumours), it is inappropriate to use growth or proliferation as the relevant objective function.Thus, iMAT and INIT are better suited to generate context-specific metabolic models for normal tissues in comparison to methods in the first category.An extension of INIT, task-driven integrative network inference for tissues (tINIT), has been .