In silico identification of switching nodes in metabolic networks

Cells modulate their metabolism according to environmental conditions. A major challenge to better understand metabolic regulation is to identify, from the hundreds or thousands of molecules, the key metabolites where the re-orientation of fluxes occurs. Here, a method called ISIS (for In Silico Identification of Switches) is proposed to locate these nodes in a metabolic network, based on the analysis of a set of flux vectors (obtained e.g. by parsimonious flux balance analysis with different inputs). A metabolite is considered as a switch if the fluxes at this point are redirected in a different way when conditions change. The soundness of ISIS is shown with four case studies, using both core and genome-scale metabolic networks of Escherichia coli, Saccharomyces cerevisiae and the diatom Phaeodactylum tricornutum. Through these examples, we show that ISIS can identify hot-spots where fluxes are reoriented. Additionally, switch metabolites are deeply involved in post-translational modification of proteins, showing their importance in cellular regulation. In P. tricornutum, we show that Erythrose 4-phosphate is an important switch metabolite for mixotrophy suggesting the importance of this metabolite in the non-oxidative pentose phosphate pathway to orchestrate the flux variations between glycolysis, the Calvin cycle and the oxidative pentose phosphate pathway when the trophic mode changes. Finally, a comparison between ISIS and reporter metabolites identified with transcriptomic data confirms the key role of metabolites such as L-glutamate or L-aspartate in the yeast response to nitrogen input variation. Overall, ISIS opens up new possibilities for studying cellular metabolism and regulation, as well as potentially for developing metabolic engineering.


Introduction
Despite huge developments in the last decades, deciphering cellular metabolism remains a great challenge, essential for a wide range of fields (including biotechnology, health, ecology, etc.).The analysis of metabolic networks plays an important role in addressing this issue.More and more networks are now available, with a more complete coverage of the metabolism [6].More and more methods have also been proposed to analyze and capitalize on these networks [15,23].Among them, one of the most common approach is Flux Balance Analysis (FBA), which aims to predict metabolic fluxes [17] given a set of constraints (on e.g.substrate uptake).
Although FBA and all its derivatives have been successfully used, the size of the networks (now often with several thousands metabolites and reactions) makes their analysis or even the exploitation and visualization of the results more and more complex.To help explore these networks, the identification of key metabolites is a major concern.Here, we will focus on the nodes in the networks where the fluxes are redirected when culture conditions (such as nutrient inputs) vary.By highlighting where the main changes in metabolism occur, this will give us a better picture of cellular response to environmental variations.
The identification of key metabolites has been the subject of many studies (e.g.[14,10,21,12]), although there is no general agreement about the definition of what a key metabolite is.In line with my vision of switch points, [18] provide a useful method to identify the so-called reporter metabolites.They correspond to nodes around which the enzymes are subject to the most significant transcriptional changes.Nonetheless, reporter metabolites are identified from experimental data and a purely theoretical method (based only on the metabolic network) is still lacking.
Here, we propose to identify switch nodes based on the analysis of a set of flux solutions under different conditions, obtained e.g. by parsimonious flux balance analysis (pFBA) [13].To do so, for each metabolite, we consider the flux vectors (including stoichiometry) of the reactions involving this metabolite (as a substrate or a product) for all the conditions.If the dimension of the vector space generated by this set is greater than one (i.e. the flux vectors involving this metabolite for different conditions are not co-linear), the metabolite is considered as a switch node.After explaining in details how the switches are identified, the core metabolic network of Escherichia coli will be used to illustrate the method.Then, we will show that ISIS identifies reporter metabolites for Saccharomyces cerevisiae under nitrogen limitation and that it brings out metabolites involved in post-translational modification (PTM) of proteins in E. coli.Finally, ISIS will be used to decipher how metabolism is impacted by trophic modes in the diatom Phaeodactylum tricornutum.

ISIS principle
Our objective is to identify switch nodes in a metabolic network, corresponding to key metabolites where flux reorientations occur when environmental conditions change.ISIS is based on the analysis of a set of flux vectors for a range of environmental conditions (e.g.different inputs, different objective functions reflecting different metabolic stages, etc.).A metabolite is considered as a switch if the fluxes at this point are redirected in a different way when conditions change.This is illustrated in Figure 1.On the top, the fluxes around metabolite x 1 are distributed always in the same way (i.e. one third of the incoming flux v 1 goes to v 2 , the remaining goes to v 3 ), so x 1 is not a switch node.On the contrary, in the bottom example, the incoming flux is rerouted according to conditions, so x 1 is considered in this case as a switch point.This simple principle can be evaluated numerically using linear algebra (see Section Method), by evaluating the dimension of the vector space generated by the set of reaction flux vectors.More precisely, a singular value decomposition is  pFBA) for different conditions (each color represents a condition).Case A: the redistribution of fluxes around metabolite x 1 occurs always in the same way (the flux vectors are collinear), so x 1 is not a switch point.Case B: the incoming flux is rerouted according to conditions, so x 1 is a switch point.carried out and a score (between 0 and 1) is computed from the singular values.A score of zero means that the vectors are collinear (Fig. 1, on top).The higher the score, the more the metabolite can be considered a switch.

A toy example: E. coli under aerobic vs anaerobic conditions
The principle of ISIS is first illustrated by studying the transition from aerobic to anaerobic conditions in E. coli.We use its core metabolic model, composed of 72 metabolites and 95 reactions [16], and estimate flux vectors with and without oxygen using pFBA (see Fig. 2).ISIS has identified as switch nodes the junctions of glycolysis with the TCA cycle (pyruvate and acetyl-coA) and with the oxydative pentose phosphate pathway (glucose 6-phosphate).This is in line with what we could expect given that these last two pathways are shut down in absence of oxygen.We also observed that several currency metabolites (e.g.ATP, NADPH), involved in many reactions, are also identified as switch points.This simple example shows the soundness and capacity of ISIS to identify nodes around which the metabolism is rerouted.

ISIS identifies reporter metabolites for Saccharomyces cerevisiae under nitrogen limitation
Given that our definition of switch nodes is close to reporter metabolites [18], we compare both methods, using [22] as a case study.To do so, we consider the growth of S. cerevisiae under three different nitrogen limitation: ammonium, alanine, and glutamine.For each input, the flux vectors are computed with pFBA using the metabolic network YeastGEM v8.1.1 composed of 2241 metabolites and 3520 reactions [8].ISIS is then applied by taking the inputs two by two, mimicking what have been done in [22].Switch metabolites with a score above 0.1 are shown in Fig. 3.Many of these metabolites, common between the different cases, are involved in amino acid synthesis.We also observe that reporter metabolites are enriched among the top switch metabolites (hypergeometric test, p=0.048, 0.061, and 3.10 −5 for respectively glutamine vs ammonium, alanine vs ammonium, and alanine vs glutamine).When comparing alanine versus glutamine limitations, glutamate, aspartate and 2-oxoglutarate are depicted with both methods.The first two metabolites are closely related to the sources of nitrogen.The last one points out the importance of the TCA cycle in the synthesis of amino acids, as reported in [22].Finally, several reporter metabolites appears in the top of the switch metabolite list (see SI), even if they do not appear in Fig. 3.For example, in the comparison between alanine versus ammonium, the reporter metabolites pyruvate and Acetyl-CoA are ranked respectively 19 and 46 over 2241 metabolites by ISIS.This case study shows that key metabolites can be identified in silico, without requiring experiments (unlike the reporter metabolite method which requires transcriptomic data).

ISIS brings out metabolites involved in protein PTM in E. coli
Given that the switch metabolites are by definition nodes where fluxes are reoriented, these metabolites can potentially be involved in cellular regulations, such as PTM.To investigate this aspect, we follow [3], by considering the growth of E. coli under 174 diferent nutrient inputs.Using a network composed of 2382 reactions and 1668 metabolites [5], we first define the flux vectors for each input using pFBA.We then analyze all these fluxes with ISIS.The list of the switching points is given in SI.Among the top score, we find key branching points between the main pathways, such as Fructose 6-Phosphate, Pyruvate, Acetyl-CoA, etc.On Fig. 4, we plot the proportion of metabolites known to be ligands of protein (i.e.involved in PTM), according to their switching score.Ligands are clearly enriched among the top switching metabolites (hypergeometric test, p=6.10 −8 ).For the class of metabolites with the higher score (>0.3), 74% of the metabolites are ligands, compared to 25% for the class with the lower score (<0.1).
Based on a comparison of FBA fluxes, [3] have identified highly regulated reactions, which appear to account for a significant proportion of the known proteins with PTM.By contrast, here we clearly shows the key role of the switching metabolites identified by ISIS in protein PTM.Overall, the identification of switch nodes could help in deciphering the complex roles of metabolites in the regulation of protein activity [24].

Shedding light on mixotrophy in the diatom P. tricornutum
Finally, we study the effect of trophic mode in P. tricornutum.The fluxes for autotrophic, mixotrophic and heterotrophic1 growths have been estimated in [9], considering a genome-scale metabolic network composed of 587 metabolites and 849 reactions.This study highlighted the importance of flux rerouting between chloroplasts and mitochondria, depending on the trophic mode.Using this set of fluxes, ISIS identifies as switch nodes glycerone-P (dihydroxyacetone phosphate), fructose-6-phosphate, fructose 1,6-bisphosphate, fumarate, pyruvate, etc. (see SI).These metabolites are key branching points between chloroplast and mitochondria (as shown in Fig. 1 from [9]).Additionally, we also identify erythrose 4-phosphate, which is actually a major hub of the metabolism to balance the fluxes between the main pathways (in particular glycolysis, the Calvin cycle and the oxidative pentose phosphate pathway).Thus, ISIS contributes to our understanding of trophic modes by highlighting the overlooked role of a key metabolite in the non-oxidative pentose phosphate pathway in orchestrating changes between the energetic pathways.

Discussions
A method -called ISIS -has been proposed to identify switch nodes, i.e. metabolites around which fluxes are rerouted when environmental conditions change.These points are determined in silico, based on the analysis of a set of reaction fluxes, corresponding to the different conditions.The method is fast and scalable, e.g.
it takes just a few seconds with a standard computer for a metabolic network of a few thousand reactions.
ISIS gives sound results on the different case studies developed in this article, from the comparison between aerobic and anaerobic conditions or different substrates in E. coli to the analysis of trophic modes in the diatom

P. tricornutum.
Other methods dealing with the identification of key metabolites have been proposed (e.g.[14,10,21,12]).A comparison between them is not straightforward, given that the definition of key metabolites is not necessarily the same between the different studies.One of the closer definition is that of reporter metabolites [18], and the results on a case study have shown some similarities (see the case study with S. cerevisiae).The main advantage of ISIS is that it does not require experimental data.The downside is that it entirely relies on flux estimations.These estimations can benefit from all the recent progresses in metabolic network reconstruction and constraint-based modeling [4].Additionally, the soundness of flux estimations can potentially be increased by integrating experimental data (e.g.transcriptomic, proteomic, fluxomic) [19], although losing the ease of a purely in silico approach.
Given the crucial role of switch metabolites in cellular metabolism, several applications can be considered.
First, ISIS could be used to select key metabolites to monitor when studying the response of an organism to a change of environment.In the same vein, theses switch nodes can also be useful to study cellular regulations (as already illustrated by the high proportion of switch metabolites involved in PTM).The identification of switch points would also be of great interest in biotechnology: they correspond to potential targets to reorient the metabolism of the cell for a given purpose (such as the production of a metabolite of interest), either by controlling environmental conditions or by genetic manipulations.Finally, ISIS could also be used to decompose the whole network into different modules connecting the switch nodes (to analyze the metabolism or to develop dynamical model as proposed in [2,1]).This would give a new approach for network splitting, complementing a set of methods (reviewed in [20]) based on network topology, flux coupling, or elementary flux mode.A specificity of our approach in that case is that the set of selected conditions defines the node identification.Thus, the same metabolic network can be decomposed in different ways depending on which conditions are considered.
To conclude, the metabolites around which fluxes are switched in response to environmental changes are key points in the metabolic network.By identifying them in silico, ISIS allows a better comprehension of cellular metabolism and regulation, as highlighted in this article with the studies of E. coli, S. cerevisiae, and P.
tricornutum under different substrate inputs or trophic modes.This method can be easily applied to many organisms, as it only requires their metabolic network.It offers several perspectives, from fundamental studies on metabolism to biotechnology applications through metabolic engineering.

Method Framework
The metabolism of a cell can be represented by its metabolic network, composed of n m metabolites and n v reactions.It is generally described by a stoichiometric matrix S(n m × n v ), where each row corresponds to a metabolite and each column to a reaction.The metabolic fluxes through this network are given by the reaction rate vector v ∈ R nv .To estimate these fluxes, FBA makes two main assumptions [17].First, the metabolism is considered at steady-state (corresponding to balanced growth condition): Second, an objective function to be maximized is considered (e.g. the specific growth rate for microorganisms), defined by an objective vector c ∈ R nv .The metabolic fluxes are then the solution of a linear optimization problem (also called LP problem, for Linear Programming), which can easily be solved numerically: where v and v are bounds on the metabolic fluxes.These bounds are used in particular to define nutrient inputs and to specify the reversability of each reaction.
One limitation of FBA is that it can have several solutions (with the same objective value).To tackle this problem, [13] have proposed pFBA, which consists in two steps.First, FBA is used to find the optimal value of the objective function.Then, we determine the solution with the minimum overall flux that have (almost) the same objective value.This gives a unique solution, and it had been shown that it is consistent with gene expression measurements [13].

ISIS principle
Switch nodes will be identified based on the analysis of a set of reaction fluxes under n c different conditions, e.g.different nutrient inputs.For each condition j ∈ {1, . . ., n c }, the cellular metabolism is characterized by the flux vector v j obtained by pFBA (or another method), and normalized (to give each condition the same weight in the analysis).
Then, for each metabolite i and each condition j, we consider the flux vector (including stoichiometry) of the reactions involving this metabolite (as a substrate or a product) in this condition, i.e.: S i,: • v j where • stands for the Hadamard product (i.e.element-wise).Now given all the conditions, we consider the vector space M i (n v × n c ) generated by all the flux vectors for metabolite i: If the dimension of this vector space is greater than one (i.e. the flux vectors involving this metabolite for different conditions are not collinear), the metabolite is considered as a switch node (see Fig. 1).
From a practical point of view, the dimension of M i is evaluated by singular value decomposition (SVD), so we get: where U i (n v × n v ) and V i (n c × n c ) are unitary matrices and Λ i (n v × n c ) is a diagonal matrix whose diagonal entries Λ i k,k correspond to the singular values of M i (ordered in descending order).We compute a score which represents the significance of each switch node: .
If all the vectors are collinear, then all the Λ i k,k for k > 1 are almost null, so r i 0. The closer the score is to one, the more the metabolite corresponds to a switch.

ISIS implementation
ISIS has been implemented under Python 2.7 within the COBRApy framework [7].After computing the vector fluxes for all the conditions with pFBA, a loop on all the metabolites is carried out, running for each metabolite the SVD and then computing the score.To speed up the program, all the zero rows in M i are removed before running the SVD.Finally, all the metabolites are ordered following their scores.

Case studies
ISIS has been carried out on four examples.Unless otherwise stated, the procedure described above was applied.Some specificities for each case study are given below.

E. coli (core metabolic model)
As a first case study, we consider the core metabolic network of E. coli [16].We simulate growth on glucose in aerobic and anaerobic conditions (i.e. with or without oxygen).Figure 2, which compares the fluxes between the two conditions and highlights the switch metabolites, was drawn with Escher [11].

Figure 1 .
Figure 1.Principle for switch node identification in metabolic networks.The circles x i and the arrows v j represent respectively metabolites and reactions.The colored bars show reaction fluxes (computed e.g. by

Figure 2 .
Figure 2. Switch nodes for E. coli under aerobic vs anaerobic conditions.Overview of metabolic fluxes under aerobic (A) and anaerobic (B) condition (see C for the name of metabolites and reactions).C: Comparison between these two conditions.Switch nodes (dark dots) have been identified at the junctions of glycolysis with the TCA cycle (pyruvate and acetyl-coA) and with pentose phosphate pathway (glucose 6-phosphate).

Figure 3 .
Figure 3.The main switch nodes (identified by ISIS) for Saccharomyces cerevisiae under nitrogen limitation, in comparison with reporter metabolites computed from transcriptomic data [22].

Figure 4 .
Figure 4. Percentage of metabolites identified as ligand in protein PTM as a function of their scores in ISIS for E. coli, based on the fluxes for 174 different inputs with the genome-scale metabolic network iAF1260 [5].Metabolites identified as switch point are much more involved in PTM, highlighting their role in cellular regulation.

Figure 5 .
Figure 5. Metabolic fluxes in the non-oxidative pentose phosphate pathway for different trophic modes in the diatom P. tricornutum.Erythrose 4-phosphate (e4p) appears as a key hub to balance the fluxes between glycolysis, the Calvin cycle and the oxidative pentose phosphate pathway, showing the role of the non-oxidative pentose phosphate pathway in orchestrating the energy balance of the cell.