Glycosylation flux analysis reveals dynamic changes of intracellular glycosylation flux distribution in Chinese hamster ovary fed-batch cultures

N-linked glycosylation of proteins has both functional and structural significance. Importantly, the glycan structure of a therapeutic protein influences its efficacy, pharmacokinetics, pharmacodynamics and immunogenicity. In this work, we developed glycosylation flux analysis (GFA) for predicting intracellular production and consumption rates (fluxes) of glycoforms, and applied this method to CHO fed-batch monoclonal antibody (mAb) production using two different media compositions, with and without additional manganese feeding. The GFA is based on a constraint-based modelling of the glycosylation network, employing a pseudo steady state assumption. While the glycosylation fluxes in the network are balanced at each time point, the GFA allows the fluxes to vary with time by way of two scaling factors: (1) an enzyme-specific factor that captures the temporal changes among glycosylation reactions catalyzed by the same enzyme, and (2) the cell specific productivity factor that accounts for the dynamic changes in the mAb production rate. The GFA of the CHO fed-batch cultivations showed that regardless of the media composition, the fluxes of galactosylation decreased with the cultivation time in comparison to the other glycosylation reactions. Furthermore, the GFA showed that the addition of Mn, a cofactor of galactosyltransferase, has the effect of increasing the galactosylation fluxes but only during the beginning of the cultivation period. The results thus demonstrated the power of the GFA in delineating the dynamic alterations of the glycosylation fluxes by local (enzyme-specific) and global (cell specific productivity) factors.


Introduction
The introduction of Quality by Design (QbD) and Process Analytical Technology (PAT) initiatives by the US Food and Drug Administration has driven a resurgence in research efforts and innovation in biopharmaceutical process development and manufacturing (FDA, 2004;Rathore, 2009).QbD and PAT prescribe continuous process improvements through well-defined process objectives, science-based product and process understanding, timely (online) process measurements, and process system design, analysis and control.In the current competitive and burgeoning market for therapeutic proteins and biosimilars, the biopharmaceutical industry faces numerous challenges in ensuring product quality, shortening time to market, improving cost effectiveness and creating manufacturing flexibility (Li et al., 2010).One of the most important quality attributes of recombinant therapeutic proteins is the structure of the asparagine-linked oligosaccharide sugar or N-linked glycan attached to the proteins.This glycan structure can influence protein folding (Aebi, 2013), secretion (Zhang and Kaufman, 2006) and stability (Arosio et al., 2013), and has an impact on bio-activity (Umaña et al., 1999), efficacy (Goh et al., 2014) and immunogenicity (Harding et al., 2010).Consequently, the regulation and control of Nlinked glycosylation has received much attention from the biopharmaceutical industry.
The N-linked glycan of a protein is processed post-translationally in the rough endoplasmic reticulum (ER) and Golgi cisternae.There exists a natural heterogeneity in the outcome of the 4 cell culture parameters (e.g., dissolved oxygen, pH, temperature, osmolality), culture media composition, and expression hosts (Hossler et al., 2009;Ivarsson et al., 2014).Yet, the mechanisms of these dependencies still remain unclear.Transcriptomics, proteomics and metabolomics analysis have been used to get a better picture of the intracellular changes in protein glycosylation, caused by modifications in the cell culture conditions (Sha et al., 2016).
Furthermore, mathematical models of the glycosylation network have been created to explain the observed glycoform profiles and to predict the glycosylation outcome (Jimenez del Val et al., 2011;Liu et al., 2013;St Amand et al., 2014;Umaña and Bailey, 1997) Recent modelling efforts have focused on linking extracellular metabolite measurements to the protein glycosylation process by coupling the dynamic model of the bioreactor with the intracellular model of glycan metabolism and glycosylation network (Jedrzejewski et al., 2014).However, these detailed models comprised partial differential equations and possessed a large number of unknown kinetic parameters, posing a challenge in model simulations and parameter estimation.
On the opposite spectrum of modelling framework, the constraint-based modelling (CBM) uses a simple and parameter-free approach, requiring only the stoichiometry of the reactions to make predictions on the reaction fluxes.The most successful application of CBM has been in the modelling and analysis of intracellular metabolism.By assuming that the metabolic reaction network operates under steady state, the metabolic fluxes can be estimated from the measurements of extracellular metabolites using metabolic flux analysis (MFA) (when the estimation problem is over-determined) (Goudar et al., 2007) or using flux balance analysis (FBA) (when the estimation problem is under-determined) (Orth et al., 2010).In the FBA, one makes a further assumption that the cell organizes its metabolic flux distribution to optimize a particular objective (e.g.biomass production).While MFA and FBA have been routinely used to analyze metabolic network models, they are not yet commonly used to study protein glycosylation.A recent constraint-based modelling of protein glycosylation adopted a probabilistic approach using the Markov chain theory to explore feasible fluxes that could explain the observed glycan distribution, thereby avoiding the need to assign a cellular objective as in the FBA (Spahn et al., 2015).The method was developed specifically for predicting the protein glycosylation outcome from enzyme knock-out(s), and has been extended to glycoengineering by strain optimization (Spahn et al., 2017).While these works demonstrated the promise of CBM for glycoengineering application, the strategy was not designed for predicting the (dynamical) changes in the regulation of glycosylation fluxes due to alterations in the cell culture conditions and parameters (e.g., media compositions, pH, etc.).
In this work, we present a novel constraint-based modelling approach for analyzing protein glycosylation fluxes.The method, called Glycosylation Flux Analysis (GFA), generates predictions for the (intracellular) glycosylation reaction fluxes from the secretion rates of glycoforms.The GFA is based on the premise that the temporal variations in the intracellular glycosylation fluxes are the results of time-varying changes in the enzyme-specific factors (such as gene expression, enzyme activity and nucleotide sugar availability) and in the cell metabolism (especially the specific productivity).Finally, we applied the GFA to elucidate the dynamical changes of immunoglobulin G (IgG) glycosylation in fed-batch CHO cell cultures using two different media compositions (without and with manganese feed).

Cell culture conditions
CHO-S cells producing an IgG monoclonal antibody were cultivated in a 3 L bench scale bioreactor system (Sartorius Stedim, Germany).The resulting antibody bore two N-linked glycosylation sites at the Fc region.Cells were first expanded in shake bottles and the cell seeding density was set to cells/mL.A temperature shift from 36.5 °C to 33.0 °C and a pH shift from 7.1 to 6.9 were conducted on the fifth culture day.CO2 was used to control the pH and the dissolved oxygen set point was fixed to 50 % air saturation.Feeds were added on day 3, 5, 7 and 10 and consisted of a proprietary concentrated main feed with over 30 components, a highly alkaline amino acid solution and a glucose solution of 400 g/L.The main feed in the first experiment (process A) represented a control experiment, whereas the main feed in the second experiment (process B) had a different amino acid composition with manganese chloride addition.The final concentration of MnCl in process B was set to 2.4 μmol per liter culture volume.

Analytical methods for cell cultures and product quality analysis
Cell

Data Pre-processing
The glycosylation flux analysis uses the secretion fluxes (rates) of protein glycoforms as inputs to make predictions on the intracellular glycosylation fluxes.For this purpose, the cell culture data of viable cell density (  (t) [10 6 cells/mL]), mAb concentration (T(t) [g/L]) and glycan fractions (fi(t) [%]) were pre-processed to produce the required secretion fluxes.First, the concentration of each glycoform was computed from the temporal data of mAb titer and the fraction of the glycoforms as follows: where  , (  ) denotes the extracellular concentration of the i-th glycoform at the k-th time point (k = 1, 2, …, K) and   denotes the molecular weight of the i-th glycoform.Since the contribution of glycans to the molecular weight of the protein was insignificant, the protein's molecular weight was used for the calculations above (MW IgG = 150 kDa).The glycoform concentrations were smoothened using a sigmoidal Hill-type function: where the parameters a, b, d and n were determined by minimizing the sum of squares of the difference between the smoothing function and the concentration data.Finally, the time slopes of the glycoform concentration  , ()  ⁄ were evaluated using the first-order derivative of the aforementioned smoothing function.Meanwhile, the viable cell density data were smoothened using a logistic function, given by: (Goudar et al., 2005): where the parameters A, B, C and D were obtained by minimizing the sum of squares of the difference between the logistic function above and   data.Given  , ()  ⁄ and   () from the above data smoothening steps, the secretion fluxes of each glycoform  , () were computed according to: Note that in the above derivations, we assumed a constant volume of the cell culture.For the fedbatch experiments that we considered, such an assumption was reasonable since the cell culture volume varied only slightly with time (see SI Figure 1).When the volume of the cell culture varies strongly with time, the secretion flux calculation should take into account the volumetric changes, as follow: where V(t) is the cell culture volume at time t.

Glycosylation Flux Analysis
The glycosylation flux analysis (GFA) gives predictions for the intracellular fluxes of the glycosylation network (see Figure 1) under a pseudo steady state assumption based on a constraint-based modelling approach, as follow: where cI,i denotes the (internal) concentration of the i-th glycoform,  , denotes the (internal) flux of the j-th glycosylation reaction,  , denotes the (external) secretion flux of the i-th glycoform, and S(i,j) is the number of molecules of the i-th glycoform that is consumed (when S(i,j) < 0) or produced (when S(i,j) > 0) by the j-th reaction.The pseudo steady state assumption is reasonable as the residence time of Golgi apparatuses between 20-40 minutes (Bibila and Flickinger, 1991;Hirschberg and Lippincott-Schwartz, 1999), is much shorter than the time scale of the cell culture dynamics (on the order of days).The relationship above can be written in a matrix-vector format as follows: where S denotes the stoichiometric matrix of the network, and vI(t) and vE(t) denote the vectors of glycosylation and secretion fluxes of the protein glycoforms, respectively.The stoichiometric matrix is an m by n matrix where m is the number of protein glycoforms and n is the number of reactions in the network.Based on Eq. ( 7), the values of the glycosylation fluxes at each time point   can be computed directly from the secretion fluxes of the glycoforms vE(tk) when the stoichiometric matrix S is invertible or has a full column rank.Given that the glycosylation enzymes can act on multiple substrates, the number of reaction exceeds the number of structures in the glycosylation network (i.e.m is smaller than n).In other words, the stoichiometric matrix is not invertible and the calculation of the internal glycosylation fluxes is an underdetermined problem.
In order to resolve the underdetermined problem above, we assume that the temporal changes in the glycosylation fluxes vI are the result of (i) local enzyme-specific alterations and (ii) global Without loss of generality, in this work we used the first measurement time point as the reference time, and thus assigned all   ( 1 ) and ( 1 ) to 1.
The factors   () capture the dynamical alterations of glycosylation fluxes through local (enzyme-specific) factors such as enzyme expression, activity, inhibition, and nucleotide sugar availability.On the other hand, the variable () describes the influence of the global cell metabolism on the glycosylation network, in particular the dynamical changes in the specific productivity of the cells.Therefore, we computed () as the ratio between the specific productivity (qmAb) at time t and that at time tref, as follows: Given the relationships in Eqs. ( 8) and ( 9), the calculation of the internal glycosylation fluxes reduces to fitting   () and  ,  to the secretion flux values, according to: The schematic of the GFA workflow is depicted in Figure 2. The number of optimization variables above is   ( − 1) + , where   is the number of enzymes in the glycosylation network.As long as the total number of secretion flux values is larger than or equal to the number of unknowns above (i.e.,  ≥   ( − 1) + ), the GFA problem is over-or fullydetermined, respectively.In this work, the optimization in the GFA was solved in MATLAB by using a global optimization method called the enhanced scatter search method from the MEIGO toolbox (Egea et al., 2014).
For quantifying the uncertainty in the estimated   (  ) and  ,  , we generated 10 synthetic datasets for each process, consisting of glycoform concentrations and   values at the same measurement time points, by contaminating the actual data with independent and identically distributed noise from a Gaussian distribution.We used the residuals from the data smoothening step to compute the variance of the noise for each variable.Finally, for each synthetic dataset, we carried out the GFA in the same manner as described above.In the application of the GFA to CHO fed-batch cultivation below, we reported the optimized   (  ) and  ,  along with the standard deviation evaluated using the GFA results of the synthetic datasets.

Cell culture data
Figure 3 shows the cell culture measurements of the viable cell density, IgG concentration and glycoform distribution (fractions) from process A and B. The logistic function smoothing for Xv(t) appears as solid lines in the figure above.The cell density in process A reached a peak of ~20×10 6 cells/mL on day 7, while the peak cell density of process B was higher at ~25×10 6 cells/mL and occurred on day 8.While the viable cell density in process A was lower than that in process B, both processes had a similar final IgG concentration of ~3 g/L.The logistic function could describe the overall temporal variation in the viable cell density data reasonably well (SI Table 1).A more detailed comparison between the two process datasets is provided in SI Figure 2.
The time evolution of the IgG glycan distribution in both processes showed an increase of less processed glycan fractions toward the end of the cell culture duration, in agreement with previous studies using the same cell line (Aghamohseni et al., 2014;Pacis et al., 2011).While such a trend was observed for the entire cell culture duration in process A, this trend was reversed in the first half of process B.More precisely, between day 3 and 6, we observed an increase in the fraction of galactosylated glycan structures.This observation agreed with a previous study investigating the effect of manganese addition using real-time glycosylation monitoring (Tharmalingam et al., 2015).Manganese is a cofactor of many Golgi resident enzymes, including N-acetylglucosaminyltransferases I/II and galactosyltransferases (Hendrickson and Imperiali, 1995).

GFA of CHO Fed-batch Cultivation
Figure 1 shows the glycosylation network used in the GFA of CHO-S cell culture.The network was derived from a previous publication (Villiger et al., 2016), in which undetectable and nonintermediate glycoforms were removed (Shah et al., 2008).In the GFA, we considered the was larger than the number of unknowns (95), and hence the GFA was over-determined.A few of the glycoforms in the network (M8, AM5, FAM5, FA2G1-1S1, FA2G1-2S1, FA2G2S1-2) were undetected, and thus we set the corresponding secretion rates to 0. Figure 4 shows the concentration time profiles of the glycoforms and the corresponding Hill-type smoothing curves for the two process conditions.[pg/cell/d]  at an approximately constant level.A decrease in GalT activity is a frequently observed phenomena in industrial fed-batch processes of CHO-S culture (Gawlitzek et al., 2000), because of the adverse environmental conditions to which the cells are exposed (e.g., by-products, nutrient limitation and overfeeding, osmotic stress).As shown previously, the activity of GalT (and SiaT) for the cell line under investigation is much more sensitive to extracellular environment compared to ManI/II, GnTI/II and FucT (Villiger et al., 2016).In this regard, the accumulation of ammonia during the course of the cell culture (see SI Figure 2) could cause a pH increase in the Golgi, leading to a decrease in the GalT activity.The GFA of process A showed a slight increase in   () between day 3 and 6, but such an increase was not statistically significant (H0:   () = 1 has a p value larger than 0.05).Also, the drop in the GalT-specific factor explained the increase of FA2 and the decrease of galactosylated glycan structures towards the end of the two fed-batch cultivations.The flux values of reactions catalyzed by SiaT were not shown since these fluxes were predicted to be near zero (see SI Figure 3).

Effects of manganese
While processes A and B produced comparable glycoform distributions after day 10, the glycan fractions prior to this day, especially those of FA2 and the galactosylated structures, showed a completely different dynamic behavior.In order to elucidate the dynamic alterations in the IgG glycosylation caused by the change in the culture media composition and the addition of Mn, we computed the relative deviation in the viable cell density and in the (global) cell specific productivities (see Figure 7).The relative deviation between the two processes is computed as follows: where zA and zB denote the variable of interest (i.e. the viable cell density and specific productivities) of process A and B, respectively.Figure 8 further shows the relative deviation of the normalized glycosylation throughput for each enzyme J, defined as Note that the summation in the numerator above is done over all fluxes catalysed by the enzyme J.The normalized flux throughput of an enzyme reflects the activity of the glycosylation reactions carried out by a particular enzyme relative to the total activity of all glycosylation reactions in the network.
Figure 7 Relative deviations in (A) the viable cell density and (B) the cell specific productivity between process A and B. A positive (negative) value indicates that the variable is larger (smaller) in process B than in process A.
Figure 7 shows that the alternative culture media in process B led to a higher viable cell density but resulted in a lower specific productivity than process A. These observations suggested a trade-off between the cell growth and the antibody production upon changing the media composition.Except for fluxes catalysed by GalT, the normalized glycosylation throughputs in process B were roughly the same as those in process A for the entire cultivation period (see Figure 8).The normalized throughput of GalT showed a markedly more dynamic difference between the two processes.In particular, we observed a higher GalT normalized throughput in process B than in process A during the first half of the cell cultivation.This increase diminished with time and the relative deviation eventually dropped to roughly zero as with the other

Discussion
The combination of experimental and computational approaches, particularly using CBM, has brought a tremendous progress in metabolic engineering, for elucidating the regulation of cellular metabolism, for predicting the outcome of perturbations to the metabolic network (e.g.knock-outs), and for strain optimization (Bordbar et al., 2014).However, the CBM is not yet routinely applied to study the glycosylation network.Driven by the importance of understanding and controlling glycosylation of proteins, we created the glycosylation flux analysis by adapting the CBM approach for the glycosylation network.Based on a pseudo steady state assumption of the fluxes in the glycosylation network, the GFA generates dynamic glycosylation flux predictions using time-resolved measurements of cell culture data, particularly viable cell density, protein titer and glycan fractions.More specifically, the GFA captures the dynamic changes in the glycosylation fluxes using two pre-multiplicative factors;   () that accounts for the (local) changes that occur in an enzyme-specific manner, and () that accounts for the (global) alterations in cell metabolism that affects the cell specific productivity.Besides reducing the degrees of freedom in the flux prediction problem (due to the small number of enzymes in the glycosylation network), the use of pre-multiplicative factors above leads to the delineation of local and global effects of the process conditions on the protein glycosylation.In contrast to the more detailed differential equation based analyses of the glycosylation networks (Jedrzejewski et al., 2014;Jimenez del Val et al., 2011;Krambeck and Betenbaugh, 2005), the GFA, like the MFA, represents a parameter-free approach and requires only the stoichiometric matrix information.However, since the GFA involves solving the inverse problemestimating intracellular fluxes given dynamic cell culture data, this method cannot directly be used for  Because all intracellular glycosylation fluxes associated with a particular enzyme J scale equally by   (), the ratios among these fluxes thus remain constant with variations in   ().Such ratios are given by the relative magnitudes of the corresponding  ,  (see Figure 6).Finally, since   () and () are set to 1 at the reference time point tref,  ,  are the predictions for the intracellular glycosylation fluxes at this time point, i.e.  , (  ) =  ,  .
We applied the GFA to elucidate the glycosylation fluxes in two fed-batch cultivations of CHO cells producing IgG using different cell culture media.Our analyses showed that the temporal variation in the glycan distribution mainly resulted from the decrease in the galactosylation activity with time, relative to the other glycosylation reactions.We attributed this drop to the sensitivity of GalT activity to environmental changes, particularly the accumulation of ammonia in the cell culture.While the time profiles of the viable cell density and the specific productivity of IgG were similar between the two fed-batch processes, the changes in the glyoprofiles during the first half of the cultivations followed markedly different dynamics.The GFA showed that between day 3 and 6, the addition of Mn, a co-factor of GalT (Ramakrishnan et al., 2004). in process B increased the normalized galactosylation throughput.However, this increase largely disappeared by day 10.While Mn has also been reported to be a co-factor for GnTI/II (Nishikawast et al., 1988), under these cultivation conditions, the normalized glucosamination throughput in process B was roughly equal in that in process A. This observation suggested that the enzyme activity of GnTI/II might not be the rate-limiting factor of the IgG glucosamination in the two fed-batch cultivations.

Conclusion
In this work, we adapted a constraint-based modeling approach to the analysis of protein IgG production demonstrated the dynamic information that one could obtain from this analysis, specifically on the key glycosylation enzyme(s) that governed the resulting glycoform distribution.As more research efforts are invested in understanding protein glycosylation, .CC-BY 4.0 International license a certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under The copyright holder for this preprint (which was not this version posted March 28, 2017.; https://doi.org/10.1101/121517doi: bioRxiv preprint 28 producing a treasure trove of cell culture and glycoprofiling data, a network-based analysis method, like the GFA, will play an enabling role in generating insights into the complex intracellular regulation and control of the protein glycosylation process.

Figure 1 N
Figure 1 dynamical changes in the cell metabolism.Based on this assumption, we write the time evolution of internal glycosylation fluxes as follow:  , () =   ()() ,  (8) where   () denotes the enzyme-specific (local) factor, () denotes the cell specific (global) factor, and  ,  denotes the reference flux value for the j-th internal flux  , .The enzymespecific factor   () is shared among all reactions catalysed by the same enzyme J, while the cell factor () applies to all reactions in the glycosylation network.The factors   () and () represent the time-dependent fold-amplification or -attenuation of the glycosylation fluxes, and hence their values can be normalized with respect to those at an arbitrary reference time point tref.

Figure 2 The
Figure 2

Figure 3
Figure3 glycosylation flux balances around all protein glycoforms downstream of M8 (including M8), which corresponded to m = 19 glycoforms and n = 25 reactions (see SI Text A for the glycosylation network model).For K = 11 time points, the number of secretion flux values (209)

Figure 5
Figure 5 depicts the secretion fluxes of the glycoforms (in solid lines) after the data pre-

Figure 6
Figure 6 describes the time profiles of the enzyme-specific factors   (), the cell specific productivity (), and the reference fluxes  ,  for process A and B. The cell specific

Figure 6 The
Figure 6 , we did not compare the glycosylation fluxes catalysed by SiaT as the flux values were nearly zero (~10 -3 pg/cell/day).

Figure 8
Figure 8 predicting the effects of changing process parameters or genetic alterations on   (), () and  ,  a priori.There exist several implications from the formulation of intracellular glycosylation flux  , that depends on three variables:   (), () and  ,  .Starting with (), for fixed values of   (), a fold-increase (-decrease) in () means an equal fold-increase (-decrease) among all of the glycosylation and secretion fluxes in the network (see SI Text B).Therefore, () captures the global change in the glycosylation network, but such a change will not affect the glycan distribution in the monoclonal antibody.Meanwhile, for fixed values of () and  ,  , a variation in the enzyme-specific factors   () will lead to a reorganization of the glycosylation fluxes and thus changes in the glycan distribution.The reorganization of the internal fluxes is restricted by constraints enforced by the flux balance equation in Eq. (7) (see SI Text B).
glycosylation and developed the glycosylation flux analysis method.The GFA uses extracellular data of glycoform secretion fluxes and a (pseudo) steady state model of the glycosylation network to determine the intracellular glycosylation fluxes.In this regard, the GFA separates the influence of local enzyme-specific factors and global cell metabolism on each glycosylation flux.The ability to decouple the influences of local and global factors is important in elucidating the regulation of protein glycosylation and especially in efforts to control the outcome of the glycosylation process.The application of the GFA to two fed-batch cultivations of CHO cells for review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available underThe copyright holder for this preprint (which was not this version posted March 28, 2017.; https://doi.org/10.1101 review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.It is made available under