A genome-scale metabolic reconstruction provides insight into the metabolism of the thermophilic bacterium Rhodothermus marinus

The thermophilic bacterium Rhodothermus marinus has mainly been studied for its thermostable enzymes. More recently, the potential of using the species as a cell factory and in biorefinery platforms has been explored, due to the elevated growth temperature, native production of compounds such as carotenoids and EPSs, the ability to grow on a wide range of carbon sources including polysaccharides, and available genetic tools. A comprehensive understanding of the metabolism of production organisms is crucial. Here, we report a genome-scale metabolic model of R. marinus DSM 4252T. Moreover, the genome of the genetically amenable R. marinus ISCaR-493 was sequenced and the analysis of the core genome indicated that the model could be used for both strains. Bioreactor growth data was obtained, used for constraining the model and the predicted and experimental growth rates were compared. The model correctly predicted the growth rates of both strains. During the reconstruction process, different aspects of the R. marinus metabolism were reviewed and subsequently, both cell densities and carotenoid production were investigated for strain ISCaR-493 under different growth conditions. Additionally, the dxs gene, which was not found in the R. marinus genomes, from Thermus thermophilus was cloned on a shuttle vector into strain ISCaR-493 resulting in a higher yield of carotenoids. Importance A biorefinery converting biomass into fuels and value-added chemicals is a sustainable alternative to fossil fuel-based chemical synthesis. Rhodothermus marinus is a bacterium that is potentially well suited for biorefineries. It possesses various enzymes that degrade biomass, such as macroalgae and parts of plants (e.g. starch and xylan) and grows at high temperatures (55-77°C) which is beneficial in biorefinery processes. In this study, we reviewed the metabolism of R. marinus and constructed a metabolic model. Such a model can be used to predict phenotypes, e.g. growth under different environmental and genetic conditions. We focused specifically on metabolic features that are of interest in biotechnology, including carotenoid pigments which are used in many different industries. We described cultivations of R. marinus and the resulting carotenoid production in different growth conditions, which aids in understanding how carotenoid yields can be increased in the bacterium.

consensus sequence using the highest quality bases, using Geneious (v9. 1.4 Cobrapy [41] was used in all model simulations, along with the GLPK solver. The corresponding code 228 can be found as a Jupyter notebook on Github (https://github.com/steinng/rmarinus). For all 229 simulations, flux balance analysis (FBA) was used [42], [43]. Exchange reactions corresponding to 230 metabolites taken up from the media (glucose and pyruvate) and secreted (lactate and acetate) during 231 growth, were constrained with experimentally obtained rates (Supplementary file 1). FBA was 232 subsequently used to optimize for growth by maximizing flux through the biomass reaction. 233 For accurate growth rate predictions, the biomass reaction should ideally be based on data obtained 234 for the target organism. Here, the biomass reaction was formulated based mostly on available data on 235 R. marinus (Supplementary file 2). Separate biosynthetic reactions for each group of macromolecules 236 (protein, lipid, DNA, etc.) were formulated, describing the ratio of the building blocks (amino acids, 237 fatty acids, nucleotides, etc.) and the energy required. Sensitivity analysis, which shows how much 238 variation in each macromolecule affects the predicted growth rate, was performed. This analysis helps 239 to identify which biomass components most urgently need to be accurately measured.  Table 1. 244

Results and discussion
The Memote tool [44] was used to help guide the reconstruction process, by verifying stoichiometric 245 consistency, mass and charge balance and annotation quality (Supplementary file 3). Reactions and 246 metabolites were usually abbreviated in accordance with the BiGG database and annotations with links 247 to external databases are included. The genome sequence for strain DSM 4252 T was obtained from 248 GenBank (accession nr: NC_013501). The genes in the reconstruction were identified with the locus 249 tags from the GenBank file. They were annotated with the old gene locus tag from the GenBank file, 250 the protein ID, protein annotation and protein sequence. Experimental data on R. marinus obtained in 251 this study and the available literature was used to curate reactions, genes and gene-protein-reaction 252 (GPR) rules. Several metabolic features were reviewed during the reconstruction process. In the 253 following, we highlight a few, which are of interest for biotechnological application of R. marinus. 254 Sugar metabolism 255 R. marinus produces pyruvate from glucose through the Embden-Meyerhof-Parnas (EMP) pathway. A 256 13C metabolic flux study of the central metabolism in R. marinus [45] showed that the EMP pathway 257 and the TCA cycle are both highly active while metabolizing glucose. The oxidative pentose phosphate 258 pathway and the glyoxylate shunt had very low activity and the Entner-Doudoroff (ED) pathway, malic 259 enzyme and phosphoenolpyruvate carboxykinase were inactive. 260 Growth of R. marinus strain DSM 4252 T was tested on many different carbon sources, both in vivo and 261 in silico (Table 2). Growth has been shown on several mono-, di-and polysaccharides, which was also 262 observed in silico. However, growth on cellulose was predicted in silico while not observed in vivo. R Table 2). The gene encoding a β-galactosidase (EC 3.2.1.23), which hydrolyzes lactose into glucose and 279 galactose, was found in the genome. Three steps are needed to convert galactose to glucose-6-280 phosphate, which then enters the glycolysis EMP pathway. The genes encoding galactokinase and 281 galactose-1-phosphate uridylyltransferase, catalyzing the first two steps (galactose -> galactose-1-282 phosphate -> glucose-1-phosphate), were found in the genome. The third step, where glucose-1-283 phosphate is turned into glucose-6-phosphate, is usually performed by the enzyme 284 phosphoglucomutase (EC 5.4.2.2). The gene for this enzyme was not found in the genome. However, 285 a homology search showed similarity between known phosphoglucomutase genes from other bacteria 286 and genes RMAR_RS01880 (E value 2e-37) and RMAR_RS08875 (E value 1e-25) which are annotated 287 as phosphomannomutase (EC 5.4.2.8) and phosphoglucosamine mutase (EC 5.4.2.10), respectively. 288 The enzyme phosphoglucosamine mutase in E. coli, which usually catalyzes the interconversion of 289 glucosamine-6-phosphate and glucosamine-1-phosphate, was also shown to be able to catalyze the 290 interconversion of glucose-6-phosphate and glucose-1-phosphate, at a lower rate [49]. The 291 phosphorylation site of this enzyme in E. coli is Ser102 and a mutational change of Ser100 to a 292 threonine residue increased the phosphoglucomutase activity significantly. Gene RMAR_RS08875 293 from R. marinus was investigated and the serine residue responsible for the phosphorylation was 294 found to be residue number 103. The corresponding residue to Ser100 in the E. coli enzyme was found 295 to be a threonine, which indicates that this R. marinus enzyme may be responsible for the 296 interconversion of glucose-6-phosphate and glucose-1-phosphate in R. marinus. The unsaturated monouronate is converted to 4-deoxy l-erythro 5-hexoseulose uronic acid (DEH) y a 307 spontaneous reaction and further catalyzed to 2-keto 3-deoxygluconate (KDG) by an aldose reductase 308 [54]. KDG enters the partial ED pathway in R. marinus, where it is catalyzed to 2-keto-3-deoxygluconate  [62]. Here, the dead-end metabolite of polyamine biosynthesis, 5-methylthioadenosine (MTA) is 330 metabolized by an alternative methionine salvage pathway, which produces DXP as a side-product. 331 Phylogenetic analysis showed that the genes in this pathway are partially present in the R. marinus 332 genome. At present, it is not known how DXP is produced in R. marinus and without further evidence 333 of alternative pathways, the DXP synthase reaction is present in the reconstruction without any gene 334 candidates assigned. The absence of DXS in R. marinus directly suggests heterologous expression of a 335 thermostable DXS as means to increase flux through the terpenoid pathway. 336 Light-inducible carotenoid production has been observed in many organisms, including non-337 photosynthetic bacteria, and the regulatory mechanisms have been studied in some of them, including 338 in the metabolic reconstruction, but as they are produced in response to stress, they are not included 368 in the biomass reaction and the pathways therefore not active during growth simulations where the 369 biomass is maximized.   file 4). 412 The number of total genes was 2890 and 2937 for ISCaR-493 and DSM 4252 T , respectively and protein-413 coding genes were about 98.3% thereof for each strain. 2609 protein-coding genes belonged to the common core having 50% identity across at least 50% of the protein. 74% of the core protein-coding 415 genes could be assigned to COG functional categories [82]. The remaining protein-coding core genes 416 (26%) did not get COG IDs and had unknown (S) or poorly characterized functions (R). 230 protein-417 coding genes were unique to ISCaR-493 and 275 genes were unique to DSM 4252 T , a total of 505 genes 418 that comprised the peripheric gene fraction. About 50% of the protein-coding genes in the peripheric 419 fractions of both strains could not be assigned function compared with approximately 26% in the 420 common core (Figure 2). 421 Accessory genome ISCaR-493 and only three of them did not have isozymes in the genome that showed high similarity to 431 DSM 4252 T genes (Table 3). 432 Strain DSM 4252 T contains two genes encoding xylanases while strain ISCaR-493 contains only one 433 homologue. Both strains grow on xylan as the sole carbon source (Table 2 for DSM 4252 T , data not  434 shown for ISCaR-493). Strain DSM 4252 T contains four genes encoding alginate lyases and one of them 435 is missing in ISCaR-493. The latter strain grows well in a medium with alginate as the sole carbon source 436 [50], suggesting that the three alginate lyases are sufficient to degrade alginate for utilization. 437 Many enzymes take part in EPS (the envelope polysaccharides) biosynthesis and assembly, and their 438 corresponding genes were all found in the core genome. A putative o-antigen polymerase (wzy) found in the accessory genome of strain ISCaR-493 showed a low similarity to a functionally corresponding 440 protein in DSM 4252 T (E-value 0.004). However, the encoded gene showed high similarity to genes in 441 more distantly related bacteria annotated as o-antigen ligase and o-antigen polymerase. This enzyme 442 activity is essential for EPS synthesis and must be present in ISCaR-493 as EPS is produced [10]. This analysis showed that the model accurately predicts growth for both strains (Figure 3a). 460 For both strains, but more so for DSM 4252 T , secretion rates of lactate and acetate increased during 461 the growth phase (Supplementary file 1). A decrease in growth rate during batch cultivations has been 462 observed in other bacteria, such as E. coli [83] where the main reason was oxygen limitation that could 463 also lead to an increase in organic acid secretion. The cultivations here were carried out with high 464 aeration as oxygen levels were kept fixed at 40% pO2. A plausible explanation for why the cells would 465 experience oxygen limitation in a medium with excess oxygen levels is local limitation due to cell 466 aggregation [84]. Aggregation of several R. marinus strains has been reported previously [22], 467 especially in DSM 4252 T and R. marinus is also shown to produce exopolysaccharides [10], which can 468 cause cells to aggregate [85]. 469 When the model was optimized for growth, without oxygen limitation and free secretion of acids, it 470 did not predict any acid production and the predicted growth rate was slightly higher than observed in 471 vivo. When oxygen was limited in the model, the predicted growth rate decreased, and the model 472 predicted lactate secretion (data not shown). Experimental data showed that lactate was first secreted, 473 followed by acetate (Supplementary file 1). The model predicted slightly higher growth rate when 474 lactate was the sole acid produced, opposed to when it was forced to also produce acetate. 475

476
To better understand the carotenoid production in R. marinus, a cultivation experiment comparing 477 different conditions was performed. Besides obtaining high yields of carotenoids per cell, high cell 478 density is important for achieving high yields of carotenoids. Therefore, both extracted carotenoids 479 (from 1 mL of cells diluted to OD620 nm = 1) and cell densities were measured from cultivations after 480 24 hours (Figure 4). The ISCaR-493 strain was used in this experiment, as it can be genetically modified 481 and thus likely to be used for future cell factory designs 482 Glucose and pyruvate 483 R. marinus can grow on several monosaccharides, as predicted by the model. However, we have often 484 observed better growth on oligo-and polysaccharides (data not shown). Growth of strain ISCaR-493 in 485 defined medium with glucose (1%) as the sole carbon source resulted neither in high cell density nor 486 high carotenoid production (Figure 4). R. marinus can utilize pyruvate as the sole carbon source (Table  487 2). Pyruvate is used in several pathways essential for growth and is the substrate, together with glyceraldehyde 3-phosphate, in the first step of the MEP terpenoid pathway (Figure 1 To increase both 489 cell density and carotenoid production, pyruvate (0.09%) was added to the glucose-based medium. 490 This resulted in increased carotenoid production and highly increased cell density (Figure 4). Visually, 491 these cultures exhibited much stronger red color than the glucose cultures, which can be explained by 492 both increased carotenoids yields and higher cell densities. 493 cultures were colorless and the lack of carotenoids was confirmed by measurements (Figure 4). 499

Impact of light
Alginate 500 R. marinus can grow on many different polysaccharides (Table 2), making it an interesting candidate 501 for processing 2 nd or 3 rd generation biomass, such as seaweed. Alginate is one of the major 502 polysaccharides of brown algae. The products from alginate degradation are pyruvate and 503 glyceraldehyde 3-phosphate (Figure 1), which are the same metabolites as used in the first step of the 504 MEP terpenoid pathway. This raised the question whether R. marinus produces more carotenoids 505 when grown on alginate, since it produces the two metabolites needed for the biosynthesis 506 concurrently and in equal amounts. Cultivations in defined medium with alginate (1%) as the sole 507 carbon source showed less cell density compared to glucose and pyruvate, but highly increased 508 carotenoid production ( Figure 4). 509 Glucose and pyruvate in equal quantities 510 To further examine if the availability of glyceraldehyde 3-phosphate and pyruvate in equal amounts 511 results in higher carotenoid production, cultivation in defined medium with glucose (0.5%) and 512 pyruvate (0,25%) was investigated. These cultivations showed lower cell density and higher carotenoid production compared to growth on glucose (1%) and pyruvate (0.09%) (Figure 4). The increased 514 carotenoid production could be due to the equal availability of the two metabolites. Another possibility 515 is that increased concentration of pyruvate alone in the medium caused higher carotenoid production. 516 Pyruvate 517 To examine if pyruvate alone affects the carotenoid production, two additional cultivations were set 518 up, with pyruvate (0.09% and 0.18%) as the sole carbon source. The cell density in these cultures was 519 low, only increased slightly after inoculation. This indicated that ISCaR-493 struggles to grow in liquid 520 defined medium with pyruvate as the sole carbon source, which was surprising as growth was 521 observed on agar medium ( Table 2). The carotenoids per fixed cell density in the pyruvate cultures 522 were much higher compared to cultures on glucose (1%) and pyruvate (0,09%). Additionally, increased 523 pyruvate concentration resulted in increased carotenoid production (Figure 3). This suggests that the 524 pyruvate is used for carotenoid production. Producing glyceraldehyde 3-phosphate from pyruvate 525 costs energy (gluconeogenesis) and it cannot be determined from this data if this is the case for the 526 observed growth. However, glycogen is an alternative source of glyceraldehyde 3-phosphate. The 527 amount of glycogen in the biomass of R. marinus has been estimated as 14% [45] and is relatively high 528 compared to other bacteria. Inclusion that could possibly contain glycogen can be discerned on 529 electron micrographs of R. marinus [7]. Considering the natural habitat of R. marinus in coastal hot 530 springs, it is not unreasonable to assume that it accumulates high levels of glycogen. Due to tides, the 531 availability of nutrients in the surroundings of R. marinus varies widely and it is likely that the bacterium 532 stores carbohydrates when they are in abundance in the environment. Since little or no growth was 533 observed on pyruvate in liquid cultures it is likely that the cells experienced starvation and therefore 534 started the breakdown of glycogen and carotenoid production. This was also seen for the negative 535 control cultures without a carbon source (Figure 4). The cell density did not increase from inoculation, 536 while the carotenoid production did. 537 Addition of the dxs gene from T. thermophilus 538 In an effort to increase carotenoid yields, the dxs gene from T. thermophilus was cloned on a shuttle 539 vector into R. marinus strain SB-62 (ISCaR-493 derivative, trpBpurA), resulting in the mutant strain 540 TK-4 (trpBpurA::trpBdxsT.thermophilus) (Supplementary file 5). The dxs gene encodes 1-deoxy-D-541 xylulose-5-phosphate synthase (DXS), which catalyzes the first step in the MEP terpenoid pathway 542 (section 3.1) and could not be identified in the genomes of R. marinus. Compared to ISCaR-493, 543 cultivation of TK-4 resulted in lower cell density but highly increased carotenoid production. 544 Presumably the added dxs gene resulted in a higher flux of carbons through the terpenoid and 545 carotenoid pathways. However, it is also possible that this strain struggles to grow and responds by 546 producing carotenoids. The dramatically lower cell density compared to ISCaR-493 can most likely be 547 explained by the metabolic burden caused by the replication of the shuttle vector and the expression 548 of its genes. Inserting the dxs gene in to the chromosome could reduce such effects. 549 In summary, these experiments showed that the highest cell density was obtained in glucose 550 medium supplemented with pyruvate, while higher carotenoid production was observed during 551 growth on alginate, with pyruvate added to a glucose-based medium and in the presence of light. It 552 also showed that the carotenoid production per cell increased during starvation, indicating that 553 yields can potentially be increased by either allowing the culture to reach and stay in stationary 554 phase or transfer the cells after growth to new medium with limited or no carbon source. The 555 motivation for the latter is that after the growth phase, the medium might not be optimal, e.g. due 556 to accumulation of by-products that alter the pH, and the cells might stay alive and produce 557 carotenoids longer in fresh medium. Finally, cloning the dxs gene from T. thermophilus in R. marinus 558 resulted in the highest yields of carotenoids, but much lower cell density than the wild type strain 559 ISCaR-493.

561
A manually curated genome-scale metabolic model of R. marinus DSM 4252 T was reconstructed and 562 made publicly available (https://github.com/steinng/rmarinus). Experimental data from the literature 563 and from this study was used to curate and validate the model. This includes growth data on various 564 carbon sources, bioreactor cultivations and HPLC measurements of main metabolites, used for model 565 validation, multiple studies on different metabolic pathways, components, genes and enzymes, and 566 data on biomass components, which was used to formulate a species-specific biomass objective 567 function. 568 The model was also evaluated for use with R. marinus ISCaR-493, from which the genetically modified 569 SB-62 (trpBpurA) was derived. The genome of strain ISCaR-493 was sequenced and the resulting 570 draft genome was compared to that of strain DSM 4252 T . This analysis showed that only seven model 571 genes were absent in strain ISCaR-493 and four of them were replaced by genes encoding isozymes 572 that exhibited high similarity to the DSM 4252 T enzymes. The remaining three genes are involved in 573 EPS formation and xylan-and alginate degradation. EPSs of both strains have been previously studied 574 [10] and shown to be of similar structures. It was also observed that strain ISCaR-493 grows well in 575 defined medium with xylan and alginate as the sole carbon sources. In conclusion, this analysis 576 suggests that the model is applicable for both strains DSM 4252 T and ISCaR-493. Both strains should 577 be considered when any future changes or additions to the model reconstruction are made. Data on 578 growth and metabolites was used to constrain the model and compare the experimental and simulated 579 growth rates. This revealed that the model predicts correct growth rates for both strains. 580 Different aspects of the metabolism of R. marinus were reviewed during the reconstruction process. 581 Here, an emphasis was on those with a potential biotechnological aspect, carotenoids in particular. 582 Cell density and carotenoid production of strain ISCaR-493 grown at different conditions were 583 investigated. Pyruvate addition to a glucose-based medium, highly increased cell density was obtained. 584 Carotenoid production varied considerably under different growth conditions. Higher carotenoid yields were observed when pyruvate was present in the growth medium, alginate was used as the sole 586 carbon source, cultivating the cells in light conditions and the cells experienced starvation. 587 Additionally, we cloned the dxs gene from T. thermophilus on a shuttle vector into R. marinus and 588 cultivation of the resulting mutant showed low cell density compared to ISCaR-493, but higher 589 carotenoid production. 590 With its thermostable enzymes, wide range of potential carbon sources for growth and marketable 591 products, R. marinus application potential is highly relevant in biotechnology and biorefineries. A 592 genome-scale metabolic model helps us to understand its metabolism and should be useful in future 593 strain designs. 594