Quantifying biosynthetic network robustness across the human oral microbiome

Metabolic interactions, such as cross-feeding, play a prominent role in microbial communitystructure. For example, they may underlie the ubiquity of uncultivated microorganisms. We investigated this phenomenon in the human oral microbiome, by analyzing microbial metabolic networks derived from sequenced genomes. Specifically, we devised a probabilistic biosynthetic network robustness metric that describes the chance that an organism could produce a given metabolite, and used it to assemble a comprehensive atlas of biosynthetic capabilities for 88 metabolites across 456 human oral microbiome strains. A cluster of organisms characterized by reduced biosynthetic capabilities stood out within this atlas. This cluster included several uncultivated taxa and three recently co-cultured Saccharibacteria (TM7) phylum species. Comparison across strains also allowed us to systematically identify specific putative metabolic interdependences between organisms. Our method, which provides a new way of converting annotated genomes into metabolic predictions, is easily extendible to other microbial communities and metabolic products.


Introduction
Metabolism, in addition to enabling growth and homeostasis for individual microbes, is a 27 powerful "currency", that contributes to the organization of microbes into complex, dynamic 28 societies. Metabolic interactions are believed to influence microbial community structure and 29 dynamics at multiple spatial and temporal scales 1-5 . For example, through cross-feeding, a 30 compound produced by one species might benefit another, leading to a network of metabolic 31 interdependences 5-10 . An extreme case of interdependence between microbes is believed to 32 underlie what is usually known as "microbial uncultivability" 11 , i.e. the fact that many microbes 33 isolated from a given environment do not grow in pure culture on standard laboratory conditions. 34 This observation, originally proposed as "the great plate count anomaly" 12 , has motivated interest 35 in understanding the possible mechanisms underlying unculturability 11,13,14 . One class of 36 mechanisms is based on the concept that the growth of uncultivable microbes depends on their 37 community context via diffusible metabolites produced by their neighbors 14 . These dependent 38 microbes are often referred to as fastidious, due to their limited biosynthetic capabilities and 39 reliance on externally supplied metabolites for growth. The prominence of fastidious microbial 40 organisms across the tree of life and their potential importance in microbial community structure 41 is highlighted by the recent identification of the candidate phyla radiationa large branch of the 42 tree of life consisting mainly of uncultivated organisms with small genomes and unique 43 metabolic properties [15][16][17] . 44 Some of the most promising strides in understanding metabolic interdependences between 45 microbes have been taken in the study of the human oral microbiome. The human oral 46 microbiome serves as an excellent model system for microbial communities research, due to its 47 epidemiology, as well as to the study of cascading metabolic failure upon gene deletions in 114 metabolism 60 . In percolation theory the robustness of a network can be characterized by 115 randomly adding or removing components (nodes or edges) of a network and assessing network 116 connectivity 61 . We utilized this concept to characterize the network robustness of a particular 117 metabolic network towards a specified target metabolite by randomly adding input metabolites to 118 the network and assessing the network's ability to produce the specified target metabolite. 119 To implement our method, we first introduced a probabilistic framework for analyzing metabolic 120 networks (Figure 1 A). In this framework, every metabolite can be considered to be drawn from 121 a Bernoulli distribution, i.e. present in the network with a given input probability (Pin). These 122 probabilities could represent beliefs about the environment, chances of metabolites being 123 available from a host organism, or any arbitrary prior on metabolite inputs. Throughout the 124 implementation of our method, we have assigned Pin to be an identical value for all input 125 metabolites. However, future implementations of this probabilistic framework could easily 126 utilize Pin values that vary across metabolites, e.g. matching experimentally measured 127 abundances. Following the assignment of Pin, the network structure can be used to calculate the 128 output probability (Pout) of some specified target metabolite. In practice, random sampling of 129 probabilistically drawn input metabolite sets is used to calculate the probability of producing the 130 target metabolite. For each random sample, flux balance analysis 46 with inequality mass balance 131 constraints is used to assess the networks ability to produce the target metabolite (for a complete 132 explanation of how flux balance analysis is implemented in this context, see methods section: 133 Algorithm functions, feas).  The probabilistic method that we introduced allows the definition of two novel concepts, the 153 "producibility curve" and "producibility metric" (PM) (Figure 1 B) (Figure 2 B). This reaction also consumes NADH 195 and produces NAD + , but because these cofactors can be easily recycled from each other by a  222 We next applied our method to the human oral microbiome, aiming at a mechanistic represent an atlas of biosynthetic capabilities across these human oral microbiome organisms.

252
The ensuing atlas is represented as hierarchically bi-clustered PM values for all 456 organisms 253 and 88 metabolites in Figure 3. The same data is available in Supplementary Figure 3 (clustered 254 by taxonomy), and in Supplementary Table 3.

13
Before analyzing in detail the patterns identifiable in the PM atlas of Figure 3, we showed that 279 such patterns cannot be trivially attributed to simple broad properties, such as genome size, even 280 if genome size is known to be an important predictor of the overall biosynthetic capabilities of an 281 organism 73 . Fastidious or parasitic organisms tend to have reduced genomes and consequently 282 reduced metabolic capabilities. In our data, the overall average PM value for each organism can   namely Tannerella HMT-286. Interestingly, this bacterium is hypothesized to rely on externally 373 supplied siderophores to support its growth 66 . This type of protein dependency is not captured by 374 our metabolic analysis and highlights the fact that, while uncultivability can be driven by many 375 different mechanisms, our method captures the prominent effect of reduced metabolic capacity.

376
The other uncultivated organisms that we identified in this cluster have been hypothesized to 377 have reduced genomes and limited metabolic capabilities underlying their fastidious nature, 378 much like Mycoplasma. 379 We sought to gain clearer insight into the metabolic properties of these co-clustered fastidious

488
(C) Several vitamins/cofactors/other essential factors had significantly decreased PM in TM7 compared to the hosts.

489
The cofactors acyl carrier protein and flavin adenine dinucleotide had decreased PM in TM7, and were also not 490 found to be utilized in the TM7 draft metabolic network reconstructions. Further analysis of these organisms with our method could continue to provide insight into their 557 unique metabolic properties.

558
By quickly translating genotype into phenotype with minimal assumptions, our approach has the 559 potential to serve as a baseline estimate of metabolic mechanisms in different microbial 560 communities and allows us to more easily decipher microbial community structure and function.

561
Our method can be easily applied other human-associated or environmentally relevant microbial 562 communities, providing valuable putative insight into inter-microbial metabolic dependencies, 563 that could be used to interpret existing data or design future experiments. In particular, we 564 envisage that this type of metabolic insight could help bridge the gap between correlation studies 565 and a mechanistic understanding of microbial community metabolism and dynamics.

567
Method implementation 568 The framework for implementing our method was developed as several different modular 569 functions that interact in a nested manner to run our analysis. The structure of these functions 570 and their associated variables is described in Supplementary Figure 7   Large-scale analysis of biosynthetic capabilities across the human oral microbiome 719 We investigated the large-scale biosynthetic properties of the human oral microbiome by  included as a positive control. The calculations were parallelized across metabolic networks and 726 metabolites using the Boston University shared computing cluster to improve computation time.

727
The PM values were stored as a matrix of organisms by metabolites PM values. This matrix was 728 analyzed using hierarchical bi-clustering based on average differences between groups. The 729 matrix was clustered and visualized using the R package pheatmap. Capturing specific biosynthetic patterns across human oral microbiome organisms 747 We investigated specific trends in metabolite PM values related to taxonomy by analyzing the