Introduction
Microbes typically form diverse communities of interacting species, whose activities have tremendous impact on the plants, animals, and humans they associate with1–3, as well as on the biogeochemistry of the entire planet4. The ability to predict the structure of these complex communities is crucial to understanding, managing, and utilizing them5. Here, we propose a simple, qualitative assembly rule that predicts community structure from the outcomes of competitions between small sets of species, and experimentally assess its predictive power using synthetic microbial communities. The rule’s accuracy was evaluated by competing combinations of up to eight soil bacterial species, and comparing the experimentally observed outcomes to the predicted ones. Nearly all competitions resulted in a unique, stable community, whose composition was independent of the initial species fractions. Survival in three-species competitions was predicted by the pairwise outcomes with an accuracy of ~90%. Obtaining a similar level of accuracy in competitions between sets of seven or all eight species required incorporating additional information regarding the outcomes of the three-species competitions. Our results demonstrate experimentally the ability of a simple bottom-up approach to predict community structure. Such an approach is key for anticipating the response of communities to changing environments, designing interventions to steer existing communities to more desirable states, and, ultimately, rationally designing communities de novo6, 7.
Main text
Modeling and predicting microbial community structure is often pursued using bottom-up approaches that assume that species interact in a pairwise manner8–11. However, pair interactions may be modulated by the presence of additional species12,13, an effect that can significantly alter community structure14 and may be common in microbial communities15. While it has been shown that such models can provide a reasonable fit to sequencing data of intestinal microbiomes16,17, their predictive power remains uncertain, as it has rarely been directly tested experimentally (18–19 are notable exceptions).
Current approaches to modeling microbial communities commonly employ a specific parametric model, such as the generalized Lotka-Volterra (gLV) model20–22. Generating predictions from such models requires fitting a large number of parameter values from empirical data, which is often challenging and prone to over-fitting. In addition, the exact form of the interactions needs to be assumed, and a failure of the model can reflect a misspecification of the type of pairwise interaction, rather than the presence of higher-order interactions23.
Here we take an alternative approach in which qualitative information regarding the survival of species in competitions between small sets of species (e.g., pairwise competitions) is used to predict survival in more diverse multispecies competitions (Fig. 1). While this approach forgoes the ability to predict exact species abundances, it does not require specifying and parameterizing the exact form of interactions. Therefore, it is robust to model misspecification, and requires only survival data, which can be more readily obtained than exact parameter values.
Competitions typically result in the survival of a set of coexisting species, which cannot be invaded by any of the species that went extinct during the competition. To identify sets of species that are expected to coexist and exclude additional species, we first use the outcomes of pairwise competitions. We propose the following assembly rule: in a multispecies competition, species that all coexist with each other in pairs will survive, whereas species which are excluded by any of the surviving species will go extinct.
To directly assess the predictive power of this approach, we used a set of eight heterotrophic soild-welling bacterial species as a model system (Fig. 2a, Methods). Competition experiments were performed by co-inoculating species at varying initial fractions, and propagating them through five growth-dilution cycles (Fig. S1). During each cycle, cells were cultured for 48 hours and then diluted by a factor of 1500 into fresh media, which corresponds to ~10.6 cellular divisions per growth cycle, and ~53 cellular divisions over the entire competition period. The overall competition time was chosen such that species extinctions would have sufficient time to occur, while new mutants would typically not have time to arise and spread. Community compositions were assessed by measuring the culture optical density (OD), as well as by plating on solid agar media and counting colonies, which are distinct for each species25. These two measurements quantify the overall abundance of microbes in the community, and the relative abundances of individual species, respectively. All experiments were done in duplicate.
Pairwise competitions resulted in stable coexistence or competitive exclusion of one of the species. We performed competitions between all species pairs and found that in the majority of the pairs (19/28 = 68%, Fig. 2b) both species could invade each other, and thus stably coexisted. In the remaining pairs (9/28 = 32%) competitive exclusion occurred, where only one species could invade the other (Time trajectories from one coexisting pair and one pair where exclusion occurs are shown in Fig. 2c. Outcomes for all pairs are shown in Fig. 2d). Species’ growth rate in monoculture was correlated with their average competitive ability, but, in line with previous reports26, it could not predict well the outcome of specific pair competitions (Fig. S2).
Next, we measured the outcome of competition between all 56 three-species combinations. These competitions typically resulted in a stable community whose composition was independent of the starting fractions (Table S1). However, 2 of the 56 trios displayed inconsistent results with high variability between replicates. This variability likely resulted from rapid evolutionary changes that occurred during the competition (Fig. S3). All but one of the other trio competitions resulted in stable communities with a single outcome, independent of starting conditions. This raises the question of whether this unique outcome could be predicted based upon the experimentally observed outcomes of the pairwise competitions.
Trios were grouped by the topology of their pairwise outcome network, which was used to predict their competitive outcomes. The most common topology involved two coexisting pairs, and a pair where competitive exclusion occurs (30/56 = 54%). To illustrate this scenario, consider a set of three species, labeled A, B, and C, where species A and C coexist with B in pairwise competitions, whereas C is excluded when competing with A. In this case, our proposed assembly rule predicts that the trio competition will result in the survival of species A and B, and exclusion of C (Fig. 3a).
This predicted outcome occurred for a majority of the experimentally observed trios (Fig. 3b), but some trio competitions resulted in less intuitive outcomes (Fig. 3c). For example, 1 of the 30 trios with this topology led to the extinction of A and the coexistence of B and C (Fig. 3c). The experimentally observed outcomes of competition in this trio topology highlights that our simple assembly rule typically works, and the failures provide a sense of alternative outcomes that are possible given the same underlying topology of pairwise outcomes.
Another frequent topology was coexistence between all three species pairs (15/56 = 27%), in which case none of the species is predicted to be excluded in the trio competition (Fig. 3d). Such trio competitions resulted either in coexistence of all three species, as predicted by our assembly rule (Fig. 3e), or in the exclusion of one of the species (Fig. 3f). Overall, 5 different trio layouts, and 11 competitive outcomes have been observed (Fig. 3g-k). Notably, all observed trio outcomes across all topologies can be generated from simple pairwise interactions, including the outcomes which were not correctly predicted by our assembly rule24. An incorrect prediction of our simple assembly rule is therefore not necessarily caused by higher-order interactions.
Overall, survival in three-species competitions was well predicted by pairwise outcomes. The assembly rule predicted species survival across all the three-way competitions with an 89.5% accuracy (Fig. 4a), where accuracy is defined as the fraction of species whose survival was correctly predicted. To get a sense of how the observed accuracy compares to the accuracy attainable when pairwise outcomes are not known, as a null model, we considered the case where the only information available is the average probability that a species will survive in a trio competition (note that this probability is not assumed to be available in our simple assembly rule). Using this information, trio outcomes could only be predicted with a 72% accuracy (Fig. 4a, Methods). We further compared the observed accuracy to the accuracy expected when species interact solely in a pairwise manner, according to the gLV equations with a random interaction matrix (Methods). We found that the observed accuracy is consistent with the accuracy obtained in simulations of competitions that parallel our experimental setup (p=0.29, Fig. 4b). Survival of species in pairwise competition is therefore surprisingly effective in predicting survival when species undergo trio competition.
Nonetheless, there are exceptional cases where qualitative pairwise outcomes are not sufficient to predict competitive outcomes of trio competitions. Accounting for such unexpected trio outcomes may improve prediction accuracy for competitions involving a larger set of species. We encode unexpected trio outcomes by creating effective modified pairwise outcomes, which replace the original outcomes in the presence of an additional species. For example, competitive exclusion will be modified to an effective coexistence when two species coexist in the presence of a third species despite one of them being excluded from the pair competition. The effective, modified outcomes can be used to make predictions using the assembly rule as before (Methods).
The ability of the assembly rule to predict the outcomes of more diverse competitions was assessed by measuring survival in competitions between all seven-species combinations, as well as the full set of eight species (Fig. 5a). Using only the pairwise outcomes, survival in these competitions could only be predicted with an accuracy of 62.5%, which is barely higher than the 61% accuracy obtained when using only the average probability that a species will survive these competitions (Fig. 5b). A considerably improved prediction accuracy of 86% was achieved by incorporating information regarding the trio outcomes (Fig. 5b). As in the trio competitions, the observed accuracies are consistent with those obtained in gLV simulations that parallel the experimental setup, both when predicting using pairwise outcomes alone (p=0.53), or in combination with trio outcomes (p=0.21, Fig. 5c).
Our assembly rule makes predictions that match our intuition, but there are several conditions under which these predictions may be inaccurate. First, community structure can be influenced by initial species abundances27, as has recently been demonstrated in pairwise competitions between bacteria of the genus Streptomyces28. Our assembly rule may be able to correctly predict the existence of multiple stable states, as it identifies all putative sets of coexisting, non-invasible species in a given species combination. However, we did not have sufficient data to evaluate the rule’s accuracy in such cases, as multistability was observed in only one of all our competition experiments.
Complex ecological dynamics, such as oscillations and chaos, can also have a significant impact on species survival29,30, making it difficult to predict the community structure. These dynamics can occur even in simple communities containing only a few interacting species. For example, oscillatory dynamics occur in gLV models of competition between as few as three species24, and have been experimentally observed in a cross-protection mutualism between a pair of bacterial strains31. In contrast, our competitions predominantly resulted in a unique and stable final community. This occurred despite the fact that we observed complex inter-species interactions involving interference competition and facilitation (Fig S4). These results indicate that complex ecological dynamics may in fact be rare, though it remains to be seen whether they become more prevalent in more diverse assemblages. Relatedly, prediction is challenging in the presence of competitive cycles (e.g. “Rock-Paper-Scissors” interactions), which often lead to oscillatory dynamics, and are thought to increase species survival and community diversity32,33. Such non-hierarchical relationships are absent from our competitive network, and thus their effect cannot be evaluated here.
In the absence of multistability or complex dynamics, our approach may still fail when competitive outcomes do not provide sufficient information regarding the interspecies interactions. This could be due to higher-order interactions, which only manifest in the presence of additional species, or because only qualitative information regarding survival is utilized. The observed accuracy of the assembly rule was consistent with the one found in gLV simulations, but this does not necessarily indicate that our species interact in a linear, pairwise fashion. In fact, fitting the gLV model directly to our pairwise data does not improve predictability (Fig S5). Determining whether, in any particular competition, predictions fail due to insufficient information regarding the strength of linear interactions, non-linear interactions, or higher-order interactions will require more detailed measurements.
Competitive outcomes may vary across environments, making it challenging to use outcomes measured in one environment to predict community structure in a different environment. Nonetheless, our results suggest that, when measured in the same environment, community structure can be predicted from the outcomes of competitions between small sets of species, demonstrating the feasibility of a bottom-up approach to understanding and predicting community structure. It remains to be seen to what extent these results hold in more diverse assemblages containing additional trophic levels, in the presence of spatial structure, and over evolutionary time scales.
Methods
Species and media
The eight soil bacterial species used in this study are Enterobacter aerogenes (Ea, ATCC#13048), Pseudomonas aurantiaca (Pa, ATCC#33663), Pseudomonas chlororaphis (Pch, ATCC#9446), Pseudomonas citronellolis (Pci, ATCC#13674) Pseudomonas fluorescens (ATCC#13525), Pseudomonas putida (ATCC#12633), Pseudomonas veronii (ATCC#700474), and Serratia marcescens (Sm, ATCC#13880). All species were obtained from ATCC. The base growth media was M9 minimal media25, which contained 1X M9 salts (Sigma Aldrich, M6030), 2mM MgSO4, 0.1mM CaCl2, 1X trace metals (Teknova, T1001). For the final growth media, the base media was supplemented with 1.6mM galacturonic acid and 3.3mM serine as carbon sources. These concentrations were chosen such that each contributed carbon at a concentration of 10mM. Nutrient broth (0.3% yeast extract, 0.5% peptone) was used for initial inoculation and growth prior to experiment. Plating was done on 10cm Petri dishes containing 25mL nutrient agar (nutrient broth with 1.5% agar added).
Competition experiments
Frozen stocks of individual species were streaked out on nutrient agar Petri plates, grown at room temperature for 48hr, and then stored at 4°C for up to 2 weeks. Prior to competition experiments, single colonies were picked and each species was grown separately in 50mL Falcon tubes, first in 5ml nutrient broth for 24hr and next in 5ml of the experimental M9 media for 48hr. During the competition experiments, cultures were grown in Falcon flat-bottom 96-well plates (BD Biosciences), with each well containing a 150μl culture. Plates were incubated at 25°C without shaking, and were covered with a lid and wrapped in Parafilm. For each growth-dilution cycle, the cultures were incubated for 48hr and then serially diluted into fresh growth media by a factor of 1500.
Initial species mixtures were performed by diluting each species separately to an optical density (OD) of 3*10−4 Different species were then mixed by volume to the desired composition. This mixture was further diluted to an OD of 10−4, from which all competition were initialized. For each set of competing species, competitions were conducted from all the initial conditions in which each species was present at 5%, except for one more abundant species. For example, for each species pair there were 2 initial conditions with one species at 95% and the other at 5%, whereas for the 8 species competition there were 8 initial conditions each with a different species at 65% and the rest at 5%. For a few species pairs (Fig. 2a-b), we conducted additional competitions starting at more initial conditions. All experiments were done in duplicate.
Measurement of cell density and species fractions
Cell densities were assessed by measuring optical density at 600nm using a Varioskan Flash plate reader. Relative abundances were measured by plating on nutrient agar plates. Each culture was diluted by a factor between 105 and 106 in phosphate-buffered saline, depending on the culture’s OD. For each diluted culture, 75μl were plated onto an agar plate. Colonies were counted after 48h incubation in room temperature. A median number of 85 colonies per plate were counted. To determine species extinction in competition between a given set of species, we combined all replicates and initial conditions from that competition, and classified as extinct any species whose median abundance was less than 1%, which is just above our limit of detection.
Assembly rule predictions and accuracy
For any group of competing species, predictions were made by considering all possible competitive outcomes (e.g. survival of any single species, any species pair, etc.). Outcomes that were consistent with our assembly rule were those that were predicted to be a possible outcome of the competition. For any given competition, there may be several such feasible outcomes, however a unique outcomes was predicted for all our competition experiments.
Pairwise outcomes were modified using trio outcomes as following: Exclusion was replaced with coexistence for pairs that coexisted in the presence of any additional species. Coexistence was replaced with exclusion whenever a species went extinct in a trio competition with two species with which it coexisted when competed in isolation. Only modifications cause by the surviving species, or an invading species were considered. Therefore a new set of modified pairwise outcomes was generated for each putative set of surviving species being evaluated.
The prediction accuracy was defined as the fraction of species whose survival was correctly predicted. When the assembly rule identified multiple possible outcomes, which occurred only in the gLV simulations, accuracy was averaged over all such feasible outcomes. Additionally, when the competitive outcome depended on the initial condition, accuracy was averaged across all initial conditions.
For reference, we computed the accuracy of predictions made based on the probability that a species will survive a competition between the same number of species. For example, for predicting trio outcomes, we used the proportion of species that survived, averaged across all trio competitions. Using this information, the highest accuracy would be achieved by predicting that all species survive in all competitions, if the average survival probability is > 0.5, and predicting that all species go extinct otherwise.
Simulated competitions
To assess the assembly rule’s expected accuracy in a simple case in which species interact in a purely pairwise manner, we simulated competitions using the generalized Lotka-Volterra (gLV) dynamics: where xi is the density of species i (normalized to its carrying capacity), ri is the species’ intrinsic growth rate, and αij is the interaction strength between species i and j. For each simulation, we created a set of species with random interactions where the αij parameters were independently drawn from normal distribution with a mean of 0.6 and a standard deviation of 0.46. Results were insensitive to variations in growth rates, thus they were all set to 1 for simplicity. These parameters recapitulate the proportions of coexistence and competitive exclusion observed in our experiments, and yield a distribution of trio layouts similar to the one(Fig. S6). The probability of generating bistable pairs in these simulations is low (~3.7%, corresponding to one bistable pair in a set of eight species), and we further excluded the bistable pairs that were occasionally generated by chance, since we had not observed any such pairs in the experiments.
The accuracy of the assembly rules in gLV systems was estimated by running simulations that parallel our experimental setup: A set of 8 species with random interaction coefficients was generated, and the pairwise outcomes were determined according to their interaction strengths. These outcomes were used to generate predictions for the trio competitions using our assembly rule. Next, all 3-species competitions were simulated with the same set of initial conditions used in the experiments. Finally, the predicted trio outcomes were compared to the simulation outcomes across all trios to determine the prediction accuracy. Thus, a single accuracy value was recorded for each set of 8 simulated species. Similarly, for each simulated 8-species set, the pair and trio outcomes were used to generate predictions for the 7-species and 8-species competitions, and their accuracy was assessed by comparing them to the outcomes of simulated competitions. Prediction accuracy distributions were estimated using Gaussian kernel density estimation from the accuracy values of 100 simulated sets of 8 species.
One-sided P-values evaluating the consistency of the experimentally observed accuracies with the simulation results were defined as the probability that a simulation would yield an accuracy which is at least as high as the experimentally observed one.
Author Contributions
J.F. and J.G. designed the study. J.F. and L.H. performed the experiments and analysis. J.F., L.H. and J.G. wrote the manuscript.
Author Information
The authors declare no competing financial interests. Correspondence and requests for materials should be addressed to J.F. (yonatanf{at}mit.edu) or J.G. (gore{at}mit.edu).
Acknowledgements
We would like to thank A. Perez-Escudero, N. Vega, E. Yurtsev and members of the Gore laboratory for for critical discussions and comments on the manuscript. This work was supported by the DARPA BRICS program, an NIH New Innovator Award (NIH DP2), an NSF CAREER Award, a Sloan Research Fellowship, the Pew Scholars Program and the Allen Investigator Program.