Increasing evenness and stability in synthetic microbial consortia

Construction of successful synthetic microbial consortia will harbour a new era in the field of agriculture, bioremediation, and human health. Engineering communities is a complex, multi-dimensional problem with several considerations ranging from the choice of consortia members and spatial factors to genetic circuit performances. There has been a growing number of computational strategies to aid in synthetic microbial consortia design, but a framework to optimize communities for two essential properties, evenness and stability, is missing. We investigated how the structure of different social interactions (cooperation, competition, and predation) in quorum-sensing based circuits impacts robustness of synthetic microbial communities and specifically affected evenness and stability. Our proposed work predicts engineering targets and computes their operating ranges to maximize the probability of synthetic microbial consortia to have high evenness and high stability. Our exhaustive pipeline for rapid and thorough analysis of large and complex parametric spaces further allowed us to dissect the relationship between evenness and stability for different social interactions. Our results showed that in cooperation, the speed at which species stabilizes is unrelated to evenness, however the region of stability increases with evenness. The opposite effect was noted for competition, where evenness and stable regions are negatively correlated. In both competition and predation, the system takes significantly longer to stabilize following a perturbation in uneven microbial conditions. We believe our study takes us one step closer to resolving the pivotal debate of evenness-stability relationship in ecology and has contributed to computational design of synthetic microbial communities by optimizing for previously unaddressed properties allowing for more accurate and streamlined ecological engineering.

Traditionally, single microbes have been engineered to incorporate an array of desired behaviours. However in to existing configurations to modulate evenness and stability. Intelligent manipulation of synthetic ecological 55 models requires a holistic understanding of the system and the circuit parameters involved. Our study proposes 56 a computational framework to identify the circuit parameters and the corresponding values to balance evenness 57 and stability. Our workflow combines machine learning and mathematical modeling to explore and characterize 58 the parametric space of circuit parameters to recalibrate their values to achieve an even and stable model. With 59 this framework, we aim to gain crucial insights into social interactions and the circuit parameters contributing 60 to evenness and stability. Using this method, we derive more streamlined experimental decisions to optimize 61 synthetic microbial in a swift and accurate manner.   Figure 1: A) Quorum-sensing based circuits of cooperation, predation, and competition. Two E.coli species are auxotrophic for tryptophan and histidine respectively and depend on each other for survival. In short, the growth of one species triggers or suppresses the essential amino acid production in the other species depending on the type of circuit interaction. B) A list of tunable circuit variables that are involved in the synthetic models described above. The design space of these parameters were explored and characterized by our methodology. C) The two outputs evaluated are evenness and stability. The evenness index used is the Pielou's evenness measure. Stability was calculated using linear stability analysis and eigenvalues with real negative eigenvalues denoting stable system.

IV: Extraction of optimal ranges
Create hypervolumes for all clusters

V: Analytics
Perform feature selection Perform density-based clustering ε Figure 2: Machine learning based workflow can be divided into two parts. In the first section we use adaptive sampling to build surrogate models of our test cases to ensure that parametric space has been covered efficiently. The first step is to initially populate the space with Sobol points, following which adaptive sampling covers the design space by sampling in area of high variance and using previous points to select new points. To evaluate the efficient coverage of design space, surrogate models were constructed and their accuracy was determined using support vector machine algorithm. In the second section, we analyzed the parametric space using the principles of clustering and hypervolume to derive specific parameter ranges corresponding to our desired output. Once there are enough points in the parametric space, density-based clustering is applied to partition the space. Hypervolumes are created around each clusters, which provide their corresponding parametric ranges. To obtain optimal ranges, decision trees are used. Finally, further analysis such as feature importance can be performed.
we would need to conduct at least 10 17 simulations. This computationally expensive process poses a significant 106 barrier in our ability to investigate the parametric space and draw useful information from it. We have developed 107 methodology that enables swifter exploration and analysis of parametric space conceptualized in Figure 2. Since the 108 biggest obstacle in mapping parametric space is its complexity and vastness, and hence the number of simulations 109 required to accomplish it, the core principle of our methodology is to first reduce the parametric space without 110 losing any critical information.

111
Complex simulation models are often treated as "black-box" functions with unclear input-output mapping.

112
They also incur high computational costs, which has led to the emergence of surrogate models, also known as concluded that the design space has been covered and analysis can begin.

128
The design space has been populated with points from both sobol and adaptive sampling. For each point we 129 calculated the evenness, based on the equation in Figure 1C.

130
As several clusters are generated, we had to select the cluster that is more suitable for experimental implemen-131 tation. A combination of two features guide cluster selection: "probability" and "hypervolume". the probability of a variable being wrongly classified. Hence, a Gini index = 0 is ideal. We selected for finer 138 parametric ranges with Gini index = 0 to further maximize the probability of achieving high evenness. The same 139 process was repeated on the same cluster to discover clusters of high stability within. This process significantly 140 reduced the vast design space into smaller regions that can be comprehensively explored and analyzed.

141
Cooperation offers highest flexibility in ecological engineering 142 We aim to explore how the parametric space varies in terms of evenness as measured by Pielou's evenness 143 index 49, 50 calculated when the system reaches steady state. Pielou's index measures diversity and species richness.

144
It can vary between 0 and 1, with the former implying that only one species is growing at steady state (no evenness) 145 and the latter meaning that both species have the same biomass in steady state (complete evenness).
146 Figure 3A illustrates the distribution of evenness in the three models. Cooperation shows the highest evenness Cooperation has the highest density of points with evenness index = 1, leading the system to mostly exhibit high evenness. In predation and competition, the highest density is observed for low evenness. Below the violin plots, all high evenness clusters were plotted with the hypervolumes shown on the axis below. Cooperation has clusters of the largest hypervolumes suggesting that greater region of operation. High evenness clusters in competition are the smallest size, implying least amount of flexibility.
probability of one species overtaking the other at steady state. These results are significant as they denote that 150 using a cooperation-based co-culture has highest chance of obtaining a highly even system. We observed that 151 predation system had a very binary distribution of evenness. This means that either the system tends to have 152 highest evenness (evenness index = 1) or lowest (evenness index = 0), with very few points lying in the middle, 153 and majority of the points concentrated in evenness = 0 region. Due to the oscillatory behaviour of predator-prey 154 interaction, we expected low evenness due to differences in the biomass of two species. Interestingly, there are 155 regions where the system displays high evenness. This was attributed to specific parameter value combinations 156 that weaken the behaviour of the predator, balancing the interaction to mimic cooperative behaviour (see figure   157 in Supplementary). Therefore, there is scope for predation to have high evenness by changing into cooperation.

158
The concept of "hypervolume" is typically used to describe ecological niche 51 . We have introduced it here as  In cooperation model ( Figure 4A) exhibiting low evenness, majority of the production-related parameters have 175 lower values, such as ν , α, and LasR. This pattern amplifies the difference between the biomasses of the two 176 species particularly by impacting the growth of histidine user. When the model shifts to a high evenness mode,

177
AHL production by tryptophan utilizing species, ν 1 and histidine production rate, α 2 increase. These changes in 178 particular boost the production of histidine and eventually of histidine user, bridging the difference between the 179 two species.

180
As described before, high evenness cases in predation is attributed to the system mimicking cooperative be-181 haviour. We hypothesized that this becomes possible due to the predator weakening. In our model, the predator 182 is the tryptophan user and prey is histidine user. In Figure 4B, we see predator-associated parameters change 183 to weaken its action on the prey. In high evenness, ν 1 has lower values. The degradation rates also increase, 184 except for γ 6 , which is the degradation rate of histidine production in the prey. The prey is further strengthened strengthening the prey, a cooperative behaviour is achieved and hence high evenness.

189
Our results suggest that in competition, transition into high evenness is associated with reducing tryptophan 190 and histidine production in respective species. In high evenness, by lowering histidine production rate, α 2 , and dissociation constant of histidine, β, the histidine-user decreases histidine production, while an increment in AHL 192 production by histidine-user species, ν 2 , strengthens the inhibition of tryptophan production. In competition, . It was observed that values of ν 1 , α 2 , n, β parameters are the only ones that change in all three models. In cooperation, ν 2 and α 2 increase to higher values in high evenness, increasing histidine production in histidine-user. In predation, the increase in evenness is marked by the weakening of predator's actions. For example, ν 1 decreases, γ 3 and γ 5 increase. Meanwhile, the production of histidine in the prey increases with increase in α 2 , k 2 , and β. In competition model, tryptophan and histidine production is downregulated by decrease in α 2 , β, and increase in ν 2 .
coexistence is achieved when each species inhibits its own population before that of the competitor. This is 194 visible when lower but similar levels of biomasses are observed in competition case.Cooperation and competition 195 are symmetrical interactions. However due to the species being auxotrophic for different amino acids, there is 196 a metabolic imbalance as the contribution of tryptophan and histidine to biomass production is different. Since 197 histidine is required in higher amounts for cellular growth, the histidine-utilizing species has a disadvantage in the 198 consortia, having more work to do. In cooperation, it appears that the circuit parameters need to be modulated 199 to compensate for this discrepancy and inject more resources to the histidine user. The patterns that emerge from 200 our results hint that evenness can be manipulated by the concentrating on the species in the consortia that has the 201 greatest resource requirement making it a "limiting" species. It is important to note that the distribution shows 202 where most of the points are and not the entire distribution.

203
Next we wanted to explore how much the value of each parameter varied between the models. Figure 10  thermore, in our case study we are fixing the species "richness" (number of species in community) to two and only 217 varying relative abundances, therefore the biodiversity of the community is solely dictated by evenness.

218
In order to characterize the stability of our models, we used linear stability analysis. In summary, the variables 219 are perturbed and Jacobian matrix is computed. The eigenvalues in the matrix are the eigenvalues of the equilib-220 rium and have been used to measure the stability of equilibrium 53 , 54 , 55 . We define stability as the tendency of 221 the system to go back to its initial state after perturbation. Figure 5 illustrates the stability of the three models 222 and their relationship with evenness.

223
The three synthetic consortia have non-linear dynamics over vast parametric space. Therefore, it is not feasible 224 to assume that the system would have a "single" stable state. Different regions can have different stabilities 225 depending on the parameter values. We treat our system as two-dimensional, as we are interested in the stability 226 of the biomass of both species. For a deeper analysis, we have decoupled stability into five categories: stable node, 227 stable focus, saddle, unstable focus and unstable node. A system is deemed as stable if it has stable node and 228 stable focus. Unstable system is defined by saddle and unstable node and focus points as depicted in Figure 5A.

229
In order to decipher the relationship between evenness and stability, we divided the points into high evenness   Figure 5: Stability Analysis: Stability was divided into stable node, stable focus, saddle, unstable focus, and unstable node. Using linear stability analysis, eigenvalues were obtained to test the stability of biomass of tryptophan user and histidine user. B) The proportion of each type of stability was evaluated for all models to check how they vary in high evenness (Pielou's Index > 0.8) and low evenness (Pielou's Index < 0.3) C) The eigenvalue distribution of tryptophan user and histidine user are shown for stable nodes and stable focus to assess the link between the frequency of return to steady state versus evenness.
evenly split with the unstable region being slightly larger. However in low evenness, though the saddle points 238 do not change significantly, there is a remarkable increase in unstable node accompanied by a sharp decrease in 239 stable nodes. In contrast, competition displayed the opposite behaviour. Stability and evenness have a negative 240 relationship. When evenness decreases, the region occupied by stable node significantly increases and the saddle 241 points decrease. Predation did not present with a strong link. Much like cooperation, it appears that stability 242 decreases in low evenness, however the change is not by a significant margin. This implies in predation the link 243 between stability and evenness might be harder to substantiate or be non-existent. 244 We noted that stable regions were not distinctly isolated from unstable regions. For all models, both stable and 245 unstable regions co-exist together, even though the ratios vary with evenness. This suggests that though a pattern 246 can be concluded between evenness and stability for different social ecological interactions, the relationship does 247 not universally hold true. For example, the likelihood of encountering a stable region in high evenness competition 248 is higher, but instability could still be encountered, further complicating the debate between evenness and stability 249 and justifying the need for more intelligent engineering strategies.

250
The eigenvalue distribution for stable nodes and stable focus are captured in Figure 5C. do not occur frequently. In cooperation, as evenness decreases so does the eigenvalue distribution of stable focus.

259
Whereas the opposite trend is observed for competition.

260
The eigenvalues of stable nodes of tryptophan-user biomass is always consistently more negative than that of 261 histidine user despite different ecological dynamics and evenness. It appears that stable node is driven significantly 262 by only one species in the consortia. This difference in eigenvalue was not noted for stable focus. Figure 12 (in 263 supplementary section) shows the eigenvalues of the two species for saddle points and unstable focus corroborating 264 the pattern noticed for stable node. This was a curious finding we attributed to the difference in the growth rates 265 of the species, which is being controlled by the system to bring high evenness conditions. It appears that the 266 faster growing species in a community is dampened to achieve stability. We believe this is an important insight 267 into stability. When designing synthetic consortia, the species should be chosen and/or engineered to account for 268 this.

269
Our analysis proves that the evenness-stability relationship is complex and contextual. Additional challenges 270 are imposed with cases like predation where the association between the two is not as clearly defined. Further-271 more in synthetic microbial community comprised of multiple species and hence multiple social interactions, the 272 relationships are expected to change. Therefore, there is a need to regulate stability to prevent dependence on a 273 variable qualitative relationship.

274
Predicting optimal ranges for high evenness and high stability 275 An ideal synthetic microbial consortia would be highly even and stable to ensure long term function and 276 survival. Using our workflow, we aimed to engineer the synthetic consortia to operate in both high evenness 277 and high stability regions. As per our process, we first partitioned the design space into clusters of high evenness, 278 following which we selected a suitable cluster with highest probability and appropriate hypervolume. More samples 279 were generated within the chosen cluster using sobol sequence for analysis. Next, we evaluated the stability at each 280 point and obtained ranges by selecting for stable points and eliminating unstable focus, nodes and saddle points.

281
This method yielded multiple clusters from which we chose a stable region. The even and stable operating ranges 282 have been shown in Figure 6. The parameter values should be set within the darker shaded region to maximize 283 the probability of the system exhibiting high evenness and high stability. The operating ranges for cooperation 284 highlighted are very wide. This is a positive trait as it gives scientists more flexibility when engineering the circuit 285 parameters. The optimal ranges became narrower for predation and competition, making circuit tuning more 286 challenging. Figure 13 in supplementary information illustrates the average hypervolume of the high stability 287 regions within high evenness. Predation is shown to have the smallest hypervolume, suggesting that the operating 288 ranges to achieve high stability are the narrowest. However as seen in Figure 3, competition has the smallest 289 hypervolume for high evenness, not predation. This shows that the hypervolumes for high evenness and high 290 stability are not correlated. Keeping in mind that engineering seventeen parameters can be tedious, we performed feature selection to 292 determine the importance of each parameter on evenness and stability metric. Figure 7A showed that for both LasR α 1 θ n k 1 α 2 β k 2 LuxR Figure 6: (A) Cooperation: The lighter green maps the operating ranges for high evenness. The darker green corresponds to the parametric ranges to get high stability within high evenness. It can be seen that operating ranges in this model are wider, offering a larger design space to function. For parameters γ 5 , γ 6 , ρ 1 , ρ 2 , LuxR, LasR, and α 1 , the high stability range spans almost the entire range of high evenness. Meanwhile parameters such as n, θ, and β would offer highest flexibility when seeking high evenness. (B) Predation: The lighter yellow maps the operating ranges for high evenness. The darker yellow corresponds to the parametric ranges to get high stability within high evenness. Compared to cooperation, the operating ranges in this model are smaller, making their tuning more challenging. The high evenness ranges are still relatively large (ν 1 , ν 2 ,γ 6 , θ). However, the high stability region within is significantly narrower, owing to the nature of the predation interaction. To achieve high evenness and high stability, all parameters require precise tuning, other than θ, which has the widest working region. (C) Competition: The lighter red maps the operating ranges for high evenness. The darker red corresponds to the parametric ranges to get high stability within high evenness. Contrary to cooperation, competition has narrower operating parametric ranges. Much like predation, it follows the trend of exhibiting wider high evenness regions like for ν 2 , γ 3 , LuxR, α 1 , and α 2 . However, the high stability regions within them are notably confined.

Cooperation
Predation Competition

B) C)
Figure 7: Feature Importance: Circuit parameters were ranked on their contribution toward evenness and stability for all interaction circuits using the mean accuracy decrease method shown on the y-axis. Differences were noted for each. A) Cooperation: ν 2 was the most important parameter for both evenness and stability. For stability, there were more parameters of significance especially degradation rates and protein production rates. There are more parameters important in determining stability than evenness. B) Predation: Both evenness and stability are prominently driven by ν 1 alone. C) Competition: This model witnesses highest number of significant parameters compared to the other two models. For evenness ν 2 is most important and for stability it is n. Competition generally witnesses several parameters contributing to evenness and stability compared to the other two models. evenness and stability, ν 2 was the highest contributing factor in cooperation. Though in evenness, cooperativity 294 and dissociation constant appears to be important as well, it is in stability we notice that several parameters are 295 significant contributors, in particular degradation rates and protein production rates. Manipulating them in a 296 wet-lab setting is straightforward by changing the RBS strengths 56 . In predation, the operating ranges become 297 noticeably narrower, particularly for high stability within high evenness. However feature selection in Figure 7B 298 shows that ν 1 dictates evenness and stability in predation model. The rest of the parameters seem to have a 299 markedly lower to negligible impact. We believe that lowering ν 1 has a cascade effect of weakening the effect of 300 the predator on the prey, thereby pushing the system toward high evenness and stability. In competition, we note 301 narrow engineering space in Figure 6 and feature importance shows multiple parameters being of significance.

302
Most important parameters shared between evenness and stability are n, LuxR, α 2 , out of which the first two have knowledge there is no study that aims to engineer this link. As opposed to viewing evenness-stability as intrinsic 330 properties of a synthetic consortia, we propose that they are mutable targets. By engineering parameters, we 331 can shift a synthetic consortia to a region of high evenness and high stability. Arming synthetic biologists and 332 ecologists with this ability, increases their freedom of choosing synthetic models, as evenness and stability are no 333 longer fixed restrictions. This is in particular useful, as the mechanisms behind stability remain elusive.

334
Diversity is considered to be one of the factors driving stability, but other factors such as substrate storage, 335 habitat characteristics, biotic and abiotic interactions, and environmental conditions are also contributors. Vari-336 ations between different synthetic consortia can hinder the establishment of a universal correlation with different 337 mechanism yielding different relationships. Though a comprehensive understanding of stability will be an incred-338 ibly useful insight, our methodology proves that high stability regions can be identified without that knowledge. 339 We can directly engineer the system to have high stability without changing other factors. Those factors can 340 be accounted for but our method can easily be modified to characterize any parametric space and find a solu-341 tion. We have demonstrated the concept for co-cultures, but it can easily be expanded to multiple species with a 342 combination of interactions with parameter values that narrow the function in a high evenness and high stability 343 region. 344 We emphasize that while some relations emerge, the parametric space is too vast and multi-dimensional to find 345 design rules that hold universally true. In fact, our analysis hints that even generally held true relationships can 346 shift and modify depending on the conditions. Therefore when designing optimal functioning space for synthetic 347 consortia, it is important to not rely on general relationships as they are several fail points. Our strategy of 348 narrowing down parametric spaces and selecting suitable clusters to operate within will alleviate this issue and 349 maximize the probability of seeing desired effects. in metabolic engineering using co-cultures, this method can be used to optimize the system to maximize chemical 359 production. The method can be broadly applied to explore and evaluate any genetic circuit. Additionally, the 360 ability deconstruct solution space based on multiple metric will prove particularly profitable to synthetic biologists.

361
Designing synthetic consortia is a multi-faceted challenge. One of the approaches involves optimal tuning and 362 balancing the circuit parameters for desired output. Most of the methods to design synthetic microbial consortia 363 center around model selection and configuration design 33 . We propose an alternative perspective: optimizing 364 synthetic ecological circuits. The main advantage with this approach is the freedom to design models without 365 any structural constraints. Our methodology can easily be adapted to fit any synthetic ecological configuration 366 and enhance its performance by determining regions of desired output. One of the most unique and powerful 367 aspects of our developed method is the comprehensive exploration and analysis of the design space. As opposed 368 to designing the structure of synthetic consortia, we analyze the architecture of the design space of synthetic 369 consortia to extract valuable insights into behaviour and instructions on how to optimally regulate it. We believe 370 our framework will aid not only in ecological engineering but also the diverse field of synthetic biology.

372
Models and Packages Genome-scale metabolic model of E.coli iML1515 was obtained from BiGG Models 373 database. All coding was done in Python language.

374
System Equations The following ordinary differential equations are used to describe the circuit dynamics in 375 a chemostat setting. It is assumed that the chemostat is perfect, which means flow rate of media in and out of 376 the reactor is the same, keeping volume constant.  For implementation, a python package, pymoo -multi-objective optimization in python -was used.

400
For a target function f in D ∈ R n , initial set of samples are generated X = (x 1 , ... , x m ). New points, X new 401 are added iteratively by solving an bi-optimization score function. 402 We performed density-based clustering, which is recommended for high-dimensional data, to divide our points 421 into high evenness and low evenness clusters. Points with an evenness index above 0.8 was labelled as high 422 evenness. We calculated the hypervolume for each cluster, a product of the parameter ranges that contain points 423 of a specific metric. We identified regions of high evenness within the parametric space and suggest that within The general solution (close to fixed point x * ):    Figure 1: Hypervolume analysis: The entire space was split into low, medium, and high evenness clusters and plotted against the number of points falling inside each cluster. The hypervolume and number of points inside a cluster are not always correlated. As can be seen in predation, the hypervolume of high and medium evenness is larger than low evenness, however the number of points falling within low evenness is the highest. In cooperation, the cluster populated by most points belong to high evenness and in competition the opposite is observed. However, it is in predation that we note that low evenness cluster, which has the highest number of points, has a lower hypervolume compared to high evenness and medium evenness clusters. These results show that the while majority of the points in predation code low evenness, the ranges of the parameters that would yield low evenness are narrowest, reducing the hypervolume.  : The hypervolumes of the high stability regions within high evenness were averaged. As supported by Figure 6, cooperation has the highest hypervolume, implying that the optimal ranges for high stability are the widest. Predation was shown to have the smallest hypervolume, meaning that the optimal ranges to achieve high stability within high evenness in predation were the most restrictive. Meanwhile, competition has a bigger hypervolume than predation but smaller than cooperation. Even though as seen as in Figure 3, competition has the smallest hypervolume for high evenness but that is not the case for high stability.