Artificial selection improves pollutant degradation by bacterial communities

Artificial selection is a promising way to improve microbial community functions, but previous experiments have only shown moderate success. Here, we experimentally evaluate a new method that was inspired by genetic algorithms to artificially select small bacterial communities of known species composition based on their degradation of an industrial pollutant. Starting from 29 randomly generated four-species communities, we repeatedly grew communities for four days, selected the 10 best-degrading communities, and rearranged them into 29 new communities composed of four species of equal ratios whose species compositions resembled those of the most successful communities from the previous round. The best community after 18 such rounds of selection degraded the pollutant better than the best community in the first round. It featured member species that degrade well, species that degrade badly alone but improve community degradation, and free-rider species that did not contribute to community degradation. Most species in the evolved communities did not differ significantly from their ancestors in their phenotype, suggesting that genetic evolution plays a small role at this time scale. These experiments show that artificial selection on microbial communities can work in principle, and inform on how to improve future experiments.


Introduction
Microbial communities naturally provide us with many ecosystem functions like digesting inaccessible nutrients or cleaning wastewater.Being able to design such multi-species communities from scratch to optimize ecosystem functions would be a major biotechnological breakthrough, but knowing which species to combine and how such a choice will affect ecological and evolutionary dynamics and thereby functional dynamics is a very challenging problem.
A first intuitive approach is to collect candidate species, study their capacities through genomic and phenotypic analyses and then combine them in clever ways that are likely to result in high function (1)(2)(3)(4)).An alternative is to automate the optimization process while remaining blind to the properties of each species.This blind approach can be taken using artificial selection (5, 6).
Artificial selection -also known as "directed evolution" or

D R A F T
A second problem with the existing approaches lies in how "offspring" communities are generated from their "parents" at every round: parent communities are either simply diluted to make offspring (low abundant species may go extinct) or pooled together and then distributed over the offspring communities.Both approaches result in offspring communities that are very similar to one another, and do not deviate much from the communities at the start of the experiment (20,29).The resulting lack of variability between communities gives little material for artificial selection to work on.The challenge then is to develop a selection method that favors cooperation within and between species, while maintaining between-community variability and selecting for increased function at the community level.
Here we address these fundamental problems by experimentally testing whether a novel selection approach called "disassembly selection" that was inspired by optimization algorithms from the computational sciences called genetic algorithms (30,31) could improve community performance over the length of a selection experiment.We have previously compared our approach to existing methods theoretically (29).Here, we implemented disassembly selection experimentally to automatically explore the species composition search space: we randomly generated communities of known species composition, and then repeatedly selected the bestscoring communities, disassembled their member species and re-assembled new communities that differed slightly in their composition for the next round (Fig. 1A).This approach was expected to improve performance while maintaining between-community variability (29).A second goal of our disassembly approach was to limit competition within communities and instead select for increased cooperativity.
To achieve this and avoid aggressive species that exclude all others, we penalized communities where species extinctions occur.
We used our approach to find a community that can efficiently degrade industrial pollutants called Metal Working Fluids (MWFs), a challenge we have previously studied using a single four-species bacterial community (32).As this original community could only degrade 44.4% of the MWF on average, we hypothesized that there would be room for improvement.
After 18 rounds of selection using a pool of 11 species isolated from MWF in which 167 combinations of four species were tested, we found a community that degraded 75.1% of the MWF on average.While this is significantly better than our original community (32), the best community in the first round, and a random control, the magnitude of the improve-ment was still marginal.We also separately found a species 125 pair that performed at least as well as the top community.The 126 lessons learned with this approach suggest that it can still be Assemble the new communities in the lab using the frozen species and repeat from step 2. B) Degradation scores of the 5 best communities in each round for the selection (green triangles) and random (orange circles) treatments, with lines through the average of the 5. C) Community composition (x-axis) vs. degradation score (hue, color bar) for the best 56 communities (one third of all 167 tested communities, the full set is shown in Fig. S2) over the 18 rounds of selection (y-axis) in the selection (top panel) and random (bottom panel) treatments.The x-axis is ordered by increasing degradation scores (averaged over all instances of the same species composition).Note that these are degradation percentages, not final community scores (extinctions not considered).showing selection for improved degradation and the mainte-176 nance of high-performing communities.Our approach also 177 continued to explore the search space by testing many new 178 communities at each round: in round 18 there were still com-179 munities with low degradation scores (Fig. 1C).S1).As a control, we also counted the number of 210 contamination events (any species that was present at day 4, 211 despite not being inoculated at day 0), which we did not ex-212 pect to vary significantly between the treatments.Indeed, we

D R A F T
ties lacking strong growers in Table S1).Eq. ( 1), ( 33)), the evenness of the 10 communities whose Population size, growth or evenness could only be calculated for the 10 communities per treatment that we plated (see Methods).
Finally, we might expect the total biomass in communities to   Not all good degraders were over-represented in the selection treatment, though.Pf and At1 each featured in only 2 of the 10 best communities (Fig. 1D), despite Pf being the bestperforming species alone and featuring in 8 of the 10 best pairs (Fig. 3B, D).In contrast, Ct was present in 7 out of the 10 winning communities (Fig. 1D).
We also analyzed which species were most present when extinctions occurred, as we selected against extinctions.In the 20 communities of the selection treatment where extinctions occurred, the species most often found were At1, Pr and Pf (13, 12 and 12, respectively, Fig. S5).In 9 of the 20 communities, At1 and Pr were both present, which may explain why they do not feature together in the best 10 communities (Fig.

311
1D), despite being one of the best degrading pairs (Fig. 3B

373
We first used the population size data of mono-and co-374 cultures (Fig. 3E) to estimate interactions in the ancestral 375 species (Fig. 4C).We then selected isolates of a few pairs (we 376 favored species whose ancestral interactions were significant  S2).We therefore conclude that evolution at     and pairwise co-cultures are shown (with partner).Significant differences are calculated using a generalized linear model with biological replicate as random variable and number of species in culture as an explanatory variable, significant p-values with a Bonferroni correction for multiple comparisons are shown.C)-E) Interactions between ancestral species (C), the species evolved in the random treatment (D) and the selection treatment (E) defined as the log2 fold-change of the focal species in CFU/ml of day 3 in co-culture with the companion species vs mono-culture.Interactions that were not significant (no significant difference between growing alone or with companion species) are shaded, p-values were adjusted using the Benjamini-Hochberg method.Positive (facilitative) interactions are in blue, while negative interactions are shown in red.White squares are ones that we did not measure.Overall, we saw very few changes between ancestral and evolved species (black-bordered squares, quantified in Table S2).
that might not be able to survive alone, but enhanced the degradation score when paired with the strong degrader.This simple heuristic revealed that the best overall performance was achieved by two species in co-culture: At1 and Af.
Adding two more species to this pair (Ct and particularly the free-rider Ac) increased the variance in community performance, and combining all 11 species performed poorly (Fig. 3).It appears then that our optimal community of four species is in fact too rich.
Another key observation is that despite our efforts to favor within-species evolution -we sampled many colonies when disassembling communities through plating to include sufficient within-species diversity, used a competition-promoting medium, and penalized extinctions to give room for interactions to evolve to become less negative or more positive -it did not have a large effect on final population sizes, degrada-tion abilities or inter-species interactions.One explanation 439 may be that species are changing their biotic environment pool testing all possible combinations would be much more challenging.Furthermore, the improvement achieved by this proof-of-concept study was significant yet marginal, and before we can apply our method more widely, it would need to improve performance by several folds.In addition, the approach was quite cumbersome and would not be easy to set up for a new problem.
A first question is whether the artificial selection approach is useful at all, or whether we could have predicted the composition of the best community using fewer culture experiments.
To explore this question, we performed an additional analysis using a simple linear model that predicts the degradation score based on species presence/absence (4).Including the data from all our experiments, the linear model had a reasonable fit (R 2 = 0.8) and would have chosen a community that performed relatively well (degradation score: 69.2%, compared to 75.1% with our method).However, if we only used mono-and co-culture data to train the model, performance dropped when we tested the model on the remaining data (R 2 = 0.26) and the best predicted communities ranged in performance from 40.3% to 83.1% (Fig. 5A, B).Finally, if we trained the model on measurements from the communities and then tested it on the mono-and co-culture data, its predictive ability still remained low (Fig. 5C, R 2 = 0.32).
It would be interesting to determine the minimal amount of data needed to achieve a good prediction and explore whether 480 other prediction methods would perform better (e.g. ( 34)).

481
Overall, though, this analysis suggests that this approach 482 would not have identified the best-performing community.

483
A second lesson could be to focus on the strength of our 484 method as a search algorithm to efficiently explore the space 485 of possible species combinations.Our approach is analogous 486 to a genetic algorithm (30,31) in that solutions are encoded 487 in a "genome" (here the species composition of each commu- step of disassembling communities (as illustrated in Fig. 5D).

497
We would then no longer need selective media and it would 498 suffice to know whether species went extinct, which could be 499 achieved through amplicon sequencing.Removing the con-500 straint of designing selective media would allow us to greatly 501 expand the species pool, and we would no longer need to 502 avoid any species pairs, which limited the species combina-

D R A F T
generating more combinations thereof, we expect the overall performance of the method to improve (29).Another important modification would be to allow community size to change, as opposed to restricting it to four species as we have done here.This would involve removing or adding species independently, allowing communities to grow or shrink in size.This decoupling would increase the search space of possible combinations, but might find better solutions, for example by avoiding free-rider species like Ac to establish in so many communities.
Allowing community size to change automatically would also answer an important question for community function: how many species are actually needed to solve the problem of interest?In our previous work (32), a mathematical model predicted that in harsher environments, more species are needed to achieve maximal community function compared to permissive environments.Experimentally, degradation saturated at two species in the more permissive MWF with casamino acids, compared to three species in MWF alone (32), which is consistent with our best solution here having only two species.In hindsight, a more challenging environment might have shown a stronger improvement over the experiment and required a larger optimal community.
A final important limitation of our approach is that we fixed the initial population size of all species at each round, in order to select against cheater strains that grow quickly without contributing, and to improve heritability of the community function (27).It would be interesting to see how our best-performing communities would equilibrate over a few rounds of growth and dilution, when we do not adjust the initial population sizes.It is conceivable that the performance of the community at equilibrium would be different.Ideally, community stability should be part of the community score, although this would require substantial revision to the selection algorithm.
In summary, we have tested an approach to artificially select amongst communities composed of different combinations of culturable species.Our approach found a four-species community that is efficient at degrading MWF pollutants and is superior to the performance of all species in our pool grown together.However, the selection experiment was relatively complex and a smaller community was also found by testing species pairs and comparing them to the winning community.Going forward, we propose a simpler, more effective approach (Fig. 5D).Even though the challenges of ensuring  S3 and S4).The disassembly plates 585 consist of two 24-well plates where we poured 1.5ml of each

D R A F T
Hill number of order 1 (39): divided by its maximum value (similar to Pielou's evenness

Fig. 1 .
Fig. 1B).165 D) Community composition corresponding to panel C, showing the presence (dark blue) or absence (white/grey) of each species, and illustrating the difference in composition by the Hamming distance (i.e. the number of substitutions needed to transform a given community to another) to the community with the highest degradation score at the bottom.ment tested high-performing communities more often than 172 the random treatment, and this occurred preferentially in the 173 later rounds of the experiment (high density of purple dots in 174 top right corner of the selection treatment in Fig. 1C, S2, S3), 175

180
In an effort to estimate the ruggedness of the community "fit-181 ness landscape" (4), we next asked whether the communities 182 with the highest degradation scores resembled each other in 183 their species composition, by calculating the Hamming dis-184 tance to the best community (number of species that one must 185 exchange to get the same composition as the best commu-186 nity, Fig. 1D).At first glance, there was no obvious pattern 187 between the similarity in community composition to the top 188 community and degradation score.However, the best 5 com-189 munities had a distribution of Hamming distances that was 190 significantly different from the distribution of distances be-191 tween all pairs of communities in our study (Kruskal-Wallis 192 H-test, p = 3.9 ◊ 10 ≠4 , Fig. S4).

193
Selection reduced extinctions, but did not increase194 evenness or total biomass.We first explore whether se-195 lection has favored certain community properties: population 196 growth, community evenness or species survival.When de-197 signing our selection method, we decided to penalize extinc-198 tions by scaling the degradation scores by the fraction of sur-199 viving species.We did this firstly to ensure that communities 200 did not reduce to single species, and secondly to test whether 201 we could select against competitive species, or even for more 202 cooperative mutants.Extinctions were quantified in each 203 round by plating 10 communities per treatment (see Meth-204 ods) on selective media on day 4 and comparing the pres-205 ence of each species to how we composed the community on 206 day 0. The distribution of extinctions per round was signifi-207 cantly lower in the selection compared to the random control 208 treatment (Kolmogorov-Smirnov test, p = 0.013, Fig. 2A, 209 S5, Table

221
Next, we ask if communities in the selection treatment were 222 more even than in the random control.We might expect selection to favor evenness, since species in diverse communi-224 ties may complement one another while communities dom-225 inated by a single species risk excluding others that could 226 contribute to degradation.Calculating evenness as the effec-227 tive species number relative to its maximum value (Methods, 228

229Fig. 2 .
Fig. 2. Number of extinctions, evenness and total population size over time.A) Number of extinctions per round (solid lines) and cumulative (dashed lines) in the 10 plated communities of the selection and random treatments.B-C) Mean (lines) ± SD (shaded areas) values of the 10 plated communities at each round where B) shows evenness (the effective species number divided by its theoretical maximum value) and C) total population size in CFU/ml.D-E) Degradation percent plotted against D) evenness and E) total population size with the selection treatment in triangles and the random treatment in circles, and color representing selection rounds.

236
influence degradation for two reasons: (i) degradation could 237 be the aggregated effect of individual cells assuming that all 238 species contribute to degradation, and (ii) as species adapt to 239 the medium, they might increase their growth rates, which 240 should increase degradation.We calculated the total popula-241 tion size on day 4 per community at each round of selection, 242 but found no significant effect of total biomass or selection 243 treatment on degradation: Total biomass did not correlate 244 strongly with time (Spearman's fl = 0.04, 0.07, for selection 245 and random, respectively, Fig. 2C) and was not significantly 246 different between the two treatments.Indeed, degradation 247 score did not even correlate with total biomass (Fig. 2E, 248 Spearman's fl = ≠0.0062,p = 0.904).249 In sum, selection seems to have favored communities whose 250 members are less likely to drive each other extinct, but no 251 other community features could explain the increase in degra-252 dation scores or the difference between treatments.253 Successful communities were composed of good de-254 graders, their facilitators and freeriders.Noticing that 255 certain species were often found in the winning communi-256 ties, we next explored which species features were selected 257 and whether community degradation scores depended on the 258 presence of specific species or species combinations.259 First, we analyzed which species were over-or under-260 represented in the meta-community compared to what one 261 would expect by chance.For each treatment, we quantified 262 how often each species appeared among plated communities 263 in the last 5 rounds of the experiment.If a species' frequency 264 was more than one standard deviation above or below the fre-265 quency one would expect by chance (18.18), we designate 266 it as over-or under-represented, respectively (mean±SD= 267 18.18 ± 11.8; Fig. 3A dashed line, shaded area).Over-268 represented species were: Ct, Af and Ac in the selection treat-269 ment, and Pf and Pr in the random treatment, while Ml and 270 At2 were under-represented in the selection treatment and Ac 271 in the random treatment.The communities that contained the 272 over-represented species tended to be associated with high 273 degradation scores (Fig. 3A).274 The best-scoring community in the selection treatment 275 (At1+Ct+Af+Ac) contained all 3 over-represented species, 276 which partially explains their over-representation.However, 277it does not answer how its member species were contributing 278 to the score.High degradation in these communities could 279 either be due to single species degrading well, or to synergis-280 tic effects between the species.To find the answer, we grew 281 all 11 species alone and in most pair-wise co-cultures and 282 ranked them from best to worst degradation.We included 283 four of the best 4-species communities and all eleven species 284 grown together, as a reference (Fig.3B).We observed a wide 285 variation in degradation abilities, and to our surprise, all 11 286 species together ranked 35th (dashed line in Fig.3B), which 287 is well below what even single species could achieve.288Thebest individual degraders were Pf, Ct and At1 (mean 289 degradation: 66%, 58% and 50% respectively), while Af, Ea 290 and At2 were the worst (mean degradation: 2%, 2% and 4% 291 respectively, Fig.3B, D).Interestingly, Af which is one of the

Fig. 3 .
Fig.3.A) Species representation and corresponding percentage of degradation in the last 5 rounds of the evolution experiment.As both measures can be quantified in percent, we display them on the same y-axis.The dashed line represents the average frequency at which we expect to see a given species in the last 5 rounds by chance, and the shaded area one standard deviation away from that average.Points that are outside the shaded area are more or less represented than expected by chance.The violin plots show the degradation scores of communities containing that species.B) Degradation percent on day 3 in monocultures, co-cultures, top communities and 11 species together, using species taken from ancestral strains or strains isolated at the end of the random or selection treatment.Data-points are ordered according to the average degradation % and interesting cases are highlighted with a colored background and arrows corresponding to data shown in panels D and E. C) Experiment to determine whether Ac might be a "free-rider".Data points in the top panel show population sizes (log10CFU/ml) of different species and boxplots in the bottom panel show the distribution of degradation scores at day 3 of co-cultures as indicated on the x-axis.Ac reduces the degradation score of the communities it is in, or increases their variance.D) Matrix of degradation percentage in mono-(diagonal elements) and co-cultures of ancestral strains only (average of dots in panel B).E) Matrix of population sizes (log10CFU/ml) in mono-(diagonal elements highlighted with white squares) and co-cultures of ancestral strains only.In panels B, D and E, we highlight interesting cases in blue and light green that are further discussed in the text.worstdegraders, was present in many winning communities.This may be because when combined with At1, it achieves one of the highest degradation scores (Fig.3B, D blue highlight).Compared to their growth in monoculture, At1 promoted the growth of Af by more than 3 logs, although Af reduced the growth of At1 (Fig.3Eblue highlight).

377
and members of the winning communities) from the random 378 and selection treatments for which we conducted mono-and 379 co-cultures to estimate the interactions between the evolved 380 species.While some interactions differed after evolution 381 (Fig.4D, E), we found little evidence of reduced competi-382 tion in the selection treatment, and more generally, no overall 383 pattern (Table Ea Sw Ml Pr Kp At1 Ct Pf -

Fig. 4 .
Fig.4.Effect of within-species evolution.A) Degradation percent or B) population size (log10CFU/ml) of each species at day 3 in three conditions: the ancestral strains before the experiment, strains harvested after 18 rounds of selection treatment (S) and strains after 18 rounds of the random control treatment (R).Data from mono-(alone)

440Fig. 5 .
Fig. 5. A-C) Linear model analysis.We use a linear model (code taken from (4)) that uses species presence/absence to predict degradation percent by using A) all the data we generated as training and test sets, or B) the mono-and co-culture data as training set and the rest of the data as test set, or C)the larger communities as training set and the mono-and co-culture data as test set.Community richness is shown in color.Each dot is one degradation score measurement, such that biological replicates and technical replicates, if available, are all represented.D) Proposing a new artificial selection method.Rather than disassembling communities, we propose to use the winning communities as templates to generate the offspring communities in the next round.These communities would then be seeded by taking the clonal ancestral species from the freezer, such that there would be no within-species evolution over rounds.Step 3 would be to select the top 10 communities, 4a to generate communities in proportion to their community scores and 4b to randomly choose 21/29 of the new communities (illustrated with 4) for either species removal or introduction (see white asterisks in step 4, 4a and 4b are shown in one step).Freezer icon created by SAM Designs from Noun Project.
488nity) whose phenotype is measured (here degradation score 489 plus extinctions) that is subject to "mutations" (here the ex-490 change of a species for the next round).In addition, disas-491 sembling and reassembling allowed each species to evolve 492 over the course of the experiment.But since species' pheno-493 types changed little over the course of the experiment (Fig.4944), it would be simpler to start communities at every round 495 from frozen stocks and avoid the challenging experimental 496

Table 1 .
(32,35,36) 555 bacterial species listed in Table1, which were selected from 556 a pool of 20 natural isolates from MWF (provided by col-557 laborators) based on our ability to distinguish them on se-558 lective media.At1, Ct and Ml were previously isolated from 559 MWF as previously described(32,35,36).Note that Ml (Mi-560 crobacterium liquefaciens) was previously referred to as Mi-561 crobacterium saperdae but a more recent classification has 562 led us to refer to it differently.At2 was kindly donated by 563 Justine Collier (plant associated) and the remaining species 564 were isolated from MWF and kindly donated to us by Pe-Bacterial species used in the experiment and the acronyms we use to we refer to them throughout the manuscript.
ecological and evolutionary stability remain open, we argue that this first proof-of-concept supports the blind approach to automate the breeding of bacterial communities with optimal functions.553 Methods and Materials 554 Bacterial species and culture conditions.565 ter Küenzi from Blaser Swisslube AG, a company that pro-566 duces MWFs.The species were identified at Blaser Swiss-567 lube AG by MALDI-TOF, and confirmed by PCR amplifi-568 cation and 16S gene sequencing.All experiments were per-569 formed in 6ml batch cultures containing 0.5% (v/v) Castrol 570 Hysol TM XF MWF (acquired in 2016) diluted in water with 571 added salts, metal traces (Tables 2, 3), and supplemented with 572 1% Casamino Acids (Difco, UK).Cultures were incubated at 573 28 ¶ C, shaken at 200 rpm.

Table 2 .
Phosphate solution (1%) for the MWF+AA medium as described in Table 3.
Selective media.We designed 10 selective media that al-575 low the growth of only one or two of the 11 species at a time.576DR A F T

Table 3 .
For 600 ml of MWF+AA medium, mix in the above order, top to bottom.The phosphate solution is found in Table2.The MWF needs to be added carefully, one drop at a time to allow mixing.Some species combinations (Ct& At2, Af & Ct, Ml & Sw, Ml 577 & Ea, Ac & Pf, Ct & Kp) cannot be easily distinguished on 578 these media, and we avoided combining these species in the 579 communities of either treatment (Fig. S8A).This means that 580 instead of the 330 combinations of 4 species out of 11, we 581 have 174 possible communities, meaning that we are explor-582 ing a subset of the possible search space.Our selective media 583 are generally composed of a rich base and at least one antibi- 584 otic (details in Tables