Ecological drift during colonization drives within- and between-host heterogeneity in animal symbiont populations

Host-associated microbiomes vary greatly in composition both within and between host individuals, providing the raw material for natural selection to act on host-microbe associations. Nonetheless, the drivers of compositional heterogeneity in host-associated microbiomes have only rarely been examined. To understand how this heterogeneity arises, we utilize the squash bug, Anasa tristis, and its bacterial symbionts in the genus Caballeronia. We artificially modulate symbiont bottleneck size and strain diversity during colonization to demonstrate the significance of ecological drift, which causes stochastic fluctuations in community composition. Consistent with predictions from the neutral theory of biodiversity, ecological drift alone can account for heterogeneity in symbiont community composition between hosts, even when two strains are nearly genetically identical. When acting on competing, unrelated strains, ecological drift can maintain symbiont genetic diversity among different hosts by stochastically determining the dominant strain within each host. Finally, ecological drift mediates heterogeneity in isogenic symbiont populations even within a single host, along a consistent gradient running the anterior-posterior axis of the symbiotic organ. Our results demonstrate that symbiont population structure across scales does not necessarily require host-mediated selection, but emerges as a result of ecological drift acting on both isogenic and unrelated competitors. Our findings illuminate the processes that might affect symbiont transmission, coinfection, and population structure in nature, which can drive the evolution of host-microbe symbiosis and microbe-microbe interactions within host-associated microbiomes.


20
All multicellular life forms on Earth play host to teeming communities of host-associated 21 microbes. The most specialized host-microbe associations, or symbioses, have evolved multiple times in 22 the tree of life. Host-microbe symbioses feature diverse mechanisms for host benefit, control of 23 community composition, and limited opportunities for microbial transmission [1]. These mechanisms 24 play essential roles in initiating, stabilizing, and undermining symbiosis during a host's lifetime, over 25 generations, and across macroevolutionary timescales [1,2]. Nevertheless, a persistent paradox in the 26 study of host-microbe symbioses is that, like microbes in natural environments, microbial symbionts 27 exhibit enormous strain diversity [3-10], even when natural selection, imposed by specialized 28 interactions with their hosts, is expected to erode genetic variation. 29 Different mechanisms, based on environmental or host-level selection, are typically invoked to 30 explain the maintenance of symbiont genetic variation, often in terms of host benefit [11,12]. However, 31 these hypotheses do not account for how host-associated consortia assemble as ecological 32 communities, which embeds this genetic variation within patches in physical space [13,14]. This is an 33 inherently stochastic process that generates heterogeneity [15][16][17]. Heterogeneity in host-associated 34 microbial communities manifests at two scales: as heterogeneity in colonization between hosts, and as 35 spatial heterogeneity across tissues and organs within each host. At both scales, it is critical to 36 understand how heterogeneity emerges during establishment of symbiosis, which drives the evolution, 37 ecology, and physiology of both host and microbe. 38 While the ecological processes that create heterogeneity during community assembly have been 39 studied with mathematical models (e.g. [18]), validation of these models in empirical studies using 40 natural, ecologically realistic communities, including host-associated microbial communities [14][15][16]19], 41 is scarce. Some of these processes are deterministic, acting on specific traits that allow or hinder 42 establishment of a taxon in a predictable, niche-based fashion. However, community assembly is also 43 governed by dispersal between habitats. Dispersal imparts a stochastic element on community assembly 44 [16]: Taxa immigrate and establish in new patches in a probabilistic manner, in part because they 45 experience transient reductions, called bottlenecks, in population size [20]. These bottlenecks intensify 46 ecological drift, or stochastic variation in community composition. Since the proposal of Hubbell's 47 unified neutral theory of biodiversity, the relative role of stochastic processes such as ecological drift in 48 community assembly, compared with deterministic niche-based processes such as between-species 49 interactions, has been a matter of continuous study [21][22][23]. 50 To illustrate the importance of ecological drift during the establishment of even highly specific 51 symbioses, we employ the squash bug, Anasa tristis (Fig 1A), as a model. A. tristis is host to specific 52 symbionts in the β-proteobacterial genus Caballeronia [24] (previously referred to in the literature as 53 the Burkholderia "SBE" clade [25] or the B. glathei-like clade [26]), which it requires for survival and 54 normal development to adulthood. Acquisition of Caballeronia occurs through the environment after 55 nymphs (immature insects) molt into the second instar. Once they successfully colonize the host, 56 symbionts are housed in hundreds of sacs called crypts, which form two rows running along a specific 57 section of the midgut, called the M4 (Fig 1B). Unlike many other insect symbionts, Caballeronia can be 58 isolated from bugs and established in pure culture in the laboratory. Because A. tristis nymphs hatch 59 from the egg symbiont-free, the symbiosis can be reconstituted anew every generation by feeding 60 cultured symbionts to these nymphs in the laboratory. measures, to quantify heterogeneity in community composition. We inoculated second instar squash 110 bug nymphs with approximately 1:1 mixtures of GA-OX1 sfGFP with GA-OX1 RFP, diluted to produce 111 inocula ranging from approximately 10 6 to 10 1 CFU/µL ( Fig S1A). (Figs 2B and S1B). The slight bias towards sfGFP colonization could be due to toxic aggregation of the 141 dTomato fluorescent protein, which has been observed in eukaryotic cells [34]. As inoculum density 142 decreases, and thus as transmission bottlenecks tighten, individual infections become increasingly 143 dominated by one or the other strain, causing the bimodality coefficient to increase (Figs 2C and S1C, 144 First, we demonstrated that GA-OX1 and SQ4a compete under an in vitro approximation of 162 natural conditions within the host midgut ( Fig 3A). In trials where SQ4a sfGFP and RFP were grown 163 together as liquid cultures in filter-sterilized zucchini squash extract, both strains were recovered at high 164 densities after 24 hours. On the other hand, when either SQ4a strain was grown with a counter-labelled 165 GA-OX1, SQ4a almost always went extinct (T-test, p<0.001, n=10). Labelled GA-OX1 strains grew to high 166 densities regardless of whether they were growing alongside SQ4a or the counter-labelled GA-OX1. 167 These data suggest that GA-OX1 is the superior competitor to SQ4a in vitro.   3D, S4B, and S4D, Tables S2-S4). 208 Ecological drift during colonization generates within-host spatial 209 heterogeneity.

210
The squash bug symbiotic organ, called the M4, contains hundreds of crypts (Figs 1A and 1B). 211 Because each crypt is filled with its own population of symbionts, we asked whether these crypts might 212 exhibit heterogeneity in symbiont composition within the host, consistent with previous unquantified 213 observations from related insect-Caballeronia models [27,38]. If crypts indeed contain heterogeneous 214 populations, we would expect crypts to contain mostly RFP-and mostly GFP-expressing symbionts, as 215 opposed to highly similar populations composed of one or both types. We also examined if within-host 216 heterogeneity might be sensitive to inoculum density in the same manner as between-host 217 heterogeneity. If so, we would expect individual crypts to be more heterogeneous when symbionts are 218 subjected to tighter transmission bottlenecks during host colonization. 219 We systematically characterized within-host spatial heterogeneity by co-inoculating nymphs 220 with 1:1 mixtures of counter-labeled GA-OX1 at approximately 10 6 and 10 2 CFU/µL, as above. By imaging 221 freshly dissected whole guts from coinfected nymphs, we observed that the M4 does impose spatial 222 heterogeneity on symbiont populations, with individual crypts varying in GFP and RFP intensity even at 223 colonization with 10 6 symbiont CFU/µL ( Fig 4A). However, there is a clear gradient in the degree of 224 heterogeneity among crypts along the length of the M4, with anterior crypts being co-colonized and 225 posterior crypts being singly infected (Fig 4B). We quantified this gradient by measuring the variance in 226 RFP intensity relative to GFP along the length of the M4 (Fig 4C). Contrary to our expectations, we saw 227 that nymphs colonized with just 10 2 symbiont CFU/µL also exhibited this gradient, with anterior crypts 228 being co-colonized despite a 10,000-fold reduction in inoculum density (Figs 4B and 4C). Thus, patterns 229 of heterogeneity within the host are consistent over four orders of magnitude in inoculum density. Even 230 when microbe-microbe competition is nearly neutral, host anatomy appears to impose spatial structure 231 on symbiont populations.  (Table 1) [110]. The conjugative Escherichia coli K12 strain SM10(λpir) 400 harboring pTn7xKS-sfGFP or pTn7xKS-dTomato (Table 1), which were a generous gift from Travis Wiles 401 [37], as well as an E. coli parent of the same strain harboring helper plasmid pTNS2 [110], were plated 402 with SQ4a and GA-OX1 at high density on LB plates with salt (10 g/L tryptone, 5 g/L yeast extract, 10 g/L 403 NaCl, 15 g/L agar). After 24-48 hours of incubation at 30°C, matings were harvested into LB Lennox low-404 salt broth with a lytic coliphage, T7, to eradicate E. coli. After further incubation for four hours at 30°C 405 shaking at 200-225 rpm, cultures were plated on NA amended with 1 mM isopropyl-β-D-1-406 thiogalactoside (IPTG) and 10 µg/mL gentamicin to select for successful integrants. Colonies on selective 407 plates were screened for fluorescence and frozen at -80°C as 20% v/v glycerol stocks. 408 To confirm stability of fluorophore expression, newly constructed strains were streaked on NA 411 plates and visually assessed for fluorescence after two days. To confirm site-and orientation-specificity 412 of mini-Tn7-GmR integration, we ran PCR to amplify the fragment between the endogenous glmS gene 413 (GA-OX1: 5' AGGCGCGTTGAAGCTCAAGG 3'; SQ4a: 5' CGCTGGAGCCGCAAATCATC 3') and the inserted 414 aacC gentamicin resistance marker (aacC-83F: 5' GTATGCGCTCACGCAACTGG 3'). We did not screen for 415 insertion at additional sites, as to our knowledge the strains of Caballeronia we used have only one 416 attTn7 site. 417 418 During the summer growing season, squash bugs feed on macerated cell contents, xylem, and 419 phloem in tissues from squash plants and fruits [120,121]. To replicate microbial competition in this 420 environment, competition assays in liquid culture were conducted in filter-sterilized zucchini squash 421 extract. In short, juice from organic zucchini fruits was extracted in a juicer, combined, and filtered to 422 remove large suspended particles. This filtrate was then centrifuged at 10,000 xg for 3 hours to pellet 423 suspended particles, then filter-sterilized through a 0.2 µm filter and stored at -20°C. 424

Competition assays in vitro
To initiate competition assays, GA-OX1 and SQ4a, labelled with sfGFP or dTomato as described 425 above, were initially streaked from frozen 20% glycerol stocks onto NA plates and incubated at 30˚C for 426 48 hours. Individual colonies from each plate were inoculated into 3 mL of LB media and incubated in a 427 shaking incubator (New Brunswick Scientific Excella E25) at 30˚C for 12 hours with shaking at 225 rpm. All co-cultures were subsequently incubated for 24 hours at 30˚C with shaking at 225 rpm. As described 437 above, co-cultures were dilution plated on NA and incubated a further 20-24 hours, and single colonies 438 containing each fluorophore were distinguished and counted under a dissecting scope. We confirmed 439 that SQ4a and GA-OX1 do not appear to inhibit each other intensely on NA plates, suggesting that 440 plating co-cultures on NA provides an unbiased count for both competitors (Fig S5). 441 We found that the defined feeding solution improved nymphal feeding response and better prevented 458 bacterial population growth during the inoculation time window, with minimal impact on our 459 experimental results (Fig S4). Nymphs previously starved overnight for 15-25 hours in clean plastic 460 rearing boxes (7 cm X 7 cm X 3 cm) were supplied with 120 µL of a single inoculum treatment blotted on 461 quartered sectors of 55 mm-diameter qualitative filter paper (Advantec MFS N015.5CM). Nymphs were 462 then allowed to wander and feed ad libitum for 2-3 hours. After this brief inoculation period, nymphs 463 were housed singly in 24 well plates with small pieces of organic zucchini to develop for three days. Just 464 before and after the inoculation period, inocula were serially diluted as above to quantify the 465 concentration of each strain and ensure that no substantial growth or death of either strain occurred 466 during the inoculation period (Fig S6). 467

Competition assays in vivo
On the fourth day after inoculation, nymphs were killed in 70% denatured ethanol, surface 468 sterilized in 10% bleach for 5-10 minutes, washed off again in 70% ethanol, and immersed in ~20 µL 469 droplets of 1xPBS. Whole nymphs, when applicable, were imaged on a Olympus SZX16 470 stereomicroscope with an Olympus XM10 monochrome camera and Olympus cellSens Standard 471 software ver. 1.13. Nymphs were immersed in a shallow volume of PBS in 6 cm plastic petri dishes, and 472 images were taken in darkfield (30 ms exposure 11.4 dB gain), brightfield (autoexposure, 11.4 dB gain), a 473 GFP channel (autoexposure, 18 dB gain), and a RFP channel (autoexposure, 11.4 dB gain). Darkfield and 474 brightfield images were merged in FIJI version 1.54f using the Image Calculator plugin, and the result 475 Finally, the whole preparation was re-immersed in 2550 µL of PBS, to which 1 µL of M9 buffer containing 494 1% Triton-X100 was added to aid the spreading of the droplet. 495 Gut preparations, which degrade or dry rapidly, were imaged as soon as possible. Tilescan 496 images were taken using a Leica DMi8 inverted widefield light microscope with a Leica DFC9000 GT 497 fluorescence camera and Leica Application Suite X ver. 3.4.2.18368 software. Automated tilescans were 498 taken with a 10X objective lens with brightfield, DIC, GFP, and RFP channels. Fluorescent channels were 499 established by filter sets. The GFP channel was set to: bandpass filter 470/40 nm emission, dichroic 500 mirror 495 nm, emission 525/50 nm. The dsRed channel was set to 546/11 nm excitation, dichroic 501 mirror 560 nm, 630/75 nm emission. As each sample is unique, care was taken to set GFP and RFP 502 channel exposure times manually according to the most intense pixels in the entire M4 (usually in the 503 posterior crypts), to minimize signal saturation in any part of the preparation. Due to the convoluted 504 shape of the M4, images were taken with and without autofocus, and stitched images were visually 505 assessed to determine which images were more useful. The repetitive structure of the M4, composed of 506 nearly identically sized, regularly spaced crypts, also necessitated a lower overlap value between tiles for 507 tilescans, as low as 2%. LAS X software was used to merge tiles from tilescans without smoothing for 508 quantitative analysis. 509

510
All statistical analyses were conducted in R version 4.1.1, and the R package ggplot2 (version 511 3.4.2) was used for all data visualization. Because multiple trials were run for inoculation experiments, 512 and some trials recovered very low numbers of infected nymphs, we binned nymphs from multiple trials 513 into discrete treatment groups, based on inoculum size, for analysis. For neutral competition 514 experiments, which utilized isogenic GA-OX1 GFP and RFP, we measured the proportion of GA-OX1 RFP 515 extracted from each host. For interspecies competition experiments, utilizing different combinations of 516 SQ4a and GA-OX1, we measured the proportion of GA-OX1, the superior competitor, compared to the 517 sum of all green and red fluorescent colonies extracted from each host. 518 Raw colony counts of each fluorescent strain recovered from each individual insect were 519 converted into proportions for analysis and visualization. To quantify between-host heterogeneity in 520 symbiont colonization for each inoculation treatment, we calculated a bimodality coefficient [33] using 521 the R package mousetrap (version 3.2.0), as well as the population variance, for each inoculation 522 treatment. We also independently validated this approach using a species-level Fst measure as 523 described previously for quantifying compositional variation among plant communities [21], which is 524 based on the classical Fst measure for quantifying genetic differentiation between populations [122]. 525 Using the R package diptest (version 0.76-0), we also implemented Hartigan's dip test [123], which 526 calculates a dip statistic (Tables S1-S4) based on the shape of the cumulative distribution function of a 527 dataset. In addition to the dip statistic, we also calculated three p-values for each treatment to assess 528 deviation from normality of the distribution of GA-OX1 compositional variation. We considered a 529 distribution to deviate from unimodality if p-values from the dip test repeatedly fell below the threshold 530 of 0.05. 531 To quantify within-host spatial structure, a linear region of interest (ROI) was sampled from one 532 complete row of crypts from each sample to obtain GFP and RFP intensities. Crossover of the ROI from 533 one side of the M4 to the next was occasionally necessary to follow that row through each twist of the 534 M4. The identical ROI was translated to obtain GFP and RFP intensities from the empty background 535 immediately adjacent to the crypts. GFP and RFP intensities at each point along the M4 were normalized 536 by subtracting the background signal from the same point outside the M4. For the RFP channel, the 537 difference between crypt and background signal was occasionally less than 0; in these rare cases the 538 normalized RFP intensity at that point was assigned a value of 1. The log-transformed ratio of RFP to GFP 539 intensity was measured from each pixel along the ROI. In addition, the variance in this value was 540 calculated by iteratively sampling pixels from within a sliding interval 10% of the length of the ROI. 541