Human commensal Candida albicans strains demonstrate substantial within-host diversity and retained pathogenic potential

Candida albicans is a frequent colonizer of human mucosal surfaces as well as an opportunistic pathogen. C. albicans is remarkably versatile in its ability to colonize diverse host sites with differences in oxygen and nutrient availability, pH, immune responses, and resident microbes, among other cues. It is unclear how the genetic background of a commensal colonizing population can influence the shift to pathogenicity. Therefore, we undertook an examination of commensal isolates from healthy donors with a goal of identifying site-specific phenotypic adaptation and genetic variation associated with these phenotypes. We demonstrate that healthy people are reservoirs for genotypically and phenotypically diverse C. albicans strains, and that this genetic diversity includes both SNVs and structural rearrangements. Using limited diversity exploitation, we identified a single nucleotide change in the uncharacterized ZMS1 transcription factor that was sufficient to drive hyper invasion into agar. However, our commensal strains retained the capacity to cause disease in systemic models of infection, including outcompeting the SC5314 reference strain during systemic competition assays. This study provides a global view of commensal strain variation and within-host strain diversity of C. albicans and suggests that selection for commensalism in humans does not result in a fitness cost for invasive disease.

isolate under three carbon sources at 30°C for 24 hours in biological duplicate. Rate and carrying capacity were determined using the GrowthcurveR analysis number. We noted that the isolates globally exhibited copy number expansion in the 3 2 5 telomeric regions, and that copy number increases were more common on smaller 3 2 6 chromosomes. Overall, the considerable copy number variation among our commensal 3 2 7 isolates is in line with past work suggesting that host environmental pressures induce  Notably, we were able to identify structural rearrangements and putative 3 3 1 chromosome fusions, and used PCR to test for the presence of the fusion event. In 3 3 2 strain 880-2, we observed a fusion event between chromosomes 3 and 4, connected by 3 3 3 1 5 a 1.3 kb intervening sequence ( Fig 3C). This intervening sequence had 94% sequence 3 3 4 identity to an intergenic sub-telomeric region of chromosome 6. To determine whether 3 3 5 this was a true event or a sequencing artifact, we designed primers to span the junction 3 3 6 and performed PCR to amplify the fusion (Fig 3D). Using this approach, we observed 3 3 7 that in strain 880-2, there is a bona fide structural rearrangement that links 3 3 8 chromosomes 3 and 4. In strain 859-2, we observed a fusion event between 3 3 9 chromosomes 1 and 3, connected by an approximately 7kb intervening sequence with 3 4 0 no obvious sequence identity to the SC5314 reference strain, but instead had 99.76% 3C). We were again able to use PCR to span both junctions observed the presence of 3 4 3 the fusion between chromosomes 1, 2, and 3 ( Fig 3D). This fusion event may 3 4 4 correspond to the additional chromosomal band between chromosomes 2 and 3 that we were present in the SC5314 reference strain, indicating that the fusions were unique to 3 4 7 the specific isolate ( Fig 3D). These structural variations were not captured in the SNV  isolates, which suggests that even within similar strains, there is potential for structural  The set of isolates for sequencing were initially chosen based on variation in 3 5 8 growth rate in rich medium and alterations in invasion into agar. However, we 3 5 9 hypothesized that we may identify specific adaptations in C. albicans strains isolated 3 6 0 from different sites that allow for colonizing different host microenvironments. The host 3 6 1 sites commonly colonized by C. albicans vary dramatically in environmental cues, such 3 6 2 as nutrient availability, pH, immune responses, and resident microbiomes. Additionally, 3 6 3 we hypothesized we may identify phenotypes associated with specific C. albicans 3 6 4 clades, as we were able to identify structural variants shared between closely related isolates. To test this, we performed a set of growth analyses under multiple 3 6 6 environmental conditions, including pH stresses, nutrient limitation, cell wall stressors, 3 6 7 and antifungal drugs ( Fig 4A). These analyses produced a dense array of quantitative 3 6 8 phenotypic information for each strain. A) Growth curve analysis under multiple environmental conditions. Carrying 3 7 2 capacity (K) was normalized to SC5314, and the fold-change plotted by heatmap.

7 3
Aggregating strains (882-60, 882-46, and 811-7) demonstrate a consistently   Representative filamentation score of 0 (left). Representative filamentation score isolates for macrophage filamentation scoring, phagocytosis rates, and cell death 3 8 0 rates. SC5314 reference strain is indicated in red. For phagocytosis rates, 3 8 1 significant differences from the SC5314 reference strain were determined by 3 8 2 one-way ANOVA, with Dunnett's multiple correction testing. For cell death, 3 8 3 significant differences from the mock condition were determined by one-way 3 8 4 ANOVA, with Dunnett's multiple correction testing. Asterisks indicate P < 0.05 (*), 3 8 5 P < 0.005 (**), P < 0.001 (***) and P < 0.0001 (****). From these data, we identified 3 strains, 882-60, 882-46, and 811-7, that 3 8 8 consistently grew more poorly than the wildtype under multiple conditions; these strains 3 8 9 were those that exhibited aggregation at 30°C and slow growth in rich media conditions. These strains all belonged to Clade C and were closely related, despite arising from two 3 9 1 donors ( Fig 4A). Growth rates in the nutrient limitation conditions were generally 3 9 2 correlated with each other. However, we did not observe a correlation between body 3 9 3 site and growth rate, even in response to cues that would appear to be specific for a 3 9 4 particular body site, such as anaerobic growth. Across the commensal isolates, we 3 9 5 noted the most variation in growth in response to caffeine and the antifungals 3 9 6 fluconazole and rapamycin. In addition to growth, we measured each of the strains for  as we did not observe growth enrichment in cues specific to isolation sites.

0 4
A major stress condition and environmental factor impacting C. albicans in the 4 0 5 host is the immune response. Therefore, we moved from pure growth assays to 4 0 6 measuring host-microbe interactions, using macrophages as representative 4 0 7 phagocytes. We first hypothesized that the oral strains may show decreased recognition bone-marrow derived macrophages and determining the ratio of internalized to external 4 1 2 cells by differential staining and microscopy ( Fig 4B) [58]. Although most isolates were 4 1 3 not significantly different from the SC5314 reference, the isolates generally had a lower 4 1 4 phagocytic rate than SC5314. Additionally, there was no correlation between sample 4 1 5 origin site or clade with phagocytosis rate. As phagocytosis was not a major differentiating factor between strains, we then 4 1 7 wanted to examine whether the strains would induce different levels of macrophage cell 4 1 8 death. We primed bone-marrow derived macrophage for 2 hours with LPS before 4 1 9 infecting with each of our isolates for 4 hours. Following infection, we stained the cells 4 2 0 with propidium iodide (PI) as a measure of cell death ( Fig 4C). On average, the 4 2 1 commensal isolates induced between 5% and 20% cell death, which was significantly 4 2 2 lower than the reference strain, SC5314, which induced an average of 40% cell death. Several strains, including the 3 aggregating strains, 882-60, 882-46, and 811-7, were 4 2 4 1 9 not significantly different than the mock condition. Other than SC5314, only one isolate, 4 2 5 831-1, was significantly different (p = 0.0411) from the mock condition.

2 6
Recently, we showed that C. albicans mutants that filament in serum are not 4 2 7 always filamentous within macrophages [59]. As filamentation is linked, but not required, those that filamented more than the SC5314 reference strain. Notably, the extent of 4 3 4 filamentation did not correlate with colony morphology or invasion on agar, with many  Using individual phenotypic measures, we were unable to identify associations based on the cosign metric. We observed three major clusters, but they did not isolates showed extensive phenotypic variation, but this was not dependent on the body 4 4 8 site or participant from which they were collected. isolates from each individual, we were able to obtain phenotypically diverse strains with 4 5 6 a limited set of unique SNPs between isolates, allowing us to identify causative variants 4 5 7 associated with a particular phenotype.

5 8
We focused on the strains from donor 814, as this donor's matched oral and these 163 colonies were included in the condensed set, and these 6 strains clustered 4 6 2 tightly in Clade 1, which we hypothesized would allow us to identify causative variants 4 6 3 associated with particular phenotypes that were divergent between strains. were less invasive ( Fig 5A). Moreover, from this donor's 163 isolates, only this single 4 6 7 isolate exhibited the hyper invasive phenotype into Spider agar at 37°C ( Fig 5B); this 4 6 8 phenotype was the motivation for initially including this strain in the sequenced set.   that it is highly correlated with genes involved in regulating the yeast-to-hyphal 4 8 8 morphogenic transition (Fig 5C). To test whether this SNV can drive an invasive 4 8 9 phenotype, we generated complementation plasmids encoding the ZMS1T681S allele  changing amino acid 681 to a serine is a dominant active allele that is sufficient to drive 5 0 7 a hyphal invasion program into Spider agar. We also identified natural variation that was analysis of a limited set of natural isolates from a single host can be exploited to identify  We next examined the fitness and virulence of the commensal isolates relative to 5 1 4 the SC5314 reference strain; we hypothesized that the commensal isolates would have decreased virulence compared to SC5314, a clinical isolate. To test this, we turned to that both strains are equally fit and that the fluorescence does not impose a fitness cost.

2 4
In contrast, most commensal strains had a competitive index > 2, suggesting that these growth exhibited by these strains in many growth conditions (Fig 4A). commensal isolates compared to SC5314 reference. Isolates were competed 5 3 3 against a fluorescent SC5314 isolate, starting at a 1:1 initial inoculum.

3 4
Competitive fitness was calculated as the ratio between fluorescent and non-  Significant differences from the SC5314 reference strain were determined by G. mellonella, comparing the SC5314 reference to 6 isolates from donor 814.

3 9
Each strain was standardized to 2x10 6 cells/mL before inoculating 20 G. Mantel-Cox log-rank test. Asterisks indicate P < 0.05 (*), P < 0.005 (**), P < 5 4 3 0.001 (***) and P < 0.0001 (****) compared with SC5314. The striking increased competitive fitness of the other isolates motivated us to 5 4 6 test whether this increased ratio was correlated with increased disease. Here, we compared with SC5314, to test the hypothesis that increased invasion is associated including the hyper-invasive 814-168 isolate (Fig 6B). Two strains had a slight defect in 5 5 3 virulence (p < 0.05 log-rank test). We additionally tested two other clusters of strains for cause host immune cell death (Fig 4C). Together, this suggests that there is not a we observed wide variation in in vitro host response phenotypes, the isolates generally 5 6 0 2 5 retained their pathogenic potential, indicating that our in vitro assays may not capture 5 6 1 the complex stresses experienced during whole-organism infections. Intra-species analyses of microbial strains can allow for the identification of  variants that was largely absent from previous descriptions of natural diversity. We were 5 8 2 able to leverage clonal variation within a single donor to identify meaningful variation that arose during microevolution. Overall, we performed a systematic phenotypic 5 8 4 analysis of commensal isolates from healthy donors, thus allowing us to examine C.

8 5
albicans genotypic and phenotypic diversity before the transition to virulence.

8 6
Strikingly, our commensal strains generally maintained their capacity to cause 5 8 7 disease, and all strains were able to filament in response to the inducing cues of 10% invade into agar, recent intravital imaging approaches suggest that filamentation in 5 9 2 response to serum matched that seen in vivo more than filamentation in response to all of the commensal isolates suggests that the selective pressures that occur during 5 9 5 mouse models of colonization may not recapitulate the selection that occurs during 5 9 6 human colonization.

9 7
Although we were not able to associate a particular phenotype with increased 5 9 8 systemic disease, we observed extensive phenotypic and genotypic variation between 5 9 9 the commensal isolates. This variability is also consistent with recent work from clinical, 6 0 0 disease-associated strains [18]. Moreover, our commensal strains were able to 6 0 1 proliferate on a range of different environmental conditions, and we did not observe 6 0 2 significant differences in phenotypes between isolates obtained from oral or fecal sites, Previous work has suggested that the C. albicans population within a given 6 0 8 individual is clonal [68,[75][76][77][78], and that the fungus is acquired during birth as a part of 6 0 9 the normal microbiota [42]. In these studies, they were often sampling from patients with 6 1 0 active disease; this may suggest that there is selection for the ability to cause disease, contrast, we identified disparate individuals that appeared to be colonized by strains that 6 1 3 were nearly identical, suggesting that there was some transmission between individuals.

1 4
Whether these transmission events allow for long-term colonization, and how they affect 6 1 5 the initial C. albicans colonizing strains, is still not fully understood. We also observed diversification within hosts, with multiple instances of closely- with co-expression, which we term "limited diversity exploitation", to identify a candidate 6 2 0 transcription factor that regulates invasion. Previous work on a ZMS1 knockout strain of 6 2 1 C. albicans did not show any differences in phenotype compared with the parent strain 6 2 2 [65]. However, we observed that a single amino acid substitution in the predicted fungal 6 2 3 transcription factor regulatory middle homology region was sufficient to drive hyper-6 2 4 invasive growth. Moreover, this phenotype was distinct from the deletion phenotypes in 6 2 5 these two genetic backgrounds, again contrasting with the SC5314 reference strain. It is resistance, and our results suggest that studying more than a single representative 6 2 9 isolate gives opportunities to discover new biology. and HUM00118951) and was conducted in compliance with the Helsinki Declaration. albicans colonies were individually inoculated into fresh 96-well plate in YPD and after 6 4 5 24 hours of incubation at 30° C, 50% glycerol was added to generate the stock plates. For each oral sample, individual Candida albicans colonies were picked from the 6 4 7 CHROMagar plates and incubated overnight in 100 µL of YPD in a 96 well plate. After 6 4 8 24 hours incubation in 30° C, 50% glycerol was added to the 96 wells to generate the 6 4 9 stock plates. All strains were maintained in -80 C cryoculture. Media for growth curves is described in Supplemental Table 1. YPD in 96-well plates at 30° C, then subcultured into fresh media using a sterile pinner. Growth curves were performed on a BioTek 800 TS Absorbance Reader in the  Table S1. Summary statistics from GrowthcurveR for the entire 6 6 1 collection of isolates are included in Table S2. Summary statistics from GrowthcurveR 6 6 2 for the condensed set in multiple environmental conditions are included in Table S4. YPD in 96 well plates at 30° C, then subcultured into 100 μ L of YPD at either 30° C or 6 6 7 42° C for three hours. After three hours, the plates were imaged on an Olympus IX80 6 6 8 microscope at 20X magnification. To test the response to serum, overnight cultures To identify single nucleotide variants (SNVs), we mapped 435 read libraries to to convert alignment files, remove PCR duplicates, and assign read groups [49]. include high-confidence SNVs, we removed low quality variant calls at the (i) individual individual sample genotype calls if their genotype quality (i.e., GQ) was < 0.99 (out of allele (i.e., not diallelic), an alternate allele longer than 1 nucleotide (i.e., indels), or with 8 1 5 >5% missing genotype calls across samples. Finally, we removed entire samples from 8 1 6 the combined VCF if their genotype calls across all loci were more than 90% missing,  To remove redundancy and focus on natural C. albicans diversification, we studies (i.e., [34,51,52] generate an input matrix for distance calculation and NJ clustering, we coded genotypes genotypes (i.e., coded as N) were ignored. Using these raw distances, we calculated generalized distance function for sample i and sample j...
…where nSNPs is the total number of high-quality SNPs in our dataset (i.e., GT ix is the genotype of sample i at locus x, and GT jx is genotype of sample j at locus x.
The resulting distance matrix was clustered with the nj function in phytools in R [83]. We 8 3 9 visualized the resulting tree structure and associated data with ggtree in R [84][85][86]. ) for each sample was calculated using the following formula: calls of a give type which overlapped with each bin. To identify potential translocations from the TELL-seq data, we used the Tell-Link where consecutive aligned segments longer than 10,000 bp on the same contig aligned 8 5 9 to different chromosomes. To experimentally validate the predicted junctions, we i.e., one from each of the two chromosomes predicted to be joined together. Primers were selected from the 500 bp directly upstream and downstream from the predicted used. Primers are in Table S5. To make the ZMS1 allele swap strains, we used the NEBuilder HiFi assembly kit 8 6 9 to clone both alleles of ZMS1 from the 814-168 strain background into the pUC19 presence of specific ZMS1 alleles by Sanger sequencing using oTO736. To generate the deletion strains, the ZMS1::NAT cassette was amplified from the Integration was tested using oTO5 and oTO1218 and loss of the wild-type ZMS1 gene 8 8 2 was tested using oTO736 and oTO1218.