Ploidy tug-of-war: evolutionary and genetic environments influence the rate of ploidy drive in a human fungal pathogen

Variation in baseline ploidy is seen throughout the tree of life, yet the factors that determine why one ploidy level is selected over another remain poorly understood. Experimental evolution studies using asexual fungal microbes with manipulated ploidy levels intriguingly reveals a propensity to return to the historical baseline ploidy, a phenomenon that we term ‘ploidy drive’. We evolved haploid, diploid, and polyploid strains of the human fungal pathogen Candida albicans under three different nutrient limitation environments to test whether these conditions, hypothesized to select for low ploidy levels, could counteract ploidy drive. Strains generally maintained or acquired smaller genome sizes in minimal medium and under phosphorus depletion compared to in a complete medium, while mostly maintained or acquired increased genome sizes under nitrogen depletion. Surprisingly, improvements in fitness often ran counter to changes in total nuclear genome size; in a number of scenarios lines that maintained their original genome size often increased in fitness more than lines that converged towards diploidy. Combined, this work demonstrates a role for both the environment and genotype in determination of the rate of ploidy drive, and highlights questions that remain about the force(s) that cause genome size variation.


Introduction
Baseline ploidy levels vary among closely related extant species and whole genome sequencing has revealed widespread paleopolyploidy throughout the eukaryotic tree of life (reviews on animals: Gregory TR and Mable 2008;fungi: Albertin and Marullo 2012;and plants: Wendel 2015). Ploidy variation has the potential to affect many aspects of evolutionary dynamics and there are theoretical advantages to both high and low ploidy levels, primarily related to differences in the mutation rate and the efficiency of selection (Thompson and Lumaret 1992;Orr and Otto 1994;Otto and Whitton 2000;Gerstein and Otto 2009, and references within). In brief, more mutations arise in higher ploidy backgrounds, thus if new beneficial mutations are the rate-limiting step in adaptation, higher ploidy populations should be advantageous. However, if the adaptive mutations are not completely dominant or have low penetrance, they will take longer to reach high frequency in higher ploidy populations and may be lost due to chance.
Additional complexities regarding the influence of ploidy level on the characteristics of beneficial mutations have been revealed in recent experimental studies with yeast. Different ploidy levels may have different suites of mutations available to them (Orr and Otto 1994;Gerstein and Otto 2009;Selmecki et al. 2015). Furthermore, the same mutations can have different effect sizes in different ploidy backgrounds, independent of dominance (Gerstein 2012; Selmecki et al. 2015), implying that ploidy itself has important consequences on the mutational pathways available to evolution.
Ploidy also has the potential to directly or indirectly influence the physiological characteristics of an organism. In single-celled eukaryotes, ploidy directly influences the volume of the cell (likely mediated through differences in nuclear volume, Schmoller and Skotheim 2015), and ploidy/cell volume have been directly implicated in differences in amino-acid pools and enzyme activity (Weiss et al. 1975), gene expression (Galitski et al. 1999;Storchová et al. challenges it commonly faces. If the nutrient limitation hypothesis has a significant effect on ploidy drive, we predicted that compared to a complete medium, under nutrient limitation initially haploid strains should stay haploid, while initially polyploid strains should converge to diploidy faster. Unexpectedly, we found that a more nuanced view of the influence of nutrient limitation on ploidy drive was required. Strains evolved under phosphorus depletion and minimal medium generally behaved as predicted, while strains evolved under nitrogen depletion behaved more similarly to those evolved under complete medium.

Strains & environments
We utilized nine C. albicans strains for this study that vary in their relationship to each other ( Table 1). The first set are homozygous lab strains, consisting of two related haploid strains (1N1 & 1N2, that are ~91% similar) and a diploid strain (2N1) that is isogenic to strain 1N2. The second set of strains is of clinical origin, and contains a heterozygous diploid strain (2N2) and a polyploid strain with a complex karyotype (4N1) isolated from the same patient, and a euploid tetraploid strain isolated from a different patient (4N2). The remaining strains include the diploid laboratory reference strain SC5314 (2N3), and two related polyploids (4N3 & 4N4). All strains have previously been genotyped by either CGH-SNP arrays or whole genome sequencing (see Table 1 for references).
Each strain was initially streaked from glycerol stocks stored at -80°C onto an SDC (synthetic defined complete medium) plate. After 48 h at 30°C, a single colony was arbitrarily chosen and transferred to a microcentrifuge tube in 15% glycerol. We refer to these singlegenotype stocks as the "ancestral strains."  a  t  i  n  g  p  r  o  d  u  c  t  b  e  t  w  e  e  n  t  w  o  S  C  5  3  1  4  -d  e  r  i  v  e  d  s  t  r  a  i  n  s  B  e  n  n  e  t  t  a  n  d  J  o  h  n  s  o  n  2  0  0  3 We assessed ploidy drive in four different environments. Standard complete yeast medium (SDC) contains 1.7 g/L yeast nitrogen base (which includes the phosphorus source, 1.0 g/L KH 2 PO 4 ), 5.0 g/L ammonium sulfate (the primary nitrogen source), 0.08 g/L uridine, 0.04 g/L adenine, and standard amino acids (Rose et al. 1990

Fitness measurements of ancestral strains and evolved lines
We use term biological replicate to refer to measurements obtained from overnight culture that was grown up from different colonies, and technical replicate for culture that came

Genome size analysis with flow cytometry
Flow cytometry was used to determine the genome size of evolved populations from all lines. This type of flow cytometry uses a dye that stains nuclear DNA, and thus enables total nuclear genome size quantification by measurement of fluorescence intensity. Throughout, when we use the term "genome size", we are referring to this measure, not the haploid genome size ("C-value"). Sample preparation was similar to previously published methods in C. albicans (Hickman et al., 2013) with the following modification: SybrGreen dye was diluted 1:100 in 50:50 TE and the final incubation step was conducted overnight in the dark at room temperature (the previously published protocol used 1:85 diluted SybrGreen solution and incubated overnight at 4°C). All ancestral and evolved lines from all environments from the same initial ploidy group were always prepared simultaneously and run on the same day on an LSRII flow cytometer (BD Biosciences); this enables us to accurately compare ancestral and evolved genome sizes. We collected FITC-A fluorescence data from 10 000 cells from each population. Cells are collected from all stages of the cell cycle, leading to some unavoidable ambiguity in ploidy determination (e.g., a haploid in the G2 phase of mitosis contains the same amount of DNA as a diploid in G1 phase). As we purposefully did not bottleneck populations prior to analysis, in some cases multiple peaks representing subpopulations of differing genome sizes were present ( Figure 1).
As nuclear genome size is determined by fluorescence intensity, the units are arbitrary and depend entirely on the machine settings, which were chosen to enable quantification of haploid to octoploid G1 peaks, and kept constant throughout our experiments.
The mean G1 peak of the predominant ("major") genome size peak was recorded for each replicate in Flowjo (Tree Star) using the cell cycle platform to fit the Watson pragmatic model.
Ancestral haploid cells always contained a large G2 peak, thus when a visible haploid peak was retained after evolution we recorded the major peak for that population as haploid ( Figure 1A).
Conversely, ancestral diploid and polyploid cells are predominantly in G1 phase after an overnight in YPD, thus for initially diploid and initially polyploid evolved populations we recorded the major ploidy population as the highest peak ( Figure 1B). All samples were prepared and measured twice by flow cytometry with extremely consistent results. We based the majority Environmental influences on ploidy drive 1 1 of our analyses on the major peak observed but also recorded additional ("minor") peaks when they were present. Total nuclear genome size change was calculated by comparing the major evolved peak to the median ancestral genome size.

Figure 1. Ploidy analysis from plots of FITC intensity.
A) Three representative flow profiles after evolution from initially haploid lines. In each plot the major ploidy peak (G1 maj ) is indicated by a red star. Due to ambiguity between the haploid G2 and diploid G1 peak, if a haploid G1 peak was present, regardless of height, this peak was recorded as G1 maj (center panel). B). Three representative flow profiles after evolution from initially polyploid lines. In each plot the mean of the major ploidy peak (G1 maj ) is indicated by a blue star. For initially diploid and polyploid lines, when more than two peaks were present (indicative of a polymorphic population), we recorded G1 maj from the highest peak (blue star) and G1 min from the next highest peak (yellow star). Note that because we assay cells at all stages of the cell cycle, we did not record the mean of peaks that are consistent with the G2 phase (open stars).

Statistical analyses
All analyses were conducted in the R programming language (R Core Team 2014 To examine the factors that influence genome size change we ran two-way ANOVAs with strain, evolutionary environment, and their interaction as the predictor variables and change in genome size as the response. When the interaction was not significant we re-ran the model without this term. To determine the relationship among environments we ran the HSD.test function (i.e., Tukey test with multiple comparisons) from the agricolae package (de Mendiburu 2015).
Among-strain differences for initial fitness were separately examined for each environment with an ANOVA followed by a Tukey test. To test for the correlation between initial fitness and change in fitness in each environment we used the non-parametric Kendalls's rank association test.
To test whether initial ploidy and/or evolutionary environment significantly influenced the observed fitness changes we ran a linear mixed-effect model for each fitness measure (i.e., growth rate and yield) using the lme function from the nlme package (Pinheiro et al. 2016) with environment, ploidy and their interaction as predictor variables and strain as a random effect.
The predicted marginal means, confidence intervals, and significance of contrasts between each pair of environments for each ploidy group were obtained using the lsmeans function.
Finally, to test whether change in genome size predicted improvement in fitness, for each initial ploidy group we ran a linear mixed-effect model with the change in fitness as the response variable, change in genome size as the predictor variable, and strain background and

Evolutionary environment influenced the rate of ploidy drive
After ~140 generations of evolution in synthetic defined complete medium (SDC), minimal medium (MM), phosphorus-depleted medium (Pdep) and nitrogen-depletion medium (Ndep), only ~50% (49/98) of initially haploid, and ~74% (142/192) of initially polyploid lines maintained even a minor cell population of their initial ploidy state. By contrast, 96% of diploid lines remained exclusively diploid (138/144 lines) ( Figure S1). As expected, ploidy drive was thus broadly observed. To determine whether evolution under nutrient limitation specifically could counteract or enhance ploidy drive, we compared the rate of ploidy drive under the different environmental conditions.
Environmental influences on ploidy drive 1 4 Figure 2. The frequency of ploidy drive was influenced by evolutionary environment and strain background. Haploid, diploid, and polyploid lines were evolved in complete (SDC), minimal (MM), phosphorus-depleted (Pdep), and nitrogen-depleted (Ndep) medium. Plotted is the mean (+/-SE) major-population genome size of 12 replicate lines evolved for ~140 generations (see Figure S1 for evolved genome size of each line individually). Genome size is measured as the fluorescence intensity in the FITC channel of SybrGreen dye. This dye stains nuclear DNA and thus fluorescence intensity reflects the total nuclear genome size (note that fluorescence intensity does not scale linearly with genome size; intensity of ~90 corresponds to haploidy, ~190 corresponds to diploidy, while ~ 310 corresponds to tetraploidy). Grey squares indicate the range of genome sizes of twelve ancestral replicates measured on the same day as evolved lines.

Initial fitness does not predict ploidy changes
We tested whether the observed genome size changes could be explained by variation in initial growth rate or yield (optical density at the time of transfer: 24 hours in SDC, 48 hours in MM, Ndep and Pdep), two measures typically used to characterize fitness in microbes. The homozygous strains (1N1, 1N2 and 2N1) were always amongst the slowest growing and lowest yield strains for all growth environments, including MM and Pdep, the environments where haploidy was maintained ( Figure 3). For the remaining heterozygous diploid and polyploid strains, strain background rather than ploidy group per se seemed to determine initial growth rates. Although the two heterozygous diploid strains were always among the most fit, there were no clear or consistent growth differences between them and the polyploid laboratory or clinical strains that could have predicted genome size changes. Differences in initial fitness thus did not explain the observed variation among environments for ploidy drive.

Changes in genome size and improvement in fitness are not correlated
The majority of evolved lines increased in growth rate and yield relative to the ancestral strains ( Figure S2). Across the entire dataset, the fitness changes were significantly influenced by

Discussion
The evolutionary environment, initial ploidy, and genetic background all significantly influenced the rate of genome size change in C. albicans. Consistent with previous experiments in both C. albicans (Hickman et al. 2013;2015) and other diverse fungal microbes (Gerstein et al. 2006;Gresham 2006;Schoustra et al. 2007;Seervai et al. 2013;Voordeckers et al. 2015), we found that the genome size changes observed over ~140 generations of batch culture evolution were nearly always towards the baseline ploidy (diploidy for C. albicans), a phenomenon that we term 'ploidy drive'. By tracking genome size changes during evolution under nutrient limitation we tested whether the predicted costs of higher genome sizes (e.g., the extra P and N demands for DNA/RNA and proteins) can counteract ploidy drive. We found that the evolutionary environment did significantly influence the frequency of genome size change ( Figure 2): lower ploidy levels were selected or maintained in in phosphorus-depletion (Pdep) and minimal medium (MM) while higher genome sizes were generally selected or maintained in complete medium (SDC) and nitrogen-depletion (Ndep). in Pdep exhibited changes in genome size (in all cases an increase in genome size, Figure S1).
Interestingly, the evolved lines that increased in genome size had reduced growth rates relative to the evolved lines that stayed diploid (mean growth of the five measured lines that increased in Environmental influences on ploidy drive 2 1 ploidy: 0.26 +/-0.02; mean growth rate of the five lines that remained diploid: 0.31 +/-0.01), a further indication that genome size transitions occur independently of increases in growth rates.
Of the initially polyploid strains, all evolved lines from the clinical strain 4N1 showed genome size reduction (Figure 2 and S1), while many evolved lines from the other clinical strain (4N2) and the two laboratory strains (4N3, 4N4) retained sub-populations of cells with their initial genome size ( Figure S1). 4N1 has a unique karyotype that contains both tetrasomies and trisomies as well as two copies of isochromosome 5L ( Future studies are needed to shed light on particular genomic features important in promoting (or counteracting) ploidy drive.
Initial fitness was more influenced by genetic background than initial ploidy in all environments. Genome-wide homozygosity is known to reduce competitive fitness in C.
albicans (Hickman et al. 2013), and clinical homozygous strains have yet to be obtained.
Accordingly, the three homozygous strains (1N1, 1N2 and 2N1) were consistently among the least fit strains regardless of environment ( Figure 3), with the caveat that these three strains are nearly isogenic. While the nutrient limitation hypothesis predicts that higher ploidy cells will grow slower in nutrient limitation relative to lower ploidy cells, C. albicans polyploids typically had similar fitness as the heterozygous diploids (Figure 3), despite having larger cell size and volume (Hickman et al. 2013). Furthermore, if initial fitness dictated the rate of ploidy drive we would expect to see higher rates of ploidy drive in strains that were initially less fit. This prediction was not borne out; strain 4N1 had an increased rate of ploidy drive yet similar fitness to the other polyploid lines (with the exception of in MM). Thus the frequency of ploidy drive of polyploid lines cannot be due to differences in initial fitness.
While the majority of lines increased in fitness in after evolution ( Figure 5 & S2), neither initial ploidy nor final genome size correlated with the observed changes ( Figure 6). An a priori explanation for ploidy drive is that cells at their baseline ploidy have higher fitness than ploidy- . Notably, in our experiments genome size stability often ran counter to changes in fitness: haploid lines stayed haploid in MM and Pdep yet improved in fitness rate the least, the diploid lines that changed in genome size in Pdep were less fit than diploid lines that stayed diploid, polyploid lines were most stable in Ndep yet also improved in fitness the most. The force driving genome size dynamics is thus independent of the measured fitness parameters in these microbial taxa, which have streamlined  Supporting Tables   Table S1. Tukey test results following ANOVA tests to examine the influence of environment and strain on the rate of genome size change. Table S2. Least square means pairwise posthoc tests to determine the influence of evolutionary environment on change in growth rate. Degree of freedom was 342 in all tests. Table S3. Least square means pairwise posthoc tests to determine the influence of evolutionary environment on change in yield. Degree of freedom was 342 in all tests. Figure S1. Strain background and environment influence genome stability. Twelve lines of each ancestral strain were evolved in complete (SDC, black points), minimal (MM, blue), phosphorus-deprivation (Pdep, green) and nitrogen-deprivation (Ndep, red) medium. Filled points represent the major peak and hollow points indicate the minor peaks in FITC intensity (see Figure 1) after ~140 generations of evolution. Top row = initially haploid lines; middle row = initially diploid lines; bottom row = initially polyploid lines. Grey boxes indicate the genome size range measured from twelve ancestral replicates for each strain.