Seed Banks Alter the Rate and Direction of Molecular Evolution in Bacillus subtilis

Fluctuations in the availability of resources constrains the growth and reproduction of individuals, which in turn effects the evolution of their respective populations. Many organisms are able to respond to fluctuations by entering a reversible state of reduced metabolic activity, a phenomenon known as dormancy. This pool of dormant individuals (i.e., a seed bank) does not reproduce and is expected to act as an evolutionary buffer, though it is difficult to observe this effect directly over an extended evolutionary timescale. Through genetic manipulation, we analyze the molecular evolutionary dynamics of Bacillus subtilis populations in the presence and absence of a seed bank over 700 days. We find that the ability to enter a dormant state increases the accumulation of genetic diversity over time and alters the trajectory of mutations, findings that are recapitulated using simulations based on a simple mathematical model. While the ability to form a seed bank does not alter the degree of negative selection, we find that it consistently alters the direction of molecular evolution across genes. Together, these results show that the ability to form a seed bank affects the direction and rate of molecular evolution over an extended evolutionary timescale.

Nature is rarely static. Temporal fluctuations in abiotic and biotic environmental 2 factors often reduce the rate that an organism can grow and reproduce. To contend with 3 such fluctuations, many species enter a reversible state of reduced metabolic activity, an 4 adaptation known as dormancy [1]. In this state, individuals can endure environmental 5 stressors until they subside, a temporary cessation of short-term reproductive efforts in 6 favor of long-term reproductive gains. This evolutionary trade-off, and the life-history 7 strategies through which it is implemented, have received substantial attention by means 8 of theoretical [2][3][4][5][6] and empirical investigations [7][8][9], spurred in-part by the observation 9 that dormancy has independently evolved multiple times across the tree of life [10][11][12]. 10 While the trade-off aspect of dormancy has been of considerable historical interest, 11 life-history traits do not operate in a population genetic vacuum. The fitness benefit of 12 a life-history trait is often a consequence of its effect on birth-death processes [13], 13 population dynamics that sequentially alter the dynamics and fates of genetic variation. 14 The ability to enter a dormant state is no exception. The accumulation of dormant 15 1/24 individuals within a system can lead to the formation of seed banks [10], demographic 16 structures that can reshape the molecular evolutionary dynamics of a population. 17 The presence of a seed bank primarily alters the molecular evolutionary dynamics of 18 a population through two means. First, the ability to enter a dormant state dampens 19 the accumulation of de novo genetic diversity and its subsequent fluctuations, as 20 dormant individuals do not reproduce and the vast majority of mutations are typically 21 acquired during the process of genome replication [1,14]. Second, seed banks can act as 22 reservoirs of genetic and phenotypic diversity. These reservoirs reduce the efficiency of 23 natural selection [15][16][17], dampen the loss of genetic diversity due to random genetic 24 drift [17][18][19][20], and allow for temporary deleterious variants to be retained until 25 environmental changes occur [21,22]. Alternatively stated, the presence of a seed bank 26 reduces the rate of molecular evolution while increasing the maximum amount of 27 genetic diversity that can be retained. Finally, because the ability to form a seed bank 28 is the result of a life-history strategy maintained by natural selection, it is possible that 29 the formation of seed banks restricts the targets of molecular evolution, constraining the 30 direction of evolution as well as its rate. While substantial progress has been made 31 towards developing mathematical models that describe these patterns within the 32 discipline of theoretical population genetics [16,17,[17][18][19][20], there remains a comparative 33 lack of empirical data necessary to evaluate and test their central predictions. 34 The role of seed banks as an evolutionary buffer means that time is an essential 35 factor when considering an empirical system. Namely, if the per-generation rate of 36 change of genetic diversity is reduced by a given amount it is necessary to observe an 37 additional proportionate number of generations. This constraint makes it challenging to 38 directly observe seed bank dynamics over extended evolutionary timescales. Given their 39 short generation times, large population sizes, propensity for rapid adaptation, and the 40 prevalence of dormancy among their lineages [23], microorganisms are the ideal group of 41 organisms to characterize the extent that dormancy alters molecular evolutionary 42 dynamics. In addition, certain lineages of microorganisms have evolved the ability to 43 form complex protective structures (i.e., endospores) that allow them to survive and 44 form long-lasting seed banks [24,25]. While these structures are not the only means 45 through which microorganisms can enter a dormant state [26], their existence provides a 46 means through which the formation of seed banks can be genetically manipulated.

47
Furthermore, while evolution experiments have been performed using dormancy-capable 48 microorganisms [27], questions pertaining to dormancy have primarily been restricted to 49 examining the phenotypic decay of endospore formation via the acquisition of de novo 50 mutations under relaxed selection [28][29][30], whereas the effect of endospore formation on 51 the molecular evolutionary dynamics of microorganisms remain unexplored.

52
In this study, we examine the molecular evolutionary dynamics of Bacillus subtilis 53 populations that differ in their ability to form protective endospores, a non-reproductive 54 structure that is the primary mechanism through which this species enters a dormant 55 state to form a seed bank. Replicate populations were maintained for over 700 days, 56 generating a molecular fossil record which was reconstructed to determine how the 57 presence of a seed bank altered the trajectories of de novo mutations. We then 58 recapitulated the dynamics we observed using simulations based on a stochastic model 59 of molecular evolution in dormancy-capable populations. Finally, we identified the sets 60 of genes that were enriched for mutations within each transfer-regime for 61 dormancy-capable and incapable populations, allowing us to quantify parallel evolution 62 among replicate populations as well as the degree of divergent evolution between 63 dormancy-capable and incapable populations. To manipulate endospore formation we deleted spo0A, the master regulatory gene for 67 sporulation pathways in B. subtilis. Gene deletion was performed using Gibson 68 assembly and PCR amplified dsDNA fragments upstream and downstream of spo0A.

69
Purified ligated plasmid was transformed into E. coli DH5α and plasmid DNA was 70 purified from cultured positive transformants. Purified plasmid product was injected 71 into E. Coli TG1, positive transformants were selected, and plasmid DNA was purified 72 before a single B. subtilis NCIB 3610 colony was grown in inoculate containing a 73 purified plasmid aliquot and identified using antibiotic plating. Transformation was 74 confirmed via PCR and loss of antibiotic resistance was confirmed via antibiotic plating 75 (see Supplemental Methods for additional detail).

76
B. subtilis ∆spo0A evolution experiment 77 Fitness assay 78 We performed fitness assays to determine the degree that the ability to form a 79 protective endospore provided a fitness advantage across energy-limited environments.

80
At late-exponential phase, aliquots of B. subtilis WT and B. subtilis ∆spo0A cultures 81 were transferred to three replicate flasks with new media at an equal ratio. Flasks were 82 periodically plated over 100 days in duplicate for a range of serial dilutions. Colonies 83 were distinguishable by their morphology, providing estimates of WT (N WT ) and 84 ∆spo0A (N ∆spo0A ) population sizes. The relative log fitness after t days was defined as 85 Because we t taking measurements from a population that will inevitably go extinct 86 in an increasingly energy-limited environment and we do not know the number of 87 generations that occur after exponential growth, we chose to examine X at each given 88 time point rather than use the typical fitness per-unit time estimate ∆X = 1 ∆t · X(t). contamination. An identical experiment was concurrently run with the WT strain, 100 which was previously described [31].

101
Under our chosen transfer regime with 1:10 dilutions and assuming that the decline 102 in population size is negligible, replicate populations evolved for log 2 (N f /N i ) ≈ 3.3 103 generations per-transfer, with a cumulative ∼3,300, 330, and 30 generations for 1, 10, 104 and 100-day transfer regimes, respectively. However, that is not necessarily the case as 105 population size can decline by several orders of magnitude depending on the transfer 106 regime, meaning that the number of generations per-transfer can be substantially higher 107 than ≈ 3.3. In addition, CFU counts suggest that ∆spo0A has a different 108 3/24 time-dependent net rate of growth than the wild type, meaning that 10 and 100-day 109 transfer regimes of those two strains undergo a slightly different number of generations 110 each transfer. While we cannot account for any generations that occur within a transfer, 111 we can account for the change in population size after resource replenishment for each 112 transfer regime and strain to get a more accurate estimate of the timescale of the 113 experiment. However, an unknown number of generations likely occurred while 114 populations remained in stationary phase due to cells using dead cells as a nutrient 115 source. This "cryptic growth" [25, [31][32][33] suggests that the estimated number of 116 generations likely represents the true number of generations for 1-day transfers, while it 117 is closer to an estimate of the minimum number of generations for 10 and 100-day 118 transfers.

119
DNA extraction, library preparation, and pooled population sequencing was 120 performed on all ∆spo0a timepoints for all replicate populations as previously 121 described [31]. The first 20 bp of all reads were trimmed and all read pairs where at 122 least one pair had a mean Phred quality less than 20 were removed cutadept 123 v1.9.1 [34]. Candidate variants were identified using a previously published 124 approach [35] that relied on alignments generated from breseq v0.32.0 [36], which 125 was modified as previously described [31].

126
Mutation trajectory analyses 127 We estimated the frequency of the mth mutation candidate in the pth population at the 128 tth timepoint using the naive estimatorf pmt ≡ A pmt /D pmt , where A pmt and D pmt are 129 the total number of reads containing the alternate allele and the total depth of coverage, 130 respectively. We examined the accumulation of mutations by time t as the sum of  (2) Given that the log of M (t) over time often appeared to saturate in this study as well 133 as in previous studies [31,35], we modelled the relationship between M (t) and t using 134 the following equation.
Where [log 10 M ] max is the maximum value of log 10 M (t * ) and t * 1/2 is the value of t * 136 where log 10 M (t * ) is half of [log 10 M ] max . The variable t * represents the shift in time so 137 that Eq.3 reduces to the intercept parameter (log 10 M (0)) at the first temporal sample, 138 in this case, t * = t − 100 days. We then multiplied t 1/2 by the estimated minimum 139 number of per-day generations, the product of which we define as τ 1/2 . While this 140 model is phenomenological in that we do not posit a microscopic mechanism, much like 141 the Michaelis-Menten kinetic model from which it is derived [37], it effectively captures 142 the hyperbolic pattern. Numerical optimization was performed over 54 initial conditions 143 using the BFGS algorithm in Python using statsmodels [38]. 144 We examined three different measures to determine how the ability to enter a 145 dormant state affected molecular evolutionary dynamics, defined as 146 4/24 First, f max is the maximum estimated frequency of a given mutation over T 147 observations. Second, |∆f | /∆τ is the magnitude of change in f between two 148 observations. Finally, Qf (τ ) is the direction of change for f between two observations, 149 where Q corresponds to the quotient. We compared the empirical cumulative

158
Parallelism at the gene level 159 We identified potential targets of selection by examining the distribution of 160 nonsynonymous mutations across genes using a previously published approach [35]. To 161 briefly summarize, gene-level parallelism was assessed by calculating the multiplicity of 162 each gene as where n i and L i is the number of mutations observed and the length of the ith gene 164 and L is the mean length of all genes. Under this definition, the null hypothesis is that 165 all genes have the same multiplicity m = n tot /N genes . Using the observed and expected 166 values, we can quantify the net increase of the log-likelihood of the alternative 167 hypothesis relative to the null where significance was assessed using permutation tests. To identify specific genes 169 that are enriched for mutations, we calculated the P -value of each gene as where FDR correction was performed by defining a critical P -value (P * ) based on 171 the survival curve of a null Poisson distribution. We then defined the set of significant 172 genes for each strain-transfer combination for α = 0.05 as:

5/24
(Con/di)vergence at the gene level 174 We tested for divergent/convergent evolution by examining the overlap of I between

175
WT and ∆spo0A populations and populations from all three transfer regimes and 176 comparing them to a null hypergeometric distribution as previously described [31]. To 177 examine convergent/divergent evolution among enriched genes, we calculated the vector 178 of relative multiplicities (M i = m i / m i ) and compared the mean absolute difference 179 between I genes for a given pair of transfer regimes or genetic backgrounds as Null distributions of ∆M were generated by constructing a gene-by-strain distribution as previously described [41].

189
Simulating evolution with a seed bank 190 We performed simulations to determine whether the empirical patterns of genetic 191 diversity we observed were consistent with outcomes expected from seed bank effects.

198
Given that the rapid evolution we observed was likely driven by selection, we focused 199 on a distribution of fitness effects with beneficial mutations ρ(s) that followed an 200 exponential distribution with mutations occurring at a per-individual rate U b . Though, 201 ultimately, the rate that de novo mutations accumulate during adaptation is mediated 202 by the distribution of fitness effects [42], we note that there is no a priori reason to 203 suspect that the removal of endospore formation would alter the distribution of fitness 204 effects of B. subtilis. The following forward-in-time master equation was simulated These coupled master equations are extensions of previous theoretical efforts to 206 characterize adaptation dynamics in asexual microbial populations [43], where we have 207 incorporated the seed bank and resuscitation and dormancy dynamics. Simulations were 208 performed with U b = 10 −4 /indiv.. While the true value of U b is unknown, we elected for 209 a value that was on the order of magnitude of 10% of the total mutation rate (all 210 non-lethal mutations) obtained from a previously published mutation accumulation 211 experiment [44]. The scale parameter of ρ(s) was set to 10 −2 . Simulations were run for 212 3,300 generations with N = 10 6 for c = 10 −5 and values of M ranging from 10 1 − 10 6 . 213 Only values of M were manipulated as the same transition rates can be obtained by 214 manipulating c or K. Ten replicate simulations were performed for each value of M . All 215 simulations were performed using custom Python scripts. The results of this long-term experiment test, confirm, and challenge long-standing 228 hypotheses regarding the effect of seed banks on the dynamics of molecular evolution.

229
The effect of seed banks on the accumulation and fate of de Deinococcus-Thermus) frequently went extinct in 10 and 100-day transfer regimes [31]. 236 This comparison suggests that Bacillus is exceptionally capable of surviving and 237 evolving in harsh environments, even without access to what is generally considered its 238 primary survival strategy.

239
Through pooled population sequencing, we reconstructed the trajectories of de novo 240 mutations for all replicate populations from all transfer regimes (Fig. S1-2). By for log 10 M to reach half of its maximum value (τ 1/2 ). After obtaining estimates of these 253 parameters via numerical optimization, we found that [log 10 M ] max remained fairly 254 constant as the time between transfers increased for the WT, but sharply decreased for 255 ∆spo0A (Fig. 1d). In contrast, τ 1/2 remained constant for ∆spo0A, but decreased as 256 transfer time increased (Fig. 1e). This pattern confirms our prediction that the rate of 257 accumulation of genetic diversity will be higher in populations that cannot form a seed 258 bank. While the trends we observed were generally robust, the error in our estimates for 259 the 100-day transfer regime was considerable (Table S2). This error was likely a result 260 of the small number of mutations acquired among populations in the 100-day treatment 261 as well as their small population sizes, increasing the variance in our estimates of M (t). 262 Regardless, parameter estimates consistently change in directions predicted by the seed 263 bank effect.

264
Given seed banks altered the accumulation of genetic diversity, we examined how the 265 presence of a seed bank altered the fate of a given mutation. The presence of seed banks 266 reduce the efficiency of selection and retain genetic diversity [15], suggesting that a 267 given mutation would have a lower probability of extinction as well as a lower 268 substitution rate [16,17]. Our estimates of the probability of extinction for each 269 replicate population provide little evidence to support this claim, as there is substantial 270 variation across replicate populations for a given strain-transfer combination (Fig. 1f). 271 While the rate of fixation was similar for WT and ∆spo0A in the 1-day regime, as 272 expected given that the ability to form a seed bank does not contribute to survival on 273 that time scale, few fixation events occurred within replicate populations in the 10-day 274 and 100-day transfer regimes (Fig. 1g). This paucity of fixations meant that the  slightly higher level of genome-wide parallelism across transfer regimes (Fig. S5). This 353 increased level of parallelism suggests that the presence of a seed bank can make 354 evolution more predictable. However, it is worth considering that this result may be an 355 aftereffect of this particular experiment, given that spo0A is a master regulatory gene 356 that controls cellular processes in addition to endospore formation [46,47], meaning that 357 the decrease in parallelism in ∆spo0A could be a consequence of pleiotropy.

358
To deconstruct this genome-wide pattern of parallelism, we examined the mutation 359 counts at each gene corrected for gene size (i.e., their multiplicity [35]). As expected, 360 based on the likelihood tests, the multiplicity of the WT was consistently higher than 361 ∆spo0A across transfer regimes (Fig. S8a-c). Identifying the set of significantly enriched 362 genes revealed that genes enriched within the WT for a given transfer regime tended to 363 also be enriched within ∆spo0A (Fig. S8d-f). This pattern of consistent enrichment 364 occurred across transfer regimes as well as among transfer regime comparisons within a 365 given strain (Fig. S5,6), suggesting that, generally, the direction of evolution at the 366 gene-level tended towards convergence rather than divergence. We found that this is the 367 case, as the degree of overlap in enriched genes relative to a null distribution [31] 368 suggests convergent evolution (Fig. S9). While certain transfer regime and strain 369 comparisons had stronger signals of convergence than others, overall convergent 370 evolution overwhelmingly occurred.

371
Similar to a previous analysis [31], it is likely that gene identity was again too coarse 372 a measure to determine whether convergent or divergent evolution occurred. While 373 ∆spo0A is a master regulatory gene, its removal may only have slightly perturbed the 374 rates of evolution for a large number of genes in a given environment. If true, then it is 375 arguably more appropriate to examine the difference in mutation counts among 376 enriched genes in order to assess whether convergent or divergent evolution occurred.

377
By examining the mean absolute difference in mutation counts across enriched genes 378 10/24 between two transfer regimes ( ∆M ) and standardizing the observed value with 379 respect to an appropriate null distribution (Z ∆M ) we can establish whether 380 convergent or divergent evolution occurred. The WT strain exhibited significant 381 divergent evolution for the 1-day vs. 10-day and 1-day vs. 100-day comparisons, a result 382 that is consistent with the WT surviving resource-poor environments by forming 383 endospores as a life-history strategy (Fig. 5a). This conclusion is strengthened by the 384 evidence of convergent evolution for the 10-day vs. 100-day comparison, though it was 385 ultimately not significant. For ∆spo0A the pattern is inverted, as there is a significant 386 signal of convergent evolution for the 1-day vs. 10-day comparison. For the 1-day vs. 387 100-day and 10-day vs. 100-day comparisons we find a strong signal of divergent 388 evolution, suggesting that the direction of molecular evolution shifts between 10 and 100 389 days for ∆spo0A with no such shift occurring between 1 and 10 days.

390
To examine how seed bank formation affected the direction of molecular evolution 391 within a given transfer regime, we repeated the convergent/divergent analysis between 392 the WT and ∆spo0A populations, analyses that can be bolstered through comparisons 393 to fitness estimates (Fig. 5b). Divergent evolution overwhelmingly occurred for all three 394 comparisons, though it was at its highest for the 10-day transfer regime (Fig. 5c). This 395 increased divergence for the 10-day transfer regime, along with evidence of convergent 396 evolution for the 1-day vs. 10-day comparison, may reflect adaptation of ∆spo0A to this 397 specific regime. Our estimates of ∆M can be compared to fitness estimates, allowing 398 us to examine how the sign and magnitude of fitness changes with the direction of 399 molecular evolution. Starting from the 1-day transfer regime, we found that the fitness 400 of ∆spo0A was effectively zero, a result that is consistent with endospore formation 401 having a negligible fitness effect in environments where it would not be advantageous.

402
The increase in fitness for the 10-day transfer regime corresponded with an increase in 403 the degree of divergent evolution in the 10-day transfer regime (Fig. 5d). Though by 404 day 100 the fitness benefit held by ∆spo0A had dissipated and the magnitude of 405 divergent evolution had diminished. This result suggests that the ability to form 406 endospores actually conferred a fitness disadvantage at the 10-day mark. However, the 407 fitness benefit was temporary and the decrease in the degree of divergent evolution 408 suggests that it is unlikely that ∆spo0A was able to adapt to the harsh 100-day 409 environment. An analysis of the set of genes that were enriched for mutations within a 410 specific strain-transfer combination supports this conclusion [41]. The 100-day ∆spo0A 411 was the only strain-transfer combination with no unique enriched genes (Table ; File 412 S2), suggesting that ∆spo0A may have been unable to adapt to this extremely resource 413 poor environment.

414
While it is unlikely that ∆spo0A was able to adapt to 100 days of energy limitation, 415 conversely the 10-day ∆spo0A harbored the highest number of unique enriched genes for 416 all three ∆spo0A transfer regimes, suggesting that adaptation may have occurred even 417 in the absence of endospore formation as a life-history strategy. The mechanism 418 responsible for the temporary gain in fitness of ∆spo0A is unknown, though it is likely 419 partially due to the recycling of dead cells, a phenotype that allows individuals to exploit 420 an untapped resource [25,48,49]. Naturally, dormant cells cannot use this resource as 421 their metabolism is effectively nonexistent and, in the case of endospores, completely 422 inert, leaving ∆spo0A with unrestricted access. Regardless, for the purposes of this 423 study, the removal of endospore formation as a life-history strategy provides a clear 424 temporary fitness benefit in environments where resources become increasingly limited. 425 Finally, given that endospore formation did not occur in the 1-day transfer regime, it 426 is possible that the pathways encoding said life-history trait were susceptible to decay 427 due to relaxed selective pressure. By calculating the fraction of nonsynonymous 428 mutations in genes that encode for endospore formation and calculating the difference 429 between ∆spo0A and the WT, we found that endospore associated genes were slightly 430 enriched in the WT for all transfer regimes, a difference that was significant using null 431 distributions simulated using binomial sampling (Table S4). Operating under the 432 premise that the majority of endospore-forming genes are nonfunctional in ∆spo0A 433 populations, for WT populations in the daily transfer regime this result can be viewed 434 as the outcome of positive selection for the removal of endospore formation as an 435 energetically costly trait in resource-replete environments [50,51]. Prior studies [28][29][30] 436 as well as spore accumulation assays support this conclusion, as by day 500 the ability 437 to form endospores was rapidly reduced for populations in the 1-day transfer regime, to 438 the point that it took 10 days for 10% of the population to form endospores (Fig. S10). 439 However, the question remains as to why 10 and 100-day WT populations acquired a 440 greater fraction of nonsynonymous mutations. Endospore formation undergoes no 441 noticeable decline in these transfer regimes (Fig. S10), suggesting that these mutations 442 had a negligible or even positive effect. leveraged to identify environment-dependent fitness effects as well as the mode in which 454 traits mediate fitness effects [53]. Given that endospore formation is a complex trait 455 with many loci that are likely interacting, it may be a suitable candidate to apply 456 recently developed models that predict the form of the distribution of fitness effects 457 when epistatic interactions are prevalent [54].

459
We demonstrated that the ability to form seed banks altered the molecular evolutionary 460 dynamics of microbial populations. Populations capable of forming seed banks 461 consistently accumulated higher levels of genetic diversity and had a reduced rate of 462 molecular evolution. Through forward-time simulations, we were able to recapitulate 463 empirical observations on the effect of seed banks on the rate and direction of allele 464 frequency changes as well as the maximum attainable frequency. In addition to testing 465 previously proposed predictions on the effect of seed banks on genetic diversity, new 466 patterns were found. Specifically, we determined that endospore formation has the 467 capacity to alter the direction of molecular evolution within a population. Stated 468 inversely, the absence of endospore formation contributed to a substantial signal of 469 divergent evolution for populations that were intermediately transferred. This signal of 470 divergence, alongside the observation that the absence of endospore formation provided 471 a substantial fitness benefit, suggests that adaptation to energy-limited environments 472 may be possible in the absence of a highly conserved life-history strategy. Though any 473 such adaptation would likely be transitory, as the absence of endospore formation 474 resulted in an increasingly strong fitness disadvantage as the degree of energy-limitation 475 increased.   Figure 1. The presence of a seed bank altered the accumulation of genetic diversity. a-c) By examining the sum of derived allele frequencies (M (t), Eq. 2), we were able to summarize the accumulation of de novo mutations over time for all strains and transfer regimes. The WT strain typically had higher estimates of M (t) than ∆spo0A and the relationship between t and M (t) became noticeably more linear as transfer time increased for the WT strain. d,e) To quantify the effect of seed bank formation on this empirical relationship we formulated a phenomenological model that allowed us to summarize the curve through two parameters: the maximum amount of genetic diversity that could accumulate ([log 10 M ] max ) and the number of generations until half of the maximum is reached (τ 1/2 ; Eq. 3). Values of [log 10 M ] max for the WT strain steadily increased with transfer time while ∆spo0A remained consistent, consistent with the prediction that the presence of a seed banks increases the amount of genetic diversity that a population can maintain. Conversely, τ 1/2 decreased for ∆spo0A but remained constant for the WT, consistent with the prediction that the rate of molecular evolution would increase in the absence of a seed bank. f,g) However, the effect of seed banks on the final states of alleles was less straightforward. While fixation events occurred across transfer regimes and strains, there was substantial variation across replicate populations that made it difficult to determine whether the presence of a seed bank affected the probability of fixation or the rate of molecular evolution (per-generation number of substitution rate). Figure 2. Due to the low number of fixation events, it was necessary to devise alternative measures of molecular evolution to evaluate the effect of seed banks. We examined three measures corresponding to a) the maximum frequency realized by an allele, b) the per-generation magnitude of change in allele frequency, and c) the change in the direction of allele frequencies between time points (Eq. 10). These measures were examined by calculating the empirical survival distribution (the complement of the empirical cumulative distribution function) for a given strain-transfer combination.
The typical value of all three measures was higher for ∆spo0a than the WT across transfer regimes, results that are consistent with the predicted effect of a seed bank. The difference we observed between strains was confirmed via Kolmogorov-Smirnov tests for all transfer regimes (P < 0.05 marked by asterisk).  . By comparing the set of genes that contributed towards parallel evolution with a given strain-transfer combination, we visualize patterns of convergent and divergent molecular evolution. While all strain-transfer combinations consistently acquired more nonsynonymous mutations than expected by chance at a large number of genes (P < P * ), very few genes were enriched exclusively within a given strain. Those genes that were significantly enriched within a given strain combination typically also acquired mutations at a non-significant level (P ≮ P * ) in the remaining strain, suggesting that the removal of endospore formation did not generate evolutionary trajectories that were divergent in terms of gene identity. To increase statistical power we ignored all genes that acquires less than three mutations (n mut < 3). Gene names are listed as provided in the annotated reference genome, all other names were acquired using RefSeq IDs (see File S1 for gene metadata). Figure 5. By examining the number of mutations within each gene, we can determine whether convergent or divergent evolution occurred between a given pair of transfer regimes or strains by calculating the mean absolute difference of mutation counts across genes ( ∆M ; Eq. 9 ). a) A comparison between all transfer regime combinations within each strain reveals contrasting dynamics of divergence/convergence. Divergent evolution initially occurs among the WT background initially for 1 vs. 10 and 1 vs. 100-day comparisons, with 10 and 100-day transfers having a weak signal of convergence. This pattern is inverted for ∆spo0A, as 1 and 10-day transfers converged and remaining transfer regime combinations diverged. b) For comparisons between strains, we can compare signals of convergent/divergent evolution between WT and ∆spo0A strains with the transfer regime-dependent fitness effects of removing spo0A. c) Generally, we found that divergent evolution consistently occurred across transfer regimes, with the 10-day transfers harboring the strongest signal of divergent evolution. d) Mapping signals of divergent evolution to estimates fitness, we can see how the sign and magnitude of selection changes with the degree of divergent evolution. Asterisks denote P < 0.05.