Profiling the initial burst of beneficial genetic diversity

24 Clonal populations of cells continuously evolve new genetic diversity, but it takes a significant 25 amount of time for the progeny of a single cell with a new beneficial mutation to outstrip both its 26 ancestor and competitors to fully dominate a population. If these driver mutations can be 27 discovered earlier—while they are still extremely rare—and profiled in large numbers, it may be 28 possible to anticipate the future evolution of similar cell populations. For example, one could 29 diagnose the likely course of incipient diseases, such as cancer and bacterial infections, and 30 better judge which treatments will be effective, by tracking rare drug-resistant variants. To test 31 this approach, we replayed the first 500 generations of a >70,000-generation Escherichia coli 32 experiment and examined the trajectories of new mutations in eight genes known to be under 33 positive selection in this environment in six populations. By employing a deep sequencing 34 procedure using unique molecular identifiers and target enrichment we were able to track 301 35 beneficial mutations at frequencies as low as 0.01% and infer the fitness effects of 240 of these. 36 Distinct molecular signatures of selection on protein structure and function were evident for the 37 three genes in which beneficial mutations were most common (nadR, pykF, and topA). We 38 detected mutations hundreds of generations before they became dominant and tracked beneficial 39 alleles in genes that were not mutated in the long-term experiment until thousands of generations 40 had passed. This type of targeted adaptome sequencing approach could function as an early 41 warning system to inform interventions that aim to prevent undesirable evolution. 42


43
New genetic variation naturally arises in lineages of cells and organisms during genome 44 replication and repair. These de novo mutations are the main drivers of adaptive evolution in 45 many populations, particularly those with little or no recombination or standing genetic variation. 46 In large laboratory populations of asexual microbes, numerous lineages with different beneficial 47 mutations arise and contend within a population before any one outcompetes the ancestor and its 48 competitors (Good et al., 2017;Lang et al., 2013;Maddamsetti et al., 2015). This 'clonal 49 interference' leads to heterogeneous populations with many lineages simultaneously adapting via 50 distinct sets of mutations, though often these mutations are in a small subset of genes that are 51 under the strongest selection (Desai et al., 2012;Gerrish and Lenski, 1998;Park and Krug, 52 2007). 53 In human cancers and chronic microbial infections, single cells clonally expand in a similar 54 fashion by evolving driver mutations that lead to disease progression and drug resistance. Both 55 solid tumors and blood cancers have been shown to be genetically heterogeneous (Marusyk et  centers). Reads in pairs that have the same dual unique molecular identifiers, which implies that 156 they were PCR amplified during library preparation from the same original genomic DNA 157 fragment, are used to construct consensus reads to eliminate sequencing errors. Consensus reads 158 are mapped to the reference genome to call sequence variants. (B) Enrichment of reads mapping 159 to eight genes known to be early targets of selection in this environment from the long-term 160 evolution experiment. The final coverage depth of consensus reads in and around these genes is 161 shown for a typical sample (population A7 at 500 generations).  This fitness trajectory fit accounts for all cells in the population, regardless of whether they have  174  a mutation in the targeted genes or elsewhere in the genome. The red line is the maximum  175  likelihood estimate of the population fitness trajectory. The red shading around it shows 95%  176  confidence intervals on this value in each interval. The black line shows the increase in fitness  177  estimated for a consensus model that was jointly fit to all mutations tracked in all six populations  178 for which sequencing data was collected in this window. The consensus population fitness 179 trajectory was used when estimating the fitness effects of individual mutations (see Methods) 180  The same plots  182  described in Figure 2C-E for population A1 are shown for populations A2, A3, and A7 (top three  183  sets of panels). For populations A6 and A9, sequencing was only performed at time points during  184  the selective sweep window so only the plots corresponding to Figure 2D-E are shown (bottom  185  two sets of panels).  186  187 Mutation trajectories in all four populations exhibited a burst of genetic diversity in the 188 targeted genes followed by loss of this diversity. The initial dynamics are expected to be largely 189 driven by new genotypes that each evolve a single beneficial mutation very early in the 190 experiment. If their descendants escape stochastic loss, they will gradually increase in frequency 191 over the first few hundred generations as they outcompete the ancestral genotype. Once the 192 population becomes dominated by these first-step mutants, their frequency trajectories plateau 193 because of clonal interference: they are now mainly competing against one another and are 194 relatively evenly matched. In populations A1, A2, and A7, the total frequencies of the mutations 195 we identified sums to 50-62% at generation 270, indicating that each population is mostly 196 composed of genotypes with a single mutation in one of the focal genes. We recovered less of 197 the initial beneficial mutation diversity in population A3 where this sum was only 13%. 198 After around 300 generations, there is a steady decline in the frequencies of most mutations 199 in the eight targeted genes. At this point, new more-fit genotypes that have evolved begin to 200 exert their influence and displace the genotypes that we initially tracked. Many of the most 201 successful new genotypes are descended from cells that already had a mutation in one of the 202 targeted genes. In these cases, the original mutations serve as markers for the further expansion 203 of these subpopulations after a period during which their frequencies stagnate or decrease, but 204 the new beneficial mutations responsible for this further increase in fitness are outside of the 205 genomic regions we surveilled. The converse situation, in which a beneficial mutation in one of 206 the eight focal genes appears in a cell with an untracked beneficial mutation elsewhere in the genome, also occurs in a few cases. Most strikingly, a mutation in pykF that only appears after 208 300 generations in population A3 rapidly increases in frequency and becomes dominant, strongly 209 suggesting that it appeared in a genetic background with a prior, unknown beneficial mutation. 210 211 Selection coefficients can be inferred from initial mutation trajectories 212 We next sought to calculate the fitness benefits of individual mutations by tracking how rapidly 213 their frequencies rose early in the experiment when they were largely competing versus the 214 ancestral genotype because all new mutations in the population were still rare. To that end, we 215 sequenced six populations at ~13-generation increments from 133 to 213 generations (Fig. 2D, 216  proportion of the population at this point, and making it necessary to account for how they are 239 increasingly competing against one another to estimate the fitness effect. 240 We accounted for clonal interference by adding a stepwise increase in the average fitness of 241 the entire cell population over time as an additional set of parameters to the binomial logistic 242 model (Fig. 2E, Fig. 3). That is, we estimated how the population fitness as a whole was generations. This rapid change followed by stasis may seem at odds with the continuing increase 250 in the trajectories of many beneficial mutations. However, this type of stepwise increase is a 251 typical result of clonal dynamics in models and experiments (Gerrish and Lenski, 1998;Lenski 252 et al., 1991). It could reflect the influence of many mutations with small fitness effects, no one of 253 which reaches an observable frequency, peaking and then being outcompeted by the most 254 beneficial mutations that we are able to track, for example. 255 The mean fitness effect that we inferred for the 240 tracked mutations in all six populations 256 was 9.00% with a standard deviation of 1.33%. Although the distributions of the fitness effects 257 estimated for mutations in nadR, pykF, and topA overlap (Fig. 4B), there was a significant stratification among these genes. Mutations in nadR were 0.44% more beneficial than mutations 259 in topA, on average, and this difference was significant (p = 0.022, one-tailed Mann-Whitney U 260 test). In turn, mutations in topA were 0.70% more beneficial than those in pykF (p = 0.00046, 261 one-tailed Mann-Whitney U test). The fitness effects of the 16 mutations in the other genes 262 (spoT, fabR, ybaL, and iclR) were not significantly different from the those of the 224 mutations 263 in nadR, pykF, and topA (p = 0.33, two-tailed Mann-Whitney U test). Thus, highly beneficial 264 mutations are possible in these genes as well, but they apparently occur at a much lower rate 265 relative to that at which similarly beneficial mutations are generated in nadR, pykF, and topA. number of mutations identified in each gene is indicated above its column. 273 274

Beneficial mutations reveal different signatures of selection on gene function 275
Of the 301 mutations that we were able to track in complete or window time courses, 272 were 276 in the nadR, pykF, or topA genes. These large sets of highly beneficial mutations in these genes 277 gave us the statistical power to test for several signatures of molecular evolution to predict what 278 types of changes in the function of each gene improved E. coli fitness in this environment. Each 279 of the three genes exhibited a distinct spectrum of beneficial mutations (Fig. 5). In some cases, 280 different types of mutations were also unevenly distributed throughout the sequences of these 281 three commonly hit genes and had noticeably different effects on bacterial fitness (Fig. 6A). Fisher's exact test) (Fig. 6A). Yet, there was not a significantly greater selection coefficient for 316 disruptive mutations compared to nonsynonymous mutations overall (p = 0.063, one-tailed 317 Mann-Whitney U test). These results suggest that complete inactivation of nadR yields the 318 maximum benefit possible for a mutation in this gene. Disrupting all three of its distinct 319 functions does not appear to be necessary for achieving this full benefit. Consistent with this 320 prediction, deletion of nadR has been shown to be highly beneficial in the very similar 321 environment of the LTEE ( intermediate representation of disruptive mutations in pykF, fewer than in nadR but more than in 326 topA (Fig. 5). Interestingly, nonsynonymous base substitutions in pykF tend to have a larger 327 selection coefficient than disruptive mutations (p = 0.00031, one-tailed Mann-Whitney U test) 328 ( Fig. 6A). This finding is in agreement with a study of various pykF alleles that arose in the 329 LTEE which found that nearly all pykF point mutations were more beneficial than deletion of the 330 gene sequence (p = 0.0018 one-tailed binomial test) (Fig. 6B). Overall, these results suggest that 337 complete inactivation of pykF is highly beneficial in the environment of our evolution 338 experiment, but mutations that alter its activity-likely in ways that reduce glycolytic flux-are 339 even more so. It has been suggested that reducing pykF activity is beneficial in the similar 340 glucose-limited conditions of the LTEE because this allows more PEP to be diverted to power 341 import of glucose into cells via the phosphotransfer system (Woods et al., 2006). 342 DNA topoisomerase I (topA) relaxes negative supercoiling introduced into the chromosome 343 by replication and transcription (Massé and Drolet, 1999). The mutations we observed in topA 344 are almost exclusively single-base substitutions (Fig. 5), suggesting that modulating the activity of this enzyme provides a fitness benefit. Indeed, complete loss of topA function is lethal to E. 346 coli without compensatory mutations in DNA gyrase (Dinardo et al., 1982;Pruss et al., 1982). 347 The structure of E. coli TopA consists of four N-terminal domains (D1-D4) that make up the 348 catalytic core and five C-terminal zinc finger and ribbon domains (D5-D9) (Tan et al., 2015). 349 The few out-of-frame indels and the large deletion that we observe truncate TopA within 350 domains D7-D9, which interact with single-stranded DNA and RNA polymerase but are not 351 critical for catalysis. Considering only the catalytic core, we find that nonsynonymous mutations 352 are more concentrated in domains D1 and D4 versus D2 and D3 than expected from their relative 353 sizes (p = 0.00068, one-tailed binomial test) (Fig. 6C) populations was founded from single cells, so we can conclude that these recurrent genetic 366 changes are due to independent mutational events. We observed a total of 252 distinct genetic 367 changes across all eight profiled genes and 31 of these were found in more than one population. While no single genetic change was detected in all six populations, 2, 2, 8 and 19 changes were 369 detected in 5, 4, 3, and 2 populations, respectively. Most of these were in the three genes that 370 were the main targets of selection (nadR, pykF, and topA), but one that occurred in three 371 populations was in fabR. These mutations may be recurrent because they have a higher fitness 372 benefit than other mutations, occur at a higher rate than other mutations, are more easily detected 373 in the sequencing data, or due to some combination of these factors. We had a fitness estimate 374 for each of the 31 recurrent mutations from tracking cells with that genetic change in at least one 375 of the window time courses and had 167 fitness measurements for mutations that were observed 376 in only one population. The recurrent mutations had a 0.12% greater fitness effect, on average, 377 compared to the singleton mutations, but this difference was not significant (p = 0.25, one-tailed 378 Mann-Whitney U test). Thus, it is unlikely that many cases of exact genetic parallelism are due 379 to these mutations being more beneficial than others in our dataset. 380 381

382
We examined bacterial evolution during the initial stages of clonal competition when there is a 383 burst of beneficial genetic diversity as many new subpopulations with different mutations evolve 384 and begin to displace the ancestral genotype. We focused on eight genes known to accumulate 385 adaptive mutations in the >70,000 generation Lenski long-term evolution experiment (LTEE) 386 with E. coli that used nearly the same environment as our experiments. The only difference was 387 that we added four times as much of the limiting nutrient (glucose). By combining Illumina 388 sequencing using unique molecular identifiers for error correction, hybridization-based capture 389 of DNA encoding these genes, and dense temporal sampling, we were able to identify more 390 beneficial mutations and track them at much lower frequencies than is possible with standard 391 metagenomic sequencing. We detected a total of 301 mutations in the focal genes: 181 in the 392 complete time courses of four populations and 240 in the window time courses of these 393 populations and two others, with 120 mutations overlapping between the two sets. 394 By densely sampling and deeply sequencing E. coli populations, we were able to characterize 395 many beneficial mutations that never reach the detection limits of standard Illumina sequencing 396 before they become casualties of clonal interference. Only 13 of the 181 mutations we detected 397 in the complete time courses ever achieved a frequency of 5% or more, which can be reliably 398 distinguished from noise by standard metagenomic sequencing, and only seven were this 399 common for 100 or more generations, such that they were likely to be detected by a typical time- but that there are so many fewer possibilities for these mutations that they were not often 426 sampled in our evolving populations. Despite these subtle differences between the LTEE and our 427 experiment, we were still able to account for majority of the genetic variation present in three of 428 the four populations that we profiled over the entire 500 generations by analyzing evolution in 429 the eight candidate genes. 430 We also wanted to understand to what extent we gained early warning of driver mutations by 431 deeply profiling evolution in genes we expected to be under strong selection. In general, we were 432 able to begin tracking most mutations when they were above a frequency of 0.01%. This level of 433 profiling enabled us to first detect mutations an average of 69, 150, and 290 generations before 434 they surpassed frequencies of 0.1%, 1%, and 5%, respectively. Under the conditions of our 435 experiment these intervals take roughly 10, 23, and 44 days, respectively; so, even though we 436 made these predictions retrospectively, there would have been sufficient time to complete the 437 DNA isolation, library preparation, sequencing, and analysis steps quickly enough for this 438 approach to give early warning of specific genetic variants driving evolution of these 439 populations. The amount of lead time becomes disproportionately longer at higher frequencies 440 due to clonal interference between beneficial mutations. The chances and timescales of earlier which makes them more common than mutations in spoT and ybaL in the long run. Therefore, 470 mutations in iclR and hslU appear to either require the presence of mutations in other genes to 471 become highly beneficial or may not be able to experience any mutations that are beneficial 472 enough to make them competitive early on in our experiment. 473 The nature of epistasis and the limits that it imposes on predicting the future evolution of a 474 cell population could be further probed using our approach in several ways. One could repeat the 475 evolution experiment beginning with genotypes containing different first-step beneficial 476 mutations as starting points to more finely map the fitness landscape. One could also interrogate 477 the diverse collections of cells containing different beneficial alleles that we have evolved, by 478 taking the 150-generation populations and further evolving them under different conditions to 479 map genotype by environment effects, for example. Such experiments might also reveal latent 480 beneficial mutation in other genes (e.g., iclR and hslU) that were able to outcompete the ancestor 481 in our experiment but remained below the detection limit because they were not as beneficial as mutations in topA, pykF, and nadR in this environment. There is precedent for changes in the 483 environment deflecting selection to different subsets of the same genes. In an offshoot of the 484 LTEE that began with a clone that had spoT, topA, and pykF mutations, selection was focused on 485 either hslU, iclR, or nadR depending on changes in temperature (Deatherage et al., 2017). would therefore be expected to contribute the most to a cell's adaptome. 508 Exhaustively mapping paths that clonal evolution is likely to follow is of particular interest 509 and utility in systems that evolve repeatedly from a defined starting point. These range from 510 bioreactors that are seeded with the same strain in different production runs to lung infections in 511 cystic fibrosis patients that start from similar, but not identical, opportunistic pathogens. The 512 ability to identify mutations in key genes while they are still very rare may also be used to 513 improve the early detection of emerging drug resistance in other human infections and cancer. 514 The evolutionary dynamics will be more complex in many of these systems, but they may also 515 unfold more slowly. For example, biofilm formation and the necessity of invading already 516 colonized niches will slow the dynamics of competition. This potentially makes the therapeutic 517 window for detecting incipient evolution by profiling the adaptome even greater. 518 519

Evolution experiment 521
Strains and growth conditions are derived from the Lenski long-term evolution experiment 522 (Lenski and Travisano, 1994;Lenski et al., 1991). Nine colonies of E. coli B strain REL606 and 523 nine of strain REL607 were selected at random. Each was used to inoculate a separate flask 524 containing 10 mL of Davis Minimal (DM) media supplemented with 100 µg/L glucose 525 (DM100). This is a slightly higher concentration of glucose than the 25 µg/L glucose (DM25) 526 used in the LTEE, but still well below the ~1000 µg/L concentration at which nutrients other 527 than glucose begin to limit growth in this medium. We used this higher glucose concentration to ensure we had a sufficient number of cells for sequencing and archiving. These initial cultures 529 were grown overnight at 37°C with orbital shaking over a one-inch diameter at 120 RPM. 530 Approximately 30 generations of growth occurred starting from the initial single cell that gave 531 rise to each colony until saturation of these cultures. Next, 10 mL of fresh DM100 were 532 inoculated with 50 µL of one REL606 culture and 50 µL of one REL607 culture for overnight 533 growth in the same conditions. The remaining culture was archived at −80°C with 2 mL 534 dimethyl sulfoxide (DMSO) added as cryoprotectant. Daily transfer of 100 µL of overnight 535 culture to 10 mL of fresh DM100 and archival of the remaining culture volume in the same way 536 continued through 75 daily transfers. Periodically 1 µL of culture was diluted 10,000-fold in 537 sterile saline and plated on tetrazolium arabinose (TA) agar to allow growth of ~200 colonies. 538 REL606 and REL607 differ by a mutation in an arabinose utilization gene that makes REL606 539 (Ara − ) colonies red and REL607 (Ara + ) colonies pink (Lenski et al., 1991). The ratio of red to 540 pink colonies was used to monitor these populations for selective sweeps ( Genomic DNA (gDNA) was isolated from frozen population samples by first thawing each 15 545 mL conical tube on ice. Of the ~12 mL total volume of culture plus cryoprotectant, 1.2 mL was 546 transferred to a 2 mL cryovial and refrozen. The remaining ~10.8 mL was centrifuged at 6,500 × 547 g at 4°C for 15 minutes. The resulting cell pellets were transferred with a volume of remaining 548 solution to 1.7 mL Eppendorf tubes. Then, gDNA was isolated using the PureLink Genomic 549 DNA Mini kit (Life Technologies). For each sample, 1 µg of gDNA was randomly fragmented 550 on a Covaris S2 focused-ultrasonicator.
Illumina libraries were constructed using the Kappa Biosystems LTP Library Preparation Kit 552 with the following modifications. End-repaired, fragmented DNA was T-tailed (rather than A-553 tailed) in a 50 µl reaction including 10 mM dTTP and 5 units of Klenow fragment, exo -(New 554 England Biolabs). Illumina adapters containing 12-base unique molecular identifiers were ligated 555 to the T-tailed fragments as previously described (Schmitt et al., 2012), except full-length 556 adapter sequences containing unique external sample barcodes were directly ligated to the T-557 tailed dsDNA inserts to reduce the risk of cross-contamination between samples. The full list of 558 DNA sequence adaptors used is provided in Table S1.  Table S2. 568 Capture was performed using a SeqCap EZ Exome Enrichment kit v3.0 (NimbleGen) with 569 several modifications to the protocol. First, 18 to 20 population samples with different sample 570 barcodes were pooled together in a single capture reaction that contained a total of 1 µg of 571 library DNA from all samples, 1 mmol of a universal blocking oligo, and 1 mmol of a degenerate 572 sample barcode blocking oligo. The sequences of these blocking oligos are provided in custom Python scripts that carried out the following steps. First, the beginning of each read was 583 evaluated for the presence of the expected 5′-end tag, consisting of a random 12-base unique 584 molecular identifier (UMI) followed by four fixed bases (5′-N12CAGT). Read pairs lacking the 585 correct 5′-end tag on either read were discarded. Across all samples, 80.2% of read pairs had 586 both UMIs. For remaining read pairs, the UMIs found on each end were concatenated to create a 587 24-base dual-UMI that uniquely identifies the original DNA fragment that was amplified and 588 sequenced. To group all reads corresponding to the same initial DNA molecule, a FASTA file of 589 all dual-UMIs was used as input into the ustacks program from the Stacks software pipeline 590 (Version 1.48) (Catchen et al., 2013) with the following options: a single read was sufficient to 591 seed a stack, a single mismatch within the dual-UMI was allowed in assigning a read to a stack, 592 secondary reads and haplotypes were disabled, and stacks with high coverage were preserved. 593 Then, CSRs were generated for all dual-UMI groups sequenced at least twice by taking the 594 straight consensus of all reads that were merged into that stack. If no base exceeded 50% 595 frequency at a given position in this set of reads, then that base was set as unknown (N). Of the 596 read pairs with valid dual-UMIs, 41.6% were incorporated into consensus reads across all 597 samples. The average number of dual-UMI read pairs utilized to create each consensus read was 598 2.46, which gives an overall yield of consensus reads per read in a pair with a valid dual-UMI 599 16.9%. with probes-extended with hundreds of bases of flanking sequence on both sides-were input 607 as "targeted" sequences, and the remainder of the genome with the identical eight regions 608 masked to degenerate N bases was supplied as a "junction-only" reference (to which reads are 609 mapped without variant calling). All 116 samples were analyzed using breseq in polymorphism 610 prediction mode with all bias, minimum allele frequency, and read-count filters disabled. 611 Evidence items in the Genome Diff (GD) files for all samples were combined using the gdtools 612 utility program to generate a single merged GD file with each piece of evidence listed a single 613 time, regardless of how many times it was detected in different samples. We then re-ran breseq 614 using the same parameters except that this GD file was supplied as an input user-evidence file to 615 force output of variant and reference information for these putative variants in every sample. 616 Then, we extracted the number of variant reads supporting each putative variant allele and the 617 total number of reads at that reference location from the GD file output by breseq. Subsequent 618 statistical tests and fitting steps were performed in R (version 4.0.0) (R Core Team, 2016) using the ggplot2 package for data visualization (Wickham, 2016