Evolutionary epidemiology of Streptococcus iniae : linking mutation rate dynamics with 1 adaptation to novel immunological landscapes 2 3

Pathogens continuously adapt to changing host environments where variation in their virulence and antigenicity is critical to their long-term evolutionary success. The emergence of novel variants is accelerated in microbial mutator strains (mutators) deficient in DNA repair genes, most often from mismatch repair and oxidized-guanine repair systems (MMR and OG respectively). Bacterial MMR/OG mutants are abundant in clinical samples and show increased adaptive potential in experimental infection models, yet the role of mutators in the epidemiology and evolution of infectious disease is not well understood. Here we investigated the role of mutation rate dynamics in the evolution of a broad host range pathogen, Streptococcus iniae, using a set of 80 strains isolated globally over 40 years. We have resolved phylogenetic relationships using non-recombinant core genome variants, measured in vivo mutation rates by fluctuation analysis, identified variation in major MMR/OG genes and their regulatory regions, and phenotyped the major traits determining virulence in streptococci. We found that both mutation rate and MMR/OG genotype are remarkably conserved within phylogenetic clades but significantly differ between major phylogenetic lineages. Further, variation in MMR/OG loci correlates with occurrence of atypical virulence-associated phenotypes, infection in atypical hosts (mammals), and atypical (osseous) tissue of a vaccinated primary host. These findings suggest that mutators are likely to facilitate adaptations preceding major diversification events and may promote emergence of variation permitting colonization of a novel host tissue, novel host taxa (host jumps), and immune-escape in the vaccinated host.


33
Microbial evolution during infection can increase pathogen fitness that, while in the host, is largely determined by 34 interactions with immune responses 1,2 . Variation in virulence and antigenicity determinants allows pathogens to escape 35 immune clearance and spread in a host population 2,3 . Therefore, antigenic variation plays major role in the epidemiology 36 of infectious diseases and often compromises their control by vaccination 4,5 . As vertebrate immune responses are 37 remarkably diverse and complex, every individual host and tissue represents a unique 'immunological niche' that requires 38 adaptation 6 . In bacteria, host-adaptive variation could be obtained via accrual of mutation, recombination, and 39 transposition 7 . Notably, genes involved in host-pathogen interactions are often represent "mutation hot-spots" as they 40 contain sequence physically pre-disposed to errors during replication 8 . Further, overall fidelity of replication in bacteria is 41 influenced by the environmental conditions, and elevation of mutation rate, known as stress-induced mutagenesis (SIM), 42 occurs in adapting populations 9,10 . One aspect of SIM is temporary physiological shifts producing higher mutation rate, such 43 as up-regulation of error-prone polymerases that bypass DNA lesions during the bacterial general stress response, SOS 9,10 .

44
Another aspect of SIM is increased frequency of mutators -heritably hypermutable strains containing mutator alleles, also 45 referred to as constitutive or permanent mutators, to distinguish them from temporary (physiologically) hypermutable 46 cultures 9-11 . Mutator strains emerge via disruption of DNA repair genes, most often belonging to the mismatch-repair 47 (MMR) and oxidized guanine (OG) repair systems 12,13 . MMR corrects base mis-parings and short insertion/deletion loops 48 that occur during DNA replication 14,15 . It also prevents incorporation of divergent DNA sequences via non-homologous and 49 homeologous (partially homologous) recombination; thus MMR disfunction produces not only hypermutable but also

61
Although mutator alleles can accelerate short-term adaptation they are also indirectly selected against along with 62 deleterious variants 24,39-41 , and emerging anti-mutator genotypes start to dominate in the adapted mutator clone 42,43 .

63
When copy-number variant confers a mutator phenotype the reverse mutation can fully restore the non-mutator genotype 64 and phenotype 44 . The latter is also possible when prophage integration confers the mutator phenotype and excision the 65 non-mutator phenotype or vice versa 45

215
Quantification of biofilms was performed using COMSTAT 86 . Structures of biomass larger than 3.5 and average thickness 216 over 5 were classified as increased biofilm forming activity (Fig. 3 D).

235
Maximum likelihood phylogenetic analysis of 80 S. iniae isolates (Table 1) Table   240 2). Most SNPs in MMR and OG genes were non-synonymous and present in protein functional domains, and most amino 241 acid substitutions were predicted to have a deleterious effect on protein function ( general mutation rate difference is insignificant within but significant between major phylogenetic lineages (Suppl . Table   247 1). Although mutation rate variation between divergent strains is significant, the magnitude of the differences is modest 248 (around 5-fold) (Figure 1, 2, Suppl. Table 2).

249
The accrual of random mutations in MMR/OG loci appears to be the primary mechanism of mutation rate molecular 250 evolution within analysed set of S. iniae genomes. Firstly, we used a maximum likelihood inference method to exclude areas 251 of recombination from our phylogenetic analyses 67 , and none of the MMR and OG gene loci were excluded as recombinant.

252
Within the core genome, where MMR and OG gene loci are located, recombination rates were extremely low with mean 253 µr/m 0.032 and only 5 from the 80 strains containing recombination blocks (Suppl.  Fig. 1). This variant is an amino acid substitution in the DNA 263 glycolyase domain of mutM that is predicted to deleteriously affect the protein function (PROVEAN score of -2.990, Table   264 2).

265
To investigate whether variation in mutation rate phenotype and MMR/OG genotype is associated with adaptation to the 266 host, we identified variation in key virulence traits in streptococci (capsular polysaccharide, hemolysin, length of cell chains) 267 and bacteria in general (resistance to ROS, and biofilm formation). We found that number of atypical phenotypes for these 268 traits correlates significantly (p = 0.0331, PGLS) with the number of MMR and OG variants (Fig. 2, Table 3 (Table 2). The latter mutation encompasses the -35 box, two binding sites of the 294 RpoD17 general transcription factor, and the 1 st bp of the -10 box, and was predicted to abolish the original promoter 295 (Suppl. Fig. 4). However, mutY might still be transcribed at some level since two other potential promoters were identified 296 in a nearby sequence (Suppl. Fig. 3). It appears that altered mutY expression is also affected by the rest of the genetic 297 background as isolates exhibit significant differences (Suppl. the mutation rate to a value not significantly different from the core rate found in clade A (Fig. 1, 2)

319
Clade D consists of strains from trout (Oncorhynchus mykiss) isolated in Réunion and Israel. These strains have an estimated 320 mutation rate of 5 x 10 -8 , contain a glutamate to aspartate substitution in mutM predicted to be deleterious to protein 321 function, a synonymous SNP in RecD2, and exhibit impeded haemolytic activity (Fig 1,2, Table 2). Also, isolates from Israel 322 form thicker and more dense biofilm structures (Fig. 3, D2).

323
Isolates from humans and fish are found in clade E where multiple phenotypic profiles for the virulence-associated traits 324 are observed (Figs. 2-3, Table 1). This clade contains two nested subclades that share a SNP in mutS, but have other unique 325 MMR/OG SNPs (Fig. 2, Table 2) and exhibit mutation rates that are significantly different in most pairwise comparisons 326 (Suppl. Table 1). Subclade E1 contains USA isolates from humans (QMA0133-35, 37-38) and hybrid striped bass (QMA0447-327 48), which have a substitution in the Shine-Dalgarno sequence of recD2 and a mutation rate of 6.5 x 10 -8 (Fig 1, Table 2).

328
Fish strains and three human strains (QMA0135, 37-38) form shorter cell chains and are less sensitive to H2O2. Two human 329 strains QMA0133-34 produce thicker and more dense biofilm structures (Fig. 3, D2). QMA0133 is non-encapsulated (Fig 3,   330 A2), and QMA0134 appears to differentially express the capsular polysaccharide in culture (Fig 3, A3). Subclade E2 contains 331 two human strains from Canada (QMA0130-31) and a tilapia strain from USA (QMA0466). These strains have unique 332 methionine to isoleucine substitution at the N-terminus of MutX, and a mutation rate of 4.5 x 10 -8 (Fig 1, Table 2). Both 333 human strains are non-encapsulated (Fig. 2). In contrast, the tilapia strain expresses the capsule but forms short cell chains 334 in common with most strains from the subclade E1 (Fig 2, Fig 3, C3). increased cell-chain length, and denser biofilms 53-56,84,99,100 (Fig. 3). Bone strains have a unique MMR/OG genotype (Fig. 2,   343 Table 2) and a mutation rate that is different from QMA0139 and QMA0190 (Figs. 1, 2 exhibits a mutation rate of 1 x 10 -8 , which is unique among the strains and significantly lower than the core mutation rate, 364 perhaps linked to increased translation of recD2 helicase resulting from a SNP in the ribosome binding site (Fig 2, Table 2).