Genetic determinants in Salmonella enterica serotype Typhimurium required for overcoming stressors in the host environment

Salmonella enterica serovar Typhimurium (S. Typhimurium), a non-typhoidal Salmonella (NTS), result in a range of diseases, including self-limiting gastroenteritis, bacteremia, enteric fever, and focal infections representing a major disease burden worldwide. There is still a significant portion of Salmonella genes whose functional basis to overcome host innate defense mechanisms, consequently causing disease in host, largely remains unknown. Here, we have applied a high-throughput transposon sequencing (Tn-seq) method to unveil the genetic factors required for the growth or survival of S. Typhimurium under various host stressors simulated in vitro. A highly saturating Tn5 library of S. Typhimurium 14028s was subjected to selection during growth in the presence of short chain fatty acid (100 mM propionate), osmotic stress (3% NaCl) or oxidative stress (1 mM H2O2) or survival in extreme acidic pH (30 min in pH3) or starvation (12 days in 1X PBS). We have identified an overlapping set of 339 conditionally essential genes (CEGs) required by S. Typhimurium to overcome these host insults. Interestingly, entire eight genes encoding F0F1-ATP synthase subunit proteins were required for fitness in all five stresses. Intriguingly, total 88 genes in Salmonella pathogenicity island (SPI), including SPI-1, SPI-2, SPI-3, SPI-5, SPI-6 and SPI-11 are also required for fitness under the in vitro conditions evaluated in this study. Additionally, by comparative analysis of the genes identified in this study and the genes previously shown to be required for in vivo fitness, we identified novel genes (marBCT, envF, barA, hscA, rfaQ, rfbI and putative proteins STM14_1138, STM14_3334, STM14_4825, and STM_5184) that has compelling potential to be exploited as vaccine development and/or drug target to curb the Salmonella infection.


Introduction
on LB plates supplemented with NA and Km to recover the transformants. With three 116 electroporations we were able to collect 350,000 Tn5 mutants and stored them in LB medium 117 with 50% glycerol at -80ºC ( Figure 1).

118
In vitro growth assay of transposon mutant library 119 In vitro selection of transposon mutant library was done as described by Opijnen and Camilli,120 (2010) (van Opijnen et al., 2014) with some modifications. Briefly, transposon mutant library 121 was thawed on ice and an aliquot of 300 µl was added to 60 ml LB broth with NA and Km 122 (OD600 = 0.131). The library was incubated at 37°C on a shaking rack for 30 min (OD600 = 0.135) 123 and centrifuged at 5,500 rpm for 8 min at room temperature. The transposon mutant library pellet 124 was resuspended in 50 ml 1X phosphate buffer saline (PBS) (OD600 = 0.143) and CFU 125 (4X10 7 /ml) was measured (t1). This step was included to prepare the mutant cells adapted to LB 126 medium and shorten the lag phase in the following selective conditions. Ten ml aliquot were saved from t1 as an input pool (IP1). Above procedure was repeated to make a technical replicate 128 of IP1 as input pool 2 (IP2). An aliquot of 0.5 ml from t1 was inoculated to 10 ml LB (LB), LB 129 with 3% NaCl (NaCl), LB with 100mM propionate with pH adjusted to pH7 (PA), LB with 1mM 130 H2O2 (H2O2). The initial OD600 of inoculated medium was 0.009. We then incubated the libraries 131 on a shaking rack (225 rpm) at 37°C with variable incubation time ranging from 3.75 h to 7 h (t2) 132 to a mid-logarithmic. The final OD600 of all output pools was very similar around 0.64 at time 133 point t2. Input pool and output pool libraries were centrifuged and the pellet was stored at -80°C 134 for DNA extraction (Figure 1).

135
In vitro survival assay of transposon mutant library 136 To identify genes negatively selected during starvation, an aliquot of 0.5 ml from t1 was 137 transferred to 10 ml PBS and incubated at 37°C on shaking rack for 12 days. On the 12 th day, the 138 tube was centrifuged and the pellet was dissolved in 1 ml PBS. 100 µl aliquot was incubated on 139 LB plate (NA + Km) overnight at 37°C. The cells were collected in PBS and stored at -80°C for 140 DNA extraction. Whereas for survival in pH3, 0.5 ml from t1 was exposed to LB medium 141 adjusted at pH3 for 30 min at 37°C and immediately transferred to 40 ml PBS. The cells were 142 centrifuged at 8000 rpm for 8 min and pellet was mixed in 1ml PBS. An aliquot of 250 µl was 143 plated on LB plate (NA + Km) overnight at 37°C. Colonies were collected in PBS and stored at -144 80°C for DNA extraction (Figure 1). 146 Genomic DNA (gDNA) from the bacterial cell pellet of input library and output libraries stored 147 at -80°C was extracted using QIAamp DNA Mini Kit (Qiagen, Valencia, CA, USA) following manufacturer's protocol. The purity and concentration were checked using Qubit 2.0 149 Fluorometer (Life Technologies, Carlsbad, CA) with Qubit Assay Kits (dsDNA BR Assay) 150 following the manufacturer's manual. 151 The sample for Illumina sequencing was prepared as previously described (Dawoud et al., 2014;152 Mandal and Mandal, 2016). All the DNA primers (Table S5) 153 used for Tn-seq library were custom designed using Primer3 (v. 0.4.0) (Untergasser et al., 2012) 154 and ordered from Integrated DNA Technologies (Coralville, Iowa). The simplified diagram for 155 preparation of Tn-seq amplicon library is shown Figure S1A. Briefly, Tn5-junctions at the right 156 end of transposon was enriched from gDNA extracted from input and output library. The single (1mM) -1 µl, Nuclease-free H2O -1.6 µl, and Terminal Transferase -1 µl, making a total 172 volume of 20 µl. The reaction was incubated at 37°C for 1 h followed by heat inactivation of the 173 enzyme at 75°C for 20 min on a thermocycler. The C-tailed products were purified using 174 MinElute PCR purification kit and eluted to 10 µl. 175 Subsequently, C-tailed PCR product was enriched with exponential PCR. PCR reaction 176 constituted: nuclease-free H2O -35 µl, Thermopol Buffer (10X) -5 µl, dNTPs (2.5 mM each) -177 4 µl, IR2 BC primer (with Illumina adapter and barcode, 10 µM) -2 µl, HTM primer (with 178 Illumina adapter, 20 µM) -1 µl, C-tailed DNA -2 µl, and Taq DNA Polymerase (NEB) -1 µl, 179 making a total volume of 50 µl. The manual hot start PCR cycle comprised of 95°C for 2 min, 180 followed by 25 cycles of 95°C for 30s, 58°C for 45s, and 72°C for 20s, trailed by a final 181 extension at 72°C for 10 min.

182
Finally, the exponential PCR products were pulse heated at 65°C for 15 min and ran on 1.5% 183 agarose gel. Tn-seq library had smear pattern whereas gDNA of S. Typhimurium (negative 184 control) had almost no amplification ( Figure S1B). Gel was excised ranging from 300-500 bp 185 and DNA was extracted using QIAquick Gel Extraction Kit (Qiagen, Valencia, CA). The purity 186 and concentration of DNA were measured using Qubit 2.0 Fluorometer. An equal amount (~ 10 187 ng) of DNA (gel-purified products) from each library were mixed together and sent for next  Raw reads from HiSeq Illumina sequencing were de-multiplexed based on the barcodes to their 192 respective libraries using custom Perl script. The barcode and transposon sequence were trimmed off from 5' end. Consequently, the remaining sequence was Tn5-junction sequences 194 with/without poly C-tail. Only 20 bp from the Tn5-junction were kept discarding most of the 195 poly C-tails. The reads were then aligned against S. Typhimurium 14028s complete genome 196 (NC_016856.1) using Bowtie version 0.12.7 (Langmead and Salzberg, 2012). The aligned 197 sequence (SAM mapping file) were fed to ARTIST pipeline to identify conditionally essential 198 genes (CEGs) using Con-ARTIST (Pritchard et al., 2014). Briefly, Tn5 insertion frequency was 199 assigned to the S. Typhimurium 14028s genome divided into 100 bp window size. Uncorrected We compared the in vitro essential genes identified in this study and our previous study 211 (Khatiwara et al. 2012) with the previously identified in vivo fitness genes. CEGs for acute 212 infection of mice (A-Mice), macrophage survival (MΦ) (Chan et al., 2005) and persistent 213 infection of mice (P-Mice) (Lawley et al., 2006) were previously identified in S. Typhimurium 214 strain SL1344 background. Additionally, Salmonella genes required for gastrointestinal colonization of pig, calf and chicken were identified in S. Typhimurium strain ST4/74 216 (Chaudhuri et al., 2013), and those for intraperitoneal infection of mice (Sp-Liv) were reported in 217 S. Typhimurium strain 14028s background (Silva-Valenzuela et al., 2015). The CEGs of 218 different strain were searched for the corresponding orthologous genes in S. Typhimurium strain 219 14028s background using Prokaryotic Genome Analysis Tool (PGAT) (Brittnacher et al., 2011).

228
Overall evaluation of resulting Tn-seq profiles 229 We have constructed a highly saturated transposon mutant library of S. Typhimurium 14028s  Figure S1A and S1B). This efficient Tn-seq protocol was developed in our laboratory that offers distinctive advantages over other Tn-seq library preparation methods, including a low amount 238 (~100 ng) of DNA required, and no need for physical shearing or restriction digestion (Dawoud 239 et al., 2014;Karash et al., 2017;Kwon et al., 2016;.

240
Illumina sequencing using HiSeq 3000 produced 163,943,475 reads from a single flow cell lane.

241
The raw reads were demultiplexed allowing a perfect match for the barcodes used (Table S1) 242 with exception of up to two mismatches within Tn5 mosaic end (ME) using a custom Perl script.   Besides, we looked for the occurrence of any hot spots of Tn5 insertion in the sample libraries. 268 We found an even distribution of Tn5 insertion reads across the libraries throughout the genome.

269
Some of the genomic regions lacking insertions have white stripes that are clearly visible ( Figure   270 S2) across all the samples that represent essential loci in the S. Typhimurium 14028s genome.

272
In this study, we used two strategies to identify conditionally essential genes (CEGs) of S. 273 Typhimurium to overcome host stressors. The first strategy was a negative selection of complex 274 Tn5 mutant libraries based on growth fitness for mild stressors (3% NaCl, 100 mM propionate, 1 275 mM H2O2) and the second one was based on survival of Tn5 mutant libraries for harsher 276 stressors (12 days starvation and PH3) as shown in Figure 1.

277
The ARTIST pipeline can identify if genes are entirely essential or domain essential in a given 278 condition. In our study only a few of the genes were identified as domain essential and the 279 majority of them were entirely essential. For simplicity, we assigned both categories of the genes 280 entirely essential and domain essential into one category, conditionally essential genes (CEGs).
separately. As expected, most of the CEGs were overlapped with these two comparisons. For the 283 conditions PA, NaCl, and H2O2, we considered the common set of identified CEGs via the 284 comparison of output library with both IP1 and LB as CEGs for each condition. However, the 285 output libraries for PH3, and Starvation were compared only with IP1 because the selection of 286 the Tn5 library was based on survived mutants and the mutant cells did not multiply during 287 selection in liquid media. 288 We identified an overlapping set of 339 CEGs that are required for fitness of S. Typhimurium 289 14028s in at least one of the five conditions ( Figure 3A). Starvation had the highest CEGs (241), 290 followed by PH3 (103), NaCl (60), H2O2 (40) and PA (19) as shown in Table S2 and S3. This 291 might likely reflect that starvation is a severe stressor involving diverse genetic pathways for 292 survival, while PA is a mild stressor for the fitness of S. Typhimurium. More than a half of CEGs 293 were on the lagging strand (56.63%), which is somewhat contrary to the responsive genes in 294 Escherichia coli and Streptococcus pneumoniae (Nichols et al., 2011;van Opijnen and Camilli, 295 2012). We assigned a functional role to 96 CEGs that were putative proteins and 21 CEGs 296 belonging to hypothetical proteins. The stress tolerant proteins commonly identified in at least 2 297 of the in vitro stressors included ATP synthase, a transcriptional regulator, 3-dehydoroquinate 298 synthase, site-specific tyrosine recombinase xerC, flavin mononucleotide phosphatase, ribulose-299 phosphate 3-epimerase, and DNA-dependent helicase II among others (Table S2 and S3). 300 Intriguingly, we found many genes in the Salmonella pathogenicity islands (SPI) were required 301 for fitness in the presence of the in vitro stressors used in this study. Numerous genes in SPI-1, 302 SPI-2, SPI-3, SPI-5, SPI-6, and SPI-11 were required for resistance against Starvation (n=68), 303 NaCl (n=28), and PH3 (n=27) (Table S4). However, no SPI genes were required for fitness in 304 PA and H2O2. SPI-5 and SPI-11 genes were only conditionally essential in PH3 (n=4 and 6, respectively), while SPI-3 genes in NaCl (n=7) and SPI-6 genes in starvation (n=7). Tn-seq 306 profiles for SPI-1 region is shown in Figure S3A as an example.

307
For a broader insight into pathways involved in stress resistance, we assigned each CEGs to the 308 cluster of orthologous groups (COG) using eggNOG database (evolutionary genealogy of genes:

309
Non-supervised Orthologous Groups) (Jensen et al., 2008). The CEGs having top hit for the 310 COG in the S. Typhimurium LT2 were kept and CEGs with no orthologous group were allotted 311 to group XX ( Figure 3B; Table S3). In overall, 21.83% of CEGs belonged to category "function 312 unknown" followed by "intracellular trafficking, secretion, and vesicular transport" (10.91%),

314
A substantial portion of CEGs (30.6%) falling into either "function unknown or "no orthologs 315 found" shows that our data set is rich in novel genotype-phenotype relationships.

316
Additionally, we were interested to see if any CEGs identified in our study fell into the essential 317 genomes of S. Typhimurium in other strain backgrounds. Essential genomes of S. Typhimurium 318 strain SL3261 (selected on LB agar) (Barquist et al., 2013) and S. Typhimurium strain LT2 319 (selected on rich medium) (Knuth et al., 2004;Zhang et al., 2004) were compared with the CEGs 320 of S. Typhimurium 14028s identified in this study. Genes in different strain background were 321 looked for the corresponding orthologous genes in S. Typhimurium 14028s background.

322
Interestingly, 10 and 15 CEGs in this study were shared with the essential genes of S. 323 Typhimurium SL3261 and LT2, respectively (Table S5; Figure S4). This indicates that these 324 genes that are essential in other strain backgrounds are dispensable in S. Typhimurium 14028s 325 strain background.

Molecular and phenotypic basis of CEGs in S. Typhimurium
Next, we delved into the genetic and biochemical mechanisms related to the CEGs identified in 328 our study. For convenience, we split the section into specific CEGs, required for fitness in only 329 one stressor, and common CEGs shared in at least two stressors out of five host stressors.

330
CEGs specifically required for propionate (100 mM PA) stress resistance. CEGs specific for 331 fitness of S. Typhimurium in propionate were yiiD and sdhAD. YiiD is a putative 332 acetyltransferase protein (Read coverage shown in Figure 4C). Acetylation, a post-translation 333 modification of protein was previously shown to enable prokaryotes to increase stress resistance 334 (Ma and Wood, 2011). Additionally, succinate dehydrogenase flavoprotein (sdhA) and 335 cytochrome b566 (sdhD) subunit proteins were up-regulated by intestinal SCFA in S.

339
CEGs specifically required for osmotic (3% NaCl) stress resistance. Twenty-six resistance 340 genes of S. Typhimurium were required for fitness in osmotic stress (3% NaCl) alone. Protein-341 protein network analysis using STRING database (http://string-db.org) against S. enterica LT2 342 showed three distinct clustering of genes, SPI-3 (mgtBC, misL, cigR, slsA, fidL and marT), two-343 component system (dcuBRS) and sodium ion transport (yihPO) along with other nodes 344 (http://bit.ly/2bCKGVG). SPI-3 genes are important for intracellular replication inside 345 phagosome where Salmonella experience hyperosmotic stress (Schmidt and Hensel, 2004). The 346 virulence proteins mgtC and mgtB, Mg 2+ transporter were expressed five-fold when S. 347 Typhimurium was exposed to 0.3 M NaCl (Lee and Groisman, 2012). MisL, an autotransporter 348 protein is an intestinal colonization factor (activated by marT, a transcriptional regulator) that 349 binds to extracellular matrix fibronectin in an animal host and is also involved in adhesion to plant tissue (Dorsey et al., 2005;Kroupitski et al., 2013). Deletion of cigR in S. Pullorum 351 resulted in a significantly decreased biofilm formation and increased virulence (Yin et al., 2016). 352 Additionally, Figureueira et al., showed ∆cigR strain of S. Typhimurium had attenuated 353 replication in mouse bone marrow-derived macrophage (Figueira et al., 2013). 354 yihPO genes are essential for capsule assembly that is required by Salmonella for environmental 355 stress persistence such as desiccation (Gibson et al., 2006). The absence of ompL (ortholog of 356 yshA) leads to solvent hypersensitivity as it helps in the stabilization of cell wall integrity 357 protecting from solvent penetrance as a physical barrier (Murinova and Dercova, 2014). In E. 358 coli, the genes under the control of dcuS-dcuR, a two-component system, were not affected upon 359 a hyperosmotic shock (Weber and Jung, 2002). However, dcuBRS were conditionally essential in  Hydrogen peroxide kills E. coli cells with two distinct modes, mode-1 killing occurs at a lower 371 concentration of H2O2 due to DNA damage and mode-2 killing occurs at a higher concentration 372 of H2O2 due to damage of other structures like proteins and lipids (Imlay and Linn, 1986).

523
High osmolality, low oxygen, and late log phase induce hilA expression in vitro that in turn 524 regulates the expression of SPI-1 genes (Lostroh and Lee, 2001). Interestingly, we identified 525 SPI-1 genes as fitness genes required for in vitro NaCl stressor. Similarly, lipopolysaccharide 526 (LPS) biosynthetic process genes were enriched in LB42, Bile and in pig, calf, and chicken for 527 fitness during enteric infection. LPS, a critical factor in the virulence of gram-negative bacterial 528 infection is required for intestinal colonization, resistance to killing by macrophage, swarming 529 motility, serum resistance and bile stress (Khatiwara et al., 2012;Kong et al., 2011). CsgBA 530 (curli subunit protein) mutant of S. Typhimurium was attenuated to elicit fluid accumulation in 531 bovine ligated ileal loops (Tükel et al., 2005) and are required for fitness in PH3 including csgF and csgG. Additionally, putative proteins STM14_1138, STM14_1486, STM14_1981, 533 STM14_3333 and STM14_4826, STM14_4828, STM14_5184, STM14_5185 (hypothetical 534 protein) were required for fitness in vitro acidic and osmotic stress respectively and enteric 535 infection in the entire three host.

549
Other than SPI genes, the majorly enriched genes were nucleic acid metabolic process (dam, 550 trpS, MnmE, truA, serc, csgD, ompR and cra), lipopolysaccharide biosynthetic process 551 (rfbABCNPU, rfaB, udg, galF), oxidative phosphorylation (ATP synthase genes, NADH 552 dehydrogenase genes), two component system (ompR, barA, phoQ, glnDL, pagKO) among 553 others ( Figure 6). Gene dam was required for fitness in H2O2, NaCl, A-Mice, and Sp-Liv. XerC Salmonella pathogenicity islands in vitro conditions such as early stationary phase, anaerobic 576 growth, oxygen shock, nitric oxide shock as well as in pH3, NaCl, bile, and peroxide shock among others (Kröger et al., 2013). However, transcription of a gene does not necessarily 578 indicate the need of that gene function for fitness in a given particular condition. The transcript 579 can be a leaky expression or required for fitness in the upcoming environment in a cost effective 580 way through predictive adaptation, phenomena where bacteria are able to anticipate and pre-581 emptively respond to the regular environmental fluctuations (temporally distributed stimuli) that 582 confers a considerable fitness advantage for the survival of an organism (Mitchell et al., 2009;Ta 583 gkopoulos et al., 2008). Traditionally, it is believed that "central dogma of life" i.e. flow of 584 information from DNA to RNA to proteins are highly concordant. However, there is a modest 585 correlation between levels of transcripts and corresponding proteins (Foss et al., 2007;Fu et al., 586 2009; Ghazalpour et al., 2011). Thus, functional genomics screening such as Tn-seq is expected 587 to reveal more direct functional aspects of the genes involved in responding to the current 588 stresses.

Figure S1. Preparation of Tn-seq amplicon library for Illumina sequencing. A) Genomic
DNA of Tn5 mutant library was linealry extended using Tn-specific primer 1 (Ez-Tn5 primer3 in Table S1). Then C-tail was attached to the 3' end of purified single-stranded DNA. The C-tailed product was purified and exponential PCR was performed using Tn-specific primer 2 (Barcoded primers in Table S1) and C-tail specific primer (HTM-Primer in Table S1) with Illumina adapter attached to primers. B) Exponentially amplified DNA was than run on 1.5% agarose gel. DNA from 300bp to 500bp was extracted from the gel and sent for Illumina sequencing. [M: Hi-Lo DNA marker; 1, 2, 3, 4: Tn5 mutant libraries; and C: negative control (gDNA of the wild type S. Typhimurium 14028s)]. Figure S2. Overlay plot displays global view of genome-wide quantitative distribution of Tn5 insertion read count for all samples. X-axis: Position on the genome; and Y-axis: Number of read count per 100 bp scaled in log10. Figure S3. Tn-seq profiles around the selected genomic regions. A) Salmonella pathogenicity island 1 (SPI-1) genes encoding type III secretion system (TTSS). Screen shot image produced using Integrative Genomics Viewer (IGV) showing raw read coverage [100-600] in seven conditions. (Blue asterisk: conditionally essential in NaCl and Starvation; and Red asterisk: conditionally essential in Starvation only). B) CpxAR were conditionally essential in starvation. only. Figure S4. Comparison of the overlapping set of conditionally essential genes of S. Typhimurium 14028s (this study) with essential genome of S. Typhimurium SL3261 and S. Typhimurium LT2.  Table 1. Tables   Table S1. Oligonucleotides used in this study. Table S2. All conditionally essential genes (CEGs) in S. Typhimurium 14028S identified in this study. Table S3. Comparison of the conditionally essential genes (CEGs) in S. Typhimurium 14028s identified in this study across the 5 stress conditions. Table S4. The conditionally essential genes (CEGs) in S. Typhimurium 14028s identified in this study that are located in Salmonella Pathogenicity Islands. Table S5. Comparison of the conditionally essential genes (CEGs) of S. Typhimurium 14028s (this study) with the essential genes of S. Typhimurium identified from previous studies Table S6. The conditionally essential genes (CEGs) in the presence of the in vitro host stressors (PA, NaCl, pH3, Bile, and LB42) that are also required for enteric infection in farm animals (cattle, pig, and chicken). infections (MΦ, A-Mice, P-Mice, Sp-Liv).