G-quadruplex stabilization in the ions and maltose transporters inhibit Salmonella enterica growth and virulence

The G-quadruplex structure forming motifs have recently emerged as a novel therapeutic drug target in various human pathogens. Herein, we report three highly conserved G-quadruplex motifs (SE-PGQ-1, 2, and3) in genome of all the 412 strains of Salmonella enterica. Bioinformatics analysis inferred the presence of SE-PGQ-1 in the regulatory region of mgtA, presence of SE-PGQ-2 in the open reading frame of entA and presence of SE-PGQ-3 in the promoter region of malE and malK genes. The products of mgtA and entA are involved in transport and homeostasis of Mg2+ and Fe3+ ion and thereby required for bacterial survival in the presence of reactive nitrogen/oxygen species produced by the host macrophages, whereas, malK and malE genes are involved in transport of maltose sugar, that is one of the major carbon source in the gastrointestinal tract of human. The formation of stable intramolecular G-quadruplex structures by SE-PGQs was confirmed by employing CD, EMSA and NMR spectroscopy. Cellular studies revealed the inhibitory effect of 9-amino acridine on Salmonella enterica growth. Next, CD melting analysis demonstrated the stabilizing effect of 9-amino acridine on SE-PGQs. Further, polymerase inhibition and RT-qPCR assays emphasize the biological relevance of predicted G-quadruplex in the expression of PGQ possessing genes and demonstrate the G-quadruplexes as a potential drug target for the devolping novel therapeutics for combating Salmonella enterica infection. Author Summary Since last several decades’ scientific community has witnessed a rapid increase in number of such human pathogenic bacterial species that acquired resistant to multiple antibacterial agents. Currently, emergence of multidrug-resistant strains remain a major public health concern for clinical investigators that rings a global alarm to search for novel and highly conserved drug targets. Recently, G-quadruplex structure forming nucleic acid sequences were endorsed as highly conserved Drug target for preventing infection of several human pathogens including viral and protozoan species. Therefore, here we explored the presence G-quadruplex forming motif in genome of Salmonella enterica bacteria that causes food poisoning, and enteric fever in human. The formation of intra molecular G-quadruplex structure in four genes (mgtA, entA, malE and malK) was confirmed by NMR, CD and EMSA. The 9-amino acridine, a known G-quadruplex binder has been shown to stabilize the predicted G-quadruplex motif and decreases the expressioin of G-quadruplex hourbouring genes using RT-PCR and cellular toxicity assay. This study concludes the presence of G-quadruplex motifs in essential genes of Salmonella enterica genome as a novel and conserved drug target and 9-amino acridine as candidate small molecule for preventing the infection of Salmonella enterica using a G4 mediated inhibition mechanism.


Introduction
Salmonella enterica belongs to Enterobacteriaceae family and known to cause typhoid fever and food poisoning in the human. Salmonella enterica consists of six subspecies namely 1) enterica 2) salamae 3) arizonae 4) diarizonae 5) houtenae 6) indica (Fig 1a). Among these subspecies, subspecies S. enterica is known to infect human and possess ~2463 serovars that can be further divided into two subclasses [1,2].
Typhoidal class of S. enterica included S. enterica subsp. enterica ser. Typhi(S. ser. Typhi) and S. enterica subsp. enterica ser. Paratyphi (S. ser. Paratyphi) known to causes typhoid or enteric fever, whereas non-typhoidal class includes S. enterica subsp. enterica ser. Enteritidis (S. ser. Enteritidis) and S. enterica subsp. enterica ser. Typhimurium (S. ser. Typhimurium), and causes food poisoning in human [3]. They have been reported for causing more than ~140 foodborne illness throughout the world [4][5][6][7][8]. As per the Centers for Disease Control and Prevention (CDC), typhoid fever causes ~22 million new cases and ~200,000 deaths every year across the world [9,10]. The emergence of antimicrobial drug resistance for chloramphenicol, co-trimoxazole, ampicillin, ciprofloxacin, ofloxacin, azithromycin and cephalosporin make the situation more dangerous and thus leading to increased death rate due to clinical treatment failure [11][12][13]. More recently, severe complications with Salmonella species have emerged and connected to multiple invasive disorders like irritable bowel syndrome [14], reactive arthritis [15], bacteremia [16] focal infection [17], meningitis [18] and infectious aortitis [19]. Due to its high prevalence and rapid emergence of drug resistance, Salmonella ring a global alarm for the development of novel and promising therapeutic approaches. A more effective and efficient therapeutic approach would be required to target the expression of those genes that were previously observed to be associated with essential nutrient acquisition system and remained conserved throughout the evolution in the genome of this deadly human pathogen.
S. enterica is an intracellular pathogen that grows in phagocytes and macrophages. During the growth phase, the host innate immune system generate various oxidative stresses to eradicate this pathogen. However, S. enterica possess a magnesium homeostasis mechanism that controls the intracellular Mg 2+ concentration and helps bacteria to survive in nitro-oxidative stressed condition [20].
Magnesium homeostasis is also required for virulence, thermo-tolerance and survival of S. enterica [21].
The bacteria contain three genes that help in Mg 2+ uptake from the host namely: i) mgtA, ii) mgtB, and iii) corA. Neutralization of the reactive nitrogen stress(RNS, nitro-oxidative) is mainly regulated by Mg 2+ transport ATPase that is encoded by mgtA gene and therefore plays a vital role in the bacterial survival inside macrophage (Fig 2a) [22]. Hence, targeting the conserved region of mgtA gene may serve as a promising therapeutic approach to combat with S. enterica pathogenesis.
Similar to Mg 2+ ion, iron is also required by this bacteria, but its uptake from the host environment is more challenging as it gets sequestered by various host proteins like siderocalin, transferrin and lactoferrin leading to its unavailability in the host environment (Fig 2b). S. enterica produces enterobactin and salmochelin, two low molecular weight catecholate type siderophores that have a high affinity for Iron than the host iron-binding proteins. These enterobactin/salmochelin are produced from chorismate that requires several enzymes encoded by a co-transcribed entABCDEF operon (Fig 2b). [23]. The conversion of the 2,3-dihydro-2,3-dihydroxybenzoate (2,3 DHB) to another intermediate 2,3-dihydroxybenzoate(DHB) is a crucial step of this pathway and required a 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase enzyme(EntA), encoded by entA gene of entABCDEF operon. The inhibition of entA gene expression has been observed to abolish the DHB formation leading to the reduced production of bacteriocin and salmochelin [24]. Interestingly, salmochelin are resistant to antimicrobial peptides (lipocalins), secreted by the host cells and acts as an essential factor in the pathogenesis of systemic S. ser.Thyphimurium infection [25]. Salmoechelins also protects this bacteria from reactive oxygen superoxides (ROS) mediated oxidative stress [26]. Targeting the production of siderophore has previously shown to have antimicrobial activity against Mycobacterium tuberculosis [27], Aspergillus fumigatus [28], Yersinia pestis [29], Pseudomonas aeruginosa [30], Bacillus substilis, Acinetobacter baunamnni and Vibrio chloreae [31]. Therefore, targeting entA gene may prove as another therapeutic approach to combat the infection and virulence of S. enterica as well.
S. enterica grows in the gastrointestinal tract of humans that has an ample amount of maltose and maltodextrin. Therefore, along with glucose, S. enterica utilizes maltose as major source of carbon.
The uptake of maltose from host environment is tightly regulated by two genes, malK and malE (Fig 2c) [32]. Inhibiting the malK and malE synthesis have shown to decline the growth rate of S. enterica [33].
Henceforth, targeting these genes will make S. enterica unable to grow inside the gastrointestinal tract.
DNA along with its canonical B-form can also fold into a non-canonical G-quadruplex (G4) structure. G rich regions having a motif G 2 N L G ≥2 N L G ≥2 N L G ≥2 present in the genome tends to form specific secondary structures known as G-quadruplex (G4). G4s are stabilized by the presence of monovalent and some divalent cations, in the order of K + > Na + > Mg 2+ > Li + and can adopt various topologies (Fig 1b   & 1c) [34,35]. This structural diversity has been exploited for the diagnosis and therapeutic targeting [36,37]. G4s are highly ordered and shown to be evolutionarily conserved in eukaryotes [38] prokaryotes [39], protista [40], plants [41] and viruses [42]. Presence of G4 binding proteins and antibody-based approaches have confirmed their presence in-vivo and are reported to play a regulatory role in the expression of genes such as regulating DNA replication by the specification of origin of replication(ORI) sites, telomere maintenance in human cells, antigenic variations by regulating recombination, transcription, and translation [43]. G4 motifs present in the telomere regions of the chromosomes and promoter regions of oncogenes like Bcl, c-Kit, MYC, KRAS, etc. have been explored in various human cancers [44,45].
Currently, G4s are being investigated for their involvement in virulence and survival mechanisms of various human pathogen [46]. Stabilization of G4s in protozoans: Plasmodium falciparum, Trypanosoma brucei and Leishmania donovani have shown anti-protozoal activities [47].  [42]. In bacteria, G4 present at the upstream of pilE locus, B31 vlsE locus and tprK antigen protein in Neisseria gonorrhoea, Borrelia burgdorferi, and Treponema pallidum, respectively acts as an activator for the initiation of antigenic variation and helps the pathogens in bypassing immune system of the host cells [46]. In Deinococcus radiodurans, G4 sequences were present in the regulatory regions of various genes and contributes to radio resistance [48]. G4 present at the 150 nt upstream of nasT in the soil bacterium Paracoccus denitrificans PD1222 isreported to be involved in nitrite assimilation [49].
All these reports demonstrated the pivotal role of G-quadruplex in human pathogens and their conserved-ness suggest them as a promising drug target for both drug susceptible and drug-resistant strains of human pathogens. Therefore, a comprehensive study that discovers highly conserved Gquadruplex in the genome of S. enterica as a drug target may provide as a most suitable therapeutic approach for fighting against the infection of this deadly pathogen and overcome the emergence of drugresistant problem in this bacterium.
In present study, we sought to explore the highly conserved potential G-quadruplex forming sequences (SE-PGQs) in all the available and completely sequenced 412 strains of S. enterica.
Bioinformatics sequence analysis revealed the presence of three SE-PGQs (SE-PGQ 1-3) in three different gene locations of S. enterica genome. The SE-PGQ-1 was found to be present in the regulatory region of mgtA, SE-PGQ-2 in the open reading frame of entA, whereas SE-PGQ-3 found to lie in the regulatory region of malK and malE genes (Fig 3a). In order to confirm the formation of G-quadruplex structure by SE-PGQs, Circular Dichroism spectroscopy (CD), gel mobility shift assay (EMSA) and one dimensional 1 H NMR spectroscopy were employed. Further to validate these SE-PGQ as a potential drug target, CD melting, and Taq polymerase stop assay were performed that confirmed 9-amino-acridine, a known G4 binding molecule interact and stabilizes the SE-PGQs motifs with high affinity and selectivity over the duplex DNA. Disc diffusion assay and MTT cytotoxicity assay confirmed the growth inhibition of S. enterica cells by 9-amino acridine molecule. Further, Real-time -quantitative PCR (RT-qPCR) revealed the reduced expression of genes that harbor the SE-PGQs in either their coding region or regulatory region upon the treatment with 9-amino acridine. This change in the expression of PGQs harboring gene in the presence of G4 binding ligand suggested a G4 mediated regulatory mechanism in the expression of these genes.
As mentioned above, these genes are essential for bacterial survival and virulence inside the host macrophages. Therefore, G-quadruplex motifs found in these genes can be utilized as a potential drug target to develop a promising anti-microbial therapeutics. Moreover, the high conserved-ness of these SE-PGQs, even in the drug-resistant strain would overcome the problem of emergence of drug-resistance in S. enterica.

Salmonella enterica genome harbors three most conserved G-quadruplexes
A comprehensive mining of potential G-quadruplex forming motifs (SE-PGQs) was performed on 412 completely sequenced strains of S. enterica (Supplementary Table S1). The bioinformatics analysis observed a total of 109400 PGQs in 412 strains of S. enterica (Supplementary File S2). Given that, the similar sequence may correspond to similar structure and evolutionarily conserved function, all the predicted PGQs were further clustered by Unweighted Pair Group Method with Arithmetic Mean clustering method using Clustal Omega tool. The conserved-ness is an essential parameter that makes these PGQ motifs suitable to work as promising drug targets. Therefore, next we examined the conservation of each PGQ clustered using the following equation: Where p is the frequency of occurrence, n = number of strains with specific G4 sequence, and N represents the total number of strains of S. enterica. These, conservation analysis revealed 187 PGQ clusters that were observed to possess conversed-ness in more than 90% strains of S. enterica. Table S3). G-quadruplex with loop length 1-7 and G tract of ≥3 forms more stable Gquadruplex [50]. Therefore, for the further study, we selected only those PGQs that satisfying the aforesaid criteria of G-quadruplex formation and were listed in Table 1.  Table S3a and S3b). Interestingly, out of these 18 PGQs clusters, three PGQs (SE-PGQ-1, SE-PGQ-2 and SE-PGQ-3) were found to be conserved in more than 98 % strains of S. enterica (S2 File) and present in the four essential genes namely mgtA, entA, malK and malE (Fig 3). The consensus sequence depicted the conserved G-residues of SE-PGQ motifs during the evolution process Fig 3b.

In vitro 1 H NMR analysis affirms the formation of G-quadruplex
NMR spectroscopy is considered as a most reliable technique for confirming the formation of Gquadruplex structure formation by the nucleic acid sequences. Therefore 1D 1

Evaluating the topology and stability of the PGQs using Circular Dichroism
Circular dichroism is one of the widely used techniques for analyzing the topology of the G-quadruplex structure. G-quadruplex, depending upon its sequence, loop length, and bound cation, can form either a parallel, antiparallel or hybrid conformation. A positive peak at ~260 nm and a negative peak at ~240 nm signifies for parallel G-quadruplex topology. However, a positive peak at ~290 nm and a negative peak at ~260 nm signifies for anti-parallel G-quadruplex topology whereas, two positive peaks at 260 nm and 290 nm with a negative peak at 240 nm depict the mix or hybrid topology [51]. Different cation affects the stability of the G-quadruplex structure in different extent. The ranking of stabilizing ability of some well studied cations is as follows : K + > Na + > Mg 2+ > Li + [35] . Therefore, we performed the CD spectroscopy of SE-PGQs in four different cations (K + , Na + , Li +, and Mg 2+ ) containing buffers (Fig 5 & S2 Fig).
CD spectra analysis revealed the predominant parallel G-quadruplex topology exhibited by SE-PGQ-1 and SE-PGQ-3 in the presence of the K + ion, whereas SE-PGQ-2 showed hybrid G-quadruplex topology in the presence of K + (Fig 5a). As expected, CD spectral scanning performed in the increasing concentration of K + ion showed the maximum molar ellipticity in highest K + ion concentration (S3 Fig). Adenine and CD spectra analysis was performed in 50 mM K + ion (S4 Table). Mutants (mut-PGQ-1, mut-PGQ-2, and mut-PGQ-3) failed to show the characteristic CD signal of G-quadruplex i.e. a positive band at 260/290 nm and a negative band at 240 nm suggesting the mutation in G tract disrupted the Gquadruplex formation (Fig 5a).

Electrophoretic Mobility Shift Assay (EMSA) supports intramolecular conformations of SE-PGQs
Next, Electrophoretic Mobility Shift assay (EMSA) was performed to check the molecularity (inter or intra molecular G-quadruplex) of SE-PGQ in the solution. An intramolecular G-quadruplexes possess a compact topology and migrate faster than their linear counterpart, whereas intermolecular G-quadruplex contains a comparatively wider topology and exhibited slow migration than their linear counterpart [52]. All the three SE-PGQs and positive control (Tel22 DNA G-quadruplex) showed faster mobility than their respective linear counterpart and therefore suggested the formation of intramolecular G-quadruplex by SE-PGQs (Fig 6).

9-amino acridine inhibits Salmonella enterica growth
Various small molecules that either stabilized or destabilized the G-quadruplexes conformations are being under investigation for therapeutic intervention of various human pathogenic infection such as BRACO-19, TMPyP4, and several 9-amino acridine derivatives [42]. Previously, 9-amino acridine and its derivatives have been observed for their anti-proliferative properties in cancer cells [53] by binding to the telomeric region [54], the c-Myc gene [55] and c-Kit promoter [56]. Therefore, here we were interested in analyzing the effect of 9-amino acridine on the S. enterica growth and performed agar disc diffusion assay and MTT assay. A clear zone of inhibition was observed in the agar plate comparable to ampicillin and penicillin (S4 Fig) that suggested the inhibitory effect of 9-amino acridine on the S. enterica growth and an MTT assay observed an IC 50 value of 10.5 μM (Fig 7).

9-amino acridine stabilized the SE-PGQs and thereby stalls the movement of polymerase
In order to understand the role of SE-PGQs in the cytotoxic effect of 9-amino acridine, binding affinity of 9-amino acridine with these SE-PGQs were analyzed by performing CD Melting studies. An increase in the melting temperature (ΔT m~8 .5°C) was observed upon addition of 9-amino acridine when compared with alone SE-PGQs. This indicated that 9-amino acridine increased the thermodynamic stability of SE-PGQs (Fig 8). Further, we employed a Taq polymerase PCR stop assay to investigate whether 9-amino acridine complex formation with SE-PGQs, make it possible to stop the movement of polymerase replication machinery or not. In order to investigate this hypothesis, we incubated PCR reaction mixture with 9-amino acridine in a concentration dependent manner and then performed PCR amplification. We observed diminished intensity of bands with increase in concentration of 9-amino acridine, however, in the absence of the 9 amino acridine the band intensity was maximum indicated that the Taq polymerase were able to extend the SE-PGQs motifs. It shows that binding of the 9-amino acridine to the SE-PGQs motif stabilized the G-quadruplex structure and inhibited the movement of replication machinery over the untreated SE-PGQs. On the contrary, when mutant PGQs lacking G-tract were used as a DNA template, 9-amino acridine was not able to bind and thus, could not inhibit the movement of Taq polymerase and produced a PCR product in the reaction (Fig 9).

9-amino acridine decreases the transcription rate of the genes harboring PGQs:
Further, we performed qRT-PCR assay to check the effect of 9-amino acridine on the expression of the binding with small molecule are being extensively investigated as a promising therapeutic approach for combating the various type of human pathogen infection [42,46]. For example, HIV-1 promoter region possessed a G-quadruplex motif in long terminal repeat (LTR) region of their genome and observed to be critical for its proliferation. BRACO-19 has shown anti-HIV-1 activity by stabilizing G-quadruplex motif present in the LTR region [42]. Similarly, stabilization of G-quadruplex structure present in the core gene of HCV genome by PDP, halts its replication, translation and therefore can be used as potential antihepatitis therapeutics [57]. Pyridostatin is also observed to stabilize the G-quadruplex structure formed in the mRNA of nuclear antigen 1 protein of EBV leading to its translation suppression [58]. Recently,

BRACO-19 and quarfloxin have shown inhibitory effect on Mycobacterium tuberculosis and Plasmodium
falciparum by stabilizing G-quadruplexes present in various regions of their genome [40,59].
Considering the suitability of G-quadruplex structure as a promising drug target in both drug susceptible and drug-resistant strain of pathogens, here we sought to search for G-quadruplex motifs in

Salmonella enterica strains
Completely sequenced strains of S. enterica (S1 Table) were downloaded from National Center for Biotechnology Information (NCBI). These strains were then extensively mined for the potential Gquadruplex motifs in both sense and antisense strand using our previously developed G-quadruplex predictor tool [63]. This prediction tools used the following regular expression.
To find the conserved PGQ's that are available in all the strains, multiple sequence alignment (MSA) was performed by using Clustal Omega tool and clustering was done using UPGMA method.
Consensus sequences representing the whole G4 sequence with -5 and +5 flanking regions were constructed using the Glam2 tool of MEME Suite [66].
The resultant PGQ clusters were then mapped for their gene location in the genome of the individual S. enterica strains using the coordinates extracted from our G4 prediction tool by using Graphics mode of GenBank Database (https://www.ncbi.nlm.nih.gov/nuccore/).

Salmonella genus G4 homolog prediction
In order to check the conservation of predicted PGQs at the Salmonella genus level, NCBI nucleotide BLAST was performed by taking each consensus PGQ as a query sequence and Salmonella bongori genome sequences as a target(NCBI taxid: 590). The threshold e-value was set as 1e-3 to remove any results by chance.

Oligonucleotide preparation for CD and ITC analysis
Predicted G4 oligonucleotides sequences were procured from Sigma Aldrich Chemicals Ltd. (St. Louis, MO, USA). 100 μM stock solutions were prepared as per manufacturer's instructions. Before each set of experiment, oligonucleotides were subjected to re-anneal by heating at 95 ᵒ C for 10 minutes and slow cooling at room temperature for 2 hrs. All these oligonucleotides were dissolved in four different Trisbuffer (pH=7.0, 10mM) containing 50 mM of K + , Na + , Li + and Mg 2+ separately.

One dimensional 1 H-NMR Spectroscopy
AVANCE 500 MHz BioSpin International AG, Switzerland equipped with a 5 mm broadband inverse probe was used to perform NMR spectroscopic analysis. All the NMR experiments were performed using H 2 O/D 2 O solvent at 9:1 ratio. Temperature of 298K with 20 ppm spectral width and 3 - (Trimethylsilyl) propionic-2, 2, 3, 3-D4 acid sodium salt (TSP) as an internal reference were used. NMR data processing, integration, and analysis were done by using Topspin 1.3 software.

Bacterial strain culture and growth conditions
The S. ser. Typhimurium strain ATCC 14028 was procured from HiMedia and streaked on Nutrient Agar (HiMedia). A single colony was inoculated in the in Nutrient Broth (HiMedia) and kept overnight at 37˚C and 220 rpm in incubator shaker.

Growth inhibition assay -Disk Diffusion and MTT
Disk diffusion method was employed to identify the susceptibility of bacteria to 9-amino acridine. well served as blank (without 9-amino acridine). The plates were kept at 37˚C, 220 rpm for 3 hr.
Afterward, 10 µl of MTT (5 mg/mL) was added to each well and incubated for 3 hr. Finally, 20 µl of DMSO was added in each well to dissolve formazan crystal, and the plate was examined using microplate reader (BioTek) at 590 nm [68].

PCR Stop Assay
Templates and Primers were procured from Sigma-Aldrich Chemicals Ltd. (St. Louis, MO, USA) (

Authors Contribution
Data conceptualization and methodology was performed by A.K.