Gut feeling: Extent of virulence and antibiotic resistance genes in Helicobacter pylori and campylobacteria

Background Helicobacter pylori, a member of campylobacteria, is the leading cause of chronic gastritis and gastric cancer. Virulence and antibiotic resistance of H. pylori are of great concern to public health. However, the relationship between virulence and antibiotic resistance genes in H. pylori in relation to other campylobacteria remains unclear. Materials and Methods By using the virulence and comprehensive antibiotic resistance databases, we explored all available 354 complete genomes of H. pylori and compared it with 90 species of campylobacteria for virulence and antibiotic resistance genes/proteins. Results On average, H. pylori had 129 virulence genes, highest among Helicobacter spp. and 71 antibiotic resistance genes, one of the lowest among campylobacteria. Just 2.6% of virulence genes were shared by all campylobacterial members, whereas 9.4% were unique to H. pylori. The cytotoxin-associated genes (cags) seemed to be exclusive to H. pylori. Majority of the isolates from Asia and South America were cag2-negative and many antibiotic resistance genes showed isolate-specific patterns of occurrence. Just 15 (8.8%) antibiotic resistance genes, but 103 (66%) virulence genes including 25 cags were proteomically identified in H. pylori. Arcobacterial members showed large variation in the number of antibiotic resistance genes and there was a positive relation with the genome size. Conclusion Large repository of antibiotic resistance genes in campylobacteria and a unique set of virulence genes might have important implications in shaping the course of virulence and antibiotic resistance in H. pylori.


Introduction
Helicobacter pylori is a gram-negative microaerophilic bacterium that persistently colonizes the epithelial lining of the stomach, and it is estimated that more than half the world population is chronically infected (Hooi et al., 2017).H. pylori has been associated with numerous pathophysiological conditions including chronic gastritis, peptic ulcer, gastric cancer, and gastric mucosa-associated lymphoid tissue lymphoma (Suerbaum and Michetti, 2002).Annual stomach cancer rates are high -there were more than 1.22 million incident cases and 865000 deaths in 2017 that contributed to 19.1 million disability-adjusted life-years (Etemadi et al., 2020).H. pylori infection is the most important established risk factor for stomach cancer (Chiang et al., 2021;Choi et al., 2020;Malfertheiner et al., 2017).The more virulent cagA-positive H. pylori strains also lead to persistent inflammatory stimulation and have been associated with ischemic heart disease (Pasceri et al., 1998).
Given the high burden of H. pylori, especially in the developing countries (Etemadi et al., 2020;Hooi et al., 2017), elimination of H. pylori infection has been suggested and attempted (Chiang et al., 2021;Choi et al., 2020).H. pylori eradication quadruple therapy usually includes the heavy metal bismuth, a proton pump inhibitor such as lansoprazole, and at least two antibiotics among amoxicillin, clarithromycin, furazolidone, levofloxacin, metronidazole, and tetracycline (Hu et al., 2017).However, as H. pylori resistance to many of these antibiotics is already at alarming levels worldwide (Savoldi et al., 2018), the increasing prevalence of antibiotic resistance in H. pylori has been recognized by the WHO as "high priority" for urgent need of new therapies (Tacconelli et al., 2018).
There are attempts at finding novel and effective therapeutic regimens for H. pylori (Hu et al., 2017), including antibiotics susceptibility-guided treatment (Gingold-Belfer et al., 2021).
However, as H. pylori seems to be resistant to multiple antibiotics such as clarithromycin, levofloxacin, metronidazole, and tetracycline at high concentration (Megraud et al., 2021;Nestegard et al., 2022;Shu et al., 2022), there is a need to understand the antibiotic resistance in H. pylori from different perspectives.Two areas standout -first one is the relationship between the virulence and antibiotic resistance genes within H. pylori, and the second one is the extent of virulence and antibiotic resistance genes between H. pylori and other campylobacteria.
Recent studies are revealing some interesting relationships between virulence and antibiotic resistance.For instance, while Brennan et al. (2018) found that less virulent (cagA-negative and vacA S2-containing) strains of H. pylori are associated with primary clarithromycin resistance, da Silva Benigno et al. (2022) concluded that virulent H. pylori strains (cagA and cagE positive) may be more susceptible to clarithromycin treatment.However, Hosseini et al. (2021) found that isolates with virulence genes oipA, vacA, and iceA1 were resistant to clarithromycin, while Liu et al. (2022) found no relationship between the presence of virulence factors cagA and vacA, and resistance to clarithromycin, levofloxacin, and metronidazole.
The comparison of H. pylori with other campylobacteria is also very relevant and useful for multiple reasons.Many campylobacterial members are zoonotic and similar to H. pylori they cause gastroenteritis.Further, apart from being also associated with human extragastrointestinal infections, Campylobacter spp.are also the leading cause of bacterial foodborne and waterborne infections (Igwaran and Okoh, 2019).In fact, Campylobacter infection is widespread, and the incidence and prevalence of campylobacteriosis have been increasing (Kaakoush et al., 2015).
For example, Campylobacter concisus gastritis was often misattributed to H. pylori on gastric biopsy (Ferreira et al., 2022).Campylobacter spp.infection and antibiotic resistance are widespread especially in sub-Saharan Africa (Hlashwayo et al., 2021).The fluoroquinoloneresistant Campylobacter spp.are recognized by the WHO as "high priority" (Tacconelli et al., 2018), and campylobacteriosis was the most reported zoonosis in the European Union in 2020 (EFSA, 2022).
While the knowledge on the relationship between virulence and antibiotic resistance genes within H. pylori is sparse and uncertain (Brennan et al., 2018;da Silva Benigno et al., 2022;Hosseini et al., 2021;Liu et al., 2022;Wang et al., 2019), the extent of virulence and antibiotic resistance genes between H. pylori and other campylobacteria is largely unexplored.
In this study, we sought to investigate the relationship between virulence and antibiotic resistance genes among all available 354 complete genomes of H. pylori and also wanted to get a larger perspective by comparing it with 90 species of other campylobacteria.We used the comprehensive antibiotic resistance database (CARD) (Alcock et al., 2019) and virulence factor database (VFDB) (Liu et al., 2019) -two most inclusive and widely used resources for the genome-wide identification of virulence and antibiotic resistance genes, respectively.We reveal interesting patterns of virulence and antibiotic resistance genes among campylobacteria and H. pylori, and discuss their relevance in shaping the scope of antibiotic resistance.

Genome sequence acquisition
The complete genome reference sequences (RefSeqs) for all campylobacterial species were downloaded from the NCBI website (https://www.ncbi.nlm.nih.gov/, last accessed on Dec 15, 2022).There were 91 species (from 15 genera) of campylobacteria with a RefSeq.Further, all available 353 complete genome sequences (excluding RefSeq) for H. pylori were also downloaded.For most of the other campylobacterial species, apart from RefSeq, the NCBI database did not contain additional complete genome sequences.Strict filter criteria were used to ensure the quality as only annotated genomes with an assembly level of "complete" were used.
They were also confirmed to have a single circular chromosome with the size matching the average genome size of the species.Genome sequences were downloaded in ".fna" file format.
Based on the coordinate information available in the corresponding ".gtf" file, corresponding 16S rRNA sequences from the above genomes were used for phylogenetic analysis.All sequence related metadata including the geographical origins of the isolates were also collected from the NCBI.All relevant basic information about the genomes such as length, GC content (%), geographical origins of the isolates, etc. are given in Table S1.
The gene sequences and their mutations, if any, are known, and clear experimental evidence exists (Alcock et al., 2019).As a curated data resource, the CARD has 4336 antibiotic resistance ontology (ARO) terms for 2923 known antimicrobial resistance (AMR) determinants/genes and 1304 resistance variant mutations (Alcock et al., 2019).The web interface of CARD (https://card.mcmaster.ca/analyze/rgi)identifies putative antibiotic resistance genes from experimentally confirmed AMR gene models based on multiple approaches such as BLAST, sequence alignment, regular expressions (RegEx), hidden Markov models (HMMs), and/or position-specific SNPs (Alcock et al., 2019).We used the web interface with RGI version 6.0.1 and CARD 3.2.7.Further, we used the complete genomic DNA sequence of high quality/coverage and the prediction of partial genes were excluded.Each campylobacterial genome sequence was submitted to CARD's resistance gene identifier (RGI) tool to obtain annotations based on perfect, strict, or loose paradigm, and complete gene match criteria for the identification of antibiotic resistance genes (Rao et al., 2023;Zhang et al., 2022).The CARD outputs a list of ARO terms -the total number of antibiotic resistance genes -for each genome.
Duplicate (two or more) entries, if any, of the ARO term indicate multiple copies of an antibiotic resistance gene (Rao et al., 2023).

Phylogenetic analysis
To understand virulence and antibiotic resistance genes among campylobacteria from a phylogenetic perspective, complete 16S rRNA sequences from 91 species of campylobacteria were used to construct a phylogenetic tree in MEGA11 using the maximum likelihood method (Tamura et al., 2021).A thousand bootstrap iterations were performed.Nautilia profundicola was taken as the outgroup.

Data analyses
The total and unique number of genes were counted for individual species and the extent of overlap of genes among different species/clades were represented using a Venn diagram (Rao et al., 2023).The R function/package ggvenn() was used to make the Venn diagram.Scatter plot was used to visualize the relationship between two sets of data, for example, the number of virulence genes versus the number of antibiotic resistance genes.To account for the different sizes of genomes, the number of genes were normalized by dividing with individual genome size (in bp) and multiplied by the average genome size.The Spearman's rank correlation, which is less sensitive to outliers, was used to examine the relationship between the numbers of antibiotic resistance genes and virulence genes.The significance of the correlation coefficient was tested using cor.test(which is based on t-distribution or approximation) in R. As the extent of antibiotic resistance is also associated with the differential presence of virulence genes, in particular cytotoxin-associated genes (cags) (Brennan et al., 2018;da Silva Benigno et al., 2022;Hosseini et al., 2021;Liu et al., 2022), to further understand their interplay, genomes were grouped based on cags into three categories namely "all cags" when all cags were present, "cag2-" when only cag2 was missing, and "others" when most of the cags including cag2 and cagA were missing.
An unpaired t-test (two-tailed, unequal variance) was used to compare the means (for instance, the average number of genes) between two groups (Rao et al., 2023).In addition, a chi-squared test was used (as chisq.test() in R with the parameter simulate.p.value = TRUE) for the analysis of contingency table, for example, to check whether the difference in the numbers of genes/genomes in different categories was significantly different (Agresti, 2018).A Bonferroni correction was applied to account for multiple comparisons.The extent of overlap of genes between two sets was quantified using overlap coefficient which is defined as the size of the intersection of two sets divided by the size of the smaller of the two sets (Vijaymeena and Kavitha, 2016).Finally, to see if virulence and antibiotic resistance genes were proteomically identified, we searched the literature for proteomic studies using PubMed and Google Scholar with keywords such as "Helicobacter pylori" AND "proteomics" (Karlsson et al., 2016).The overlaps between different sets were depicted using a Venn diagram.The routine data handling/analysis was done in Python and Microsoft Excel.

Number of antibiotic resistance and virulence genes in H. pylori and campylobacteria
The H. pylori and other campylobacterial complete genomes were explored using CARD and VFDB for the identification of antibiotic resistance and virulence genes, respectively (Table S2-S9).There were, on average, 90.9 (SD ±2.0, range 86-95) antibiotic resistance genes and 132.8 (SD ±11.0, range 104-155) virulence genes in the genomes of H. pylori (Table 1).As a few genes were in multiple copies, after ignoring those duplicate entries, there were 71.4 (SD ±2.2, range 66-77) unique antibiotic resistance genes and 128.9 (SD ±10.9, range 101-145) unique virulence genes.It should be noted that CARD annotates antibiotic resistance genes based on protein homolog and variant models, and perfect/strict/loose paradigm.For "loose" hits, the median bit score of 99 (average = 158.5)with median sequence length of 1074 (average = 1309.7)would translate to an E-value of 7.35E-24 (1.1E-41 for average).In case of virulence genes, majority of the hits had far higher sequence identity and coverage than the assigned cutoffs of ≥60 % and ≥40 %, respectively.For example, 92.8% hits had ≥85% of sequence identity and ≥70% of coverage.The lowest coverage sequence at 40.04%, with a length of 473 and 69.86% of identity, had an E-value of 2.44E-51.Whereas the lowest identity sequence at 63.47%, with a length of 1095 and 54.11% of coverage, had an E value of 1.02E-121.Compared to H. pylori, other species of Helicobacter had slightly higher number of antibiotic resistance genes (82.5±9.3,68-98, p = 0.001, t-test), but only half the number of virulence genes (65.5±20.8,34-109, p = 1.1E-07, t-test).The Wolinella, a close relative of Helicobacter (Fig. S1), had just 28 virulence genes, but 115 antibiotic resistance genes with some of them in multiple copies (Table S6).Compared to Helicobacter spp., Campylobacter spp., on average, had similar number of antibiotic resistance genes (88.2±6.7,77-106, p = 0.057, t-test), but more virulence genes (86.3±40.4,9-155, p = 0.029, t-test).Arcobacterial species and Sulfurospirillum spp., on average, had far lower number of virulence genes, but had much higher number of unique antibiotic resistance genes with many in numerous copies that nearly doubled their total number (Table 1 and S6).

Relationship between antibiotic resistance and virulence genes
There was a negative correlation (ρ = -0.64,p = 1.3E-11, t-test for correlation) between the total number of antibiotic resistance genes and the total number of virulence genes among the 91 species of campylobacteria (Fig. 2A).Further, species from different clades/genera formed distinct clusters based on the numbers of antibiotic resistance and virulence genes.Individually, the numbers of virulence genes were negatively correlated (ρ = -0.68,p = 1.3E-13, t-test for correlation) with the genome size, whereas the numbers of antibiotic resistance genes were strongly positively correlated (ρ = 0.91, p = 2.2E-16, t-test for correlation, Fig. 2B and 2C).
Except for Campylobacter clade, the overall patterns remained similar after adjusting to the genome size.

Antibiotic resistance versus virulence genes in H. pylori
There was a weak negative correlation (ρ = -0.15,p = 3.96E-3, t-test for correlation) between the numbers of antibiotic resistance genes and virulence genes in H. pylori (Fig 3A).However, genomes could be grouped based on the presence of cags -in 133 (37.6%) genomes all cags were present, in 147 (41.5%) genomes only cag2 was absent, whereas in the remaining 74 (20.9%) genomes many or all cags were missing.Interestingly, the cag2-negative H. pylori genomes were almost all exclusive to Asia and South America (Fig. 3B).Further, apart from all cags, a few more virulence genes such as htpB, groEL, etc. were also significantly different (p < 0.05, chi-squared test with Bonferroni correction) in one of the groups based on cags (Fig. S3).
For instance, different variants of rpoB which impart resistance to rifampicin were present in different groups based on cags (Fig. 3C and Table S10).
H. pylori and C. jejuni in particular seem to have the highest number of virulence genes among 91 species of campylobacteria.Further, along with the numerous unique virulence genes, cags seemed to be exclusive to H. pylori.This hints at the high virulence and global prevalence of H. pylori and C. jejuni (EFSA, 2022;Hooi et al., 2017;Kaakoush et al., 2015;Tacconelli et al., 2018).The others, more specifically the arcobacterial members, showed low numbers of virulence genes.Most member of this clade are free living in a wide range of habitats, although a few are considered as emergent enteropathogens and/or potential zoonotic agents (Collado et al., 2011;Pérez-Cataluña et al., 2018).On the contrary, while H. pylori had one of the lowest numbers of antibiotic resistance genes, arcobacterial members showed a wide range with many containing several hundreds.It may be noted that the members of arcobacteraceae are very diverse and their hosts/habitats include aquatic animals and planktons, cyanobacterial mats, sludge and marine sediments, estuarine and river water, etc. (Pérez-Cataluña et al., 2018).It is believed that free-living bacteria acquire resistance factors to counter environmental challenges including antibiotic pollution and biotic hostility such as cyanobacterial bloom (Larsson et al., 2022;Zhang et al., 2020).
It is well known that H. pylori contains as many as 32 cags that encode a bacterial type IV secretion system, and it was observed that cags were present in approximately 60-70% of Western H. pylori strains and virtually 100% of East-Asian H. pylori strains (Noto et al., 2012).
The CagA protein as a pro-oncogen requires tyrosine phosphorylation at EPIYA motif and structural polymorphism of CagA influences its scaffold function, which was thought to underlie the geographic difference in the incidence of gastric cancer (Hatakeyama et al., 2017).We found that the majority of the isolates from Asia and South America were cag2-negative.The hp0521 (cag2) gene encodes Cag2 protein.Xu et al. (2022) have found that the deletion of cag2 gene had no effect on bacterial growth but increased cagA mRNA expression and CagA protein thereby likely affecting the virulence of H. pylori.This may partly explain why most Asian strains of H. pylori are more virulent (Hatakeyama et al., 2017;Yuan et al., 2017).
The interplay between virulence and antibiotic resistance is documented -for example, Pseudomonas aeruginosa with oprD mutants that impart carbapenem resistance were also more virulent (Geisinger and Isberg, 2017).The H. pylori strains with virulence genes oipA, vacA, and iceA1 were resistant to clarithromycin (Hosseini et al., 2021).On the contrary, there might also be a trade-off between the two -less virulent (cagA-negative) strains of H. pylori were found to be clarithromycin resistance (Brennan et al., 2018) andvice versa (da Silva Benigno et al., 2022).The differential presence of antibiotic resistance genes among cag2-positive and negative strains observed in this study further hints at such an interplay and trade-off.
Presence of resistant variants of genes, for example, for erythromycin (ermB), quinolone (gyrA and gyrB) and tetracycline (tetA and tetB) are also known in Campylobacter species (Igwaran and Okoh, 2019).Bacteria may also contain inherent genes such as AbaF, a well-known efflux pump, that might impart antibiotic resistance, but remain functionally silent until sufficiently challenged with selection pressure (Abdi et al., 2020;Nikaido, 2009).Perturbations under antibiotics such as mutations leading to increased expression of efflux pump might result in antibiotic resistance (Nikaido, 2009;Salini et al., 2022).Antibiotic resistance can also be acquired by spontaneous mutations or through horizontal gene transfer (HGT) via plasmids (van Hoek et al., 2011).The 'silent reservoir' of antibiotic resistance genes can lead to the emergence of multidrug-resistant "superbugs" through HGT (Kent et al., 2020).Numerous arcobacterial members, close relatives of H. pylori, contain such a silent reservoir of hundreds of antibiotic resistance genes.While HGT from H. pylori to C. jejuni is known (Oyarzabal et al., 2007), H. pylori has a composite system for DNA uptake and natural transformation ability to possibly acquire additional antibiotic resistance (Stingl et al., 2010).
To list some of the strengths and limitations of this study -we ignored more numerous partial/incomplete genome sequences to avoid getting incomplete patterns, but this obviously reduced the number of samples for analyses.Being external tools, CARD and VFDB had their own limitations in the computational identification of target genes (Alcock et al., 2019;Liu et al., 2019) and that might have indirectly affected our analyses and results.For example, it should be noted that all the genes mentioned in this paper are listed as either virulence or antibiotic resistance genes in the respective databases based on the evidence from existing scientific literature.Finally, being purely a bioinformatics work, like others (Rao et al., 2023;Her et al., 2021), no attempts were made at experimental validations or functional characterization of these genes.However, we attempted to put the results in the phenotypic/functional context.
In conclusion, we showed that H. pylori genome, on average, contained 130 virulence geneshighest among Helicobacter spp.and 71 antibiotic resistance genes -one of the lowest among campylobacteria.While Campylobacter spp.showed a large difference in the number of virulence genes with C. jejuni containing the highest of 155, arcobacterial members showed a large difference in the number of antibiotic resistance genes and there was a positive relation with the genome size.The cags were exclusive to H. pylori and isolates from Asia and South America were mostly cag2-negative with many antibiotic resistance genes having isolatespecific patterns of occurrence.The large repository of antibiotic resistance genes along with the unique set of virulence genes might change the course of virulence and antibiotic resistance of H. pylori under selection pressure and in turn might pose a challenge to global public health.

Figure legends
Fig. 1.Venn diagrams showing the overlap in the presence of (A) antibiotic resistance genes and (B) virulence genes in campylobacterial clades.(A) Eighty (12.8%) antibiotic resistance genes were common to all clades while only 16 (2.6%)were unique to H. pylori.(B) Just 12 (2.6%)virulence genes were common to all clades whereas 43 (9.4%, primarily cags) were unique to H. pylori.Note: H. pylori was not included in the Helicobacteraceae set.

Fig. 2 .
Fig. 2. (A)Overall, the total numbers of antibiotic resistance genes were negatively correlated (ρ = -0.64)with the total numbers of virulence genes in campylobacteria.Species from different clades/genera form distinct clusters based on the number of virulence and antibiotic resistance genes.Further, (B) the number of virulence genes were negatively correlated (ρ = -0.68)with the genome size, whereas (C) the antibiotic resistance genes were positively correlated (ρ = 0.91).

Fig. 3 .
Fig. 3. Antibiotic resistance versus virulence genes in H. pylori.(A) Correlation between the numbers of antibiotic resistance vs virulence genes was low (ρ = -0.15).(B) The cag2-negative H. pylori genomes were exclusive to Asia and South America.(C) Heat map shows the relative proportion of genes in three groups -at least 25 antibiotic resistance genes were preferentially present (p < 0.05, chi-squared test with Bonferroni correction) in one of the groups based on cags.Bar graph at the right shows the number of genomes.

Table 1 .
Summary of virulence and antibiotic resistance genes in campylobacteria.