Uncovering bacterial hosts of class 1 integrons in an urban coastal aquatic environment with a single-cell fusion-polymerase chain reaction technology

Horizontal gene transfer (HGT) is a key driver of bacterial evolution via transmission of genetic materials across taxa. Class 1 integrons are genetic elements that correlate strongly with anthropogenic pollution and contribute to the spread of antimicrobial resistance (AMR) genes via HGT. Despite their significance to human health, there is a shortage of robust, culture-free surveillance technologies for identifying uncultivated environmental taxa that harbour class 1 integrons. We developed a modified version of epicPCR (emulsion, paired isolation and concatenation polymerase chain reaction) that links class 1 integrons amplified from single bacterial cells to taxonomic markers from the same cells in emulsified aqueous droplets. Using this single-cell genomic approach and Nanopore sequencing, we successfully assigned class 1 integron gene cassette arrays containing mostly AMR genes to their hosts in coastal water samples that were affected by pollution. Our work presents the first application of epicPCR for targeting variable, multi-gene loci of interest. We also identified the Rhizobacter genus as novel hosts of class 1 integrons. These findings establish epicPCR as a powerful tool for linking taxa to class 1 integrons in environmental bacterial communities and offer the potential to direct mitigation efforts towards hotspots of class 1 integron-mediated dissemination of AMR. Synopsis We present a novel single-cell genomic surveillance technology for identifying environmental bacterial hosts of a class of mobile genetic elements that are linked to anthropogenic pollution and contribute to the dissemination of antimicrobial resistance.

bacterial cells to taxonomic markers from the same cells in emulsified aqueous droplets. Using 22 this single-cell genomic approach and Nanopore sequencing, we successfully assigned class 1 23 integron gene cassette arrays containing mostly AMR genes to their hosts in coastal water 24 samples that were affected by pollution. Our work presents the first application of epicPCR for 25 targeting variable, multi-gene loci of interest. We also identified the Rhizobacter genus as novel 26 hosts of class 1 integrons. These findings establish epicPCR as a powerful tool for linking taxa 27 to class 1 integrons in environmental bacterial communities and offer the potential to direct 28 mitigation efforts towards hotspots of class 1 integron-mediated dissemination of AMR.  The emulsified PCR mix for each technical replicate was pooled, vortexed in 900 µL of 144 isobutanol and 200 µL of 5M NaCl and centrifuged for 1 min at ~20,000g with a soft brake.

145
The aqueous phase at the bottom of the tubes was extracted, purified using the Select magnetic beads to deplete epicPCR products containing cassette-less integrons.

156
To confirm that fusion between class 1 integron and 16S DNA fragments has occurred (Figure Sanger sequencing of epicPCR products from the Rhizobacter genus 164 To confirm that class 1 integrons can be found in the Rhizobacter genus, we pooled 165 approximately 150,000 cells from the three biological replicates and performed the epicPCR 166 procedures in three technical replicates with the minor modifications shown in Figure S3B  Analysis of Nanopore sequencing data of epicPCR amplicons 189 All filtering and processing steps were carried out using an in-house pipeline 190 (https://github.com/timghaly/Int1-epicPCR). First, the pipeline quality-filters reads using 191 NanoFilt v2.8.0 45 , removing those with an average read quality of <7 and read length <670 bp, 192 which represents the minimum length of an epicPCR product with a cassette-less integron.  Cd-hit v4.8.1 51 was used to cluster chimeric epicPCR products comprising class 1 integron 213 gene cassette arrays and V4 hypervariable regions of the 16S rRNA gene with ≥99% pairwise 214 identity in nucleotide sequences in at least 3 epicPCR replicates. For epicPCR products that 215 were found in fewer than 3 replicates, the NCBI Genome Workbench was used to perform 216 blastn searches of these sequences against a local database created using all the Nanopore reads 217 obtained in this study. Reads with ≥98% pairwise identity in nucleotide sequences and ≥98% 218 coverage were aligned using the Geneious bioinformatic software (Biomatters, New Zealand), 219 and the consensus sequences were generated from these alignments within each set of epicPCR 220 replicates. Consensus sequences that could be found in at least 3 epicPCR replicates and 221 showed ≥99% pairwise identity in nucleotide sequences and ≥99% identical sites were added 222 to the list of epicPCR products that had already been shortlisted by the Cd-hit algorithm.  in at least three independent replicates (n≥3, ≥99% pairwise nucleotide identity and ≥99% 257 identical sites relative to the consensus sequence for each epicPCR product). In addition, there 258 were five combinations of class 1 integrons and hosts that were detected in fewer than 3 259 replicates and were consequently excluded. Taxonomic classifications were obtained using the SILVA ribosomal RNA gene database (Table S1) (Table 1). The nucleotide sequences of all the gene cassettes reported in this study are 271 summarised in Table S3. We compared their nucleotide sequences against reference class 1 272 integron gene cassettes from the NCBI database and found that both the coding and non-coding 273 regions of these gene cassettes were highly conserved relative to the reference sequences 34, 59 .

274
In our dataset, there are six different attC recombination sites with 100% nucleotide sequence 275 identity to those in the respective reference gene cassettes in Table 1. We also aligned all the  Alphaproteobacteria host has not been reported previously.

294
As expected, the majority of the class 1 integron gene cassettes we detected are AMR related.

295
The dihydrofolate reductase encoded by dfrA5 confers high levels of trimethoprim resistance 296 in Gram-negative bacteria 63 . We found an association between the dfrA5 gene cassette and a previously. In all currently available NCBI RefSeq genomes for the Rhizobacter genus (n = 321 11), we detected various non-class 1 integrons but did not find any bioinformatic evidence of 322 class 1 integrons. 323 We found two Rhizobacter isolates with an identical nucleotide sequence in the 16S rRNA V4 324 hypervariable region to that of the Rhizobacter host in our dataset (NCBI accession no. A recent study applied a modelling approach to understand the emergence and spread of novel 334 antimicrobial resistance genes 80 . The dispersal of antibiotic resistance genes from 335 environmental bacteria to humans has been highlighted as a major knowledge gap that 336 contributes to uncertainties in these models. Acquiring more experimental data for these gene 337 mobilisation and HGT events can improve future quantitative risk assessment models for 338 antibiotic resistance and our ability to curb transfer of antibiotic resistance genes between 339 bacteria. epicPCR has the potential to contribute to this endeavour given its strengths in 340 identifying environmental microbes that cannot be cultivated under laboratory conditions.