Genome-resolved insight into the reservoir of antibiotic resistance genes in aquatic microbial community

Aquatic microbial communities are an important reservoir of antibiotic resistance genes (ARGs). However, distribution and diversity of different ARG categories in environmental microbes with different ecological strategies is not yet well studied. Despite the potential exposure of the southern part of the Caspian Sea to the release of antibiotics, little is known about its natural resistome profile. We used a combination of Hidden Markov model (HMM), homology alignment and a deep learning approach for comprehensive screening of the diversity and distribution of ARGs in the Caspian Sea metagenomes at genome resolution. Detected ARGs were classified into five antibiotic resistance categories including prevention of access to target (44%), modification/protection of targets (30%), direct modification of antibiotics (22%), stress resistance (3%), and metal resistance (1%). The 102 detected ARG containing metagenome-assembled genomes of the Caspian Sea were dominated by representatives of Acidimicrobiia, Gammaproteobacteria, and Actinobacteria classes. Comparative analysis revealed that the highly abundant, oligotrophic, and genome streamlined representatives of taxa Acidimicrobiia and Actinobacteria modify the antibiotic target via mutation to develop antibiotic resistance rather than carrying extra resistance genes. Our results help with understanding how the encoded resistance categories of each genome are aligned with its ecological strategies.

Antibiotic resistance gene profile of the Caspian Sea Bacteria. Using six different tools we initially detected in total 259 potential ARGs in 110 MAGs. All predicted genes were further manually checked for conserved domains to confirm the functional predictions. For detected genes that confer resistance to antibiotics due to mutations we manually checked the alignments and report them as potential resistance genes only when they contained the exact mutation as those reported to cause resistance. A total of 82 genes conferring antibiotic resistance due to mutation were initially detected. Multiple sequence alignment together with reference genes confirmed mutation in 56 genes, two parC gene, three murA genes, 31 rpsL genes and 20 rpoB genes (multiple sequence alignments and mutations conferring antibiotic resistance are shown in the Supplementary Figure S2). There are ongoing debates regarding the relevance of detected mutations in annotated genes to exhibiting resistant phenotype in the organism [33][34][35] . While our additional alignment results confirm the presence of exact mutations for antibiotic resistance via these genes, experimental tests are the ultimate confirmation of the resistant phenotype.
After this curation step, annotations of 33, 95, and 105 predicted genes was confirmed as putative ARGs in respectively 15, 40, and 150 m depth (Fig. 1d). These ARGs were distributed in 102 bacterial genomes (Fig. 1a). Confirmed ARGs identified via each screening tool are detailed in the Supplementary Table S1. Antibiotic resistant bacteria (ARB) were more diverse in 150 m (13 different classes) and 40 m (12 different classes) depths as compared to the 15 m (6 different classes) depth. In general, a higher phylogenetic diversity was detected in the deeper strata of the Caspian Sea 36 .
The Caspian ARGs were classified into 5 different antibiotic resistance categories according to their annotated functions: (I) prevention of access to target, (II) modification/protection of targets, (III) direct modification of antibiotics, (IV) stress resistance, and (V) metal resistance (stats of these categories and their subcategories are shown in the Table 1). The most frequently detected category (44%) was prevention of access to target (due to prevalence of antibiotic efflux pumps) followed by, modification and protection of targets (30%) (Fig. 2) and direct modification of antibiotics (22%). The rest of identified ARGs were classified in two categories of stress (6 genes or 3%) and metal (3 genes or 1%) resistance. Categories (I), (II) and (III) ARGs were less prevalent in 15 m depth metagenome of the Caspian Sea (Fig. 1e).
We detected six cyclic AMP (cAMP) receptor protein (CRP) genes in the stress resistance category. All of these six stress resistance genes belong to the class Gammaproteobacteria (Fig. 1f)  www.nature.com/scientificreports/ in bacteria through its regulatory role in multiple cellular pathways, such as anti-oxidation and DNA repair pathways. Stress responses play an important role in integron rearrangements, facilitating the antibiotic resistance acquisition and development, and ultimately the emergence of multidrug-resistant bacteria. So, understanding the evolution of bacterial stress responses is critical, since they have a major impact on the evolution of genome plasticity and antibiotic resistance 37 . A recent study demonstrates the potential contribution of metal resistance genes and plasmidome to the stabilization and persistence of the antibiotic resistome in aquatic environments 38 . We identified three ferritin genes classified in the metal resistance category, in the Caspian Sea MAGs affiliated to genus Mycolicibacterium. Ferritin (bfr) is an iron storage protein involved in protection of cells against oxidative stress (iron-mediated oxidative toxicity) and iron overload 39,40 .
The most frequent subcategory detected in the Caspian Sea MAGs was RND efflux pump (50 ARGs), β-lactamase (40 ARGs), and mutation in rpsL gene (31 ARGs), respectively (Table 1). In category Prevention of access to target, Caspian ARGs are classified into different types of efflux pumps and their regulatory sequences (Fig. 2a,b). Besides, all resistances caused by mutational changes are in category modification/protection of targets (Fig. 2c,d). In the category of direct modification of antibiotics, Caspian ARGs belong to β-lactamases and some transferases (Fig. 2e,f). Many ARGs provide resistant to several classes of antibiotics in bacteria thus majority of the Caspian Sea ARGs belong to the multidrug antibiotic class (Fig. 3). The term "multidrug" here refers to different classes of antibiotics that a certain ARG can offer resistance against. For instant, genes related to efflux pumps can offer resistance to a number of antibiotic classes and hence here they were considered as a part of the multidrug class in Fig. 3. Genes encoding multidrug efflux pumps are evolutionarily ancient elements and are highly conserved 11 . The frequency of the efflux mediated antibiotic resistance in other environments 23,28 supported that efflux pumps have other physiologically relevant roles such as detoxification of intracellular metabolites, stress response and cell homeostasis in the natural ecosystems 11 . The second antibiotic class that the Caspian Sea ARGs provide resistance to is the β-lactams class. many soil bacteria have been isolated that can grow on β-lactam antibiotics as the sole source of carbon 41,42 . The abundance of β-lactamases in the Caspian Sea MAGs could also be related to other ecological roles of β-lactam.   Figure S1). Moreover, 64% of the Caspian Sea ARG containing MAGs had a single ARG, 18% had two ARGs, and rest of them (18%) had multiple ARGs. Multidrug resistant microbes are defined as those with resistance to three or more classes of antibiotics 43 . Multidrug resistant bacteria constitute 18% of Caspian Sea ARG containing MAGs dominated by representatives of Proteobacteria. (Supplementary Figure S7). The casp40-mb.75 and casp150-mb.119 MAGs contained ARGs belonging to four different groups of resistance genes (Supplementary Figure S4). Both MAGs contain an ARG in the metal resistance group (and no stress resistance gene). These MAGs are taxonomically affiliated to the genus Mycolicibacterium and show a higher abundance at 40 and 150 m depths (ca. 1300 TPM) (Supplementary Figure S5). The genus Mycolicibacterium comprise a wide range of environmental and pathogenic bacteria that are potential hosts of ARGs and MGEs. This may contribute to their diversity and evolution or even to their success as opportunistic pathogens 44 . Studies conducted in Japan suggest that livestock could acquire Mycolicibacterium peregrinum from their environment 45 . Presence of Mycolicibacterium representatives containing a set of ARGs in the natural environment could be a reservoir of genes for potential development of resistance in pathogenic groups.
Two MAGs affiliated to Pseudomonadales order (casp40-mb.215 and casp150-mb.169) contain ARGs belonging to categories I, II and III (Supplementary Figure S4). Among 22 ARG containing MAGs affiliated to Gammaproteobacteria, 14 MAGs belonged to Pseudomonadales order with estimated genome sizes in the range of 2.1 to 5.4 Mbp. Among all ARG containing MAGs, the casp40-mb.215 (n = 21 ARGs) and casp150-mb.169 (n = 18 ARGs) affiliated with Acinetobacter venetianus had the highest number of ARGs (Supplementary Figure S6). Representatives of genus Acinetobacter are commonly found in soil and water 46 . This genus contains Acinetobacter baumannii that is a pathogen with known antibiotic resistance complications for infection treatment 47 .
Representatives of the Acidimicrobiia class are ubiquitous aquatic microbes with high relative abundances in the brackish Caspian Sea 36 . These MAGs have the estimated genome size in the range of 1.3 to 2.9 Mbp and their ARGs belong to the category II and are mainly caused by mutations ( Fig. 4 and Supplementary Figure S3). In addition to the Acidimicrobiia class, there is a high frequency of antibiotic resistance mechanisms based on target modification and protection detected in the Actinobacteria affiliated MAGs (Fig. 4). For streamlined members of this taxon that are highly abundant in the ecosystem and have adapted to the oligotrophic environments, it www.nature.com/scientificreports/ could potentially be advantageous to modify the antibiotics target to develop antibiotic resistance so they can avoid the cost of carrying a new gene for developing resistant phenotype. 12 MAGs affiliated to Nanopelagicales order in Actinobacteria class contain 20 ARGs. All these ARGs are in category II and subcategories mutation in rpoB and rpsL genes (12 rpoB gene and 8 rpsL gene). Unlike other Actinobacteria, members of the order Nanopelagicales, family Nanopelagicaceae and AcAMD-5, have a low G + C% content (38% to 47%) in their genome and have streamlined genomes in the range of 1.3 to 1.6 Mbp. Members of this order are present in freshwater and brackish environments such as the Caspian Sea in high abundances making up more than 30% of the microbial community in the surface layer of freshwater ecosystems 48 . According to streamlining theory, these organisms remove unnecessary genes from their genomes, thereby lowering the cellular metabolic costs 49 .
In line with this strategy, the use of antibiotic resistance mechanisms based on modification or protection of the target, especially based on mutations in the antibiotic target, seems to be one of the best options to achieve antibiotic resistance in members of such lineages (Supplementary Figure S3). Although this order does not contain a known pathogenic representative, their ubiquitously high abundance in the ecosystem could offer a new perspective on the ecological role of antibiotic resistance genes. The family S36-B12 from order Nanopelagicales that is one of the high G + C% content (about 60%) groups with the estimated genome size in the range of 2-3 Mbp, also developed their resistance through mutations. While we attribute the resistant streamlined genomes to the target modification and protection mechanisms (especially based on mutations), this may be a feature of class Actinobacteria. Among all ARG containing MAGs, class Acidimicrobiia affiliated MAGs show the highest abundance in three depths of Caspian Sea followed by class Actinobacteria (Supplementary Figure S5). The MAG of casp15mb.93 and casp15-mb.71 are among the most abundant bacteria with detected ARGs in 40 and 150 m depth metagenomes. These MAGs belong to order Microtrichales and their ARGs were classified in category II having mutations in the rpsL genes. While these MAGs were reconstructed from the 15 m depth metagenomes, they show a higher abundance at the lower strata (Supplementary Figure S5).
Prior culture based studies on the Zarjoub 50 and Gowharrood 51 rivers that are entering the Caspian Sea basin report antibiotic resistant coliform bacteria. These studies do not report the antibiotic concentrations of the www.nature.com/scientificreports/ natural environment but claim that presence of antibiotic resistant bacteria is due to uncontrolled discharge of agricultural and livestock effluents upstream of the river and the entry of municipal and hospital wastewater into these two rivers and later the Caspian Sea. Hence, it is important to understand the accurate resistome profile of this natural ecosystem as a step toward sustaining its ecosystem services. The Caspian Sea ARG containing MAGs are dominated by representatives of Acidimicrobiia, Gammaproteobacteria and Actinobacteria classes. A recent study on the deep-sea water (more than 1000 m deep) suggest that even deep marine environments could be an environmental reservoir for ARGs mainly carried by representatives of Gammaproteobacteria (70%) and Alphaproteobacteria (20%) 28 . The identified ARGs were classified based on the classes of antibiotics they provide resistance to and most abundant identified ARG types respectively included multidrug, peptide and aminoglycoside 28 . Exploring the diversity and abundance of ARGs in global  www.nature.com/scientificreports/ ocean metagenomes using machine-learning approach (DeepARG tool) showed that ARGs conferring resistance to tetracycline are the most widespread followed by those providing resistance to multidrug and β-lactams. In the contigs containing ARGs, Alphaproteobacteria was identified as the largest taxonomic unit, followed by Gammaproteobacteria 23 . In the Caspian Sea however, similar to the global ocean 23 most identified ARGs provide resistance to multidrug class followed by β-lactams (Fig. 3). Caspian ARGs conferring resistance to tetracycline were annotated as transporter groups and consequently we classified them into category I and multidrug class. We additionally explored the distribution of ARGs in Caspian viral contigs and six viral contigs identified by virsorter2 contained ARGs however in the follow up manual curations we could not confirm the viral origin of these contigs and removed them from the results.

Phylogenetic analysis of the Caspian Sea β-lactamases.
A total of 40 ARGs classified as β-lactamase genes (bla), were detected in the MAGs of the Caspian Sea and their phylogenetic relations were analyzed (reference sequences and tree file are accessible in Supplementary Data File S1). Beta-lactamases are classified into four molecular classes (A to D classes) based on their amino acid sequences. Class A, C, and D enzymes utilize serine for β-lactam hydrolysis and class B are metalloenzymes that require divalent zinc ions for substrate hydrolysis 52 . Among identified Caspian bla, 9, 22, 2, and 7 are classified in respectively class A, B, C, and D (Supplementary Table S4). As shown in Fig. 5, most of the Caspian bla are metallo-β-lactamases. Metallo-beta-lactamase enzymes pose a particular challenge to drug development due to their structure and diversity 53 . These enzymes escape most of the recently licensed beta-lactamase inhibitors. Acquired metallo-beta-lactamases, which are prevalent in Enterobacterales and Pseudomonas aeruginosa, are usually associated with highly drug-resistant phenotypes and are more dangerous 53 . While the bla containing reference genes included in this phylogeny (collected from the KEGG database) mostly belong to the Gammaproteobacteria, Caspian bla containing genomes represent a higher diverse belonging to six different phyla (in 8 different classes). Some of these bla containing MAGs are affiliated to taxa that do not yet have a representative in culture. Natural ecosystems are known to be important reservoirs of β-lactamase gene homologs, however, exchange of β-lactamases between natural environments and human and bovine fecal microbiomes occurs at low frequencies 54 . Additionally, β-lactams can be used as a source of nutrient after β-lactamase cleavage. The β-lactam catabolism pathway has been detected in diverse Proteobacteria isolates from soil that is generating carbon sources for central metabolism 42,55 .

Conclusions
Antibiotic resistance is a global health challenge and according to One Health approach, attention to environmental antibiotic resistome is critical to combat AMR. Our study shows the distribution of antibiotic resistance genes in the Caspian Sea ecosystem, even though no accurate measurement of antibiotic contamination of the Caspian Sea has been reported so far. Moreover, our findings revealed the mechanism of resistance in streamlined genomes, which is based on target modifications. The increase of antibiotic concentrations in natural ecosystems, as a consequence of human activities, not only influences the Prevalence of antibiotic resistance genes, but also can alter the microbial populations and communities of the Caspian Sea. It can have adverse effects on the carbon and nitrogen cycle balance and hence may cause imbalance in the homeostasis of microbial communities in the Caspian Sea leading to potentially severe consequences for this ecosystem as a whole. However, as bacterial communities are formed by a complex array of evolutionary, ecological and environmental factors, it is difficult to obtain a clear understanding of the evolutionary and ecological consequences of antibiotic resistance in natural environments. The resistome profile and the type of resistance mechanism of the Caspian Sea MAGs provided in this study can be used as a reference database for monitoring the development and spread of antibiotic resistance in the Caspian Sea over time and can also guide future studies. Eventually, Global problems require global solutions and only a concerted and sustained international effort can succeed in dealing with AMR. The sequenced metagenomes were quality checked using bbduk.sh script (https:// sourc eforge. net/ proje cts/ bbmap) and assembled using metaSPAdes 57 . Metagenomic reads were mapped against assembled contigs using bbmap.sh script (https:// sourc eforge. net/ proje cts/ bbmap). Contigs ≥ 2 kb were binned based on differential coverage and composition using Metabat2 58 . Quality of the reconstructed MAGs was assessed using CheckM 59 and bins with completeness ≥ 40% and contamination ≤ 5% were used for further analysis. Taxonomy of these MAGs was assigned using GTDB-tk (v0.3.2) and genome taxonomy database release R89 60 62 . NCBI AMRFinderPlus v3.9.3 (https:// github. com/ ncbi/ amr/ wiki) command line tool and its associated database, The Bacterial Antimicrobial Resistance Reference Gene Database (which contains 4,579 antimicrobial resistance proteins and more than 560 HMMs), were used for screening ARGs. The protein sequences of all reconstructed MAGs were analyzed with parameter "-p" 63 . Additionally, all ARGs present in the MAGs protein sequences were screened using a deep learning approach, DeepARG v1.0.2 command line tool, (https:// bitbu cket. org/ gusph dproj/ deepa rg-ss/ src/ master/) with DeepARG-DB database (-model LS-type nucl-arg-alignmentidentity 60) 64 .

Methods
The nucleotide sequences of the reconstructed MAGs were searched for ARGs using ResFinder 4.1 command line tool (https:// bitbu cket. org/ genom icepi demio logy/ resfi nder/ src/ master/) and its associated database, ResFinder database with parameters "-ifa -acq -l 0.6 -t 0.8" 65 . They were also searched using ARGminer v1.1.1 database 66 and BacMet v2.0 database 67 using sraX v1.5 command line tool (https:// github. com/ lgpde vtools/ srax) www.nature.com/scientificreports/ with parameters "-db ext -s blastx" 68 . These sequences were also searched against ARG-ANNOT v4 database 69 using ABRicate v0.8 command line tool 70 75 . All functional annotation results were compiled and results were compared to obtain a consensus assignment. Then, ARGs were manually curated into 5 antibiotic resistance categories and 21 subcategories based on their functional annotations. The overall workflow of this study is shown in Supplementary Figure S8. Gene alignment. To confirm resistance due to mutation events in the candidate Caspian ARGs, multiple amino acid sequence alignment was carried out Using Clustal-W (default parameters) 76 embedded in MEGA-X software 77 . For each type of the ARGs, reference gene with specific mutations was downloaded from CARD database (Supplementary Table S2 shows the detail of mutations involved in antibiotic resistance).
Beta-lactamase phylogeny. To understand the evolutionary relationship of the recovered β-lactamase enzymes, firstly, 1141 reference protein sequences (beta-Lactamase gene variants) were downloaded from KEGG database, https:// www. genome. jp/ kegg/ annot ation/ br015 53. html and combined with β-lactamases recovered from Caspian MAGs (40 protein sequences). Then, all β-lactamase sequences were subjected to multiple sequence alignment using Clustal-W embedded in MEGA-X (Molecular Evolutionary Genetics Analysis) software 77 . Phylogenetic tree was constructed using the maximum likelihood method, JTT matrix-based model, and 100 bootstrap replications in MEGA-X software. The bootstrap consensus tree inferred from 100 replicates is taken to represent the evolutionary history. This analysis involved 1181 amino acid sequences in total. Taxonomic assignment of MAGs was extended to the of β-lactamases and iTOL v6.3.1 was used to annotate and visualize the final phylogenetic tree 78 .
Identification of viral contigs. Viral contigs were identified in contigs longer than 1 kb using VirSorter2 tool at the score threshold of 0.8 79 . These contigs were further checked manually to ensure the viral origin.

Data availability
The Caspian Sea metagenomes used for this study have been deposited to GenBank by Mehrshad et al. 36 and are accessible via the bioproject PRJNA279271. Genomes containing ARGs were also deposited to GenBank and are accessible under the accession number Bioproject PRJNA279271. All alignments used for manual evaluation of mutations, detailed stats of detected ARGs, and their sequences are accompanying this manuscript as Supplementary data S2. www.nature.com/scientificreports/