Novel hydrogen- and iron-oxidizing sheath-producing Zetaproteobacteria thrive at the Fåvne deep-sea hydrothermal vent field

Iron oxidizing Zetaproteobacteria are well-known to colonize deep-sea hydrothermal vent fields around the world where iron-rich fluids are discharged into oxic seawater. How inter-field and intra-field differences in geochemistry influence the diversity of Zetaproteobacteria, however, remains largely unknown. Here, we characterize Zetaproteobacteria phylogenomic diversity, metabolic potential, and morphologies of the iron oxides they form, with a focus on the recently discovered Fåvne vent field. Located along the Mohns ridge in the Arctic, this vent field is a unique study site with vent fluids containing both iron and hydrogen with thick iron microbial mats (Fe mats) covering porously venting high-temperature (227-267 °C) black smoker chimneys. Through genome-resolved metagenomics and microscopy, we demonstrate that the Fe mats at Fåvne are dominated by tubular iron oxide sheaths, likely produced by Zetaproteobacteria of genus Ghiorsea. With these structures, Ghiorsea may provide a surface area for members of other abundant taxa such as Campylobacterota, Gammaproteobacteria and Alphaproteobacteria. Furthermore, Ghiorsea likely oxidizes both iron and hydrogen present in the fluids, with several Ghiorsea populations co-existing in the same niche. Homologues of Zetaproteobacteria Ni,Fe hydrogenases and iron oxidation gene cyc2 were found in genomes of other community members, suggesting exchange of these genes could have happened in similar environments. Our study provides new insights into Zetaproteobacteria in hydrothermal vents, their diversity, energy metabolism and niche formation. Importance Knowledge on microbial iron oxidation is important for understanding the cycling of iron, carbon, nitrogen, nutrients, and metals. The current study yields important insights into the niche sharing, diversification, and Fe(III) oxyhydroxide morphology of Ghiorsea, an iron- and hydrogen oxidizing Zetaproteobacteria representative belonging to ZetaOTU9. The study proposes that Ghiorsea exhibits a more extensive morphology of Fe(III) oxyhydroxide than previously observed. Overall, the results increase our knowledge on potential drivers of Zetaproteobacteria diversity in iron microbial mats and can eventually be used to develop strategies for the cultivation of sheath-forming Zetaproteobacteria.

). Other lineages frequently observed at vents (57-118 61) were more abundant in the Fe mat sample than Zetaproteobacteria, including 119 members of Gammaproteobacteria and Campylobacterota (mainly Sulfurovum), which 120 comprise 31% and 30% of the community, respectively (Figure S12). A single 121 Alphaproteobacteria Robiginitomaculum MAG comprised ~2% of total MAG coverage. 122 Interestingly, three high-quality (CheckV) viral genomes (vMAGs) identified in the Fe mat 123 are predicted to have abundant Fe mat bacteria Sulfurimonas (Campylobacterota) and 124 Gammaproteobacteria as potential hosts (Table S9). dominating Ghiorsea Fåvne MAGs, Faavne_M6_B18 and AMOR20_M1306, are 148 members of each of these clusters, and were present at 5 and 2%, respectively. Based 149 on ANI estimates, the closest publicly available genome to the highest quality Ghiorsea 150 MAG from Fåvne is a Ghiorsea MAG (64.8% complete) from a cold oxic subseafloor 151 aquifer (66) with an ANI value of 81.3% ( Figure 10, Figure S11) were all assigned to the genus Ghiorsea, low-temperature diffuse venting at Fåvne 159 supported a higher number of other Zetaproteobacteria taxa (Table S11). 28 unique 160 species-representative genomes of Zetaproteobacteria were recovered at Fåvne (based 161 on 95% ANI cutoff and publicly available MAGs) consisting of high-and medium-quality 162 MAGs (average completeness 81.7%, contamination 2.1% based on CheckM2; Table  163 S4). These Fåvne MAGs were associated with 2 families defined by GTDB and 7 164 defined genera, with 3 MAGs remaining unclassified to genus level and most of taxa 165 lacking cultured representatives ( Figure S9).  (Table S12). A 175 broader functional screening revealed that H2-based metabolism with a Group 1d 176 hydrogenase is common to other dominant MAGs within the Fe mat belonging to the 177 Gammaproteobacteria, Ignavibacteria, Calditrichia, KSB1 and Aquificae ( Figure S18, 178 Table S8A). Ghiorsea and some Gammaproteobacteria in the Fe mat also encode 179 12 In contrast, a higher diversity of Zetaproteobacteria is present in low-temperature 248 diffuse-venting areas at around ~10 °C (Table S11, Figure S7, Figure S8). All these 249 genomes, except for Ghiorsea, lack uptake hydrogenases ( Figure S9). Low-temperature 250 diffuse-venting areas may reflect a low availability of H2 relative to Fe(II), lost by abiotic 251 or other subsurface mixing processes and low temperature fluid formation (84). 252 Ghiorsea, with its hydrogen uptake capability, emerges as the sole specialist in the 253 presence of H2. While Ghiorsea is also observed in likely H2-poor diffuse-flow 254 environments, the higher diversity of Zetaproteobacteria, reflected by a diversity of 255 Fe(III) oxyhydroxide structures ( Figure S5, Figure S6), indicates an absence of a 256 monopolizing niche player. This pattern of distribution supports the hypothesis that H2 257 acts as a niche-determining factor for Ghiorsea at Fe(II)-rich hydrothermal vents (1). The 258 ability of Ghiorsea to utilize H2 affords it a competitive advantage as H2 is a 259 thermodynamically more favorable energy source than Fe(II), supporting faster cell 260 growth (53). The competitive advantage of growing on H2 is likely linked to evading the 261 need for reverse electron flow to replenish the reducing agent NADH needed for CO2 262 fixation ( Figure 6). 263 Hydrogenases restricted to Ghiorsea ZetaOTU9 at Fåvne show that potential for growth 264 on H2 is a trait limited to ZetaOTU9. Through the analysis of publicly available genomes 265 of Zetaproteobacteria, however, transmembrane uptake hydrogenases were detected in 266 two other species-representative genomes of Zetaproteobacteria outside of Ghiorsea, 267 sourced from freshwater and a subseafloor aquifer ( Figure S9, Figure S16). Even so, all 268 Ghiorsea do not necessarily share the ability to oxidize H2. Outside of Ghiorsea Clusters 269 A and B, two species-representatives of Ghiorsea from freshwater and a subsea tunnel MAGs from Fåvne (Sfz1-6 genes; Figure 2) or the full metagenome assembly even at 335 low sequence identity, suggesting a different genetic mechanism for sheath formation.  (Table S7)  are predicted to be highly expressed (Table S12) 4 Table S1). Samples 455 retrieved were centrifuged at 6000 rcf for 5 minutes and the supernatant was removed. 456 Iron microbial mat pellet for on-ship metagenome sequencing using Nanopore MinION 457 was processed directly. Aliquots for other analyses were frozen in liquid nitrogen and 458 stored at -80 °C until processing. Samples for scanning electron microscopy were fixed 459 in 2.5% glutaraldehyde and stored at 4 °C until further processing. 460

llumina sequencing workflow 500
Whole-sample genomic DNA was extracted using MO BIO Power Soil kit and sent to the 501 Norwegian Sequencing Centre (University of Oslo, Norway) for shotgun metagenomic 502 sequencing. 150 bp paired-end sequencing was performed using an Illumina NovaSeq 503 S4 flow cell. Raw reads were scanned for quality, duplication rate, and adapter 504 contamination using FastQC v0.11.9 (https://github.com/s-andrews/FastQC), and 505 concurrent visualization of the reports across samples was carried out in MultiQC (102). 506 Strand-specific quality filtering methods recommended (103) were implemented through 507 use of the "iu-filter-quality-minoche" script of the illumina-utils python package (104). 508 Quality-filtered reads were subsequently cleaned of contaminating human DNA by 509 mapping reads to the hg19 human genome with a mask applied to highly conserved 510 genomic regions using the bbmap.sh script within the BBTools package (105) and 511 human genome mask developed by Bushnell. 512 Sequence reads were assembled by individual metagenomic sample with MEGAHIT 513 v1.2.9 (106) using a minimum contig length of 1000 bp. Reads from each sample were 514 consecutively mapped to individual Illumina sample assemblies, effectively "co- Hybrid assembly of Nanopore and Illumina reads was also performed using 529 metaSPAdes (116). This hybrid assembly had lower quality than wtdbg2 and MetaFlye-530 only and Illumina polished MetaFlye assemblies (Table S2), with lower quality bins and 531 fewer 16S rRNA genes assigned to the genomes. Nanopore-only-based assembly 532 generated 1 Zetaproteobacteria MAG in FeMat sample, while 2 distinct 533 Zetaproteobacteria genomes were obtained using polished MetaFlye assembly. With 534 this in mind, we decided to go forward with the MetaFlye long read assembly polished 535 with Illumina reads. Assembly read information and quality metrics are shown in Table  536 S2. Several other assembly and binning strategies were attempted and compared using 537 QUAST v.5.0.2 with MetaQUAST output (117). 538

Combining long reads and short reads 539
To obtain high quality metagenome-assembled genomes (MAGs) with better sequencing 540 depth, a MetaFlye long-read assembly, done with Flye v2.9 and filtered nanopore reads 541 25 (118), was polished using Illumina short reads using Pilon v1.23 (119). Illumina short 542 reads were mapped to the assembly using bwa v.0.7.17 (120) and minimap2 for 543 Nanopore long reads (121). A mapping file was then reformatted using samtools (108). The dereplication resulted in 111 MAGs. Relative abundances were calculated using the 556 abundance output of relative coverage within one sample (Anvi'o v.7), and this was 557 normalized to one. 558

Taxonomic classification 559
The reconstructed MAGs were taxonomically classified using the genome taxonomy 560 database tool kit gtdbtk v.1.7.0 (127) using the database GTDB 202 release. In addition, 561  Table S1). The choice 570 was made to concentrate efforts on the black smoker iron microbial mat (FeMat) after 571 identifying the presence of only the genus Ghiorsea and iron oxide sheaths and since 572 FeMat was the most precise sample of the iron microbial mat. All publicly available 573 Zetaproteobacteria genomes (73; taxid 580370) and corresponding metadata at NCBI 574 Alphaproteobacteria closely related to the Fåvne MAGs were downloaded from NCBI as 581 references. A threshold cut-off of high and medium quality genomes (min. 50% 582 completeness, max. 10% redundancy) was used before further analysis. Phylogenomic 583 analyses included 148 Zetaprotobacteria genomes in addition to the MAGs from this 584 study. All selected genomes are presented in Supplementary Table S4. 585

Phylogenetic and phylogenomic analyses of Zetaproteobacteria 601
Phylogenomics was used to determine the closest evolutionary relationships. Single 602 copy marker genes present in all genomes were detected and extracted using Anvio 603 v.7.0 (109) with anvi-get-sequences-for-hmm-hits, using Anvio's Bacteria_71 and 604 GTDB's bac_120 collection of single copy marker genes (127). Selection of marker 605 genes was based on genes being present only in a single copy, found in at least 70% of 606 all Zetaproteobacteria genomes and supporting Zetaproteobacteria monophyly in 607 28 individual marker phylogenetic trees (Table S5) (161), tRNAscan (162) and WIsH (163). Score of 3 was used as a threshold value to 696 32 assign hosts based on previous approaches (164). Host-virus pairs were analyzed also 697 with PHP host predictor software using K-mer predictions (165). 698

6
Data availability 699 All MAGs in the study were deposited in NCBI and accession numbers with associated 700 BioProject and BioSamples with corresponding metadata are listed in Supplementary 701 Material 4 Table S1 and Supplementary Material 2 Table S4 and Table S5. The authors declare that the research was conducted in the absence of any commercial 715 or financial relationships that could be construed as a potential conflict of interest.