Corynebacterium Comparative Genomics Reveals a Role for Cobamide Sharing in the 1 Skin Microbiome 2 3

The human skin microbiome is a key player in human health, with diverse functions ranging from defense against pathogens to education of the immune system. Recent studies have begun unraveling the complex interactions within skin microbial communities, shedding light on the invaluable role that skin microorganisms have in maintaining a healthy skin barrier. While theCorynebacteriumgenus is a dominant taxon of the skin microbiome, relatively little is known how skin-associated Corynebacteria contribute to microbe-microbe and microbe-host interactions on the skin. Here, we performed a comparative genomics analysis of 71Corynebacteriumspecies from diverse ecosystems, which revealed functional differences between host- and environment-associated species. In particular, host-associated species were enriched forde novobiosynthesis of cobamides, which are a class of cofactor essential for metabolism in organisms across the tree of life but are produced by a limited number of prokaryotes. Because cobamides have been hypothesized to mediate community dynamics within microbial communities, we analyzed skin metagenomes forCorynebacteriumcobamide producers, which revealed a positive correlation between cobamide producer abundance and microbiome diversity, a trait associated with skin health. We also provide the first metagenome-based assessment of cobamide biosynthesis and utilization in the skin microbiome, showing that both dominant and low abundant skin taxa encode for thede novobiosynthesis pathway and that cobamide-dependent enzymes are encoded by phylogenetically diverse taxa across the major bacterial phyla on the skin. Taken together, our results support a role for cobamide sharing within skin microbial communities, which we hypothesize mediates community dynamics.


61
The human skin supports a diverse and complex ecosystem of bacterial, fungal, viral, and 62 microeukaryote species, termed the skin microbiome. Highly adapted to live on the skin, these 63 microorganisms form distinct and specialized communities across the various microenvironments 64 of the skin, which include sebaceous, moist, dry, and foot sites. Rather than existing as innocuous 141 (porphyrin and chlorophyll metabolism) were used to create a custom profile for KOfamscan, a 142 functional annotation program based on KO and HMMs [29]. Each genome was queried against 143 this profile, and hits to the KOs above the predefined KOfamscan threshold were considered for 144 further analysis. Visualization of the phylogenomic tree and cobamide biosynthesis pathway 145 completeness was performed in R with the ggtree package [30].

147
Microbiome diversity analysis of Corynebacterium cobamide producers 148 Using the taxonomic abundance information from 585 skin metagenomes from Oh et al. [8], the 149 total sum abundance of 5 Corynebacterium species (C. amycolatum, C. kroppenstedtii, C. 150 ulcerans, C. pseudotuberculosis, C. diphtheriae) shown to encode for de novo cobamide 151 biosynthesis was calculated for each metagenome. Richness was determined by calculating the 152 total number of observed species within each sample. Alpha diversity was determined by 153 calculating the Shannon index using the diversity() function from the phyloseq v1.26.1 package 154 in R. The Spearman correlation between Corynebacterium cobamide producer abundance and 155 alpha diversity was calculated and plotted with ggscatter() and stat_cor() from the ggpubr v0.4.0 156 package. For beta diversity analysis, relative abundances were square root transformed to give 6 Profile HMMs, retrieved from the TIGRfam and Pfam databases, were used to detect cobamide 163 biosynthesis, cobamide transport, and cobamide-dependent genes within skin metagenomic 164 sequencing data. A total of 12 cobamide biosynthesis genes were selected because of their broad 165 distribution throughout both the aerobic and anaerobic biosynthesis pathways and their presence 166 within taxonomically-diverse cobamide producer genomes [31][32][33]. CbiZ was included as a 167 marker of cobamide remodeling, and BtuB was included to assess cobamide transport. 19 168 cobamide-dependent enzymes and proteins with B12-binding domains were chosen to evaluate 169 cobamide use. The single copy gene rpoB was used as a phylogenetic marker to assess microbial 170 community structure within each metagenome and as a proxy for sequence depth. All cobamide-171 associated genes used in this analysis can be found in Supplementary Material S3.

173
Metagenomic sequence search using HMMER

174
Raw sequencing data from 906 skin metagenomes was retrieved from the Sequence Read 175 Archive (SRA) and converted to FASTA format, retaining only forward read files for analysis 176 (Supplemental Table S2). Metagenomes were translated to each of 6 frame translations using 177 transeq from the emboss v6.6.0 package [34]. The program hmmsearch from HMMER v3.3.1 [35] 178 was used with default parameters and an E-value cutoff of 1E-06 to scan the metagenomic 179 sequencing reads for homology to each cobamide-related HMM. The resulting hits were

209
Species within the Corynebacterium genus occupy highly diverse habitats, including soil, cheese 210 rinds, coral mucus, and human skin, with certain species causing serious disease in humans and 211 animals. To explore the genomic diversity within the Corynebacterium genus, we performed a 212 pangenomic analysis using anvi'o. These analyses included 50 host-associated (HA) and 21 213 environment-associated (EA) genomes (Supplemental Table S1), acquired as complete 214 assemblies from NCBI (n=68) or as draft assemblies from human skin strains we isolated (n=3).

215
Gene clusters (GCs), which represent one or more genes contributed by one or more genomes, 216 were inferred using the anvi'o anvi-pan-genome function. Across all species, we identified 42,154 217 total GCs, 495 of which are core GCs present in all genomes. In a subset of genomes, 13,235

218
GCs are shared (dispensable) and 28,424 GCs are found in only one genome (species-specific) 219 ( Figure 1). Genome size ranged from 2.0 to 3.6 Mbp, with an average of 2.7 ± 0.3 Mbp, and the 220 number of GCs per genome ranged from 1858 to 3170 GCs, with an average of 2365 ± 294 GCs 221 (Supplemental Table S1). Further, HA species have a significantly reduced average number of 222 GCs per genome compared to EA species (2242 vs. 2661, p-value<0.0001) and a significantly

225
Because we observed significant differences in gene cluster number and genome size between 226 HA and EA genomes, we hypothesized that HA and EA Corynebacterium species would form 227 phylogenetically distinct clades. To test this hypothesis, we generated a Corynebacterium 228 phylogenetic tree based on 71 conserved single copy genes. We found that generally, EA and 229 HA species were interspersed throughout the phylogeny, with a few clear HA-or EA-specific 8 subgroups ( Figure 3). For example, the HA C. amycolatum subgroup forms the deepest subline 231 within this genus, consistent with previous findings [40]. Species within this subgroup are unique 232 compared to most other Corynebacteria in that they do not contain mycolic acids in their cell walls.

233
In addition, the clade containing C. diphtheriae and C. pseudotuberculosis also consists uniquely 234 of HA species, several of which are serious pathogens in humans and other animals.

235
Furthermore, a distinct lineage of EA soil dwelling Corynebacteria is formed by several species 236 including C. glutamicum, which is an industrial producer of amino acids and other diverse 237 products. We also observed that genome length was reflected by the phylogeny, where species 238 within the same clade possessed similar genome length and EA clades having notably larger 239 genomes ( Figure 3).

241
De novo cobamide biosynthesis is enriched in host-associated Corynebacteria.

242
To identify functional gene predictions that differ between HA and EA genomes, we performed a 243 functional enrichment analysis in anvi'o using Clusters of Orthologous Groups (COG) annotations.

244
Within the top significantly enriched functions in EA genomes, we observed functions putatively 245 involved in amino acid transport, metabolism of various substrates, including aromatic 246 compounds, tetrahydropterin cofactors, citrate/malate, and alcohols, and other uncharacterized 247 functions (q < 0.05) ( Figure 2D). Within HA genomes, we observed a significant enrichment of 248 functions involved in the transport of various substrates, as well as 8 COG functions putatively 249 involved in cobamide biosynthesis in HA genomes (q < 0.05) (Fig. 2C). Cobamides, which consist 250 of the vitamin B12 family of cofactors, are required for metabolism by organisms across all domains 251 of life. Specifically, they function in the catalysis of diverse enzymatic reactions, ranging from 252 primary and secondary metabolism, including methionine and natural product synthesis, to 253 environmentally impactful processes, such as methanogenesis and mercury methylation.

255
To validate the presence of the enriched cobamide biosynthesis genes and other genes of the 256 approximately 25-enzyme cobamide biosynthesis pathway within the 71 Corynebacterium 257 genomes, we scanned the genomes for these genes using KOfamScan [29]. We found that 258 tetrapyrrole precursor synthesis, which is shared among the cobamide, heme, and chlorophyll 259 biosynthesis pathways [31], was highly conserved throughout the genus (Figure 3). Corrin ring 260 and nucleotide loop synthesis was predominantly intact and conserved within 5 distinct

261
Corynebacterium lineages, including those of C. diphtheriae, C. epidermidicans, C. 262 argentoratense, C. kroppenstedtii, and C. amycolatum. The species within these groups encode 263 for all or nearly all of the genes required for cobamide biosynthesis, and notably, 21 out of 22 of 9 these predicted cobamide producers are host-associated ( Figure 3). In addition, three EA species 265 encode for the later portion of the pathway involved in nucleotide loop assembly, which can 266 function in cobinamide salvaging. Taken together, these observations demonstrate a range of 267 cobamide biosynthetic capabilities by Corynebacteria, including de novo producers, cobinamide 268 salvagers, and non-producers.

270
Microbiome diversity is associated with the abundance of Corynebacterium cobamide 271 producers. While 86% of bacteria have been found to encode at least one cobamide-dependent 272 enzyme, only 37% of bacteria can produce the cofactor de novo [31]. Therefore, within microbial 273 communities, cobamide sharing likely exists as a means to fulfill this nutritional requirement and 274 is hypothesized to mediate community dynamics. Because several skin-associated

275
Corynebacteria were identified to encode for de novo cobamide biosynthesis, we hypothesize 276 that cobamide sharing occurs in the skin microbiome and mediates community dynamics. To 277 assess changes at the community level associated with the presence of Corynebacterium 278 cobamide producers, we explored the relationship between microbiome diversity and the 279 abundance of Corynebacterium species producing cobamides. To do this, we analyzed published 280 relative abundance data from 585 skin metagenomes [8]. Within each metagenome, we summed 281 the abundance of five Corynebacterium species that encode for de novo cobamide biosynthesis 282 (C. amycolatum, C. kroppenstedtii, C. diphtheriae, C. ulcerans, C. tuberculosis) and analyzed 283 diversity within each community. We found that the abundance of these cobamide producers was 284 positively correlated with alpha diversity (Shannon index) in 12/18 skin sites (p<0.05) ( Figure 4A),

285
suggesting that increased abundance of cobamide-producing Corynebacteria (CPC) within the 286 community may increase diversity. We also observed a positive correlation between CPC 287 abundance and community richness, although only in 4/18 skin sites (p<0.05) (Supplemental 288 Figure 1). The observation that CPC abundance overall has a higher correlation with Shannon 289 diversity, a metric taking into account total number of species and their overall proportion, 290 compared to total species counts suggests that the increase in diversity we observe arises 291 predominantly from an increase in species evenness within a given community. Supporting this 292 argument, we found that communities with a low CPC abundance were usually dominated by 293 single species including Cutibacterium acnes (formerly Propionibacterium acnes) and

294
Propionibacterium phage, while communities with high CPC showed an expansion of other skin 295 taxa and an overall more even species distribution within the community (Supplemental Figure   296 2).

298
To determine if CPC also shapes community composition, we calculated the Bray-Curtis 299 dissimilarity index of square-root transformed abundance data of all samples within each skin site.

300
A square root transformation was applied in order to give less weight to dominant species and 301 more weight to low abundant species, which were prevalent in the dataset. The resulting matrices 302 were ordinated by non-metric multidimensional scaling to visualize relatedness between 303 individual communities. Within skin sites, particularly those that demonstrated a positive 304 correlation between Shannon diversity and CPC abundance, samples clustered along gradients 305 of both alpha diversity and CPC abundance ( Figure 4B). This was especially evident for samples 306 in the alar crease, back, manubrium, and inguinal crease, each of which had the strongest 307 correlation coefficients between Shannon diversity and CPC abundance (R=0.82, R=0.87, 308 R=0.67, R=0.79 respectively) ( Figure 4A).

310
Cobamide biosynthesis, salvage, and remodeling genes are encoded by skin taxa.

311
Having identified that several skin-associated Corynebacterium species encode for de novo 312 cobamide biosynthesis, we were interested in determining if other members within the skin 313 microbiome also had this genomic capacity. We searched for cobamide biosynthesis genes within  Table S2). We used 12 profile HMMs representing 316 genes that demonstrated broad distribution within the de novo cobamide biosynthesis pathway 317 and have been previously used to identify cobamide biosynthesis genes within metagenomic data 318 [32,33]. Among site microenvironments, sebaceous sites were revealed to harbor the overall 319 highest number of hits to cobamide biosynthesis genes (n=139,128 reads), followed by moist

327
we calculated the contribution of each taxa to cobamide biosynthesis gene hits by dividing the 328 number of gene hits assigned to a given taxa by the total number of gene hits within the sample.

329
We found that Propionibacteriaceae was the dominant contributor to cobamide biosynthesis gene 330 hits, particularly in sebaceous sites ( Figure 5). However, metagenomes from moist, dry, and foot sites exhibited higher variability in Propionibacteriaceae contribution, which was also reflected in 332 variable contribution by the other four taxa.

334
For a finer resolution of taxa contribution across individual biosynthesis genes, we determined 335 taxa contribution to 12 cobamide biosynthesis marker genes for the top 40 abundant taxa within 336 the dataset. This analysis revealed that the number of taxa encoding the full or nearly complete 337 suite of cobamide biosynthesis markers varied according to the microenvironment ( Figure 6). In 338 particular, cobamide biosynthetic potential within the sebaceous microenvironment is uniquely 339 restricted to Propionibacteriaceae, whereas numerous taxa within the moist, dry, and foot 340 microenvironments encode for most cobamide biosynthesis markers. These include diverse taxa 341 across the Firmicutes, Actinobacteria, and Proteobacteria phyla, such as Pseudomonadaceae,

342
Veillonellaceae, and Rhodobacteraceae, all of which have previously been implicated in 343 cobamide biosynthesis [41][42][43]. Corynebacteriaceae were also found to encode for nearly all 344 cobamide biosynthesis markers, validating our comparative genomics analysis that identified 345 specific Corynebacterium species as de novo producers. We also observed that a proportion of 346 low abundant skin taxa encode for the full suite of cobamide biosynthesis markers, grouped into 347 the "Other" category (<0.15% abundance based on rpoB hits), demonstrating that cobamide 362 [46][47][48]. Therefore, the identification of taxa that encode for cobamide remodeling enzymes 363 suggests that structurally distinct cobamides are produced and used for microbial metabolism on 364 the skin. Overall, these results demonstrate that certain skin taxa have the genetic potential to 12 produce complete cobamides through de novo biosynthesis, precursor salvage, or lower ligand 366 remodeling.

370
Our findings suggest that several taxa within the skin microbiome synthesize cobamides de novo.

371
Therefore, cobamides and cobamide precursors are likely available in the community for uptake

378
Across sebaceous, moist, and dry microenvironments, Propionibacteriaecae was the dominant 379 taxa encoding for cobamide-dependent enzymes D-ornithine aminomutase, methylmalonyl-CoA 380 mutase, and ribonucleotide reductase class II, with this taxon contributing overall the most 381 cobamide-dependent sequences within the dataset (Supplemental Table S7). In contrast, we 382 found across the remaining cobamide-dependent enzymes, hits were assigned to 383 phylogenetically diverse taxa across the four major phyla on the skin (Actinobacteria, Firmicutes,

384
Proteobacteria, and Bacteroidetes) [17]. In particular, cobamide-dependent methionine synthase, 385 epoxyqueosine reductase, ethanolamine lyase, and ribonucleotide reductase were the most 386 common cobamide-dependent enzymes in the dataset, when considering the number of unique 387 taxa that encode for these enzymes (Supplemental Figure 3A). Notably, more unique species 388 within the dataset encode for at least one cobamide-dependent enzyme (n=456 species 389 contributing at least five reads to any cobamide-dependent enzyme) than appear to appreciably 390 contribute to de novo cobamide biosynthesis on the skin (n=23 species contributing at least five 391 reads to at least 7/10 biosynthesis genes) (Supplemental Table S9). This supports a model for

404
To further explore the observed role that cobamides play in C. acnes, we mapped the identified 405 riboswitch metagenomic reads to the C. acnes KPA171202 reference genome and found that the 406 reads mapped to several genomic regions, with nearby genes having functions involved in ABC 407 transport, cobalt transport, cobamide biosynthesis, and cobamide-dependent and -independent 408 reactions ( Figure 8A). Regions 6, 7, and 8 did not encode for known cobamide-related genes, but 409 all were found directly upstream of a small ~250 bp pseudogene and either a 1321 bp pseudogene     Corynebacteria. Alternative cobamide-independent pathways exist for these functions, therefore 463 cobamides may confer a distinct advantage for Corynebacterium cobamide-producing species.

464
Indeed, metE, the cobamide-independent methionine synthase, is sensitive to oxidative stress

498
In support of a key role for cobamides within the skin microbiome, we found that other 499 phylogenetically diverse skin taxa, both dominant and low abundance, encode for metabolically diverse cobamide-dependent enzymes, as well as proteins involved in cobamide transport, 501 salvage, and remodeling (Figures 6 and 7). Because cobamide structure can dictate microbial 502 growth and metabolism [74], microorganisms exhibit cobamide selectivity, employing numerous 503 mechanisms for acquiring and using their preferred cobamide(s). These include selectivity in 504 cobamide-dependent enzymes, differential cobamide import, and cobamide-specific gene

515
Corynebacterium cobamide-producing species may promote microbiome diversity (Figure 4) 516 through an increase in community alpha diversity that arises from an overall more evenly 517 distributed community. Interestingly, we observed that communities with low abundance of 518 cobamide-producing Corynebacteria and thus low diversity were dominated by Propionibacterium 519 species, predicted to be the dominant cobamide producer on the skin. Because we anticipate 520 communities that produce more cobamides to have increased diversity, this result is confounding.

521
However, because a spectrum of ecological niches exists on the skin, ranging from the anaerobic 522 sebaceous follicle to the aerobic and desiccate skin surface, we hypothesize that cobamide 523 interactions are dependent upon spatial structure of skin microbial communities. For example, C.  Overall, our findings reveal that skin-associated species within the Corynebacterium genus 536 encode for the de novo cobamide biosynthesis pathway, which is a relatively rare metabolic 537 function. Within skin microbial communities, abundance of these cobamide-producing 538 Corynebacteria is correlated with increased microbiome diversity, suggesting that cobamides are 539 important mediators of microbiome structure. We also demonstrate that cobamide use is 540 widespread across phylogenetically diverse skin taxa, therefore we hypothesize that cobamides 541 are likely to play a significant role in skin microbial community dynamics. Future studies to 542 interrogate cobamide interactions between skin commensals will provide insight into the effects 543 of cobamide biosynthesis and use within microbial communities.