Snow Microbiome Functional Analyses Reveal Novel Microbial Metabolism of Complex Organic Compounds

Microbes active in extremely cold environments are not as well explored as those of other extreme environments. Studies have revealed a substantial microbial diversity and identified cold-specific microbiome molecular functions. We analyzed the metagenomes and metatranscriptomes of twenty snow samples collected during early and late spring in Svalbard, Norway using our computational read-based microbiome function annotation tool, mi-faser. Our results revealed a more diverse microbiome functional capacity and activity in the early compared to in the late spring samples. The dissimilarity between the metagenomes and metatranscriptomes of the same samples was also significantly higher in the early spring. These findings suggest that early spring samples may contain a larger fraction of DNA of dormant organisms, while late spring samples reflect a new community that is metabolically active. We additionally showed that the abundance of the sequencing reads mapping to the fatty acid synthesis-related microbial pathways was significantly positively correlated with organic acid levels, in both our late spring metagenomes and metatranscriptomes. Moreover, the geraniol degradation pathway and the styrene degradation pathway read abundances correlated and inversely correlated, respectively, with the organic acid levels. These results suggest a possible nutrient switch. Our study thus highlights the activity of microbial degradation pathways of complex organic compounds previously unreported at low temperatures.


Introduction
Abiotic parameters, such as temperature, pH, and pressure create stress on microorganisms, especially in extreme environments [1]. The cryosphere, an extreme cold environment, covers a large portion of Earth's surface. Over 14% of the world's biosphere is located at the planetary poles, while 90% (by volume) of the ocean is colder than 5°C [2]. Taxonomic surveys based on 16S rRNA gene sequencing have described significant microbial diversity in glacial ice [3][4][5], cryoconite [6,7], sea ice [8], and polar and alpine snow [9][10][11][12][13]. Bacteria seem to be ubiquitous in snow and belong to numerous taxa such as Proteobacteria (Alpha-, Beta-and Gamma-), the Cytophaga-Flexibacter-Bacteriodes group, Actinobacteria, and Cyanobacteria [10,12,14,15], although the results can vary based on season, sampling location and analysis methods. For example, the diversity of organisms in snow from the Canadian high Arctic ice sheet was 20 times lower than that measured in Tibetan plateau snow [12,16]. A variety of approaches, such as cultivation, ribosomal profiling, and stable isotope probing, have been used to detect and measure microbial activity at subzero temperatures in permafrost soils (for review, see Nikrad et al. 2016 [17]). These offer insights into the microbial interactions with the soil environment in the cold. Considerably less is known about the functional capacity of microorganisms in the snow. One pioneering metagenomic study identified microbiome functional capacity correlating with the chemical parameters (e.g. mercury concentration) in Arctic spring snow samples [18]. If organisms are active at below zero temperatures in the snow, then they are likely involved in a range of processes involving organic matter, which could impact atmospheric and biogeochemical cycles [19]. One poorly constrained source and potential modifier of organic compounds is biological activity in snow [20].
Large scale metagenome sequencing drastically increased the publicly available metagenomic data from high profile projects such as Terragenome [21], the Global Ocean Sampling Expedition [22], the Human Microbiome Project [23] and Tara Oceans [24]. Studies have been carried out in a wide variety of environments including the human gut [25][26][27], groundwater [28], acid mine drainage [29], beach sand [30], etc., and identified potential diagnostic, therapeutic, or bioremediation targets. With ample data, comparative analysis of meta-genomes/transcriptomes under different conditions highlights the key microbial members and functions that result from and/or contribute to niche differences [31,32]. Comparative meta-genomes/transcriptomes analysis have not been commonly applied to cold environment samples. Such analysis should help elucidate microbial mechanisms of survival and adaptation at low temperatures.
Bergk-Pinto et al. studied the microbial ecology in twenty snow samples collected during early and late spring (mid-April to mid-June, 2011, Svalbard, Norway) [33]. Using a combined method of marker genes and network analysis, Bergk-Pinto et al. revealed that from early to late spring, the microbial community in the snow shifted from cooperation to competition, accompanied by enrichment of antibiotic resistant genes [33]. Here, we investigated microbial metabolism of organic compounds at low temperature and further analyzed these metagenomic and metatranscriptomic datasets using mi-faser [27]. This new bioinformatic tool allows functional annotation of sequencing reads based on experimentally verified microbial enzymes at high accuracy (>90%). Our results revealed significantly lower metagenome-to-metatranscriptome similarity in the early spring than in the late spring samples. We also found that in the late spring samples the abundance of sequencing reads mapping to the components of the fatty acid synthesis-related microbial pathways significantly correlated with organic acid levels. Both of these findings are consistent with a period of microbial community growth, in line with the previous finding of the switch from microbial cooperation to competition [33]. We further observed that the rise in organic acid levels correlated with the appearance of the geraniol degradation pathway and disappearance of the styrene degradation pathway. This finding might represent a change in nutrient conditions during the community growth process. To summarize, here we observed the presence of microbial functionality necessary for degradation of complex organic compounds in both metagenomes and metatranscriptomes of the late spring snow samples. This is the first time this functionality has been reported to be under active expression at temperatures below 0°C.

Data collection and preprocessing
We obtained the metagenomic, metatranscriptomic, and chemistry data for twenty snow samples from the ftp website from the Environmental Microbial Genomic Group (ftp://ftp-adn.eclyon.fr/Snow_organic_acids_bacterial_interactions/metagenomes_and_metatranscriptomes_svalbard2011 /). The technical details of the sampling and sequencing process were described in a previous study [33]. Briefly, snow was collected over two months during mid-April to mid-June at Ny Ålesund in the Spitsbergen Island of Svalbard, Norway (78°56'N, 11°52'E). Surface snow layers were collected into sterile bags using a sterilized shovel from a 50 m2 perimeter with restricted access to reduce contamination from human sources. Details on sampling conditions, sample site and chemical analyses can be found in Bergk-Pinto et al. [33].
The sequencing data were processed using Mothur [34] for quality filtering using the settings in Schloss et al. [35]. Base overrepresentation was controlled using FastQC [36] and Usearch [37] was used to identify and remove remained adaptors.

Analysis
The post-quality-control samples were submitted to mi-faser web service [27, 38] for functional annotation. For each sample, mi-faser returns a read abundance table of enzyme functionality detected in the sample, the EC-profile (EC stands for Enzyme Commission [39]). For each known functional pathway [40], we divided the sum of all reads mapping to the enzyme members of the pathway by the number of these enzyme members to create an entry in the pathway-profile of the sample. The NMDS diagrams were generated with the (enzyme and pathway) profiles of samples assigned to four groups, early_DNA (early spring metagenomes), early_RNA (early spring metatranscriptomes), late_DNA (late spring metagenomes), and late_RNA (late spring metatranscriptomes). The Euclidean distances between the same-sample DNA and RNA NMDS points were calculated and compared across groups. Note that our computational method (mi-faser followed by NMDS analysis) successfully revealed microbial functional diversity in various environment samples, such as beach sand and human gut [27]. The reliability of mifaser annotation is from 1) its function mapping algorithm that is specifically trained for short reads, and 2) its manually curated reference database that contain only protein sequences with experimentally verified function [27]. Organic acid levels were standardized with their according total concentration in all samples. The Pearson correlation coefficients, as well as the significance of correlations, were calculated by the R function cor.test [41].

Early-to-late-spring dissimilarity and metagenome-to-transcriptome divergence highlight community activity in late spring samples
While the metagenome reflects the overall potential function of a microbial community, metatranscriptomic analyses are based on genes that are transcribed and thus provide more information on the active fraction of these functions. Using mi-faser (microbiome functional annotation of sequencing reads), our read based microbiome function annotation tool, to analyse the metagenomes and metatranscriptomes of early and late spring polar snow samples, we observed that (1) the early spring samples were more diverse (measured as the Euclidean distance between entries on the NMDS plot; Methods) in both potential and active microbial functionality than the late spring samples (EC-profiles sample distance: early spring = 4.8±2.3, late spring = 0.4±0.3, Figure 1A, SOM Figure 1A,B; pathway-profiles sample distance: early spring = 1.4±0.9, late spring = 0.1±0.1, Figure 1B, SOM Figure 1C,D) and that (2) metagenome-to-transcriptome similarity of the same sample (measured as the Euclidean distance between entries on the NMDS plot; Methods) is was significantly lower in early than in late spring (in both comparisons of the EC-profiles, t-test p-value <0.001, Figure 2A and the pathway-profiles, p-value=0.025, Figure 2B). Note that for all comparisons ~29% ECs (195 of 683; SOM Table 1) in our data could not be mapped to known KEGG pathways. The discrepancy in annotation between metagenomes (DNA) and metatranscriptomes (cDNA) has previously been observed in environments such as human gut [42] and open ocean [43]. The genes observed in the metagenomes represent potential functions that may or may not be expressed in the environment at the time of sampling and could belong to inactive community members; the metatranscriptome-specific functions belong to active members of the community at the time of sampling [44]. The low metagenometo-transcriptome similarity in the early spring samples (Figure 2) suggest that the active members in early spring occur at such low abundance that metagenomic sequencing fails to detect them. We speculated that the potential functional diversity in the early spring metagenome samples (Figure 1; SOM Figure 1; DNA datasets) might come from the DNA of dead or inactive cells preserved in the snow. Interestingly, many microorganisms identified in snow and ice via 16S rRNA gene surveys are non-psychrophiles [45] and their importance in the community needs further investigation. Meanwhile, the diversity of active microbial functionality in the early spring metatranscriptomes (Figure 1; SOM Figure 1; RNA datasets) reflected diverse microbial activities (238 enzymatic functions involved in 84 metabolic pathways including cell size reduction, changes in fatty acid and phospholipid membrane composition, and decrease in the fractional volume of cellular water). This observation is in line with the known variety of survival strategies employed by microbes at low temperature [2,17]. With the warming in the late spring, the active community made up a larger fraction of the sequenced reads and, thus, manifested in more homogeneity. Previous 16S rRNA-based taxonomic analysis on the same dataset observed a shift in the community from early to late spring [33]. While the early spring samples contained a core community of 59 OTUs, there were only 29 OTUs in the late spring samples, with 42 early spring core OTUs disappearing from the core community of late spring samples [33]. The early spring community contained a higher diversity of core organisms of which only a small fraction were likely active, and the inactive community members could no longer be detected in the late spring samples. As a result, we observed a decrease in the function diversity (Figure 1; SOM Figure 1) and an increase in the metagenome-to-transcriptome similarity (Figure 2). In addition, our result also suggests that despite the taxonomic diversity in the late spring samples, their functional capacity and activity were highly similar ( Figure 1; SOM Figure 1; RNA datasets), highlighting the advantages of functional -omics analysis to the 16S rRNA gene surveys.

Microbial use of complex organic compounds in the snow
Snow provides a medium and nutrients for microbial growth and associated physicochemical processes [46], and growth implies the utilization of nutrients. Numerous genes detected in environmental ice metagenomes related to xenobiotics, biopolymers and other carbon sources suggest that glacial ice microorganisms have the potential to degrade a wide range of substrates [47]. Our hypothesis of increased active microbial community members in the late spring snow might be related to the changes in organic acid levels in the samples (oxalate, acetate and formate; SOM Table 2). All three organic acids remained low in concentration in the early spring samples. They increased in the late spring (SOM Figure 2), possibly concomitant with increased microbial activity. Microbial preferences for different carbon classes was studied in Antarctic snow and results showed a higher rate of carbon uptake when snow microcosms were amended with a combination of simple and complex carbon sources [48]. The appearance of organic acids in the snow may have both abiotic (e.g. aerial deposition) and biotic (e.g. microbial activity) origins. In our study, however, the clear correlation of their per-sample concentrations with different microbial activity levels, captured by metatranscriptomes, strongly indicates active metabolism in the late spring samples. Note that both EC-profiles and pathway-profiles correlated with the organic acid levels more significantly than the EggNog Mapper derived functional profiles (Methods; SOM Table 3) from Bergk-Pinto et al. [33], highlighting mi-faser's adequacy for this study. Among the enzymes that were not mapped to known KEGG pathways, two tRNA-methyltransferases (2.1.1.61 and 2.1.1.217; p-value<0.05, Methods) showed significant correlation with organic acid levels. tRNA methylation regulates important steps in protein synthesis and is essential for microbial growth in high temperature [49]. Our results suggest that it could be also involved in low temperature conditions.
We further identified five pathways in our meta-genomes/transcriptomes that significantly correlated with the organic acid levels in the late spring samples (Table 1; SOM Figure 3-7; p-value<0.05, Methods): fatty acid biosynthesis, biosynthesis of unsaturated fatty acids, fatty acid elongation, geraniol degradation, and styrene degradation. The top three pathways were related to fatty acid synthesis and elongation. Fatty acids are essential for living organisms due to their role in membrane synthesis, which is even more critical when low temperature affects membrane fluidity [50]. Geraniol degradation is important as geraniol is a terpene produced by a variety of plants for its antibacterial activities [51]. Terpenes are released from plants to the atmosphere [52], and deposited in arctic snowpacks like other volatile organic compounds [53]. Some bacteria, e.g. Pseudomonas putida, are able to utilize geraniol as their sole carbon and energy source [54]. P. putida is also known to degrade styrene [55] and polystyrene [56]. Therefore, the correlation and anticorrelation between the organic acids levels and, respectively, the microbial geraniol degradation and styrene degradation, may suggest a switch of nutrients in the environment. P. putida is known to possess diverse metabolic capabilities to degrade a variety of organic solvents. Most of its strains are mesophilic, but one (KT2440) has been reported as psychrotolerant (optimal growth at 30°C but can proliferate at 4°C) [57]. To the best of our knowledge, no microbial metabolism of geraniol and styrene has been reported at low temperatures with evidence at transcription level. Our functional -omics study thus indicates, for the first time, the activity of microbial degradation pathways of complex organic compounds at sub-zero temperatures.

Conclusions
We defined microbial activity at low temperature at the gene expression level in metagenomic and metatranscriptomic datasets from snow in early and late spring. Our results highlight the novel microbial activity of complex organic compound degradations at low temperature. Further in-depth exploration of the functionality of the cryosphere inhabitants can contribute to our understanding of microbial metabolism at low temperatures and aid in the discovery of novel enzymes with potential industrial and bioremediation value.