Abstract
The anadromous Atlantic salmon undergo a preparatory physiological transformation before seawater entry, referred to as smoltification. Key molecular developmental processes involved in this life stage transition, such as remodeling of gill functions, are known to be synchronized and modulated by environmental cues like photoperiod. However, little is known about the photoperiod influence and genome regulatory processes driving other canonical aspects of smoltification such as the large-scale changes in lipid metabolism and energy homeostasis in the developing smolt liver.
Here we generate transcriptome, DNA methylation, and chromatin accessibility data from salmon livers across smoltification under different photoperiod regimes. We find a systematic reduction of expression levels of genes with a metabolic function, such as lipid metabolism, and increased expression of energy related genes such as oxidative phosphorylation, during smolt development in freshwater. However, in contrast to similar studies of the gill, smolt liver gene expression prior to seawater transfer was not impacted by photoperiodic history. Integrated analyses of gene expression and transcription factor (TF) binding signatures highlight likely important TF dynamics underlying smolt gene regulatory changes. We infer that ZNF682, KLFs, and NFY TFs are important in driving a liver metabolic shift from synthesis to break down of organic compounds in freshwater. Moreover, the increased expression of ribosomal associated genes after smolts were transferred to seawater was associated with increased occupancy of NFIX and JUN/FOS TFs proximal to transcription start sites, which could be the molecular consequence of rising levels of circulating growth hormones after seawater transition. We also identified differential methylation patterns across the genome, but associated genes were not functionally enriched or correlated to observed gene expression changes across smolt development. This contrasts with changes in TF binding which were highly correlated to gene expression, underscoring the relative importance of chromatin accessibility and transcription factor regulation in smoltification.
Author summary Atlantic salmon migrate between freshwater and seawater as they mature and grow. To survive the transition between these distinct environments, salmon transform their behavior, morphology, and physiology through the process of smoltification. One important adaptation to life at sea is remodeling of metabolism in the liver. It is unknown, however, whether this is a preadaptation that occurs before migration, what degree this is influenced by day length like other aspects of smoltification, and how gene regulatory programs shift to accomplish this transformation. We addressed these questions through a time course experiment where salmon were exposed to short and long day lengths, smoltified, and transferred to seawater. We sampled the livers and measured changes in gene expression, DNA methylation, chromatin accessibility, and transcription factor binding. We found metabolic remodeling occurred in freshwater before exposure to seawater and that day length did not have any long-term effects in liver. Transcription factor binding dynamics were closely linked to gene expression changes, and we describe transcription factors with key roles in smoltification. In stark contrast, we found no links between gene expression changes and DNA methylation patterns. This work deepens our understanding of the regulatory gear shifts associated with metabolic remodeling during smoltification.
Introduction
Atlantic salmon are an anadromous species. They begin life in freshwater riverine habitats, then migrate to sea to grow and mature before returning to freshwater to spawn. The seawater migration is preceded by a “preparatory” process that influences a range of behavioral, morphological and physiological traits, referred to as smoltification [1]. This includes changes in pigmentation and growth [2], ion regulation [3, 4], the immune system [5], and various functions of the metabolism [6, 7].
The timing of smoltification is regulated by the physiological status of the fish [8], as well as external environmental signals such as temperature and day length [2, 9, 10]. Salmon smoltify in the spring, and the transition from short to long days is believed to drive changes in hormonal regulation and initiate smoltification. In line with this model, we recently demonstrated that exposure to a short photoperiod (i.e. a simulated winter photoperiod) induce transcription of a subset of photoperiod-history sensitive genes [3], dampens acute transcriptomic responses to increased salinity, and results in enhanced seawater growth [11]. These findings support a model of smolt development regulation, where photoperiodic-history drives genome regulatory remodeling underlying key smoltification associated phenotypes.
Although gill physiology has received most attention in the smoltification literature, other organs such as the liver also undergo large changes in function upon smoltification and seawater migration, with large implications for key metabolic traits. It has been shown that lipid composition in Atlantic salmon reared on different diets converges after smoltification [12, 13]. This is likely a consequence of smoltification associated increase in lipolytic rates and decreased lipid biosynthesis [6, 7]. In a recent study we demonstrated large changes in lipid metabolism gene regulation across the fresh-saltwater transition following smoltification [14]. Unfortunately, in this study smoltification and seawater transfer were confounded (i.e. smolts in freshwater were not sampled), hence it remains unclear if photoperiodic history is involved in shaping the molecular phenotype of the smolt liver as we observe in gills.
In this study, we conducted a smoltification trial to test if the photoperiodic history is a major factor impacting the genome regulatory landscape of Atlantic salmon liver. To do this we generated transcriptome, chromatin accessibility, and DNA methylation data across the smolt development and seawater transfer to characterize the transcriptomic changes in smolts reared with a short winter-like photoperiod (8:16) compared to smolts reared on constant light (24:0). We test if photoperiodic history affects the smolt liver phenotype at the level of gene expression and use chromatin accessibility data to identify putative regulatory pathways and transcription factors involved in life-stage associated changes in liver function from the juvenile stage in the freshwater environment to an adult fish in seawater.
Results
Gene expression changes support decreased lipid metabolism and increased protein metabolism and energy production during smoltification
A main goal was determining the effects of smoltification on metabolism and whether there was an effect of exposure to a short photoperiod (i.e. a winter) on the gene regulation in the liver. To accomplish this, we reared three groups of salmon for 46 weeks on commercial diets, from parr, through smoltification, and 6 weeks following transfer to seawater (Fig 1). The experimental group was given an artificial winter-like short photoperiod (8 hours light, 16 hours dark) for 8 weeks before they were returned to constant light, while the control group was reared under constant light throughout the experiment. Finally, the freshwater control group contained fish from the experimental group that was not transferred to sea. Following smoltification, fish transferred to seawater grew more slowly than fish that remained in freshwater (Fig 1, Table S1). There was no mortality throughout the freshwater portion of the trial, but some mortality (8x fish) in one tank due to improper oxygenation after seawater transfer.
Schematic of the experimental design and weight of salmon over time. Fish were reared for 21 weeks after first feeding in constant light conditions prior to week 1 sampling. The experimental group (black, solid line) was exposed to a short photoperiod before switched back to constant light and sampled at week 10. After a smoltification period, fish were sampled at week 19, then transferred to seawater conditions and sampled lastly at week 25. A photoperiod control group (grey, dashed line) received constant light throughout the experiment, and a freshwater control group branched off from the experimental group by remaining in freshwater. Four fish were sampled at each timepoint.
To characterize global transcriptome changes through key life stages, under a semi-natural developmental trajectory, we sampled liver tissue from fish at each sampling point for RNA sequencing. We first tested for changes in gene expression in fish experiencing artificial winter and transfer to seawater (experimental group) using an ANOVA-like test. This yielded 3,845 differentially expressed genes (DEGs, FDR <0.05) which were assigned to seven co-expression clusters using hierarchical clustering (Fig 2A, Table S2). These clusters reflected major patterns of gene regulatory changes (Fig 2B); peak expression levels in smolts (clusters 2 and 3), peak expression following the short photoperiod (clusters 4 and 5), decreased expression after short photoperiod and in smolts relative to all other time points (cluster 6), steady decrease in expression from parr throughout the experiment (cluster 7), and strong increase in expression in seawater (cluster 1).
A) Relative liver expression of genes differentially expressed between any time point in the experimental fish cohort (FDR <0.05). Scaled expression is denoted as gene-scaled transcripts per million. Genes were partitioned into six co-expression clusters by hierarchical clustering. Colored bars indicate cluster membership when correlation to the mean cluster pattern was >0.5. Genes with correlation =<0.5 were excluded. B) Gene expression trends over time by cluster. Colored line indicates mean relative expression while grey lines are individual genes within the cluster. C) KEGG pathway enrichment by cluster. Colored diamonds indicate for pathways which clusters they are significantly enriched in (adjusted p <0.05). Colored bars indicate the proportion of genes within the pathways that are in clusters.
To associate well defined metabolic or signaling processes to the different gene expression trends, we performed KEGG enrichment analysis on each co-expression cluster, yielding 56 unique significantly enriched (adjusted p <0.05) pathways (Fig 2C). Genes in clusters 2 and 3 that increased during smoltification and sharply decreased after seawater transfer were enriched in pathways related to genetic information processing, cell growth, protein metabolism, and oxidative phosphorylation. Genes in clusters 4 and 5 which had peak expression after a short photoperiod and decreased during smoltification and seawater transfer were similarly enriched in genetic information processing pathways and energy metabolism, however they also contained several pathways related to amino acid metabolism including cysteine and methionine metabolism, glutathione metabolism, and selenocompound metabolism. Cluster 1 genes strongly increased in relative expression after seawater transfer and was exclusively enriched in the ribosome pathway. Genes in cluster 7 which decreased in relative expression during smoltification and remained low during seawater transfer were enriched mainly in lipid, amino acid, and carbohydrate metabolic pathways, ABC transporters, and signaling pathways including FoxO signaling and PPAR signaling.
Since many KEGG pathways contain enzymes with reciprocal activities, we manually examined genes within select enriched KEGG pathways to determine what was driving enrichment trends. In lipid metabolic pathways we observed a distinct bias in genes relating to long-chain fatty acids towards downregulation in freshwater smolts. Seven long-chain-fatty-acyl-CoA ligase (acsl) genes (acsl1, three acsl3 and three acsl4), acetyl-CoA carboxylase (acc1), three acetyl-CoA synthetase genes (acs2l-1, acs2l-1, and acs2l-1), several key genes related to polyunsaturated fatty acid (PUFA) biosynthesis (5fad, 6fada, 6fadb, elovl5b, and elovl2), both copies of fatty acid synthase (fas1 and fas2), and three copies of the key gene diacylglycerol acetyltransferase (two dgat1 and one dgat2) all significantly decreased during smoltification and remain lowly expressed through seawater transfer (Fig 3B).
A) Schematic of the lipid biosynthesis pathway in Atlantic salmon. B) Relative expression of genes in the pathway over time. Acetyl-CoA and malonyl-CoA synthesis (red) displays genes acc1, acs2l-1, acs2l-2, and acs2l-3. Fatty acid activation (green) displays genes acsl1, acsl3l-1, acsl3l-2, acsl3l-3, acsl4, acsl4l-1, acsl4l-2. Fatty acid synthesis (purple) displays gene fas1 and fas2. Poly unsaturated fatty acid (PUFA) synthesis (blue) displays genes 5fad, 6fada, 6fadb, elovl2, and elovl5b.
Synthesis of acetyl-CoA by acs and activation of long-chain fatty acids by acsl is the first obligatory step for entry into beta-oxidation or biosynthesis pathways (Fig 3A), so a decrease in these gene products likely means that metabolism of their substrates (acetate and C12 to C20 fatty acids) also decreases [15]. Finally, three copies of the key gene diacylglycerol acetyltransferase (two dgat1 and one dgat2) which catalyzes the last committed step in triacylglycerol biosynthesis [16] decreased in freshwater smolts. Collectively, co-downregulation of these important lipid associated genes is a strong indicator of decreased utilization and processing of fatty acids, especially long chain poly unsaturated fatty acids (LC-PUFA), in smolts preparing to enter a seawater environment.
We identified genes directly influenced by seawater transfer by performing a pair-wise test for gene expression changes between the experimental group and freshwater control group at week 25. This resulted in 2,121 DEGs (FDR <0.05, Table S5), most (1227) being downregulated in seawater compared to freshwater (Fig S1) and overlapped with many genes belonging to co-expression clusters 1 (281) and cluster 2 (529) in Fig 2. Regarding lipid metabolism, processes related to de novo fatty acid synthesis decrease in seawater relative to control. Both copies of fatty acid synthase (fas1 and fas2) and one acetyl-CoA carboxylase (acc1) decreased expression in seawater, all of which catalyze key steps in de novo fatty acid synthesis [17]. Additionally, two other acsl genes (acsbg2 and acsl1) known to be involved in saturated and monounsaturated fatty acid activation were downregulated in seawater [18]. This coincided with an increase in several thioesterase genes, including acot1l and acot5l, responsible for de-activation of fatty acids through the hydrolysis of acyl-CoAs [19]. It is unlikely that de-smoltification occurred in the freshwater control smolts because expression of these genes remains stable between weeks 19 and 25 in the control fish. This combination of decreased expression of key de novo biosynthesis genes and increased fatty acid de-activation through greatly increased thioesterase expression suggests a reduced capacity to synthesize fatty acids in liver of fish after transition to sea, in line with previous findings [14].
No long-term effect of short photoperiod exposure in liver
To evaluate the role of photoperiodic history on the development of liver function during smoltification, we performed pair-wise tests for gene expression changes between experimental fish exposed to a short photoperiod and control fish on constant light regime at week 10 (at the end of the short photoperiod exposure) and at week 19 (just prior to seawater transfer). We identified a relatively shorter list of DEGs (532, FDR <0.05, Table S3) associated with photoperiodic history differences at week 10, but only a few DEGs at week 19 (15, Fig 4, Table S4). At week 10 we found a vitamin D 25-hydroxylase gene, the first step in the formation of biologically active vitamin D [20], was strongly downregulated after a short photoperiod. This is likely due to decreased UV mediated vitamin D synthesis in the skin from less exposure to light [21]. The low number of DEGs at week 19, and low overlap between week 10 and 19 expression changes, showed that different photoperiodic histories did not impact longer term gene function in the liver following smoltification.
Relative liver expression of differentially expressed genes (DEGs, FDR <0.05) between the short photoperiod exposed experimental group and long photoperiod control group at weeks 10 and 19. Genes are marked to the right if differentially expressed higher (black) or lower (yellow) in response to a short photoperiod at weeks 10 and/or 19.
Remodeling of liver transcription factor binding across smoltification
To better understand the mechanistic drivers shaping changes in liver gene expression through salmon life stages, we generated ATAC-seq data to measure accessibility of chromatin and used this to indirectly quantify transcription factor (TF) occupancy at predicted TF binding sites (TFBS) by assessing local drops in chromatin accessibility (aka footprints). For each time point across the 25 week experiment group we generated two replicate samples of ATAC-seq data from the same livers sampled for RNA-seq, at a depth of 55-72M reads. Reads were aligned to the genome (41-63M) and peaks where reads were concentrated were called to represent regions of accessible chromatin. A unified set of the ATAC peaks was made by merging peaks across the different weeks (File S1). A principal component analysis (PCA) on the sample’s read counts over the unified peaks showed pairing of replicates and separation between the weeks (Fig 5A). PC1 separated weeks 1 and 19 to 10 and 25, while PC2 separated the pre-smolt weeks (1 and 10) to the post-smolt (19 and 25). The unified set of 201k peaks was composed of peak sets from each week, with week 19 having the highest number of peaks (181k, Fig 5B). Most of the peaks at each week were shared across sets, with week 19 standing out as having the greatest number of unique peaks (Fig 5C). Peaks were highly concentrated around the TSS of genes as expected, associating with gene regulation (Fig 5D). Peaks were mostly found in introns or intergenic regions, suggesting a higher proportion of peaks at enhancer than promoter elements, not unexpected (Fig 5E).
A) Similarity of ATAC-seq samples at different weeks by principal component (PC) analysis of ATAC-seq read counts within a unified set of ATAC peaks (shared and unique between weeks). B) Number of ATAC peaks called at each week. C) Number of peaks intersecting between sets or unique to each week. D) Distribution of distances (in base pairs) of unified peaks to the nearest gene transcription start site (TSS). E) Genomic locations of unified peaks.
Little is known about the environmentally driven changes in gene regulatory pathways in salmon. We therefore used a TF footprinting analysis to identify within peaks drops in reads at TFBS, indicating a bound TF (i.e. occupancy) at that site in the given sample. We first describe the TFs showing genome-wide changes in TFBS binding between livers sampled in different photoperiods and water salinities (Table S6). Since developmental stage (age) can also impact TF binding patterns, we focused on TFs with differences in occupancy that persists across environmental contrasts with fish from different developmental stages (Fig 6A and B). Using a cutoff for differential TF occupancy (log2 fold change in genome wide TF-motif occupancy >0.1) we identified 33 and 35 TF binding motifs that were associated with photoperiod and salinity, respectively (Fig 6C, Table S7).
Volcano plots show genome wide fold changes and significance for transcription factor binding site (TFBS) binding scores between A) fresh-to saltwater weeks and B) short to long photoperiod weeks. Transcription factor (TF) motifs with significant changes in global binding scores (absolute log2 fold change >0.1) across each contrast are colored. C) Heatmap shows the scaled number of total bound TFBS across the weeks for the TF motifs that are significant in A) and B). The ‘regulation’ color indicates in which environment the TF motif had the greater TFBS binding score.
Most (30/35) TF motifs associated with water salinity differences were found to have a marked drop in genomic occupancy after transition to saltwater (Fig 6C). These include several motifs known to bind TFs associated in energy homeostasis related processes, such as TEF’s, GATA4, NR5A1 MAF, KLFs [22-24]. Only five TF motifs had a significantly higher occupancy in saltwater, including LIM’s and two homeobox binding motifs (PHOX2B and HOXC8). The photoperiod contrast (Fig 6B and 6C) revealed that most TFBS’s with induced occupancy after a short day period were binding sites for E-26 family transcription factors (ETS, ERG, ETV, SPIC, ELK, SPI1), which have been associated with regulation of circadian genes in other species [25, 26]. It is interesting to note that these TF binding sites with a spike in occupancy after a short photoperiod drops dramatically towards the end of smoltification (week 19) and stays low after seawater transfer (week 25). TF binding sites with reduced occupancy following a period of shorter days were mostly related to homeostasis of cellular metabolism, including key liver glucose and lipid metabolism regulators PPARD [27], RXRs [28] and HNF4A [29], as well as occupancy of thyroid hormone receptor beta (THRB) [30].
Next, we wanted to link TF binding patterns to the specific gene regulatory dynamics. We assigned TF motifs to genes (i.e. motif-gene pairs) by closest proximity and asked if genes with a particular expression pattern (Fig 7A) were enriched for TF motifs in proximity with a corresponding pattern of TF occupancy (Fig 7B). For example, for genes in expression cluster 1 with highest expression at week 25, we expect nearby binding sites to be enriched in TF motifs that are occupied by TFs in week 25, but not the other time points. Indeed, genes in most expression clusters displayed significant enrichments of TF motifs with expected binding patterns (colored red or orange), and these signatures were quite distinct for each gene expression cluster (Fig 7C, Table S8). A diagram shows the comparison for the fisher’s exact tests between gene expression cluster and TF binding times to find TF motif significance (Fig 7D).
A) Gene expression trends of clusters from Fig 1. B) Assumed weeks the transcription factor binding sites (TFBS) near genes in the cluster would be bound by a transcription factor (TF) to regulate transcription at the weeks of highest expression. A primary and secondary assumed binding pattern is colored red and orange respectively. C) For each set of genes in a cluster, each TF was tested if the TFBS near to those genes were enriched for any combination of binding pattern. Fisher’s exact test results for all TF-binding pattern combinations are plotted per cluster as odds ratios against the significance. The test results for the assumed primary and secondary binding patterns in B) are colored red and orange, respectively. A top proportion of the most significant TFs (in a top quantile per test) are labeled. D) Diagram showing how each combination of TFBS motif, expression cluster, and binding pattern was tested for enrichment with a Fisher’s exact test.
From the enrichment results (Fig 7C) we see genes in cluster 1, enriched for ribosome related functions, had nearby TFBS with NFIX binding following transition to seawater. This is a transcriptional regulator known to be involved in ribosome biogenesis [31]. In addition, cluster 1 TFBS were associated with binding of FOS and JUN after seawater entry. These TFs are major components of the Activator Protein 1 (AP-1) transcription factor complex which is responsive to growth factors and drive cell proliferation and differentiation [32-34]. The top associated TFs for cluster 2 genes were ZNF341 known to be involved in immune homeostasis [35] and several Fox TFs (A, F, L, I, K) linked to various cell physiological processes [36, 37]. Cluster 3 genes were associated with several unnamed zinc finger transcription factors (ZNFs and ZBTBs) binding at week 19 prior to seawater transfer, as well as binding of RREB1 and EGR1 in week 19 and week 25. Among TFs with binding in week 19 only, we find genes linked to regulation of immune cell function (ZBTB32 and ZNF263) [38, 39] as well as oxidative phosphorylation [40]. The RREB1 and EGR1 are well described players in RAS signaling pathways [41-43] involved in cell growth and proliferation. Among cluster 4 genes we find enrichment of insulin and sugar metabolism functions, with PKNOX1 and NFYB TF binding significantly associated with their expression patterns. Both these TFs have been shown to function in lipid metabolism and be linked to insulin signaling [44-46]. The top TFs associated with cluster 6 gene expression is NR1D1 (also called Reverbα), a core component of the circadian clock and regulator of lipid metabolism [47]. Cluster 6 is not enriched for any KEGG pathways but has a marked drop in expression after the short photoperiod exposure. In the final cluster 7, enriched for genes playing roles in amino acid, glucose, and lipid metabolism, we find very strong associations with binding of several TFs, including KLF and SP family members. Indeed, these TFs are known to play important roles in regulating gluconeogenesis and lipid and amino acid metabolism in mammalian livers [24, 48].
DNA methylation not linked with gene expression
To investigate the role DNA methylation had on the gene expression changes throughout salmon life-stages, we produced a RRBS dataset from the same liver samples at a depth of 26-40 M reads. A consensus set of 1.2M CpG positions was used for differential methylation analysis (Table S9). To assess genome wide differences in the regulation of CpG methylation, a principal component analysis (PCA) separated samples based on the methylation levels across the CpG consensus set. There was no clear separation of samples by timepoint or experiential condition, with PC1 and PC2 each explaining less than 5% of the variance (Fig S3A). To find specific sites of differential methylation, we tested all CpGs for differences in methylation score across any timepoints of the experimental group with an ANOVA-like approach (Table S10). Out of the consensus set of CpGs, 2535 (0.2%) were differentially methylated cytosines (DMC) across life-stages (FDR <0.05, fold change >25%, Fig S3D-F). 209 of these were present in promoter regions, 103 in exons, 664 in introns, and 782 in intergenic regions (Fig S3E). Most genomic regions with a DMC had one differentially methylated site and only a few regions contained longer stretches (up to 26 bp in an intron) of differentially methylated CpGs (Fig 3D). We assessed if these CpGs were associated to genes with a specific function, but enrichment tests for GO or KEGG terms gave no significant results. We looked at next if the methylation percentages of DMCs correlated to changes in their corresponding gene’s expression across timepoints. 157 DMC-gene pairs were significantly correlated (p <0.05, Pearson correlation coefficient >0.95), however most of these genes were relatively lowly expressed. Simulating random DMC-gene pair correlation values found that the distribution of values from our real data was not significantly different from that of simulated pairs (p-value <0.62, Fig S3G), refuting strong links between regulation of CpG methylation and gene expression in our experiment.
Discussion
Lipid metabolism remodeling as a pre-adaptation to life at sea
An important feature of smoltification is how the process prepares the juvenile fish for a life in sea, i.e. physiological necessities for survival are already present while the fish is still in freshwater. This is well documented for the osmoregulatory machinery in the gills [49-54]. For example, the likely causal agent for saltwater tolerance, NKA a1b, increases in abundance in freshwater gill tissues [8]. While lipid metabolism related gene expression is known to decrease in liver of seawater stage Atlantic salmon [14], this is the first report that systemic downregulation of lipid metabolism gene expression actually occurs before transition to sea (Figs 2 and 3). Given the availability of polyunsaturated fatty acids in seawater environments is higher than freshwater [55], and that the body lipid composition changes to match this in freshwater smolts [56], it is likely that the observed decrease in lipid metabolism is a genetically programmed preadaptation to life at sea. This study therefore adds another feature to the list of pre-adaptations in freshwater smolts.
The effect of photoperiod history on genome regulation during smolt liver development
Decades of research have revealed that many features of smolt physiology development can be affected by photoperiodic history [3, 57-64], including gene expression patterns. For example, a recent study of gill transcriptomes identified a subset of 96 genes with significantly increased gene expression levels in smolt exposed to a short photoperiod (8:16) during development compared to smolts kept on constant light (24:0) [3]. In this study, however, no long-term effects of short photoperiod exposure were found in liver transcriptomes (Fig 4). We do, however, show that salmon liver gene regulation is responsive to variation in photoperiods. Transcriptome profiles through smolt development show distinct gene sets with increased and decreased expression after reduction in photoperiod (Fig 2A clusters 5 and 6). Furthermore, analyses of TF binding dynamics (Fig 6) identify photoperiod sensitive TFs encoded by genes known to have repressed expression under short photoperiod in mammals, such as retinoic acid related TFs (RXRs) and thyroid hormone receptors (THRB) [65, 66]. In addition, our integrated analyses of gene expression profiles and TF binding (Fig 7) associated NR1D1, a core component of the mammalian circadian clock [47], with genes having lower expression after exposure to short photoperiod. Taken together, we conclude that smolt liver development does not seem to rely on having experienced a winter-like photoperiod. Yet since acute effects of reduced photoperiods had a large impact on gene regulatory networks related to metabolism (Figs 6 and 7), it is likely that highly divergent photoperiodic histories can lead to delayed spillover effects and result in differences in metabolic states.
Linking genome regulatory layers to understand the developing smolt liver
Salmon experience gene regulatory changes [13, 14] and function [6, 7] in liver during smolt development in freshwater, and following seawater entry. Yet, no genome wide studies of DNA methylation and TFs involved in driving these transcriptional and physiological changes have been conducted. Here, we generated an RRBS dataset as well as an ATAC-seq dataset across smolt development in liver and used the latter to predict TF occupancy dynamics and map out putative major regulators of key developmental processes during smoltification (Figs 6 and 7).
Firstly, we showed that dynamic DNA methylation has a limited role in gene regulation in the liver during smoltification. Of the 1.17 M CpGs in our dataset, only about 2500 of these showed dynamic methylation during smoltification, and few of these were in the vicinity of differentially expressed genes. This echoes an earlier study on methylation changes associated with early maturation in Atlantic salmon in which the liver exhibits less dynamic methylation overall than brain and gonads [67]. Also in other organs of Atlantic salmon, gene expression changes seem to be controlled by other gene regulatory features than DNA methylation [68]. Despite the intriguing hypothesis of DNA methylation being an important gene expression regulator, and an epigenetic one at that, our study questions this role in the context of post-embryonic development. Indeed, many studies describe methylome changes during metamorphosis or other post-embryonic transitions but does not provide strong evidence for a causative connection between changes in gene expression and changes in DNA methylation [69-71].
Previous studies of liver physiology during parr-smolt transformation highlights decreased lipid and glycogen biosynthesis and increased levels of glyco-and lipolysis [7]. In line with this, about 600 genes enriched for lipid, carbohydrate, and amino acid metabolism related functions displayed a clear decreasing trend in gene expression from parr (week 1) to smolt (week 19) (Fig 2, cluster 7). Several TFs showed highly significant TFBS binding associations with the cluster 7 gene expression profile (Fig 7C) including KLF/SP gene family members known to play important roles in regulating gluconeogenesis and lipid and amino acid metabolism in mammalian livers [24, 48]. Finally, genes in expression cluster 4, also showing a marked decrease in expression in smolts (week 19), were enriched for TFBS that had a significant drop in NFY binding from week 19. This TF is known to be a major regulator of lipid metabolism, including biosynthesis of fatty acids [44]. Concurrently, but with opposite expression trends, genes involved in oxidative phosphorylation related functions (the last step in breaking down amino acids, lipids, and carbohydrates to energy) increased in expression from parr to smolts (Fig 2, cluster 3). The TFBS of these same genes were enriched for binding of the TF ZNF682 in smolts in week 19 (Fig 7C), a nuclear encoded TF gene that regulates oxidative phosphorylation in human cells [40]. Together, these results suggest that increased ZNF682 occupancy in combination with reduced KLF and NFY promoter binding has an associated link to the liver metabolic shift from synthesis to break down of organic compounds as fish undergo parr-smolt transformation.
Following the parr-smolt transformation, the transition to a life in seawater is also known to be associated with additional changes in physiology in Atlantic salmon. Genes increasing in expression in seawater (Fig 2, cluster 1) were involved in ribosome biogenesis and their TFBS were associated with increased NFIX occupancy in seawater (Fig 7C), reported to impact ribosome biogenesis in mammals [31]. Another well known route to increased ribosome gene expression and protein synthesis is the induction of the mTOR pathway [72, 73]. Interestingly, seawater entry is known to trigger increased growth hormone levels in salmon [74, 75] and this hormone acts as a rapid activator of protein synthesis through the mTOR pathway [72]. Furthermore, seawater growth hormone increase can also be linked to the second group of TFs putatively involved in gene expression induction after seawater transition (Fig 2), namely JUN and FOS (Fig 7C). These genes, originally known as onco-genes, are also responsive to growth hormones, and regulate cell proliferation and differentiation [32-34]. These TFs provide the molecular basis for linking growth hormones to increased growth capacity of smolt in seawater [4]. In our experiment freshwater control fish were larger than fish transferred to sea, but this can be explained by a known initial suppression in growth and feeding followed by increased growth rates [76]. Finally, genome-wide footprint signals (Fig 6) also pointed to large changes in the binding of TFs involved in energy homeostasis after seawater entry [22-24], further underpinning the metabolic gear shift. In conclusion, our data suggested that the genome regulatory dynamics in smolt livers across the fresh-to seawater transition is likely driven to a large extent by an increase in circulating growth hormone, resulting in activation of major regulatory pathways (e.g. JUN/FOS) for cell growth and differentiation.
Materials and methods
Smoltification trial
Atlantic salmon eggs, provided by AquaGen Breeding Centre Kyrksæterøra, Norway, were sterilized at the Norwegian University of Life Sciences (NMBU) fish lab and incubated at 350 to 372 day-degrees until hatching. First feeding of fry was five weeks after hatching when the egg sac had been depleted. Fry were reared in two replicate tanks and on a standard commercial diet high in EPA and DHA fats for the duration of the trial. Fish occasionally needed to be euthanized as they grew to maintain adequate dissolved oxygen levels in the tanks. Sampling began 21 weeks after first feeding as week 1, and again at weeks 10, 19, and 25. Sampled fish were euthanized by a blow to the head and samples of liver tissue were cut into ∼5 mm cubes, placed in RNAlater, and incubated for at least 30 minutes at room temperature before long-term storage at -20°C. One week after the first sampling, some fish from each tank were transferred to replicate photoperiod control tanks where the day length remained unchanged. At the same time, the experimental tanks’ photoperiod was switched to “winter-like” lighting conditions with 8 hours of light per day for 8 weeks to trigger smoltification before returning to “spring-like” conditions with 24 hours of light per day. Immediately after the week 19 sampling, some fish from each experimental tank were transferred to seawater conditions at the Norwegian Institute for Water Research (NIVA), Solbergstranda, Norway. UV-sterilized seawater used in this life-stage had a salinity of 3%- 3.5% and was obtained from the Oslofjord. Fish were sedated before transport and allowed to acclimatize for several hours before being slowly introduced to the new water conditions. The fish that were not transferred to seawater were sampled as freshwater controls at the same time as the experimental fish. All animals used in this study were handled in accordance with the Norwegian Animal Welfare Act of 19th June 2009.
RNA sequencing
For RNA sequencing we extracted total RNA of liver samples from experimental and control groups taken on weeks 1, 10, 19, and 25 in replicates of four with the RNeasy Plus Universal Kit (QIAGEN). Concentration was determined with a nanodrop 8000 spectrophotometer (Thermo Scientific) and quality was assessed by running on a 2100 bioanalyzer using the RNA 6000 Nano Kit (Agilent). Extracted RNA with an RNA integrity number (RIN) of at least eight was used to make RNA-seq libraries using the TruSeq Stranded mRNA HT Sample Prep Kit (Illumina). Mean length and library concentration was determined by running libraries on a 2100 bioanalyzer using a DNA 1000 Kit (Agilent). RNA-seq libraries were sequenced by the Norwegian Sequencing Center (Oslo, Norway) on an Illumina HiSeq 4000 using 100 bp single end reads and at a depth of 25-43M reads per sample.
Gene expression quantification
Gene expression was quantified from RNA-seq fastq files through the nf-core rnaseq pipeline (v3.9), which involves quality control, read trimming and filtering, read alignments to the Atlantic salmon genome and gene annotations (NCBI refseq 100: GCF_000233375.1_ICSASG_v2) with STAR aligner, and read quantification from alignment with the salmon program. See pipeline documentation for further details on all steps: nf-co.re/rnaseq. Gene level counts, length scaled, were used for differential expression testing, and gene transcript per million (TPM) values used for visualizations including expression heatmaps and line plots. A PCA of gene TPMs showed one sample at week 10 (week_10_2_3, Biosample: SAMEA14383461) as an outlier (Fig S2) so we removed this sample from the analysis.
Differential expression analysis
Differentially expressed genes (DEGs) were tested for first differences across all experimental group time points using an ANOVA-like test with edgeR R package (v3.36) [77], using the generalized linear model and quasi-likelihood F-test function (glmQLFTest), testing for differences between weeks 1, 10, 19, and 25. DEGs were chosen from the results using an FDR cutoff of <0.05. Euclidean distances of DEG were calculated based on TPM values over the samples, and the DEGs separated into 7 clusters. We chose 7 clusters based on observing patterns in the heatmap of expression and comparing the sum of squares within and between different numbers of clusters. DEGs had to have correlation >0.5 to the mean expression values of their assigned cluster, otherwise they were excluded from the cluster. DEGs were also tested between experimental and control groups (short and long photoperiod history, respectively) at week 10 and 19 separately, in pair-wise exact tests with edgeR (v3.36), choosing DEGs with an FDR cutoff of <0.05. Enrichment of KEGG pathways in sets of DEGs was performed with the clusterProfiler R package (v4.2.2), using the pathway data for Atlantic salmon genes within the KEGG database.
ATAC sequencing
The protocol for the ATAC assay was based on that in Buenrostro et al. 2013 [78]. Two replicate liver tissue samples were used from the previous sampling of the experimental group on weeks 1, 10, 19 and 25. The liver tissues were washed and perfused with cold PBS to remove blood before being dissociated and strained through a cell strainer. The nuclei were isolated from cell homogenate by centrifugation and counted on an automated cell counter (TC20 BioRad, range 4-6 um). Transposition of 100k (weeks 1, 19 and 25) and 75k (week 10) nuclei was performed by Tn5 transposase from Nextera DNA Library Preparation kit. The resulting DNA fragments were purified and stored at -20°C. PCR Amplification with addition of sequencing indexes (Nextera DNA CD) were done according to Buenrostro et al. 2015 [79], with a test PCR performed to determine the correct number of amplification cycles. The ATAC libraries were cleaned by Ampure XP beads and assessed by BioAnalyser (Agilent) using High sensitivity chips. Quantity of libraries were determined by using Qubit Fluorometer (Thermo). Mean insert size for the libraries was 190 bp. Sequencing was done on a HiSeqX lane using 150 bp paired-end reads and at a depth of 55-72M reads per sample.
ATAC peak calling
Calling of ATAC-seq peaks was done through the nf-core atacseq pipeline (v1.2.1), which involves quality control, read trimming and filtering, read alignments to the Atlantic salmon genome (BWA aligner), and calling of narrow peaks per sample as well as a unified narrow peak set across all samples (MACS2 peak-caller). Data for QC results including PCA of samples, intersection of peaks sets, peak distances to TSS, and peak genomic locations, were also computed through the pipeline. See pipeline documentation for further details on all steps: nf-co.re/atacseq.
TF footprinting
TF footing in the unified ATAC peak set previously generated was done using the TOBIAS program (v7.14.0) [80]. With it, we identified within peaks using the ATAC-seq read alignment BAM files for each week (reads from replicates combined) ‘footprints’ (i.e. dips in read depth within peaks), indicative of TF proteins binding to the DNA and locally blocking transposase activity during the ATAC protocol. We used a set of TFBS motifs from the JASPER database (2020 CORE vertebrates non-redundant) to associate these footprints with specific TFs. Peaks were associated to a gene by closest proximity to the TSS during the atacseq pipeline. A blacklist file was used to mask simple repetitive regions in the genome from analysis (generated in-house). Data for the genome-wide changes in TF binding was taken from the ‘bindetect_results.txt’ file produced by TOBIAS, plotting the change in TF binding scores between weeks against their p-values. The number of bound sites for each TF was scaled across weeks and used for the heatmap visualization of differences. We identified TFs changing in response to salinity or photoperiod by intersecting TF binding results between the different weeks. With a log2 fold change cutoff of >0.1, TFs that had an increase in binding in week 25 compared to both weeks 1 and 19 were assigned as ‘sea’, and inversely those with a decrease assigned as ‘fresh’. Similarly for photoperiod those increasing in week 10 compared to 1 and 19 were assigned ‘short’ and those decreasing were assigned ‘long’. We tested for the enrichment of certain TF binding patterns (which weeks TFBS were bound) within the gene expression clusters previously identified. For this test we used the peak-gene annotation data mentioned previously to assign the TFBS to genes. Then for each combination of TFBS motif type, expression cluster, and TF binding pattern, we used a Fisher’s exact test to test if motifs had more often a specific TF binding pattern given a specific expression cluster, than compared to the total background numbers (see Fig 7D for a diagram of the test). Results for binding patterns that did not involve a change, i.e. bound at all weeks or no weeks, were not shown in the results.
Reduced Representation Bisulfite Sequencing
Livers from four fish per time point (three fish for week 25) were sampled and liver tissue was stored on RNAlater at -20°C. The samples were processed with Ovation RRBS Methyl-Seq System (NuGen) and bisulfite treatment was done with the Epitect Fast Bisulfite Conversion kit (Qiagen). RRBS libraries were controlled with a BioAnalyser (Agilent) machine on DNA1000 chips. Paired-end sequencing was performed by Novogene with a HiSeq X sequencing (Illumina). The mean library insert size was 168 bp and read depth was 26-40M reads.
Alignment of bisulfite-treated reads and cytosine methylation calls
Quality trimming of reads was done with Trim Galore (v0.6.4) [81] and adapters were removed with cutadapt (v2.7) [82]. Bismark Bisulfite Mapper (v0.22.3) [81] was run with Bowtie 2 [83] against the bisulfite genome of Atlantic salmon (ICSASG_v2) [84] with the specified parameters: -q --score-min L,0,-0.2 --ignore-quals --no-mixed --no-discordant --dovetail --maxins 500. Alignment to complementary strands were ignored by default. About 40-50% of the reads were mapped to the genome. Methylated cytosines in a CpG context were extracted from the report.txt-files produced by the Bismark methylation extractor. The resulting coverage files containing methylated and unmethylated CpG loci for each sample was first filtered for known SNPs in the salmon genome then used in the analysis of differential methylation. Coverage distribution around transcription start sites (TSS) and 20 kb upstream and downstream showed that the highest coverage of reads was found nearby TSS (Fig S3B) indicating that MSPI digestion of CCGGs have resulted in enrichment around TSS, as expected by the RRBS method. Using genome annotation information, we classified CpGs according to their genomic context (Fig S3C).
Differential methylation analysis
Samples were first organized with the R package methylKit (v1.9.4) [85] and the CpG loci were filtered by read coverage, discarding those below 10 reads per locus or more than 99.9th percentile of coverage in each sample, and those not to chromosomes. CpG loci between replicates were merged, keeping those present in at least three of the samples. The differential methylation analysis was done with an ANOVA-like analysis test of edgeR [77], contrasting the counts of methylated reads at different time points. Differentially methylated CpGs (DMC) were called with an FDR <0.05. A heatmap of the differentially methylated CpGs shows row scaled methylation percentage values of DMCs (Fig S3F).
Data availability
Sequencing data is in the European Nucleotide database for RNA-seq (PRJEB52829), ATAC-seq (PRJEB65073), and RRBS (PRJEB60411). Code for running all steps of the analysis and generating results and figures is available on gitlab (gitlab.com/sandve-lab/GSFsmolt).
Supplementary material
Fig S1. Gene expression changes in response to salinity. Relative liver expression of genes differentially expressed (FDR <0.05) between the seawater exposed experimental group and freshwater control group at week 25. Genes are marked to the right if differentially expressed at week 25, green if higher in seawater, blue if higher in freshwater.
Fig S2. Principal component analysis of gene expression differences between samples. A) Similarity between all samples in the study, by principal components (PC) of gene expression. Experimental samples are colored red, and control samples for the same time points are colored gray. Labeled are potential outlier samples. B) Effect of removing sample ‘week_10_2_3’ from the PCA.
Fig S3. Methylated CpGs from RRBS assay. A) Similarity of samples by principal components (PC) of the read coverage of consensus CpGs. B) Genomic context of the differentially methylated CpGs. C) Heatmap of methylation values of differentially methylated CpGs during the smoltification trial. D) Density of correlation values between differentially methylated CpGs and gene expression levels, and density of correlation levels between random gene-CpG pairs.
Table S1. Phenotypic data of sampled Atlantic salmon. Phenotype data of Atlantic salmon used in this study. Columns provide: Unique fish identifier (Fish ID), week fish was sampled (Week #), tank number (Tank #), fish number (Fish #), date fish was sampled (Date sampled), fish weight (Weight (g)), fish length (Length (mm)), fish sex (sex (M/F)).
Table S2. Differentially expressed genes across smoltification. Atlantic salmon genes differentially expressed (FDR <0.05) between any time points across the smoltification experiment. Columns provide: the NCBI id for genes (gene_id), available gene name (gene_name), description of coded protein (description), and expression cluster number genes were assigned to (deg_cluster).
Table S3. Differentially expressed genes in response to photoperiod at week 10. Atlantic salmon genes differentially expressed (FDR <0.05) between week 10 experimental samples exposed prior to short photoperiod conditions, and week 10 control samples kept under continuous long photoperiod conditions. Columns provide: the NCBI id for genes (gene_id), available gene name (gene_name), description of coded protein (description), log2 fold change in expression of experiment versus control values (logFC), average expression across samples in log2 counts per million (logCPM), p-value of the differential expression test (PValue), false discovery rate adjusted p-value (FDR), and the time point that was tested, in this case week 10 (week).
Table S4. Differentially expressed genes in response to photoperiod at week 19. Atlantic salmon genes differentially expressed (FDR <0.05) between week 19 experimental samples exposed prior to short photoperiod conditions, and week 19 control samples kept under continuous long photoperiod conditions. Columns provide: the NCBI id for genes (gene_id), available gene name (gene_name), description of coded protein (description), log2 fold change in expression of experiment versus control values (logFC), average expression across samples in log2 counts per million (logCPM), p-value of the differential expression test (PValue), false discovery rate adjusted p-value (FDR), and the time point that was tested, in this case week 19 (week).
Table S5. Differentially expressed genes in response to seawater transition. Atlantic salmon genes differentially expressed (FDR <0.05) between week 25 experimental samples after transition to seawater conditions, and week 25 control samples kept under freshwater conditions. Columns provide: the NCBI ID for genes (gene_id), available gene name (gene_name), description of coded protein (description), log2 fold change in expression of experiment versus control values (logFC), average expression across samples in log2 counts per million (logCPM), p-value of the differential expression test (PValue), false discovery rate adjusted p-value (FDR).
Table S6. Global changes in transcription factor binding. Results from TOBIAS ATAC-seq footprinting of transcription factor binding sites (TFBS) in the Atlantic salmon genome, showing the global changes in transcription factor (TF) binding for all TF motifs tested, across the different time points of the experiment. Columns provide: output file prefix of TF name with motif ID (output_prefix), TF name (name), motif ID (motif_id), name of the TF’s cluster group (cluster), total number of TFBS (total_tfbs), columns for the mean score of TF binding across all TFBS for each time point (columns week_1_mean_score to week_25_mean_score), total number of bound TFBS for each time point (columns week_10_bound to week_25_bound), and the fold change followed by the p-value of the significance of the change for each pair of different time points (columns week_1_week_10_change, week_1_week_10_pvalueto week_19_week_25_change, week_19_week_25_pvalue).
Table S7. Changes in transcription factor binding to salinity and photoperiod. Transcription factors (TF) with significate changes in global binding of transcription factor binding sites (TFBS) between time points representing a concerted change due to experimental conditions. To photoperiod conditions; week 1 (light) vs week 10 (dark), and week 10 (dark) vs 19 (light), or to salinity conditions; week 1 (fresh) vs week 25 (sea), and week 19 (fresh) vs week 25 (sea). Columns provide: TF name (name), time point comparison (comparison), p-value of the significance of the change between time points (pvalue), the fold change in different of TF binding scores (change), the conditions compared; photoperiod or salinity (category), the time point where positive change means more binding (week_A), the time point where negative change means more binding (week_B), and the conditions where there is significantly more binding of the TF (sig).
Table S8. Enrichment in binding patterns of transcription factor binding sites of genes in expression clusters. Results of Fisher’s exact tests for the enrichment in different binding patterns of transcription factor binding sites (TFBS) of genes in different expression clusters. Tested for each transcription factor (TF). Columns provide: the gene expression cluster (deg_cluster), the binding pattern tested made up of 4 digits representing the 4 time points in chronological order with 0 equating to the TFBS not bound while 1 is bound (binding), the total number of TFBS with the binding pattern associated by nearest proximity to genes within the expression cluster (count), p-value of Fisher’s exact test (pval), odds ratio of test (OR), name of the TF motif for the TFBS (TFBS_name).
Table S9. Differentially methylated CpGs CpG sites significantly differentially methylated between time points of the experiment (FDR <0.05). Columns provide: Percentage score of the number of reads methylated at the CpG site for each time point (columns week_1_score to week_25_score), unique position of site in Atlantic salmon genome (uniq_pos), chromosome of site (chr), base position on chromosome (locus), log2 fold change in methylated read count across time points (columns week_10_week_1_logFC to week_25_week_19_logFC), average count of methylated reads across samples in log2 counts per million (logCPM), p-value for significance in change between any time points (PValue), false discovery rate adjusted p-value (FDR), associated gene’s NCBI ID (gene_id), position of gene transcription start site (TSS) (tss), gene strand position (strand), distance of gene TSS to CpG site (distance), start and end positions of gene (gene_start, gene_end), gene width (gene_width), and type of genomic feature the CpG site is located in (genomic_feature).
Acknowledgments
Trond M. Kortner, from the Faculty of Veterinary Medicine, Norwegian University of Life Sciences, provided expertise during revision of the manuscript and interpretation of results.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.
- 43.↵
- 44.↵
- 45.
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.
- 51.
- 52.
- 53.
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.↵
- 65.↵
- 66.↵
- 67.↵
- 68.↵
- 69.↵
- 70.
- 71.↵
- 72.↵
- 73.↵
- 74.↵
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵