Effects of ancient anthropogenic clam gardens on the growth, survival, and transcriptome of Pacific littleneck clams (Leukoma staminea)

Clam gardens traditionally established and maintained by coastal Indigenous Peoples of northwest North America are habitat modifications that enhance intertidal clam productivity and therefore provide secure and reliable local food resources. In this study, transcriptomic and phenotypic responses of Pacific littleneck clams (Leukoma staminea) were investigated in relation to transplantation to either clam gardens or unmodified clam beaches and growth for 16 weeks. Sediment characteristics (e.g., grain-size, carbonate, and organic content) were also evaluated and considered in the response. Large differences in phenotypic and abiotic characteristics were observed among beaches but did not differ based on unmaintained clam garden presence. A de novo transcriptome for L. staminea containing 52,000 putative transcripts was assembled and used to identify differential expression in response to the clam gardens. This identified a relatively small effect, but found two transcripts that were differentially expressed in both the gill and digestive gland tissues. In addition, differential expression along survival gradients, as well a tissue-specific expression analysis provide insight into the characteristics of the transcriptome and its ecological associations of this non-model organism. Across the beaches, abiotic characteristics with negative effects on growth and/or survival included small rocks, very fine sand, silt, carbonate, and organic content, whereas positive effects were observed from coarse sand, sand and fine sand. In conclusion, here it was found that localized environmental factors are likely to have a greater influence on Pacific littleneck clam physiology, growth, and survival than the presence or absence of unmaintained clam gardens.

1 3 each gene, as predicted by TransDecoder, that had an ORF type of 'complete', which required protein 2 7 3 homology. In cases where there were multiple transcripts with complete ORFs for a gene, the transcript 2 7 4 with the largest ORF was chosen. Further filtering was carried out to choose only transcripts ≥ 100 bp (to 2 7 5 remove small transcripts that could cause analysis error), followed by scanning of pfam-a and Uniprot 2 7 6 annotations to remove repetitive element-related keywords to finally produce a transcriptome of 52,000 2 7 7 transcripts, each putatively representing a single gene of which 42,708 had Uniprot IDs. were inspected using FastQC (Andrews 2010) and multiQC (Ewels et al. 2016). The reference 2 8 5 transcriptome was indexed using bowtie2 (Langmead and Salzberg 2012) and reads were aligned against it 2 8 6 in local mode, allowing 40 alignments to be retained per read (-k 40). For comparison with the results 2 8 7 from the reference transcriptome, the Manila clam (Ruditapes philippinarum) genome 2 8 8 (GCA_009026015.1_ASM902601v1) (Yan et al. 2019) was also tested as a potential reference by 2 8 9 indexing the genome using hisat2 (Kim et al. 2015) and aligning the Pacific littleneck clam reads against 2 9 0 the Manila clam genome (also retaining a maximum of 40 alignments per read; -k 40). The littleneck clam 2 9 1 alignments against the littleneck reference transcriptome were quantified using eXpress using default 2 9 2 parameters (Roberts and Pachter 2013). The alignments against the Manila clam genome in the test were 2 9 3 also quantified using eXpress default parameters but with the flag --rf-stranded. Resultant effective read 2 9 4 counts quantified against the de novo reference transcriptome were used in downstream analyses.

9 5
Transcriptome analyses were all conducted as described in the repository ms_clam_garden (see Data 2 9 6 Availability). Transcript annotation, effective counts, and phenotypic data were all imported into R to be 2 9 7 analyzed with edgeR v.3.36.0 (Robinson et al. 2010). Datasets were constructed for all samples or only gill 2 9 8 or only digestive gland (DG) samples (N = 3 datasets). Each dataset was input into a DGEList, then 2 9 9 filtered for low expression by requiring at least 10 reads to be aligned to the transcript in at least five 3 0 0 individuals (i.e., counts per million (cpm) > 0.82-1.03, depending on the dataset). Samples were 3 0 1 normalized by TMM normalization as applied by calcNormFactors() and multidimensional scaling (MDS) 3 0 2 plots were produced for each dataset (i.e., both tissues combined, gill tissue only, DG tissue only).

0 3
Tissue-specific expression was determined by identifying transcripts expressed in one tissue and not 3 0 4 the other. The normalized counts for these transcripts were then obtained from the all-sample dataset and 3 0 5 200 of the highest expressed tissue-specific transcripts from both tissues were plotted using heatmap() of 3 0 6 the stats package in R. Gene Ontology (GO) enrichment was conducted using the tissue-specific 3 0 7 transcripts from both tissues compared to all expressed transcripts for that tissue. Enrichment analysis was 3 0 8 conducted in DAVID Bioinformatics (Huang et al. 2007). 3 0 9 Differential expression analysis was conducted using limma v.3.50.1 (Ritchie et al. 2015). Both 3 1 0 tissues were analyzed separately and differentially expressed genes were identified using beach type (i.e., 3 1 1 CG vs. Ref) as an explanatory variable using a gene-wise negative binomial generalized linear model with 3 1 2 quasi-likelihood tests (glmQLFit) and retaining all transcripts with p ≤ 0.001. Differentially expressed 3 1 3 genes were also identified based on survivorship, where the percent survival of each plot was used to bin 3 1 4 the plot into one of high, medium, or low survival based on the fourth quartile, second and third quartiles, 3 1 5 or the first quartile of percent survival, respectively. Statistical significance was conducted as above but 3 1 6 using contrasts between each survival group. All other analyses and plots were conducted in R and are 3 1 7 available in the repository listed above. There were no significant differences between CG and Ref beaches in any measured growth or survival 3 2 2 variable (ANOVAs, p ≥ 0.168, Table 1). The mean (± SD) initial clam wet weight and shell height for CG 3 2 3 and Ref beaches (N = 180) was 1.5 ± 0.2 g and 1.8 ± 0.3 g and 1.5 ± 0.1 cm and 1.6 ± 0.1 cm, respectively 3 2 4 (Table 1). After 16 weeks in situ, these increased to 3.9 ± 1.3 g and 3.8 ± 2.9 g and 1.8 ± 0.3 cm and 1.9 ± 3 2 5 0.4 cm, which in terms of height, was an increase in growth of 20.5 ± 15.3% and 18.0 ± 22.6%, 3 2 6 respectively. Calculating percent growth using height includes both the alive and dead clams and is 3 2 7 therefore considered to be a more accurate measurement of growth compared to using wet weights. Wet 3 2 8 weights only include surviving clams, and therefore does not capture the reduced growth that may have 3 2 9 occurred in clams that eventually died. Clam survival was 63.3 ± 28.3% in CG and 58.3 ± 37.0% in Ref 3 3 0 beaches, respectively. 3 3 1 Beach location (independent of the CG/Ref factor), however, significantly affected all growth and 3 3 2 survival variables (Table 2, SI Figure S1). An unplanned difference occurred in the initial size of clams 3 3 3 among the six beaches (initial wet weight and shell height ANOVAs p = 0.031 and p = 0.044, 3 3 4 respectively, Table 2, SI Figure S2). This difference was specifically due to significantly larger clams 3 3 5 inadvertently planted on beach D (Table 2, SI Figure S2). Percent growth was highest on beaches A (31.2 3 3 6 ± 7.2%), C (29.7 ± 2.8%), and D (47.0 ± 8.2%) and lowest in beaches B (6.6 ± 6.8%), E (0.8 ± 0.9%), and 3 3 7 F (0.3 ± 0.6%) (Table 2), while percent survival was highest in beaches A (88.3 ± 12.6%), B (71.7 ± 3 3 8 27.5%), C (68.3 ± 12.6%), and D (86.7 ± 5.8%) and lowest in beaches E (33.3 ± 23.6%) and F (16.7 ± 3 3 9 24.7%) ( Table 2).  (Table 3). These beaches had 68-88% survival, with the percentage of 3 4 2 clams considered weak (0-5%), emaciated (0-5%), or having very pale digestive glands (0-22%) being 3 4 3 generally low, with the exception of beach D, where 22% of clams had pale digestive glands (Table 3). In 3 4 4 contrast, survivors from beaches E (CG) and F (Ref) had only 17-33% survival, and had more clams 3 4 5 considered weak (13-15%), emaciated (15-22%), and with digestive glands being very pale (13-22%) 3 4 6 than the other beaches. Beaches E and F are therefore considered poorer performing sites. There were no significant differences between CG and Ref beaches for any of the sediment characteristics 3 5 0 (ANOVAs, p ≥ 0.392, Table 1). Beach location (independent of the CG/Ref factor), however, significantly 3 5 1 affected all sediment characteristics except percentage of small rocks (Table 2). A PCA of these sediment 3 5 2 characteristics separates the beach locations into three groups (Figure 2), where pairs of beaches are 3 5 3 clustered based on geographical locations ( Figure 1). Beach A (CG), on a west-facing bay, was exposed to 3 5 4 0 (Table 5;  The gill dataset analyzed by multidimensional scaling (MDS) has 13% and 11% of the total variation 4 1 4 explained by dimensions 1 and 2, respectively ( Figure 4A). In general, in the gill samples, no clear 4 1 5 clustering was observed by beach type (i.e., CG vs. Ref) nor any other variables (e.g., survival, sand, silt, 4 1 6 or carbonates). There were two outlier samples (P12 from beach B and P7 from beach D) in the MDS plot, 4 1 7 both of which were Ref beaches and had 45% and 80% survival, respectively. 4 1 8 The DG dataset analyzed by MDS resulted in 14% and 9% of the total variation being explained by 4 1 9 dimensions 1 and 2, respectively ( Figure 4B). Some grouping was observed along dimension 2 with CG 4 2 0 samples clustering together, except for sample P15 from beach E. Ref samples were more spread out along 4 2 1 both axes. A gradient was observed across dimension 1 along survival, with high survival being positioned 4 2 2 in negative dimension 1, medium survival near 0-1, and low survival in positive dimension 1, although 4 2 3 the trend was not universal for all samples. Percent carbonate, which was correlated with survival (see 4 2 4 above), also trended along dimension 1 in the opposite direction to survival (i.e., high survival was 4 2 5 associated with low percent carbonate, SI Figure S7).

2 6
Differential expression between CG and Ref was evaluated for each tissue separately (Table 5) Consistency was observed between the tissues in the CG response, where two of the 91 unique 4 5 0 differentially expressed transcripts were differentially expressed in both tissues. This is notable given that 4 5 1 the datasets were normalized, filtered, and analyzed separately. This included the under expression in the 4 5 2 CG samples of a transcript annotated as von Willebrand factor type D domain (vwd) and of a transcript 4 5 3 annotated as platelet endothelial aggregation receptor 1 (pear1; Figure 5) in both tissues. 4 5 4 4 5 5 1 3.6 | Gene expression responses and differential survival 4 5 6 Given the differential survival observed among the beaches and plots, genes correlating with survival were 4 5 7 investigated to determine what might be driving these differences and to potentially identify genes that are 4 5 8 related to survival or mortality in this species. Genes were of particular interest that were differentially 4 5 9 expressed between low and medium survival, between medium and high survival, or incrementally 4 6 0 differentially expressed at each step between high, medium, and low survival (see Materials and Methods 4 6 1 for beach survivorship classification).

6 2
The majority of differentially expressed genes by survival in the gill were overexpressed between the 4 6 3 low and medium survival groups, rather than between medium and high survival (Table 5). Transcripts 4 6 4 overexpressed in the low survival group relative to the medium survival group (N = 26) were involved in 4 6 5 functions such as heat-shock, interferon response, and ligase activity (SD Additional Files S2, S3). For 4 6 6 example, heat-shock protein (hsp) 90 (FC > 60) and three different transcripts annotated as subunits of 4 6 7 hsp70 (FC > 2) were overexpressed specifically in the low survival group. Transcripts overexpressed in 4 6 8 the medium survival group relative to the high survival group included six transcripts of various functions 4 6 9 (SD Additional File S2). Fewer transcripts were overexpressed in the higher survival group (i.e., high vs. 4 7 0 medium, N = 11; medium vs. low, N = 14) and these generally were involved in immune functions, but 4 7 1 also various other functions. Overexpression of toll-like receptor 1 (FC > 2) and complement C1q-like 4 7 2 protein 4 (FC > 3.5) was observed in medium survivors relative to low survivors. 4 7 3 Similar to the gill, in the DG the majority of the differentially expressed transcripts by survival were 4 7 4 overexpressed in low relative to medium surviving clams (N = 34) and were generally involved in immune 4 7 5 system or heat-shock or transport. Transcripts annotated as hsp70 12a and 12b, as well as several 4 7 6 aminopeptidases, were overexpressed in low relative to medium and in medium relative to high survival 4 7 7 ( Figure 6). All the overexpressed genes in medium relative to high surviving clams (N = 18) were also 4 7 8 overexpressed in the low relative to medium survival group comparison. Very few genes were 4 7 9 overexpressed in the high survival clams relative to medium (N = 4) or the medium survival clams relative 4 8 0 to low survival (N = 8), and these transcripts had various functions, including a transcript annotated as 4 8 1 mucin-2 protein ( Figure 6). Results from the present study showed that there was no significantly-enhanced growth or survival on 4 8 6 unmaintained clam garden (CG) versus reference (Ref) non-walled beaches. Using similarly-sized clams 4 8 7 that were collected on the same days and at the same location as the present work Salter (2018) also 4 8 8 reported no significantly enhanced growth or survival on unmaintained CG compared to non-walled 4 8 9 beaches (controls) in Kanish and Waiatt Bays. It should be noted that the clam gardens in the present study 4 9 0 were not being maintained according to traditional tending practices and this maintenance may influence 4 9 1 clam survival, growth, and transcriptomic profile, as well as various abiotic characteristics (i.e., sediment 4 9 2 grain size, carbonate content, organic content). Salter (2018), however, also included some CG traditional 4 9 3 tending practices (i.e., adding a 1 cm-thick layer of crushed shell mixture three times throughout the 4 9 4 summer) on replicate treatment plots on both beach types and reported a positive effect of the shell hash 4 9 5 treatment on growth and survival in both CG and non-walled beaches. Therefore, both the present study 4 9 6 and Salter (2018) did not observe a positive effect of CG on growth and survival in the absence of tending 4 9 7 but Salter (2018) reported a positive effect with the addition of crushed shell hash in both beach types. 4 9 8 Some trends were observed in the transcriptomic data that suggest that the CG had a transcriptomic 4 9 9 effect on the clams, albeit not a strong effect. Most notably, consistent genes were identified in two tissues 5 0 0 even though very few genes were found to be differentially expressed, and the CG and Ref samples 5 0 1 showed a slight separation in the PCA of DG tissue. The two genes that were identified as underexpressed 5 0 2 in the CG in both tissues were annotated as von Willebrand factor type D domain (vwd) and platelet 5 0 3 endothelial aggregation receptor 1 (pear1). The former is present in many proteins, including mucins and 5 0 4 other extracellular glycoproteins, with one in particular being reported to be up-regulated in response to 5 0 5 heat shock in the Pacific oyster, Crassostrea gigas (Zhang et al. 2015). The function of the latter in 5 0 6 shellfish remains unknown; in vertebrates it is a cell membrane protein involved in platelet aggregation. 5 0 7 Homologous genes of platelet activation are present in the Hong Kong oyster (Crassostrea 5 0 8 hongkongensis) and are regulated in response to hyposalinity (Xiao et al. 2018). In terms of the other 5 0 9 genes differentially expressed by clams in CG, the general functions were not clear. Immune genes and 5 1 0 stress-response genes were found both over-and under-expressed in CG versus Ref (e.g., a heat-shock 5 1 1 protein transcript was found to be underexpressed in the CG and a universal stress protein-annotated 5 1 2 transcript was overexpressed in the CG). In general, the functions of these transcripts in the Pacific 5 1 3 littleneck clam responding to the CG factor will be useful to continue to characterize as the genomic 5 1 4 information develops for this species and additional transcriptomic studies are conducted in relation to 5 1 5 environmental stress and maintained or unmaintained clam gardens. These molecular phenotypes may 5 1 6 indicate a response to the CG from the 16-week in situ study period that is below the level of growth and 5 1 7 survival phenotypic responses. Further work with longer study periods would be required to determine 5 1 8 whether these transcriptome signatures precede macro-phenotypic responses. 5 1 9 Sediment grain sizes, percent carbonate content, and percent organic matter did not differ between 5 2 0 CG and Ref in the present study, although there was a significant level of variation with these factors 5 2 1 within both beach types and among locations. For example, carbonate content varied from 4 to 10% in CG 5 2 2 and from 2 to 15% in Ref beaches, while silt varied from 0 to 4% and from 1 to 5%, respectively. This is 5 2 3 somewhat contrary to Salter's (2018) study that in untreated plots found that CG contained 2.8 -12.9 5 2 4 1479 to 7371 nucleotides, one can deduce that they represent viral mRNA, instead of viral genomes. Some 6 3 8 dicistroviruses can be non-pathogenic or cause subtle impacts (reduced longevity and fecundity), while 6 3 9 others result in rapid paralysis (Bonning 2009). 6 4 0 Even though there was transcriptomic evidence of viral presence in all the beaches, most of the 6 4 1 Pacific littleneck clams from the medium-and high-survival groups did not exhibit any signs of 6 4 2 pathophysiology. Other clams (mostly from the low-survival group), however, were emaciated or watery 6 4 3 and had slightly to extremely pale-coloured digestive glands, an indicator of reduced feeding and 6 4 4 potentially compromised function and health. Pale digestive glands could also be the result of paramyxean 6 4 5 parasites responsible for disease in marine mollusks, with infection being linked to digestive tropism 6 4 6 interfering with food adsorption and causing pale-yellowish digestive glands with thin, watery flesh in 6 4 7 both mussels and oysters (Alfjorden 2017, Carella et al. 2011). In the present study, however, there was no 6 4 8 significant parasite gene expression at 16 weeks and histopathology would be needed to confirm any 6 4 9 paramyxean parasite infections. It is also worth noting that the single time point of sample collection in the 6 5 0 present study and the analysis of only surviving clams may have reduced the likelihood of viewing viral or 6 5 1 parasitic gene expression in these samples.