Invertebrate DNA methylation and gene regulation

As human activity alters the planet, there is a pressing need to understand how organisms adapt to environmental change. Of growing interest in this area is the role of epigenetic modifications, such as DNA methylation, in tailoring gene expression to fit novel conditions. Here, we reanalyzed nine invertebrate (Anthozoa and Hexapoda) datasets to validate a key prediction of this hypothesis: changes in DNA methylation in response to some condition correlate with changes in gene expression. While we detected both differential methylation and differential expression, there was no simple relationship between these differences. Hence, if changes in DNA methylation regulate invertebrate transcription, the mechanism does not follow a simple linear relationship.

The regulatory function of DNA methylation in invertebrates, if any, remains unclear. Methylation of 21 promoters, which is important for mitotically heritable gene silencing in vertebrates, has not been 22 observed in invertebrates (Zemach et al. 2010;de Mendoza et al. 2019;Xu et al. 2019). In contrast to 23 promoter methylation, gene body methylation (gbM) is positively correlated with gene expression in 24 both invertebrates and plants. There is little evidence however, that it directly regulates transcription. 25 For instance, several studies failed to detect substantial differences in gbM across invertebrate cell types Here, we re-analyzed publicly available methylomic and RNA-seq data from three Anthozoa and six 34 Hexapoda studies to evaluate relationships between invertebrate DNA methylation and transcription. 35 For each study, we contrast methylation-and transcriptional differences between two experimental 36 conditions. The Anthozoan studies contrasted polyp types in the coral Acropora millepora (Dixon and  37 Matz 2020), pH treatments in the coral Stylophora pistillata (Liew et al. 2018), and symbiotic state in the 38 sea anemone Exaiptasia pallida (Li et al. 2018 confirm previous findings that baseline gbM levels are bimodally distributed across coding genes, are 44 positively associated with baseline transcription level, and are negatively associated with transcriptional 45 differences between conditions. We next examine the hypothesis that changes in gbM and promoter 46 methylation between conditions correlate with changes in transcription, but find no support for this 47 prediction. 48 49 Methods 50 51 Previously published datasets 52 Previously published WGBS and RNA-seq datasets from invertebrate species are shown in Figure 1. The 53 criteria for selecting these projects were: 1) the project focused on an invertebrate species 2) the project 54 included at least two conditions, such as environmental exposure, or caste. 3) the project characterized 55 DNA methylation using Whole Genome Bisulfite Sequencing (WGBS) 4) the project characterized 56 transcription using RNA-seq 5) reads were available on the NCBI SRA database. Experimental methods 57 from some projects allowed for multiple comparisons, however for simplicity, we focused on contrasts 58 that seemed likely to induce the greatest epigenetic change. Raw reads were trimmed and quality filtered using cutadapt, simultaneously trimming low-quality bases 74 from the 3' end (-q 30) and removing reads below 50 bp in length (-m 50) (Martin 2011). Trimmed reads 75 for each dataset were mapped to the appropriate reference genome (Table S1; Xia et al. 2008;76 Baumgarten were removed from the Bismark alignment files using the deduplicate_bismark command. Methylation 81 levels were extracted from the alignments using bismark_methylation_extractor with the --82 merge_non_CpG, --comprehensive, and --cytosine_report arguments. Detailed steps used to process the 83 WGBS reads are available on the git repository (Dixon 2020). 84 85 86 87

RNA-seq data processing 95
Raw reads were trimmed and quality filtered using cutadapt, simultaneously trimming low-quality bases 96 from the 3' end (-q 30) and removing reads below 50 bp in length (-m 50)(Martin 2011). Trimmed reads 97 for each dataset were mapped to the appropriate reference genome (Table S1)  the log2 scale. Analyses of differential methylation based on bisulfite sequencing data were done using 118 MethylKit package (Akalin et al. 2012). 119 120 Relationships between gbM and mRNA level 121 For each dataset, we tested for expected relationships between gbM and mRNA expression patterns. 122 For our dataset, generated using Tag (genotype) as a factor to control for genetic effects. For simplicity, models for differential expression for 128 the published datasets included only the treatment groups indicated in Figure 1 (we did not include 129 additional factors, for instance, sex or colony identity). Differences between groups are reported as log2 130 fold differences. 131 132 Results and Discussion 133 134 Confirming previous relationships between gbM and transcription 135 We first sought to corroborate previous findings on the distribution of gbM, and its relationship to gene 136 expression patterns. Using three different methylation assays, we confirmed that gbM in A. millepora 137 shows a characteristic bimodal distribution, separating genes into methylated and unmethylated classes 138 ( Figure 2 A-C). We then confirmed that gbM level is associated with average mRNA abundance (Figure 2  139 D-F), and negatively associated with differential expression between polyp types (Figure 2 G-I). Hence, 140 regardless of the method used to measure methylation, gbM shows the expected distribution and 141 associations with gene expression in A. millepora. 142 143 We found similar results in other studies. While the relative sizes and means of the peaks varied by 144 dataset, these were also bimodally distributed (Figure 1), and similarly associated with mRNA expression 145 patterns (Figure 3). Relationships between average gbM and baseline mRNA expression tended to be 146 stronger among the insects than the cnidarians, both for mean mRNA abundance and differential mRNA 147 expression between groups.

159
No correlation between changes in gbM and changes in transcription. 160 As gbM is associated with elevated expression, a simple hypothesis is that increasing gbM increases 161 transcription. Our re-analysis of gbM and mRNA differences between treatment groups shows that this 162 is not the case. Using three different methylation datasets in A. millepora, we found that measurements 163 of gbM differences between polyp types showed no consistent association with transcriptional 164 differences ( Figure S1). This was also the case for each of the other datasets ( Figure S2). Hence, in 165 Anthozoa and Hexapoda, gbM and transcription show no simple covariation between study conditions. 166 167 No correlation between changes in promoter methylation and changes in transcription 168 As promoter methylation is associated with gene silencing in vertebrates, we tested whether changes in 169 promoter methylation correlate with gene expression changes in invertebrates. Specifically, we tested 170 whether differences in methylation in windows 1Kb upstream of genes between experimental groups 171 predicted differences in mRNA levels. As with gbM, we found no reproducible relationship between 172 differential promoter methylation and differential expression for A. millepora ( Figure S3), or any of the 173 other studies ( Figure  S4).

181
Evidence for the "seesaw" hypothesis 182 As invertebrate coding genes are separated into methylated and unmethylated classes, we hypothesized 183 that these class designations serve as a gene-regulatory signal, and that the methylated and 184 unmethylated classes of genes undergo group-level changes in expression. In a previous study of A. 185 millepora we observed inverse changes in both gbM and transcription depending on methylation class 186 (the "seesaw hypothesis", Dixon et al. (2018)) and sought to replicate those observations here. We first 187 examined the new dataset from A. millepora. Between polyp types, we detected weak patterns similar 188 to those previously reported. Specifically, based on MBD-seq and mdRAD, in axial compared to radial 189 polyps the high gbM class increased in methylation level while the low-gbM class decreased in 190 methylation. However, this pattern was not apparent for the WGBS dataset ( Figure S5). Based on all 191 three methylation assays, transcription of the low gbM class was somewhat upregulated, and the high 192 gbM class somewhat downregulated ( Figure S5). While this was consistent with our previous 193 observations (

203
Among the other studies, there were three cases (silkworm, termite, and carpenter bee) of change in 204 methylation based on gbM class, but in contrast to Dixon et al. (2018), in all these cases only the high-205 gbM class was changing ( Figure S6). Also in contrast to Dixon et al. (2018), these class-level changes in 206 methylation were not associated with class-level changes in transcription ( Figure S7). Conversely, two 207 studies demonstrated class-level changes in transcription but no corresponding class-level changes in 208 gbM. In these two cases (honeybee and bumblebee), the low-gbM class was downregulated on average, 209 and high-gbM class upregulated on average (Figure 4; Figure S7). In summary, while aspects of the 210 "seesaw" hypothesis proposed by Dixon et al. (2018) were detected in several cases, it was not fully 211 supported by any of the studies included here. 212 213 Conclusions 214 215 Here we used published methylomic and transcriptomic data from Anthozoa and Hexapoda to 216 examine how DNA methylation relates to transcriptional variation between different environmental and 217 phenotypic conditions. We found that, as previously reported, gbM is bimodally distributed, and that 218 higher gbM levels are associated with elevated transcription and less transcriptional variation between 219 conditions. Differences in gbM between conditions showed no consistent linear association with 220 differences in transcription. As there were often significant differences in both gbM and transcription 221 ( Figure S8), this indicates that changes in gbM alone are neither necessary nor sufficient to induce 222 changes in transcription. Methylation differences between conditions 1 Kb upstream up the first exon 223 also showed no association with differences in transcription. 224 In conclusion, if shifting methylation patterns regulate invertebrate transcription, the mechanism 225 is more complex than can be captured by a simple linear relationship between these two variables. Comparative analysis of the genomes of Stylophora pistillata and Acropora digitifera provides 318 evidence for extensive differences between species of corals. Sci. Rep. 7:1-14. 319 Figure  and baseline gbM level. X axes show baseline gbM level. Y axes shows differential gbM between polyp 356 types (top panels) and differential transcription between polyp types (bottom panels). Top panels: 357 Differential gbM between polyp types was linked with baseline gbM level for the MBD-seq and mdRAD 358 datasets, but not for WGBS. Bottom panels: For all three methylation datasets, the low gbM class of 359 genes was somewhat upregulated in axial compared to radial polyps, and the high gbM class was 360 somewhat downregulated. relationship. Each pair of volcano plots is associated with the scatterplot below. The first volcano plot 377 shows differences in gbM, with significant genes (q-value from MethylKit < 0.1) shown in red. The second 378 shows differences in transcription, with significant genes in blue (FDR < 0.1). The count of significant 379 genes is given above each volcano plot. The scatterplots are the same as those shown in Figure 5, with 380 expression differences on the Y axis and gbM differences on the X. The contrasts for each study are given 381 in Figure 1.