Parent of origin DNA methylation as a potential mechanism for genomic imprinting in bees

3 Genomic imprinting is defined as parent-of-origin allele-specific expression. In order for genes 4 to be expressed in this manner an ‘imprinting’ mark must be present to distinguish the parental 5 alleles within the genome. In mammals imprinted genes are primarily associated with DNA 6 methylation. Genes exhibiting parent-of-origin expression have recently been identified in two 7 species of Hymenoptera with functional DNA methylation systems; Apis mellifera and Bombus 8 terrestris. We carried out whole genome bisulfite sequencing of parents and o spring from 9 reciprocal crosses of two B. terrestris subspecies in order to identify parent-of-origin DNA 10 methylation. We were unable to survey a large enough proportion of the genome to draw 11 a conclusion on the presence of parent-of-origin DNA methylation however we were able to 12 characterise the sexand caste-specific methylomes of B. terrestris for the first time. We find 13 males di er significantly to the two female castes, with di erentially methylated genes involved in 14 many histone modification related processes. We also analysed previously generated honeybee 15 whole genome bisulfite data to see if genes previously identified as showing parent-of-origin 16 DNA methylation in the honeybee show consistent allele-specific methylation in independent 17 data sets. We have identified a core set of 12 genes in female castes which may be used for future 18 experimental manipulation to explore the functional role of parent-of-origin DNA methylation in 19 the honeybee. Finally, we have also identified allele-specific DNA methylation in honeybee male 20 thorax tissue which suggests a role for DNA methylation in ploidy compensation in this species. 21


Introduction
Genomic imprinting is defined as parent-of-origin allele-specific expression (Rodrigues and Figure 1: (a) Graphic display of the family-wise reciprocal crosses carried out between Bombus terrestris audax and Bombus terrestris dalmatinus. Each colour refers to related individuals, i.e. the queen from colony 08 is the sister of the male used in colony 19. This design reduces genetic variability between the initial and reciprocal crosses as we do not have inbred lines of B. terrestris. (b) Overview schematic for identifying allelic methylation di erences in the worker o spring. SNPs unique to either the mother or father are used to create N-masked reference genomes. The worker daughter sample is then aligned to the genome and reads are filtered to keep only those with an informative parental SNP. Methylation di erences between the alleles can then be assessed and parent-of-origin DNA methylation can be inferred from comparing reciprocal crosses.
testing (Benjamini and Hochberg, 1995). For a CpG to be di erentially methylated a minimum di erence of at least 10% methylation and a q-value of <0.01 were required. Genes were determined 142 as di erentially methylated genes if they contained an exon with at least two di erentially methylated 143 CpGs and an overall weighted methylation (Schultz et al., 2012) di erence across the exon of >15%.

144
Two CpGs were chosen based on Xu et al. (2021), they find the methylation of two CpGs is enough 145 to promote gene transcription in Bombyx mori via the recruitment of histone modifications.

146
Identification of parent-of-origin DNA methylation 147 Whole genome re-sequencing data of the parents were checked using fastqc v.0.11.5 (Andrews, only homozygous alternative SNPs which are unique to either the mother or father of each colony. 156 We also removed C-T and T-C SNPs as these are indistinguishable from bisulfite converted bases in which did not contain an informative SNP were discarded. Di erential methylation between the maternal and paternal reads of all workers was then carried out using the R package methylKit 167 v.1. 16.1 (Akalin et al., 2012) as above, with the exception of a minimum coverage of eight reads, as 168 previously described in (Wang et al., 2016). DNA methylation a minimum coverage of 10 was required and a minimum score of 0.8, which is 192 considered representative of true allele-specific DNA methylation according to Orjuela et al. (2020). 193 We then identified genes which contained allelically methylated regions using R and compared these 194 gene lists to those which show parent-of-origin DNA methylation, as identified in Wu et al. (2020).

195
Gene ontology enrichment was carried out as above using GO terms from the Hymenoptera Genome

198
Genome-wide sex-and caste-specific DNA methylation 199 It is currently unknown to what extent DNA methylation varies between sexes and castes of B.
200 terrestris. We have therefore taken this opportunity to also generally characterise the sex-and 201 caste-specific methylomes of this species. We find low genome-wide levels similar to those previously from the two female castes (Fig.2a). We also see no clustering by sub-species for the males or 206 queens, for example male 08 in Figure 2a represents with other genomic features, we also added UTR regions and intergenic regions to further explore 213 the genome-wide methylation profile. We find the highest levels of DNA methylation for all sexes 214 and castes are within exon regions, whilst promoter, and 5' UTR regions show a depletion in DNA 215 methylation compared to intergenic regions (Fig.2b).

216
We also segregated genes into categories of di ering levels of DNA methylation to explore 217 the potential function of highly methylated genes across sexes and castes. There are a small number

239
We next classed a gene as di erentially methylated if a given exon contained at least two 240 di erentially methylated CpGs and had an overall weighted methylation di erence of at least 15%.

241
We find 155 genes are di erentially methylated between males and workers, 165 between males 242 and queens and 37 between queens and workers (supplementary 1.0.7). We carried out a GO 243 enrichment analysis on all di erentially methylated genes and on hypermethylated genes for each 244 sex/caste per comparison (supplementary 1.0.8). Whilst most terms are involved in core cellular 245 processes, we specifically find di erentially methylated genes between queens and workers are 246 enriched for chromatin remodelling-related terms (e.g. "histone H3-K27 acetylation" (GO:0043974) 247 and "chromatin organization involved in negative regulation of transcription" (GO:0097549)) and 248 reproductive terms (e.g. "oogenesis" (GO:0048477)). Di erentially methylated genes between 249 males and workers were also enriched for a large number of histone modification related terms 250 (e.g. "histone H3-K27 acetylation" (GO:0043974), "histone H3-K9 methylation" (GO:0051567),

257
When looking specifically at hypermethylated genes per sex/caste compared to all di erentially 258 methylated genes per comparison we find only two enriched GO terms for hypermethylated genes 259 in queens compared to workers: "developmental process involved in reproduction" (GO:0003006) 260 and "gamete generation" (GO:0007276). In genes hypermethylated in males compared to queens 261 and workers separately we find a large number of enriched GO terms related to neuron development 262 amongst other cellular processes.

263
Most of the di erentially methylated genes are common between males, queens and work-264 ers, with only 178 total unique genes changing methylation levels between sexes/castes (Fig.2c).

265
Specifically, we find 31 genes are hypermethylated in queens and workers when compared to 266 males and 18 genes are hypermethylated in males when compared to queens and workers. We 267 carried out a GO enrichment on these genes using all di erentially methylated genes from all 268 comparisons as a background set. We find general cellular processes enriched in both gene lists with 269 hypermethylated genes in the female castes also enriched for some telomere-related functions, e.g.

302
In order to further explore the function of these core 12 genes we carried out a gene ontology 303 15 enrichment analysis using all unique genes with allele-specific DNA methylation as a background 304 set (n = 3,448). We find a variety of terms enriched, including many involved in nervous system 305 development and the term "social behaviour" (GO:0035176) (supplementary 1.1.1). In addition to identifying this core set of genes which show potentially consistent parent-316 of-origin DNA methylation across multiple independent data sets, we also ran this pipeline for 317 some male samples as previous research has shown a diploid genome exists in some tissues of A.  In this study we have explored the potential of DNA methylation as an imprinting mark in social bees. 345 We conducted reciprocal crosses to explore parent-of-origin DNA methylation in the bumblebee 346 Bombus terrestris. Whilst our crosses and data generation were successful, we were unable to 347 confidently identify genome-wide parent-of-origin DNA methylation. We were, however, able to 348 use these data to characterise the sex-and caste-specific DNA methylation profiles of B. terrestris 349 for the first time. We find genome-wide that sexes and castes show similar DNA methylation other epigenetic processes, such as histone modifications and chromatin dynamics. We also mined 353 previously generated honeybee whole genome bisulfite sequencing data to explore the consistency of 354 parent-of-origin DNA methylation, as identified in Wu et al. (2020), across independent data sets. 355 We find a core set of 12 genes which exhibit parent-of-origin DNA methylation show allele-specific DNA methylation in all 33 independently generated female data sets. We have also identified a potential role for allele-specific DNA methylation in some diploid tissues of male honeybees.

358
Recommendations for B. terrestris reciprocal cross design for parent-of-origin 359 DNA methylation 360 We used whole genome re-sequencing data of the mother and father from two sets of reciprocal SNPs, we were still unable to identify parent-of-origin DNA methylation across the entire genome 372 of B. terrestris. We explore the reasons for this and make the following recommendations for a 373 replication of this work.

374
Firstly, we sequenced the worker o spring samples to a depth of 30X, this coverage yields 375 enough data for standard di erential DNA methylation analysis even after data loss due to low 376 mapping rates of bisulfite converted data (generally less than 60% mapping e ciency (Tran et al.,377 2014)) and removal of PCR duplicates. An additional step required to identify parent-of-origin DNA i.e. the genomic sequence (Yagound et al., 2019;Marshall et al., 2019;Yagound et al., 2020).

395
This may be used, for example, to examine parent-of-origin methylation profiles in individual brain 396 samples.

397
Sex-and caste-specific methylomes of B. terrestris 398 We were able to use the data generated in this study to explore the sex-and caste-specific methylome terrestris queens is significantly longer than workers and males (Gree and Schmid-Hempel, 2008;416 Smeets and Duchateau, 2003;Duchateau and Marin, 1995). One role for DNA methylation in B.

417
terrestris may therefore be the regulation of caste di erences through core cellular processes, such 418 as telomere maintenance. Finally, we also find di erentially methylated genes between queens and 419 reproductive workers are involved in reproductive related processes. Previous work has suggested 420 a role for DNA methylation in reproduction in B. terrestris (Amarasinghe et al., 2014), as well 421 as other social insects (Wang et al., 2020;Bonasio et al., 2012), although this does not appear 422 to be consistent across Hymenoptera (Libbrecht et al., 2016;Patalano et al., 2015). Whilst the 423 di erentially methylated genes identified here suggest a role for DNA methylation in maintaining or 424 generating caste di erences, a direct causal link between DNA methylation and gene expression 425 changes mediating phenotypes has yet to be found.

427
Genes which show parent-of-origin expression have been identified in two social insect species to 428 date, B. terrestris (Marshall et al., 2020b) and A. mellifera (Wu et al., 2020). Whilst a direct link 429 between parent-of-origin DNA methylation and parent-of-origin expression has not been found in the 430 honeybee (Wu et al., 2020;Smith et al., 2020), it is possible parent-of-origin DNA methylation may 431 22 mediate imprinted genes in a trans-or temporal-acting fashion (Xu et al., 2021;Li-Byarlay et al., 432 2020 (Haig, 2000), and so a plastic response to queen presence of some imprinted genes may account for 460 the small number in common across independent samples. Finally, we cannot rule out that more of 461 the genes with parent-of-origin DNA methylation identified in Wu et al. (2020) are not consistent 462 across A. mellifera females. As discussed in the 'Recommendations for B. terrestris reciprocal 463 cross design for parent-of-origin DNA methylation' section above, it may be that we did not have 464 su cient coverage in some genome regions / samples from the data tested for these areas to show 465 significant allele-specific DNA methylation. Although, it's worth noting the advantage of identifying 466 allele-specific methylation through probabilistic models as opposed to using SNPs is that we can 467 survey homozygous regions which would usually be discounted when di erences in the underlying 468 genotype are needed for allele identification (Orjuela et al., 2020).

469
Finally, of the core 12 genes identified above we find three of those genes also show allele-470 specific DNA methylation in some male thorax tissue, including the gene involved in social behaviour.

471
Di erent tissues are known to vary in levels of ploidy in some social insects (Aron et al., 2005). DNA 472 methylation has also previously been suggested as a possible mechanism of ploidy compensation This study provides the groundwork for future research exploring parent-of-origin DNA methylation 479 as a potential imprinting mechanism in the bumblebee Bombus terrestris. We specifically highlight Li-Byarlay, H., Boncristiani, H., Howell, G., Herman, J., Clark, L., Strand, M. K., Tarpy, D., and