ABSTRACT
Transposable elements are primarily silenced by DNA methylation and the associated histone modification H3K9me2 in many multicellular eukaryotes, including plants. However, in the absence of DNA methylation due to mutations in the DNA methylation machinery or in certain developmental contexts, the same TEs can gain Polycomb-associated H3K27me3, another epigenetic silencing mark that is usually linked with the facultative silencing of genes. In this study, we initially aimed to investigate whether DNA methylation and H3K27me3 could compete during the establishment of silencing at TEs in the model plant Arabidopsis. Strikingly, we show that the deposition of the H3K27me3 mark at newly inserted transgenic TE sequences is impaired in plants in which the de novo DNA methyltransferase DRM2 is mutated, contrary to the competition model hypothesized. Further profiling of H3K27me3 in drm2 mutants and in the DNA demethylase mutant rdd confirmed this new role of DNA methylation in promoting H3K27me3 deposition at some TEs, in addition to the previously described antagonistic role at others. These findings further reveal a new function of DNA demethylation in modulating H3K27me3 levels in vegetative tissues, which we confirmed via targeted DNA methylation experiments. Together, our results uncover a novel crosstalk between DNA methylation and Polycomb at TEs and reveal that these two pathways, thought to be specialized and antagonistic, can be interdependent and cooperate more than anticipated to maintain genome and epigenome integrity in eukaryotes.
INTRODUCTION
Transposable elements are repetitive DNA sequences that have the potential to multiply and move around the genome, thus compromising genome integrity1. Nevertheless, this threat is largely mitigated by genome defense mechanisms that efficiently target transposable elements and stably repress their expression across cell division and generations through epigenetic marks such as DNA methylation (5mC). These marks can persist even on TEs that have aged and lost mobility; as such, they can play a role as epigenetic regulatory modules that can impact nearby gene regulation1. In plants and Arabidopsis thaliana, DNA methylation is established and maintained by three classes of DNA methyltransferases: MET1, which maintains CG methylation; CMT3 and CMT2, which maintain CHG and CHH methylation, respectively, and ensure feedback loops with H3K9me2 histone methyltransferases; DRM2, which participates in the maintenance of CHH (at TE loci that are smaller than those targeted by CMT2); and possibly CG methylation2,3. Importantly, DRM2 is able to methylate DNA de novo in all sequence contexts as part of the RNA-directed DNA methylation (RdDM) pathway3. Thus, drm2 mutation drastically impairs de novo methylation of newly inserted repeats or TE- derived sequences4,5, a molecular phenotype also observed to various extents in mutants for RNA interference or RdDM factors connected to DRM2 activity5,6.
The detection of a TE that precedes its de novo methylation can be achieved in two manners7. The first, referred to as “expression-dependent silencing”, is based on TE transcript detection by the RNA interference machinery, which leads to the production of small RNAs7. These small RNAs can mediate not only transcript degradation but also RNA-directed DNA methylation by guiding DRM2 to homologous sequences3. Alternatively, if the TE is not transcribed upon insertion but displays sequence homology with another DNA-methylated TE copy that accumulates small RNAs in the genome, then the preexisting small RNAs can similarly guide the de novo deposition of DNA methylation by DRM2 by virtue of sequence homology with the TE neo-copies (“homology-dependent silencing”)7. The initial DNA methylation levels on a newly introduced repeat-containing transgene or a TE neo-insertion are usually not as high as those of the corresponding endogenous copy, and several generations are needed to reach these and a stably repressed state8,9.
While very stable across generations and over evolutionary time, the DNA methylation patterns can nonetheless be constitutively pruned by the four members of the DEMETER (DME) family of 5-methylcytosine DNA glycosylases named DME (DEMETER), ROS1 (REPRESSOR OF SILENCING 1), DML2 and DML3 (DEMETER-LIKE 2 and 3). These DNA demethylases act both redundantly and in a locus-specific manner in vegetative tissues to counteract excessive DNA methylation10–13. DME (DEMETER), in particular, demethylates transposable elements in companion cells of male and female gametophytes14,15 and is required to establish imprints in the endosperm16. In contrast to DNA methyltransferases, which have sequence context-dependent activities, DNA glycosylases can demethylate 5mC regardless of the sequence context11,17,18. Several functions have been shown for active DNA demethylation, such as preventing the spread of DNA methylation outside of primary targeted sequences10 or actively participating in gene control, either at specific stages of development19 or by preventing stress-responsive genes from being locked in a constitutively silent state20,21.
Interestingly, in recent years, TEs have been shown to be decorated by another epigenetic silencing mark, H3K27me3 (trimethylation of lysine 27 of Histone 3)22associated with the evolutionary conserved Polycomb-group (PcG) proteins. This pathway mediates a more dynamic form of transcriptional repression than DNA methylation and has long been thought to specifically target protein-coding genes, particularly development genes for establishing cell identity, or stress-responsive genes23,24. H3K27me3 is catalyzed by Polycomb Repressive Complex 2 (PRC2), which is composed of four subunits and is present as different PRC2 cores in Arabidopsis. The catalytic activity is performed by either CLF, SWN or MEA, which are the three plant homologs of animal EZH223,24. PRC2 is associated with PRC1, which is composed of two core subunits and mediates the ubiquitination of H2A (H2Aub) to allow a silent yet responsive transcriptional state of H3K27me3-marked targets23. In Arabidopsis, the recruitment of PRC2 to genes has been shown to be mediated via Transcription Factors (TF) that can recognize short recognition sequence motifs called Polycomb responsive elements (PREs), and that were initially described for the recruitment of PRC2 to Drosophila genes23. Long noncoding RNAs have also been involved in PRC2 targeting to genes25,26. Finally, PRC2 recruitment was also shown at some loci to be dependent on the presence of PRC1 and H2Aub23.
Interestingly, in recent years, PcG has appeared to be the dominant system for silencing TEs in wild-type unicellular eukaryotes and plants from early lineages, thus illustrating the ancient role of PcG in regulating TE silencing22,27. However, in Arabidopsis, PcG can be recruited to three types of TEs28. First, approximately one-third of DNA-methylated TEs can gain H3K27me3 in mutants impaired for DNA methylation, such as met1 or ddm129–31, or in specific cell types that are naturally hypomethylated32. This finding implies that DNA methylation, particularly CG methylation, can exclude H3K27me3 deposition. In some instances, the two marks can co-localize33–35, and this was proposed to be constrained by DNA methylation density33,35. In this context, the two marks can cooperate in restricting TE activation, for example upon biotic stress35. Finally, we recently reported that many TE are covered by the H3K27me3 histone mark in the genome of WT Arabidopsis plants, not only short TE relics but also relatively intact copies, which are expected to be targets of DNA methylation28. Whether active DNA demethylation can promote H3K27me3 recruitment in some of these cases is unknown.
In this study, we aimed to further investigate the crosstalk between DNA methylation and PcG proteins and reveal novel relationships between these pathways at TEs. Using neo-inserted TE transgenic sequences that are able to recruit both H3K27me3 and DNA methylation de novo, we provide evidence that the deposition of H3K27me3 is dependent on DRM2. This finding sheds light on an unsuspected interconnection between the PcG and RdDM pathways and points to a positive crosstalk between them in that context of TE neo-insertion, rather than the antagonism previously proposed. We further identified a subset of endogenous loci that lose H3K27me3 upon the loss of non-CG methylation maintenance in drm2, in addition to an anticipated subset the loss of CHH leads to an increase in H3K27me3. These results highlight a dual, locus-specific effect of DNA methylation on PcG recruitment. Accordingly, in the ros1 dml2 dml3 triple mutant, the gain of 5mC leads to gain of H3K27me3 at some loci but also to loss of H3K27me3 at others, as validated by targeted DNA methylation experiments. These findings reveal a novel function of DNA demethylases in the modulation of H3K27me3 via DNA demethylation. Together, our results uncover new interdependencies between DNA methylation and the Polycomb machinery, particularly in the context of TE neo-insertion. Thus the two major silencing pathways, generally thought to separated, cooperate to shape the epigenomes, a concept that could extend to other multicellular eukaryotes.
RESULTS
DRM2 is involved in the establishment of H3K27me3 at a newly inserted TE sequence
We previously showed that neo-inserted, transgenic sequences of the mobile ATCOPIA21 retrotransposon (AT5TE65370) consistently recruit H3K27me3 de novo 28. The endogenous copy is DNA methylated in WT plants and accumulates abundant siRNAs that are likely to target the homologous transgenic sequences (Fig. S1A). To verify this, we performed BS-seq on the pools of primary COPIA21 transformants (TR) used for H3K27me3 analysis 28. Because of the sequence homology between the endogenous and transgenic COPIA21 sequences, the DNA methylation signal observed in the transgenic plants is an average of the signals at the endogenous and transgenic copies (Fig. 1A left). To circumvent this issue, we calculated, as a negative control, an ‘expected’ average DNA methylation signal if the transgenes were not methylated (Fig. 1A right, grey bars). The real DNA methylation average levels at COPIA21 in the transgenic pools were greater than those in the negative control : this confirms that the transgenic copies are DNA methylated, yet less than the endogenous copies except in the CHH context (Fig. 1A). This higher CHH methylation at transgenes is caused by high de novo CHH methylation of the long-terminal repeats (Fig. S1B), presumably because small RNAs corresponding to these sequences are particularly abundant in WT plants (Fig. S1A).
To test for competition between DNA methylation and H3K27me3 in this context of neo-insertion, we transformed the ATCOPIA21 construct into WT and drm2 mutant plants impaired in de novo DNA methylation. We assessed H3K27me3 levels at the neo-inserted sequences in large pools of primary transformants of the same size for each genetic background. Surprisingly, we observed a drastic decrease in H3K27me3 at ATCOPIA21 copies in the drm2 mutant plants compared with the WT plants (Fig. 1B). This finding reveals that DRM2 is required for the de novo deposition of newly TE inserted sequences.
DRM2 is involved in the maintenance of H3K27me3 at endogenous TEs
Next, we profiled H3K27me3 in drm2 mutants and observed globally unchanged H3K27me3 levels at genes and endogenous TEs (Fig. S2A). Thus, DRM2 is neither necessary for general PRC2 activity nor has indirect effects on the PcG machinery. Closer inspection revealed an increase in H3K27me3 in drm2 plants at a subset of ∼900 TEs (Fig. 2A left panel): 32% of them overlapped with previously described, differentially methylated regions (DMR36) that significantly loose CHH methylation (Fig. S2B). This indicates that CHH methylation can antagonize H3K27me3 deposition (Fig. 2A left, S2B and 2B left), as previously described for CG methylation30,31. Of note, the extent of H3K27me3 gain did not seem proportional to the extent of CHH loss (Fig. S2B), possibly because the loss of one or two given cytosines, while not being called DMR, is sufficient to allow H3K27me3 deposition.
On the other hand, we observed a drastic loss of H3K27me3 in drm2 at another subset of TEs (Fig. 2A right panel, S2C and 2B right panel), which is in line with the observations at the ATCOPIA21 transgene. 36% of these TEs contain previously described CHH-DMRs36 in drm2 and showed a more drastic loss of H3K27me3 in drm2 than the other TEs did (Fig. S2C). These results indicate that DRM2-mediated maintenance of CHH methylation at a TE subset can be important for proper H3K27me3 patterning, although a DNA methylation-independent role of DRM2 cannot be excluded.
DRM2-mediated H3K27me3 deposition is mostly specific to TEs
We next aimed to further characterize the contexts in which H3K27me3 deposition is dependent on DRM2 (i.e 7.1% of the H3K27me3 peaks, Fig. S3A). First, 17% and 3.6% of the H3K27me3 peaks were detected at TEs and genes respectively, showing that this crosstalk is more specific to TEs. Second, H3K27me3 levels are lower at loci where H3K27me3 is dependent on DRM2 than at loci where H3K27me3 is independent of DRM2 (Fig. 3B). In addition, the loci where H3K27me3 marks are DRM2-dependent harbor higher levels of 24 nt small RNAs and DNA methylation (in each context) than the loci where H3K27me3 marks are DRM2-independent and less DNA methylated (Fig. 3C). This indicates a particular state at H3K27me3-marked, DRM2-dependent loci, where both marks with normally separate functions can colocalize and even cooperate in their recruitment; we coin this state ‘ambivalent’.
Given that H3K27me3 colocalizes with and can be dependent on H2AK121ub at genes and TEs23,28, we asked whether H2AK121ub is present at the ‘ambivalent loci’. H2Aub peaks were detected at 92% of the loci with DRM2-independent H3K27me3 marks (Fig. S3B top) but at only 34% of the loci DRM2-dependent H3K27me3 marks (Fig. S3B bottom). In addition, the length of the overlap between H3K27me3 and H2AK121ub was lower compared to loci where H3K27me3 is independent of DRM2 (mean of 271 bp and 1060 bp, respectively) (Fig. S3C). Thus, ‘ambivalent’ TEs are less often associated with H2AUb.
Finally, motif enrichment analyses further revealed that loci where H3K27me3 is dependent on DNA methylation are enriched in C-rich motifs (particularly CHH) (Fig. 3D and S3D). Thus, TEs whereby H3K27me3 depends on the presence of DRM2 display both genetic and epigenetic signatures that may underlie this ‘ambivalent’ chromatin state.
DNA demethylation impacts H3K27me3 patterning at endogenous TEs
To further explore the interconnections between DNA methylation and the PcG pathways, we profiled H3K27me3 in the triple DNA demethylase mutant rdd (for ros1/dml1/dml2). In a subset of TEs marked by H3K27me3 in WT and targeted by ROS1/DML1/DML237, we observed a loss of H3K27me3 upon DNA hypermethylation (Fig. 4A-B and S4A left panels). This finding indicates that H3K27me3 recruitment at some TEs, which we refer to as “Type 1-TEs,” could be promoted by active demethylation. This is consistent with the previously described antagonistic effect of DNA methylation on H3K27me3 pathways. Regions with loss of H3K27me3 did not always overlap with the published DMRs37, which could be explained by the observation that complete loss of H3K27me3 was associated with the gain of only a few cytosines at some loci (Fig. S4B). Interestingly, and in line with the loss of H3K27me3 observed at some TEs in drm2 mutants, we also identified a subset of targets where the gain of DNA methylation in rdd leads to a gain of H3K27me3 (Fig. 4A-B and S4A right panels). These TEs are not or lowly marked by H3K27me3 in WT, and we refer to them as “Type 2-TE”. Again, regions with a gain of H3K27me3 did not always overlap with the published DMRs in rdd37: this is likely explained by the observation that a gain of H3K27me3 is associated with the gain of only a few cytosines at some loci (Fig. S4C).
A targeted DNA methylation approach shows the direct impact of DNA methylation on H3K27me3 deposition in a locus-dependent manner
We next wanted to verify whether the loss of H3K27me3 at the DNA hypermethylated regions was the direct consequence of ectopic DNA methylation. For this purpose, we constructed an inverted repeat transgene to produce small RNAs able to override the dominance of ROS1 activity over RdDM21 and induce DNA methylation38 at a specific TE locus (AT1TE59770) (Fig. 5A left panel, Fig. 5B). In the two independent RNAi lines that we isolated, we observed a clear decrease in H3K27me3 at the endogenous TE compared to the transgenic lines with an unrelated transgene (“Target 1”, Fig. 5C and Fig. S5A left panels). We thus show for the first time that targeting small RNA-directed DNA methylation to a H3K27me3-marked locus can cause a direct loss of H3K27me3.
To verify whether the gain of H3K27me3 at the DNA hypermethylated regions was the direct consequence of ectopic DNA methylation, we constructed another inverted repeat transgene, this time targeting a locus that gains H3K27me3 in rdd (AT5TE81105) (Fig. 5A right panel, Fig. 5B). In two independent RNAi lines, we observed a slight increase in H3K27me3 at the endogenous TE targeted by RNAi (“Target 2”, Fig. 5C and Fig. S5A right panels). This indicates that targeting DNA methylation to a Type 2-TE can increase the recruitment of PRC2. This finding is in accordance with the striking observation that H3K27me3 deposition at a newly inserted TE is dependent on DRM2 (Fig. 1C) and provides further evidence that DNA methylation can favour H3K27me3 deposition in a locus-specific manner.
DISCUSSION
We previously showed that H3K27me3 recruitment and patterning at TEs share commonalities with PcG target genes, such as a partial dependency on PRC1, H2AZ variant incorporation, and the activity of JMJ histone demethylases; besides, sequence recognition motifs such as PREs39 may also be involved in H3K27me3 deposition since H3K27me3 appears to be instructed by the TE sequence itself28. Here, by studying the establishment of H3K27me3 on a newly inserted TE sequence, we reveal a novel dependency of PRC2 activity on the DRM2 de novo methyltransferase, which is TE-specific (Fig. 6 A). De novo deposition of H3K27me3 in this context was impaired in the drm2 mutant, which has never been reported in any organism before.
In addition, DRM2 is also required for proper maintenance of H3K27me3 patterns at a subset of TE loci, as recently reported in rice genes, where H3K27me3 and non-CG methylation can colocalize40,41. Accordingly, we show that active DNA demethylation can result in the loss of H3K27me3 at some TEs. In contrast, DNA methylation can antagonize H3K27me3, as indicated by the increase of H3K7me3 in drm2 and the loss of H3K27me3 in rdd in different TE subsets. The antagonistic effect of CG methylation on H3K27me3 deposition was previously shown30; however, our results provide two novel insights. First, DRM2-mediated CHH methylation can also antagonize PRC2 recruitment, which is consistent with recent H3K27me3 profiles of rice and maize RdDM mutants42,43. Second, we demonstrate a novel role for DNA demethylases in promoting PRC2 recruitment. This result is in line with previous observations that ROS1 targets are enriched in H3K27me344 and could provide an explanation as to why metazoan TET enzymes are enriched at the hypomethylated DNA promoters of PcG targets45. Taken together, our results point to locus-specific rules for H3K27me3 deposition, which would either require DNA methylation or be antagonized by its presence (Fig. 6 B). This suggests a mode of PRC2 recruitment determined by specific genetic and epigenetic signatures and the existence of a set of transcription factor (TF) or PRC2 co-factors that may display different affinities for DNA methylation and need to be identified in the future. This does not exclude direct interactions between PRC2 and DRM2 or the RdDM machinery in a locus-specific manner or in a context-specific manner (in the context of neo-insertion, for example). In that respect, the transgenic system that we have established provides a framework to dissect the complex interactions between H3K27me3 and DNA methylation during establishment and maintenance throughout generations at a single TE copy in future endeavors.
The interconnection that we revealed between de novo DNA methylation and deposition of H3K27me3 on newly inserted TEs argues for active cooperation between the two marks at this stage of the TE life cycle. This raises the question as to why such cooperation would take place when a novel copy has just integrated the genome. When they are inserted into the genome, retrotransposons are DNA hypomethylated. The targeting of PcG at this stage could thus allow rapid silencing of the element while DNA methylation is being established progressively throughout successive generations. One exciting question to address in the future is whether H3K27me3 persists after one round of reproduction or if DNA methylation quickly becomes dense enough to antagonize H3K27me3 in subsequent generations. If this is the case, H3K27me3 could just be a transient silencing form established quickly by the plant to compensate for low de novo DNA methylation in the first generation(s) after TE neo-insertion. Our study thus points to a synergy between the two silencing pathways, a concept that also recently emerged at other types of repeats46,47.
For the endogenous TEs, we identified two additional layers of H3K27me3 regulation linked with DNA methylation. First, we demonstrated that DNA demethylases can target specific TEs to promote H3K27me3 deposition instead of DNA methylation. This could be important for TEs, which, as epigenetic modules, negatively impact nearby gene regulation When marked by H3K27me3 instead of DNA methylation, contribute to a partially repressed state as opposed to a locked silencing state, since H3K27me3 is more plastic and labile in response to developmental or environmental cues. Similarly, the identification of TEs where DNA methylation and H3K27me3 marks not only co-occur but are also interdependent points to what we refer to as an “ambivalent state”, which is actively maintained, and where the loss of one mark is linked to the loss of the other. This could similarly be advantageous for nearby gene regulation; for example, a decrease in CHH methylation in response to certain stresses (such as pathogen-induced stress48) could lead to a concomitant decrease in the H3K27me3 mark even if the latter is not sensitive per se to that stress. In that sense, these ambivalent TEs could constitute sensitized modules for the dynamic regulation of nearby genes.
By connecting PRC2 to DNA methylation at TEs, our work in Arabidopsis provides important insights into the separation and specialization of the two major silencing pathways throughout eukaryotic evolution. We previously proposed that PcG was an ancestral system of TE silencing based on H3K27me3 being the dominant mark at TEs in unicellular organisms or ancestral plants. Furthermore, in ciliates, a small RNA-guided enhancer of zeste (the conserved catalytic subunit of PRC2) results in both H3K27me3 and H3K9 methylation49,50, which either reflects ciliate-specific catalytic activity or suggests that ancestral PRC2 has both activities. We propose that with the evolution of multicellularity and the need for a dynamic system to control developmental transitions, the PcG and H3K9me2/DNA methylation pathways may have specialized for the silencing of genes and TEs, respectively. Our present results nevertheless show that the major silencing pathways in eukaryotes maintain mechanistic connections despite specialization for different functions in higher plants, which is likely to help their functional cooperation in silencing in specific contexts. Such interconnections may exist in other kingdoms, as suggested by the small RNA-driven deposition of H3K27me3 in C. elegans51,52 or the existence of AEBP2, a mammalian PRC2 cofactor that requires DNA methylation for its activity53. Conversely, DNA methylation could be dependent on PRC2, as previously suggested54,55, and it would be interesting to investigate this possibility in plants. Future work needs to further decipher the connections between PcG and H3K9/DNA methylation and how they evolved from unicellular to multicellular organisms. This should undoubtedly shed light on the evolution of silencing pathways in eukaryotes and how they shape host genome regulation.
MATERIAL AND METHODS
Plant material and growth conditions
All the experiments were conducted on A. thaliana on ½ MS plates under short light-day conditions (8-h light/16-h dark photoperiod at 22°C). For the transgenic plants shown in Fig. 1, 4- week-old rosette leaves were pooled and collected for further ChIP-seq and BS-seq analysis.
Mutant lines
We used the drm1-2 drm2-2 double mutant (Salk-031705, Salk-150863)4 and the ros1 dml1 dml2 triple mutant (Col-0 background, derived from Salk_045303 Salk_056440 Salk_131712)13
Generation of transgenic lines
The COPIA21 TE sequence was synthesized and cloned and inserted into pUC57 via Genescript. COPIA21 TE was subsequently cloned and inserted into pCAMBIA3300. The plants were subsequently transformed via Agrobacterium tumefaciens floral dip56. For the experiments shown in Fig.1, the transgenic plants were selected on Basta after 2 weeks, transferred in soil and 4 weeks old rosettes leaves were collected in pools of 15-20 plants to perform ChIP-seq and BS-seq (on the same ground tissue).
RNAi lines were obtained by cloning approximately 250 bp fragments in an inverted orientation via the pFRN vector. The plants were subsequently transformed via Agrobacterium tumefaciens floral dip56, and T1 plants were selected in-vitro with Kanamycin resistance.
Chromatin immunoprecipitation (ChIP) and ChIP‒qPCR/sequencing analyses
ChIP experiments were conducted in WT or appropriate mutant lines via an anti-H3K27me3 antibody. IP and INPUT DNA were eluted, purified and sequenced (100 bp paired-end; Illumina) by BGI. Reads were mapped via BWA57 onto TAIR10 A. thaliana. Genomic regions significantly marked by H3K27me3 were identified via MACS258, and genes or TEs overlapping these regions were obtained via bedtools59. Heatmaps and plotprofiling were generated via bedtools computeMatrix to create a score matrix, and plotHeatmap was used to generate a graphical output of the matrix. For ChIP-seq analyses in rdd mutant, regions inherited from Ws-2 ros1 and dml2 mutants (Ws background) after backcrossing into Col-0 were excluded from analysis as previously described13
Read count analyses
Reads overlapping with the SNPs (between transgenes and endogenes) at ATCOPIA21 were extracted, counted and normalized by total read number via SAMtools view. In the control regions (UBQ and FLC), reads were extracted at positions Chr2:15,143,214 (UBQ) and Chr5:3,178,750 (1st intron FLC), respectively.
ACKNOWLEDGEMENTS
We thank N. Bouché for his help with some bioinformatic analyses. We thank ANRJCJC (ANR-19-CE12-0033-01 to A.D.) for funding and the Genome Biology Department of Institut de Biologie Intégrative de la Cellule (I2BC) for support. We thank the services and platforms of the Institut de Biologie Intégrative de la Cellule (I2BC) for excellent technical support and in particular Véronique Couvreux for excellent plant care.