Abstract
Linker histones play a pivotal role in shaping chromatin architecture, notably through their globular H1 (GH1) domain that contacts the nucleosome and linker DNA. Yet, interplays of H1 with chromatin factors along the epigenome landscape are just starting to emerge. Here, we report that Arabidopsis H1 occupies and favors both chromatin compaction and H3K27me3 enrichment on a majority of Polycomb-target protein-coding genes. In contrast, H1 prevents H3K27me3 accumulation on telomeres and pericentromeric interstitial telomeric repeats (ITRs) while orchestrating long-distance interactions regulating the 3D organization of these chromosome regions. Mechanistically, H1 prevents ITRs from being invaded by Telomere Repeat Binding 1 (TRB1), a GH1-containing telomere component with extra-telomeric functions in Polycomb recruitment. Based on these findings, we propose that H1 represses H3K27me3 accumulation on large blocks of telomeric repeats by antagonizing TRB1 association to linker DNA, conferring to linker histones an additional and sequence-specific role in modulating H3K27me3 epigenome homeostasis.
Introduction
Besides core histones, chromatin architecture and functionality rely on linker histone H1 whose central globular (GH1) domain sits on the nucleosome dyad while its intrinsically disordered carboxy-terminal domain binds linker DNA at the nucleosome entry and exit sites (Bednar et al., 2017; Zhou et al., 2015). H1 incorporation directly influences the physico-chemical properties of the chromatin fiber and further modulates nucleosome distribution and chromatin compaction. H1 also contributes to the local variation in transcriptional activity by affecting the accessibility of transcription factors and RNA polymerases to chromatin but also through interactions with histone and DNA modifiers (reviewed in Bednar et al., 2016; Fyodorov et al., 2017; Hergeth and Schneider, 2015).
Polycomb-Group (PcG) proteins are other important determinants of chromatin compaction and transcriptional activity, influencing cell identity and differentiation in metazoans (Grossniklaus and Paro, 2014; Schuettengruber et al., 2017), plants (Hugues et al., 2020) and unicellular organisms (Schubert, 2019). In metazoans, the chromatin of PcG target genes is highly compacted (Francis et al., 2004; Shao et al., 1999; Shu et al., 2012), a feature thought to hinder transcription (reviewed in Illingworth, 2019; Schuettengruber et al., 2017). The repressive activity of PcG proteins on transcription involves the enzymatic activity of Polycomb Repressive Complex 1 (PRC1) and 2 (PRC2) mediating histone H2A Lysine monoubiquitination (H2Aub) and histone H3 Lysine 27 trimethylation (H3K27me3), respectively (Grossniklaus and Paro, 2014; Schuettengruber et al., 2017).
Both nucleosomal and higher-order chromatin organization rely to a large extent on the regulation of chromatin compaction and accessibility (reviewed in Bonev and Cavalli, 2016), in which both H1 and Polycomb complexes play essential roles (Feng et al., 2014; Geeven et al., 2015; Grob et al., 2014; Liu et al., 2016; Sexton et al., 2012). Firstly, PRC1 subunits, such as Posterior sex combs (Psc) in Drosophila, Chromobox 2 (Cbx2) in mammals, or EMBRYONIC FLOWER1 (EMF1) in Arabidopsis, display highly positively charged regions that trigger chromatin compaction in vitro (Beh et al., 2012; Grau et al., 2011) and can mediate gene silencing and affect genome topology in vivo (Lau et al., 2017; Terranova et al., 2008). Secondly, mammalian PRC2 favors chromatin compaction, either by promoting PRC1 recruitment or through its subunit Enhancer of Zeste homolog 1 (Ezh1) in a mechanism not necessarily relying on the H3K27me3 mark itself (Margueron et al., 2008). Specific functional interplays between PcG and H1 in chromatin compaction have also emerged (Yuan et al., 2012). In human, the preferential interaction of H1.2 with H3K27me3 nucleosomes promotes chromatin compaction (Kim et al., 2015), while, conversely, human and mouse PRC2 complexes display substrate preferences for H1-enriched chromatin fragments, their in vitro activity being more stimulated on dinucleosomes than on mononucleosomes (Martin et al., 2006; Willcockson et al., 2020). In agreement with these findings, mouse H1 has just been identified as a critical regulator of H3K27me3 enrichment over hundreds of PRC2 target genes (Willcockson et al., 2020; Yusufova et al., 2020). Interestingly, chromosome conformation capture (Hi-C) analysis of haematopoietic cells (Willcockson et al., 2020), germinal centre B cells (Yusufova et al., 2020), and embryonic stem cells (Geeven et al., 2015) showed that H1-mediated chromatin folding in the nuclear space triggers distinct genome topologies in diverse cell types or during differentiation in mammals. Still, a potential sequence specificity of these events and their conservation through evolution have not yet been addressed.
In Arabidopsis thaliana, the H1.1 and H1.2 canonical linker histone variants display similar chromatin association properties and represent the full H1 complement in most somatic cells while a third and atypical H1.3 variant is only expressed under stress conditions or in a few cell types (reviewed in Kotliński et al., 2017; Over and Michaels, 2014; Probst et al., 2020). H1.1 and H1.2, hereafter referred to as H1, are enriched over heterochromatic transposable elements (TEs), which also display high levels of cytosine methylation, of nucleosome occupancy, and of heterochromatic histone modifications such as H3K9me2 (Choi et al., 2019; Rutowicz et al., 2015). In contrast, H1 is less abundant over genes marked by transcriptionally permissive histone modifications (Rutowicz et al., 2015; Choi et al., 2019). Hence, as in mammals, incorporation of Arabidopsis H1 is thought to dampen transcription elongation, an effect that, in plants, also applies to the production of TE-derived short interfering RNAs (siRNAs) (Papareddy et al., 2020), thereby not only restricting RNA Polymerase II (Pol II) but also RNA Pol IV activity. Arabidopsis H1 further restricts accessibility of DNA methyltransferases and demethylases that target TE sequences to regulate their heterochromatinization and silencing (He et al., 2019; Liu et al., 2020; Lyons and Zilberman, 2017; Wollmann et al., 2017; Zemach et al., 2013). Interestingly, the H1 complement is massively degraded in the vegetative pollen cell nucleus (He et al., 2019; Hsieh et al., 2016) and during the formation of Arabidopsis spore mother cells (SMCs), a depletion that coincides with heterochromatin loosening and a reduction in most H3K27me3 signals in SMC nuclei (She et al., 2013; She and Baroux, 2015). Hence, despite evidence that control of H1 abundance is used by plants for a global reprogramming of their epigenome, there is currently no information on the consequences of H1 depletion on the H3K27me3 chromatin landscape and, more generally, in the dynamic equilibria modulating chromatin compaction and accessibility in plants.
Using cytogenetic and bulk analyses of chromatin, we previously reported that somatic cells of Arabidopsis H1 knockout plants not only display dramatic defects in heterochromatin compaction but also present globally reduced H3K27me3 abundance whilst, oppositely, a few discrete subnuclear foci of undetermined nature display increased H3K27me3 signals (Rutowicz et al., 2019). Here, we report that H1 is required for H3K27me3 enrichment and low accessibility on a majority of the PRC2 target genes while preventing the accumulation of this mark and influencing the 3D organization of telomeres and pericentromeric interstitial telomeric regions. We explored the specificity of these chromatin alterations through the involvement of Telomere Repeat Binding 1 (TRB1), a GH1-containing histone-related protein that displays extra-telomeric roles in PRC2 recruitment on telomeric motifs (Schrumpfová et al., 2016; Xiao et al., 2017; Zhou et al., 2018). Collectively, our findings led us to establish that H1-mediated chromatin compaction orchestrates Arabidopsis chromosomal organization and contributes to the control of H3K27me3 homeostasis between structurally distinct regions in a sequence-specific manner, notably by antagonizing DNA binding of TRB proteins.
Results
H1 is abundant in the body of H3K27me3-marked genes and reduces their chromatin accessibility
To assess the relationships between H3K27me3, chromatin accessibility and the H1 landscapes, we first compared the genomic distribution of H3K27me3 with that of H1.2, the most abundant canonical H1 variant. To maximize H1 ChIP-seq specificity, we used an H1.2 GFP-tagged version transcribed under the control of its own promoter in wild-type (WT) plants (Rutowicz et al., 2015). In agreement with previous studies in plants and other eukaryotes (Cao et al., 2013; Choi et al., 2019; Izzo et al., 2013; Rutowicz et al., 2015), H1 distribution covered most of the Arabidopsis genome without displaying clear peaks. However, examination of H1 profiles over protein-coding genes, which are the quasi-exclusive carriers of H3K27me3 in Arabidopsis (Roudier et al., 2011; Sequeira-Mendes et al., 2014; Wang et al., 2015), showed that H1 is highly abundant on H3K27me3-marked gene bodies, especially towards their 5’ region in comparison to non-marked genes (Figure 1A), and this independently from variations in histone H3 profiles used as a proxy of nucleosome occupancy (Figure S1A).
Examination of transcription start sites (TSS) confirmed that H1 is globally more abundant on H3K27me3-marked genes than on genes with hallmarks of active Pol II initiation (H3K4me3), elongation (H2Bub), or having high transcript levels (Figure S1A). H1 also was globally much more present on H3K27me3-marked genes than on heterochromatic H3K9me2-rich transposable elements (TEs) (Figure 1A), which are themselves H1-rich and heavily condensed (Choi et al., 2019). This genome-wide enrichment suggests local interplays between H1 and H3K27me3 on protein-coding genes.
We examined chromatin accessibility of H3K27me3-marked genes by Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) analysis of 4C nuclei, corresponding to the most abundant ploidy level in our samples. In WT plants, H3K27me3-marked genes displayed low chromatin accessibility as compared to expressed genes (Figure S1B), which typically show a sharp ATAC peak at their TSS in such analyses (Lu et al., 2016). We similarly examined h1.1h1.2 (2h1) double mutant plants to test whether H1 contributes to this low accessibility. In contrast to transcriptionally active genes, H3K27me3-marked genes display no change in the TSS accessibility peak upon H1 depletion but they are globally more accessible along their body as well as their promoter and terminator domains (Figure 1B and S1B). Hence, H1 strongly occupies H3K27me3-marked genes where it could favor low chromatin accessibility.
H1 is necessary for H3K27me3 enrichment at a majority of PRC2 target genes
As a first assessment of functional links between H1 and H3K27me3, we profiled the H3K27me3 landscape upon H1 depletion. Considering that H3K27me3 bulk levels are ~2-fold lower in immunoblot and immunocytological analyses of 2h1 seedlings nuclei (Rutowicz et al., 2019), we employed a spike-in ChIP-seq with reference exogenous genome (ChIP-Rx) approach using a fixed amount of Drosophila chromatin added to each sample prior to immunoprecipitation and subsequently quantified in sequenced input and immunoprecipitated samples (as in Nassrallah et al., 2018) (Figure S2 and Additional File 1). Among the 7542 genes significantly marked by H3K27me3 in WT plants, 4317 (~55%) of them displayed lower levels of the mark in 2h1 plants while only 496 had the opposite tendency (Figure 2A-C, S3 and Additional File 1). Hence, the decreased levels of H3K27me3 in 2h1 seedlings previously identified by immunoblotting and immunocytology (Rutowicz et al., 2019) result from a widespread effect of H1 on PRC2 target genes.
We first envisaged that local H1 stoichiometry may contribute to determine PRC2 marking in cis as recently reported in mouse (Willcockson et al., 2020). We cannot rule out that a similar mechanism is at play in Arabidopsis but, oppositely, we found that H3K27me3 hypo-marked genes tend to display lower H1 levels in WT plants than unaffected ones (Figure 2D). Hence, in the absence of direct correlation underlying a selective influence of H1 on genes, we envisaged a sequence-dependent mechanism and explored it by examining sequence motifs in the promoters of H3K27me3 hypo/hyper-marked gene sets. This did not reveal any over-represented sequences in the promoters of the thousands hypo-marked genes. In contrast, we found three prevalent motifs in the 496 promoter sequences of the hyper-marked gene set (Figure S4). Among them, monomeric AAACCCTA telomeric motifs, referred to as telobox regulatory elements (Regad et al., 1994; Tremousaygue et al., 1999), can serve as Polycomb Response Elements (PREs) in Arabidopsis (Xiao et al., 2017; Zhou et al., 2018). In total, 87 of the 496 hyper-marked genes contain one or more telobox motifs in their promoters (Additional File 1).
Having identified that H1 is required for H3K27me3 marking over a large set of genes, we examined in our ATAC-seq data whether they had distinct chromatin accessibilities as compared to unaffected genes. In line with their moderate H1 occupancy, H3K27me3 hypo-marked genes were globally more accessible in WT plants than unaffected or hyper-marked genes (Figure 2D and E). We then tested chromatin accessibility of these gene categories in 2h1 nuclei. Hypo-marked gene bodies were slightly more accessible in the mutant line (Figure 2E), thereby correlating with local reduction in H3K27me3. Interestingly, accessibility of hyper-marked genes was also globally increased in 2h1 plants despite a clear gain in H3K27me3. These discrepancies suggested that mostly H1 incorporation and not variations in H3K27me3 influence the accessibility of PRC2-target genes.
H1 contributes to define expression of PRC2 target genes
To get more insights into the biological role of H1 on H3K27me3-marked genes, hereafter referred to as PRC2-target genes, we compared their transcript levels in WT and 2h1 plants. Similarly to previous reports (Choi et al., 2019; Rutowicz et al., 2019), our RNA-seq analysis confirmed that H1 loss-of-function triggers only minor gene expression changes in Arabidopsis seedlings (Additional File 2), but focusing on the set of H3K27me3 hypo-marked genes showed a significant upregulation of this gene repertoire in 2h1 plants (Figure 2F). Functional categorization of the most significantly misregulated ones identified an over-representation of genes involved in transcriptional regulation, meristem maintenance, cell wall organization and vascular development (Figure S5A). These categories are consistent with the repression of these biological processes by PRC2 (de Lucas et al., 2016) and with the subtle PRC2-like phenotypes of H1 mutant plants (Rutowicz et al., 2019). Reciprocally, a survey of the hyper-marked genes identified a large proportion of TE-like features, as 25% (124/496) of them overlap with a TE annotation or are annotated as transposon genes (Figure S5B, Additional File 1). Collectively, we identified a positive effect of H1 on H3K27me3 enrichment and chromatin compaction; effects that may jointly influence expression. This trend is contrasted by H1 preventing H3K27me3 accumulation at a minority of H1-rich and poorly expressed genes that frequently display TE features.
H1 prevents H3K27me3 invasion over a specific family of heterochromatic repeats
Considering the unexpected de novo marking of H3K27me3 on several TE-related genes in 2h1 plants, we extended our analysis to all TEs that, in Arabidopsis, usually lack H3K27me3 at baseline. This revealed that 1066 TEs are newly marked by H3K27me3 in 2h1 plants, most frequently over their entire length, thereby excluding a priori the possibility of a spreading from neighboring genes (Figure 3A). We clustered them into two groups, TE cluster 1 (n=216) with high enrichment of H3K27me3, and TE cluster 2 (n=850) with modest H3K27me3 enrichment (Figure 3A). While TE cluster 2 is composed of a representative variety of TE super-families, TE cluster 1 mostly consists of “DNA/Others” repeat annotations prevalently corresponding to ATREP18 elements (Figure 3B). This is a strong over-representation as, in total, TE cluster 1 and 2 comprise 60% of the 391 ATREP18s Arabidopsis elements (189 and 47, respectively), including many of the longest units (Figure S6). Comparison of H3K27me3 variation with local H1 occupancy showed that ATREP18 elements stand out from the general population of TEs by their outstanding gain of H3K27me3 in 2h1 plants (red dots in Figure 3C). TE cluster 1, and more generally ATREP18 elements, display heterochromatic properties with H3K9me2 marks (Figure S7A-D), elevated nucleosome and H1 occupancies (Figure 3D) as well as an extremely low chromatin accessibility as compared to the majority of TEs (Figure 3E and S7E). Taken together, these observations indicate that H1 has a negative effect on H3K27me3 accumulation over a set of H1-rich, heterochromatic, and highly compacted repeats, which contrasts with H1’s positive influence on H3K27me3 marking over thousands of PRC2-target genes.
While TEs are, on average, more accessible in H1 loss-of-function mutant plants (Choi et al., 2019; Rutowicz et al., 2019), ATAC-seq analysis of TE cluster 1 and all ATREP18 repeats showed that chromatin of these elements remained very poorly accessible in 2h1 plants (Figure 3E and S6D). This observation indicates that chromatin “inaccessibility” of ATREP18 elements is H1-independent or that H1 depletion is compensated by other mechanisms, such as the strong H3K27me3 local enrichment.
Large blocks of heterochromatic interstitial telomeric repeats gain H3K27me3 in 2h1 plants
To further explore how H1 selectively contributes to prevent H3K27me3 marking on TE cluster 1-2, we first envisaged the H3K27me1 heterochromatic mark as a preferred substrate for H3K27 trimethylation. However analysis of public datasets (Ma et al., 2018) showed that, as compared to other TEs, H3K27me1 is neither particularly abundant on TE cluster 1-2 nor on ATREP18 elements (Figure S7C). We further examined the epigenomic profile of loss-of-function plants for the three major histone H3K27me3 demethylases EARLY FLOWERING 6, RELATIVE OF ELF 6, and JUMANJI 13 (Yan et al., 2018). This revealed no evidence for selective H3K27me3 removal on TE cluster 1 elements in WT plants (Figure S8), altogether ruling out a specific role for H1 in promoting H3K27me3 demethylation at these loci.
We subsequently searched for over-represented DNA motifs in TE cluster 1 elements. Out of three 5-9 bp motifs identified, the most significantly enriched sequence corresponds to the telobox (ACCCTAA) motif (Figure 4A), as formerly found in TE-related genes gaining H3K27me3 in 2h1 plants (Figure S4). In total, 7328 teloboxes were found in 195 of the 216 TE cluster 1 elements (Additional File 1). Vice versa, searching this motif in all 391 annotated ATREP18 elements identified more than ten thousand motifs frequently organized as small clusters (Figure S9), which correspond to a ~100-fold over-representation as compared to other TEs (Figure 4B – note the two-colored scales). Besides these motifs, ATREP18s display neither typical TE functional features nor a predicted protein-coding region (Figure S10A). They are principally oriented on the minus DNA strand (Figure S10B) and most frequently located in close vicinity, nearly 90 % of them being positioned within 1 kb of each other (Figure S10C). Consistent with this spatial proximity, analysis of chromosomal distribution of ATREP18s revealed their concentration within two regions of ~355 kb and ~72 kb on chromosomes 1 and 4, respectively, which both colocalize remarkably well with outstanding H3K27me3 gain in 2h1 plants (Figure 4C and S11).
These specific features led us to consider these large blocks as two of the nine Arabidopsis genome loci previously identified as Interstitial Telomeric Repeat regions or ITRs (Uchida et al., 2002; Vannier et al., 2009), which comprise a mix of perfect and degenerated telomeric repeats (Schrumpfová et al., 2019). Altogether, 95 % of TE Cluster 1 elements are contained within two ITR blocks embedded in pericentromeric regions of chromosomes 1R and 4L, hereafter referred to as ITR-1R and ITR-4L (Figure 4C). Ectopic H3K27me3 deposition was also found on interspersed TE cluster 2 elements located in pericentromeric regions of the five chromosomes outside these two ITR blocks (Figure S11B), but our main conclusion is that H1 abundantly occupies two large blocks of pericentromeric ITRs where it prevents H3K27me3 marking.
H1 excludes the GH1-containing Telomere Repeat Binding 1 protein from ITR blocks
With the aim of assessing the molecular mechanisms that selectively drive PRC2 activity on ITRs in the absence of H1, we envisioned that Telomere Repeat Binding (TRB) proteins might have a prominent role. The TRB1-TRB3 founding members of this plant-specific family of GH1-containing Single-Myb-histone proteins constitute part of the telomere nucleoprotein structure required to maintain telomere length (Schrumpfová et al., 2014). The Myb domain of TRBs has strong affinity to the G-rich strand of telobox DNA motifs (Mozgová et al., 2008; Schrumpfová et al., 2004, 2014) and allows for a general role as transcriptional regulators onto gene promoters bearing telomeric telobox motifs (Schrumpfová et al., 2016; Zhou et al., 2016), notably through the recruitment of PRC2 (Xiao et al., 2017; Zhou et al., 2018). Interestingly, despite their low protein sequence similarity to H1 proteins (14±2%; Figure S12), TRBs display a typical GH1 domain (Charbonnel et al., 2018; Kotliński et al., 2017). Hence, we reasoned that reciprocal chromatin incorporation of these two categories of GH1-containing proteins might modulate PRC2 recruitment on ITRs.
To explore this possibility, we first examined the genomic distribution of TRBs with regard to H1 and telobox motifs using available TRB1 ChIP-seq data (Schrumpfová et al., 2016). This profiling analysis confirmed that TRB1 peaks were expectedly centered on telobox motifs (Schrumpfová et al., 2016) and that H1 is globally enriched over TE Cluster 1 elements, but the H1 pattern appeared anti-correlated with telobox density and with TRB1 peaks (Figure 5A and S13). To assess whether this apparent antagonism between H1 and TRB1 is a general property outside of teloboxes, we examined H1 occupancy over all TRB1 genome-binding sites and also found an inverse correlation between H1 and TRB1 on genes and TEs (Figure 5B). Consistent with the hypothesis that H1 prevents PRC2 recruitment over telomeric repeats, these observations hint at an antagonistic cis-enrichment of H1 and TRB1 over telobox-rich regions, including TE Cluster 1.
To better resolve this apparent mutual exclusion and link it to nucleosome core and linker DNA positioning, we took advantage of the well-positioned nucleosome (WPN) coordinates previously defined using MNase-seq (Lyons and Zilberman, 2017). We plotted the profiles of H1, TRB1, telobox motifs, and nucleosome occupancy over all WPNs. As expected, H1.2-GFP distribution matched the DNA linker regions while TRB1 tends to have a much broader distribution spanning 4-5 nucleosomes. The TRB1 profile nonetheless displayed a periodic pattern matching linker DNA coordinates, and, surprisingly, the telobox distribution itself sharply coincided with regions serving as linker DNA (Figure 5C). Considering this unexpected preferential distribution over linker DNA, we concluded that telobox motifs may play a functional role in nucleosome distribution and are subject to antagonistic binding by H1 and TRB1 (schematized in Figure 5D).
Based on these findings, we tested whether H1 depletion triggers ectopic TRB1 recruitment to telobox-containing TE Cluster 1-2 elements by performing ChIP analyses of endogenous TRB1 in 2h1 plants. Analysis of two known TRB1 target gene promoters (Schrumpfová et al., 2014; Schrumpfová et al., 2016) and two non-target genes without teloboxes showed the specificity of our assay (Figure 5E top panel). We then tested telobox-rich loci subject to ectopic H3K27me3 deposition: two loci in ITR-1R selected from TE cluster 1 and three interspersed ATREP18 or LTR-Gypsy elements selected from TE cluster 2 that are located on other chromosomes. Corroborating the ChIP-seq analyses, no significant TRB1 association was detected over these five loci in WT plants but TRB1 association was reproducibly detected in 2h1 samples (Figure 5E bottom panel). These findings shed light on a role for H1 in preventing TRB recruitment on TE Cluster 1 and 2 elements, therefore providing a plausible mechanism for de novo H3K27me3 deposition on ITRs in the absence of H1.
A role for H1 in telomere chromatin composition and sub-nuclear positioning
Considering that telomeres display a continuum of perfect telobox motifs and a propensity to attract TRB proteins (Schrumpfová et al., 2014), we assessed whether these chromosome domains were, similar to ITRs, subject to H3K27me3 enrichment in 2h1 plants. Because the perfect continuum of terminal telomeric motifs is not suited for quantitative NGS analyses, we examined telomeric H3K27me3 using dot-blot hybridization of anti-H3K27me3 and anti-H3 ChIP DNA to radioactively labeled concatenated telomeric probes (Adamusová et al., 2020). This led to the estimation that telomeres display an average ~4-fold H3K27me3 enrichment in 2h1 as compared to WT plants, independent of any detectable change in nucleosome occupancy (Figure 6A and S14).
To assess whether H3K27me3 enrichment concerns a few telomeres or affects them generally, we explored the nuclear distribution of this histone mark by immunolabeling combined with telomere Fluorescence In Situ Hybridization (DNA FISH). Expectedly, in WT nuclei most visible telomeric foci were usually distributed around the nucleolus and colocalized with H3K27me3 signals. Consistent with our dot-blot analysis, H3K27me3 signal intensity at telomere foci was enhanced in the 2h1 line (Figure 6B-C). Moreover, 2-to-4 telomere FISH foci frequently presented outstandingly strong H3K27me3 labeling (Figure 6B). We did not ascertain whether some of these strong signals correspond to cross-hybridizing pericentromeric ITRs. Their frequent positioning near to the nuclear periphery coincides with the typical localization of pericentromeres and might point out to the latter hypothesis, albeit many of the telomeres were also abnormally distant from the nucleolus in 2h1 plants (Figure 6C).
In this analysis, we unexpectedly detected a decreased number of telomeric foci in 2h1 as compared to WT plants (Figure 6C). This cytogenetic pattern may result from defective individualization of the telomeres and from indirect topological alterations leading to their mislocalization in the nucleus. Collectively, we concluded that H1 does not only prevent accumulation of H3K27me3 over ITRs and telomeres but also influences the sub-nuclear organization of interphase chromosomes (Figure 6D).
H1 promotes 3D chromatin packaging but attenuates the insulation of ITRs
To gain a more detailed view of the defects in chromosome organization induced by H1 loss-of-function and to investigate how telomeres and ITRs could be impacted, we employed in situ Hi-C on WT and 2h1 nuclei isolated from dissected cotyledons. The tissue homogeneity and high read coverage allowed us to reach a high resolution (Figure S15). Consistent with previous reports (Feng et al., 2014; Grob et al., 2014; Liu et al., 2016; Moissiard et al., 2012; Sun et al., 2020), WT plants displayed frequent intra-chromosomal interactions within the pericentromeric regions but fewer within the chromosomal arms (Figure S16A). Comparison of interaction frequencies as a function of genomic distance unveiled a tendency for fewer long-range interactions within chromosome arms and pericentromeric regions in 2h1 mutant nuclei (Figure 7A). Determination of interaction decay exponents (IDEs), which characterize chromatin packaging, confirmed a less steep decay with distance in 2h1 nuclei (Figure 7B). Both observations are consistent with a general trend for chromatin decompaction in the absence of H1. Accordingly, visualization of differential interaction frequencies between WT and 2h1 nuclei at a 100 kb resolution shed light on a clear decrease of intra-pericentromeric long-range interactions (i.e., blue squares surrounding the centromeres) and an increase association frequency of pericentromeric-regions with their respective chromosome arm (i.e., red crosses), which together reflects the loosening of pericentromeric heterochromatin compaction in 2h1 cotyledon nuclei (Choi et al., 2019; Rutowicz et al., 2019) (Figure 7C and S17).
We examined more precisely ITR-1R and 4L and observed that both chromatin regions form compacted structures resembling topologically associating domains (TADs) in WT plants (Figure 7D, S16A and S17A). Interestingly, intra-ITR interactions were strongly enhanced in 2h1 plants (Figure 7D, S16 and S17) indicating an enhanced insulation of the two ITR TAD-like structures in the absence of H1. Hence, the deletion of H1 triggered opposite trends of 3D chromatin organization defects between the two ITRs and their neighboring pericentromeric environments, the latter being characterized by extensive relaxation of heterochromatin.
Magnification of the ITR-1R region at a 2 kb resolution revealed a remarkably sharp correspondence between these TAD boundaries and H3K27me3-enrichment in 2h1 plants (Figure 7D bottom panel). To further assess the relationships between topological and H3K27me3 defects in 2h1 plants, we plotted all differences in H3K27me3 profiles over the chromosomes (sum of log2 ratios) and compared them to a similar representation of the Hi-C data (Figure 7E). The comparison sheds light on ITR-1R as a major H3K27me3-enriched genome locus with prominent differences in chromatin interactions in 2h1 chromatin.
H1 impacts long-distance interactions between chromosome ends
Considering that H3K27me3-enriched telomeres display altered sub-nuclear organization in 2h1 mutant plants (Figure 6B-C), we examined long-distance interactions among the corresponding chromosome regions in the Hi-C matrices. Because telomeres are not included in the Arabidopsis reference genome, we used the most terminal 100 kb regions of each chromosome sequence as a proxy to probe the 3D organization of sub-telomeric regions. In this analysis, we also considered an internal 100-kb region of each pericentromeric region, the ITR-1R and 4L, and several 100-kb regions randomly chosen in distal chromosomal arms. Possibly reflecting the capacity of different centromeres to aggregate within chromocenters, in WT plants we observed that pericentromeric regions frequently interact with each other as compared to chromosomal arm domains (Figure S17A-B). Similarly, in agreement with the preferential localization of centromeres at the nuclear periphery and of the telomeres around the nucleolus in the nuclear interior (reviewed in Pontvianne and Grob, 2020; Santos et al., 2020), most of the telomere-proximal regions frequently interacted with each other through long-range interactions, but less with ITR-1R and 4L (Figure S17A). One exception was the 100-kb region (referred here as the SubNOR4 region) that is separated from chromosome 4 telomere by the ~4 Mb Nucleolar Organizing Region 4 (NOR4) (Copenhaver and Pikaard, 1996), which encodes highly transcribed 45S ribosomal RNA genes (Mohannath et al., 2016) and tends to be more isolated than all other telomere proximal regions (Figure S17A).
In 2h1 nuclei, we observed that, with the exception of the SubNOR2 and SubNOR4 regions that displayed specific defects, the frequencies of interaction between the different sub-telomeric regions were increased (Figure 7F and S17). This observation supports an organizational model in which telomere territories tend to coalesce more frequently in the absence of H1, as hinted by the lower number of telomere FISH signals and the appearance of strong H3K27me3-marked telomeric foci in 2h1 nuclei (Figure 6B). Examination of ITR-1R and 4L did not show such a clear tendency, apart from a decreased frequency of association with pericentromeric regions as detailed above, and a slight increase between them and sub-telomeric regions (Figure 7F and S17C). In brief, in contrast to its positive impact on intra- and inter-pericentromere associations, H1 appears to dampen long-distance associations between telomeres in addition to preventing H3K27me3 enrichment on these loci.
Discussion
We report that Arabidopsis H1 is highly enriched at PRC2 target genes where it contributes to efficient H3K27me3 marking and to low chromatin accessibility. Contrasting with this general tendency, we identified an opposite role of H1 in limiting H3K27me3 deposition over interstitial telomeric repeats, telomeres, and a few genes frequently displaying TE-like features or containing a telobox sequence motif in their promoter region. Thus, H1 has a differential effect on H3K27me3 levels over thousands of protein-coding genes on the one hand and over loci characterized by repeated telomeric motifs on the other hand. This antagonistic behavior provides an explanation for our former observation that most of H3K27me3 nuclear signals are low in H1 loss-of-function plants whilst, intriguingly, a few foci of undetermined nature remain enriched (Rutowicz et al., 2019). The large scale on which these antagonistic patterns are observed sheds light on the existence of strong mechanistic links between H1 and Polycomb-based regulation in Arabidopsis, two main factors in the instruction of DNA accessibility.
Promoting H3K27me3 enrichment on genes: an evolutionarily conserved function of H1
A majority (57%) of the H3K27me3-marked genes were hypo-methylated in 2h1 plants, yet, most of the H3K27me3 peaks are not completely erased. This pattern suggests that H1 has a general influence on H3K27me3 deposition/maintenance or spreading but is not mandatory for the nucleation of this mark. Although we have not been able to identify any sequence specificity for H1-mediated H3K27me3 enrichment over most protein-coding genes, the dual influence of H1 on chromatin status and on transcriptomic patterns underlie its implication in selective mechanisms. While H1 depletion typically resulted in a global increase of gene chromatin accessibility, its impact on expression was apparently more related to variations in H3K27me3 marking. Hence, consistent with the functional categories of the misregulated genes, and with subtle phenotypes of H1-knockout plants reminiscent of PRC2 loss-of-function plants (Rutowicz et al., 2019), part of the defects in gene expression observed after H1 depletion might result from indirect consequences through PRC2 mis-function. These properties might represent an evolutionarily conserved function of H1, as suggested by recent findings in mouse where the depletion of H1 variants triggers widespread H3K27me3 loss and the misregulation of PRC2-regulated genes, phenocopying loss of EZH2 (Willcockson et al., 2020; Yusufova et al., 2020).
H1 may antagonize PRC2 activity on telomeric repeats through competitive binding with TRB proteins
Considering their capacity to recruit the somatic PRC2 methyltransferases CURLY-LEAF (CLF) and SWINGER (SWN) to telobox-containing genes (Zhou et al., 2018), TRBs represent excellent candidates for the sequence-specific regulation of H3K27me3 on ITRs and possibly also telomeres. Waiting for an assessment of the relative affinity of H1 and TRB1 for telobox elements in a chromatin context, several observations support a model in which H1 antagonizes PRC2 activity on telomeric repeats through competitive binding to DNA with TRB proteins. Firstly, on a genome-wide scale, H1 and TRB1 occupancies are negatively correlated. Secondly, analysis of nucleosome positioning showed that telobox motifs are preferentially situated in linker DNA where TRB1 association is also pronounced; hence competition with H1 might occur on linker DNA. Last, supporting this model, we showed that TRB1 ectopically invades ITRs and other telobox-rich elements gaining H3K27me3 in 2h1 plants. This indicates that, in wild-type plants, elevated H1 incorporation on these loci limits TRB1 enrichment and/or accessibility despite the presence of repeated telobox motifs for which the Myb domain of TRB1 has strong affinity (Mozgová et al., 2008; Schrumpfová et al., 2016). Given its highly dynamic association with chromatin (Dvořáčková et al., 2010), ectopic enrichment of TRB1 on ITR regions in 2h1 plants may therefore simply result from an increased accessibility of telobox sequences otherwise occupied by H1 (Figures 5D).
Our model echoes the recent report that Arabidopsis GH1-containing High Mobility Group A1 (GH1-HMGA1) is present at telomeres and minimizes H1 incorporation into chromatin (Charbonnel et al., 2018; Kotliński et al., 2017). Hence, in line with mammalian HMGs and H1 antagonistically binding to DNA (Catez et al., 2004; Krishnakumar et al., 2008), Arabidopsis H1 chromatin incorporation might act in competition with several GH1-domain-containing proteins, such as TRB1 and GH1-HMGA1. In Arabidopsis, the 15 GH1-domain-containing proteins (Charbonnel et al., 2018; Kotliński et al., 2017) may allow playing this H1 theme using multiple combinations of antagonistic or cooperative interactions to regulate chromatin.
Future studies will determine whether H3K27me3-enrichment on telomeric repeats directly relies on PRC2 recruitment by TRB proteins, as recently shown using telobox-containing promoter reporter lines (Zhou et al., 2018), or whether other chromatin modifiers influencing H3K27me3 are implicated. At this stage the latter possibility cannot be discarded as, for example, the Arabidopsis PRC1 subunit LIKE-HETEROCHROMATIN 1 (LHP1) acting as a chromatin reader of H3K27me3 (Turck et al., 2007), prevents TRB1 enrichment on PRC2 target genes displaying telobox motifs (Zhou el al., 2016). The outstanding pattern of telobox positioning in linker DNA also suggests a capacity of this sequence motif to influence chromatin organization, possibly by repelling nucleosomes away from telobox motifs.
A new role for H1 on telomeric chromatin structure
Owing to their repetitive nature (Fojtová and Fajkus, 2014; Majerová et al., 2014; Vaquero-Sedas et al., 2011, 2012; Vega-Vaquero et al., 2016), the chromatin composition and organization of plant telomeres has long remained enigmatic (Achrem et al., 2020; Dvořáčková et al., 2015). ChIP dot-blot analyses indicated a dominance of H3K9me2 over H3K27me3 histone marks but some of the Arabidopsis telomere regions also display euchromatic H3K4me2/3 marks (Adamusová et al., 2020; Grafi et al., 2007; Vaquero-Sedas et al., 2011). Here, combining ChIP-seq, ChIP dot-blot with telomeric probes and in situ immunolocalization showed that H1 moderates by 2-to-4 fold the accumulation of this mark on telomeres. However, our analyses did not allow assessing the precise distribution of HK27me3 enrichment along each telomere, especially considering their mosaic chromatin status. Arabidopsis telomeres are indeed thought to comprise chromatin segments with distinct nucleosome repeat length (NRL), for instance one of an average NRL of 150 bp (Ascenzi and Gantt, 1999), which is much shorter than the 189 bp estimated for H1-rich TEs (Choi et al., 2020). Such a small DNA linker size, i.e., 5 bp, is seemingly incompatible with H1 incorporation into chromatin, as H1 protects about 20 bp of DNA in vitro (Simpson, 1978). Consistently, H1 has been proposed to be under-represented at telomeres in plants (Ascenzi and Gantt, 1999; Fajkus et al., 1995) as it is in mammals (Achrem et al., 2020; Déjardin and Kingston, 2009; Galati et al., 2013; Makarov et al., 1993). This could explain the short NRL of Arabidopsis and human telomeres (Ascenzi and Gantt, 1999; Lejnine et al., 1995) and led to the interpretation that some telomere segments would exist in an H1-free state and display a columnar structure in which nucleosome arrays are stabilized be stacking interactions mediated by the histone octamers themselves (Fajkus and Trifonov, 2001). In conclusion, the existence of distinct chromatin states at Arabidopsis telomeres needs to be explored in more detail to establish whether the repressive influence of H1 on PRC2 activity is a general property of telomeres or rather impacts specific segments.
H1 has a profound influence on the Arabidopsis 3D genome topology
Using Hi-C we identified a reduced frequency of chromatin interactions within and among the pericentromeres in 2h1 nuclei. This is a typical feature of Arabidopsis mutants affecting chromocenter formation (Feng et al., 2014; Grob et al., 2014; Moissiard et al., 2012) or when chromocenters get decompacted in response to environmental stress (Sun et al., 2020). These analyses refine the recent observation that chromocenter formation is impaired in 2h1 leaf and cotyledon nuclei (Choi et al., 2019; He et al., 2019; Rutowicz et al., 2019), a defect that commonly reflects the spatial dispersion of pericentromeres within the nuclear space (Fransz et al., 2002). They also shed light on a complex picture in which ITR-1R and 4L embedded within the pericentromeres of chromosomes 1 and 4 escape the surrounding relaxation of heterochromatin induced by H1 depletion and organize themselves as TAD-like structures. H3K27me3 cis-enrichment resulting from H1 depletion might underlie the maintenance of compacted and very poorly accessible ITR chromatin while neighboring heterochromatic regions tend to become more accessible.
We also observed that H1 depletion leads to a reduction in the proportion of telomeric foci located near the nucleolus and in their total number. Using Hi-C, we could attribute this apparent lack of telomere spatial individualization to more frequent inter-chromosomal interactions between telomere proximal regions, which were used as a proxy for telomeres in the Hi-C analysis. As the preferential positioning of telomeres around the nucleolus and of the centromeres near to the nuclear periphery are important organizing principles of Arabidopsis chromosome territories (recently reviewed in Pontvianne and Grob, 2020; Santos et al., 2020), H1 therefore appears to be a crucial determinant of Arabidopsis interphase nuclear organization.
While both PRC1 and PRC2 participate in defining Arabidopsis genome topology (Feng et al., 2014; Veluchamy et al., 2016), H3K27me3 is favored among long-distance interacting gene promoters (Liu et al., 2016). This led to the proposal that, as in animals, this mark could contribute to shape chromosomal organization in Arabidopsis, possibly through the formation of Polycomb subnuclear bodies (Liu et al., 2016). Here, we mostly focused on large structural components of the genome, such as telomeres, pericentromeres and ITR regions. In mammals, H1 depletion not only triggers higher-order changes in chromatin compartmentation (Willcockson et al., 2020; Yusufova et al., 2020), but also extensive topological changes of gene-rich and transcribed regions (Geeven et al., 2015). Future studies will help to determine to which extend the impact of H1 on the H3K27me3 landscape contributes to define Arabidopsis genome topology.
H1 as a modulator of H3K27me3 epigenomic homeostasis
With thousands of perfect and degenerated telomeric motifs altogether covering ~430 kb, ITR-1R and 4L represent at least twice the length of all Arabidopsis telomeres combined, which span 2 to 5 kb at the end of each chromosome (Fitzgerald et al., 1999; Richards and Ausubel, 1988). Therefore, as illustrated in Figure 6D, the cumulated presence of thousands teloboxes in ITR-1R and 4L forms an immense reservoir of PRC2 targets. Our observations show that H1 serves as a safeguard to avoid the formation of gigantic H3K27me3-rich blocks in both pericentromeric ITRs and telomeres, which would be on a scale potentially tethering many PRC2 complexes away from protein-coding genes. In Neurospora crassa, artificial introduction of an array of (TTAGGG)17 telomere repeats at interstitial sites triggers the formation of a large block of H3K27me2/3-rich chromatin (Jamieson et al., 2018). This example and our findings illustrate the intrinsic attractiveness of telomeric motifs for H3K27me3 deposition in multiple systems. Our study further suggests the implication of H1 in balancing PRC2 activity between protein-coding genes and telomeric repeats in plants, hence potential acting as a modulator of the epigenome’s homeostasis.
Methods
Plant lines and growth conditions
All plants used in this study correspond to A. thaliana Col-0. The h1.1 h1.2 (2h1) Arabidopsis mutant line and the transgenic pH1.2::H1.2-GFP line (Rutowicz et al., 2015) were kindly provided by Dr. Kinga Rutowicz. Seeds were surface-sterilized, plated on half strength Murashige and Skoog (MS) medium with 0.9% agar and 0.5% sugar, and cultivated under long-day (16h/8h) at 23/19°C light/dark photoperiod (100 μmol.m−2.s−1) for 5 days unless otherwise stated. Cotyledons, when used, were manually dissected under a stereomicroscope.
Immuno-FISH
After fixation in 4% paraformaldehyde in 1X PME, cotyledons of 7-day-old seedlings were chopped directly in 1% cellulase, 1% pectolyase, and 0.5% cytohelicase in 1X PME, and incubated 15 min. Nucleus suspensions were transferred to poly-Lys-coated slides. One volume of 1% lipsol in 1X PME was added to the mixture and spread on the slide. Then, 1 volume of 4% PFA in 1X PME was added and slides were dried. Immunodetection and FISH were conducted as described previously (Charbonnel et al., 2018) using the following antibodies: rabbit H3K27me3 (#07-449 - Merck) diluted 1:200, Goat biotin anti Rabbit IgG (#65-6140 - ThermoFisher) 1:500, mouse anti-digoxigenin (#11333062910 -ROCHE) 1:125, rat anti-mouse FITC (#rmg101 - Invitrogen) at 1:500, goat Alexa 488 anti-rabbit IgG (#A11008 – Invitrogen) at 1:100, mouse Cy3 anti-biotin antibody (#C5585 - Sigma) at 1:1000. Acquisitions were performed on a structured illumination (pseudo-confocal) imaging system (ApoTome AxioImager M2; Zeiss) and processed using a deconvolution module (regularized inverse filter algorithm). The colocalization was analyzed via the colocalization module of the ZEN software using the uncollapsed Z-stack files. To test for signal colocalization, the range of Pearson correlation coefficient of H3K27m3 vs telomeric FISH signals were calculated with the colocalization module of the ZEN software using Z-stack files. Foci with coefficients superior to 0.5 were considered as being colocalized.
ATAC-seq
Nuclei were isolated from 200 cotyledons and purified using a two-layer Percoll gradient at 3000 g before staining with 0.5 μM DAPI and sorting by FACS according to their ploidy levels using a MoFlo Astrios EQ Cell Sorter (Beckman Culture) in PuraFlow sheath fluid (Beckman Coulter) at 25 psi (pounds per square inch), with a 100-micron nozzle. We performed sorting with ~43 kHz drop drive frequency, plates voltage of 4000-4500 V and an amplitude of 30-50 V. Sorting was performed in purity mode. For each sample, 20000 sorted 4C nuclei were collected separately in PBS buffer and centrifuged at 3000 g and 4 °C for 5 min. The nuclei were resuspended in 20 μl Tn5 transposase reaction buffer (Illumina). After tagmentation, DNA was purified using the MinElute PCR Purification Kit (Qiagen) and amplified with Nextera index oligos (Illumina). A size selection was performed with AMPure® XP beads (Beckman Coulter) to collect library molecules longer than 150 bp. DNA libraries were sequenced by Beijing Genomics Institute (BGI Group, Hong-Kong) using the DNA Nanoballs (DNB™) DNBseq in a 65 bp paired-end mode. Raw ATAC-seq data were treated using the custom-designed ASAP (ATAC-Seq data Analysis Pipeline; https://github.com/akramdi/ASAP) pipeline. Mapping was performed using Bowtie2 v.2.3.2 (Langmead and Salzberg, 2012) with parameters --very-sensitive −X 2000. Mapped reads with MAPQ<10, duplicate pairs, and reads mapping to the mitochondrial genome as well as repetitive regions giving aberrant signals (Quadrana et al., 2016) were filtered out. Concordant read pairs were selected and shifted as previously described by 4 bp (Schep et al., 2015). Peak calling was performed using MACS2 (Zhang et al., 2008) using broad mode and the following parameters: --nomodel --shift −50 --extsize 100. Heatmaps and metaplots were produced from depth-normalized read coverage (read per million) using the Deeptools suite (Ramírez et al., 2016).
In situ Hi-C
Hi-C was performed as in Grob et al. (2014) with downscaling using seedlings crosslinked in 10 mM potassium phosphate pH 7.0, 50 mM NaCl, 0.1 M sucrose with 4 % (v/v) formaldehyde. Crosslinking was stopped by transferring seedlings to 30ml of 0.15 M glycine. After rinsing and dissection, 1000 cotyledons were flash-frozen in liquid nitrogen and grinded using a Tissue Lyser (Qiagen). All sample were adjusted to 4 ml using NIB buffer (20 mM Hepes pH7.8, 0.25 M sucrose, 1 mM MgCl2, 0.5 mM KCl, 40 % v/v glycerol, 1 % Triton X-100) and homogenized on ice using a Douncer homogenizer. Nuclei were pelleted by centrifugation and resuspended in the DpnII digestion buffer (10 mM MgCl2, 1 mM DTT, 100 mM NaCl, 50 mM Bis-Tris-HCl, pH 6.0) before adding SDS to a final concentration of 0.5 % (v/v). SDS was quenched by adding 2% Triton X-100. DpnII (200 u) was added to each sample for over-night digestion at 37 °C. dATP, dTTP, dGTP, biotinylated dCTP and 12 μl DNA Polymerase I (Large Klenow fragment) were added before incubation for 45 min at 37 °C. A total of 50 unit of T4 DNA ligase along with 7 μl of 20 ng/μl of BSA (Biolabs) and 7 μl of 100 mM ATP were added to reach a final volume of 700ul. Samples were incubated for 4h at 16°C with constant shaking at 300rpm. After over-night reverse crosslinking at 65°C and protein digestion with 5 μl of 10 mg/μl protease K, DNA was extracted by phenol/chloroform purification and ethanol precipitation before resuspension in 100μL of 0.1X TE buffer. Biotin was removed from the unligated fragment using T4 DNA polymerase exonuclease activity. After biotin removal, the samples were purified using AMPure beads with a 1.6X ratio. DNA was fragmented using a Covaris M220 sonicator (peak power 75W, duty factor 20, cycles per burst 200, duration 150 s). Hi-C libraries were prepared using KAPA LTP Library Preparation Kit (Roche) as in Grob et al. (2014) with 12 amplification cycles. PCR products were purified using AMPure beads (ratio 1.85X). Libraries were analyzed using a Qubit fluorometer (Thermofisher) and a TAPE Station (Agilent) before sequencing in a 75 bp PE mode using a DNB-seq platform at the Beijing Genomics Institute (BGI Group; Honk Kong). Mapping of Hi-C reads was performed using the Hi-C Pro pipeline (Servant et al., 2015) with default pipeline parameters and merging data from three biological replicates at the end of the pipeline. Data were in visualized using the Juicebox toolsuite (Durand et al., 2016) and represented in Log10 scale after SCN normalization (Cournac et al., 2012) with Boost-HiC (Carron et al., 2019) setting alpha parameter to 0.2. In Figure S17, we normalized the sequencing depth in each sample and scored the number of reads in each combination of genomic regions using HOMER (Heinz et al., 2010). Read counts were further normalized for the bin size and the median value between the three biological replicates was reported.
RNA-seq
Seedlings grown in long days were fixed in 100% cold acetone under vacuum for 10 min. Cotyledons from 100 plants were dissected and grinded in 2 ml tubes using a Tissue Lyser (Qiagen) for 1 min 30 sec at 30 Hz before RNA extraction using the RNeasy micro kit (Qiagen). RNA was sequenced using the DNBseq platform at the Beijing Genomics Institute (BGI Group) in a 100 bp paired-end mode. For raw data processing, sequencing adaptors were removed from raw reads with trim_galore! v2.10 (https://github.com/FelixKrueger/TrimGalore). Reads were mapped onto combined TAIR10 genome using STAR version 2.7.3a (Dobin et al., 2012) with the following parameters “--alignIntronMin 20 -- alignIntronMax 100000 --outFilterMultimapNmax 20 --outMultimapperOrder Random -- outFilterMismatchNmax 8 --outSAMtype BAM SortedByCoordinate --outSAMmultNmax 1 -- alignMatesGapMax 100000”. Gene raw counts were scored using the htseq-count tool from the HTSeq suite version 0.11.3 (Anders et al., 2015) and analyzed with the DESeq2 package (Love et al., 2014) to calculate Log2-fold change and to identify differentially expressed genes (p-value < 0.01). TPM (Transcripts per Million) were retrieved by dividing the counts over each gene by its length and the total counts in the sample and multiplying by 106. Mean TPM values between two biological replicates were used for subsequent analyses. To draw metagene plots, genes were grouped into expressed or not and expressed genes split into four quantiles of expression with the function ntile() of the R package dplyr (https://CRAN.R-project.org/package=dplyr).
H1 and H3 ChIP-seq experiments
H1.2-GFP and parallel H3 profiling were conducted as in (Fiorucci et al., 2019) with slight modifications to sonicate chromatin to reach mono/di-nucleosome fragment sizes. WT Col-0 or pH1.2::H1.2-GFP seedlings were crosslinked for 15 min using 1 % formaldehyde. After dissection, 400 cotyledons were grinded in 2 ml tubes using a Tissue Lyser (Qiagen) for 2 x 1 min at 30 Hz. After resuspension in 100 μl Nuclei Lysis Buffer 0.1 %SDS, the samples were flash frozen in liquid nitrogen and chromatin was sheared using a S220 Focused-ultrasonicator (Covaris) for 17 min at peak power 105 W, duty factor 5%, 200 cycles per burst, to get fragment sizes between 75 and 300 bp. Immunoprecipitation was performed on 150 μg of chromatin quantified using the Pierce™ BCA Protein Assay Kit (Thermo Fisher Scientific) with 60 μl of Protein-A/G Dynabeads and 3.5 μl of anti-GFP (Thermo Fisher #A11122) for H1.2-GFP and mock (WT) sample or anti-H3 (Abcam #Ab1791) for H3 IPs. Immunoprecipitated DNA was subjected to library preparation using the TruSeq® ChIP Sample Preparation Kit (Illumina) and sequenced using a NextSeq 500 system in a single-end 50 bp mode (Genewiz, USA).
TRB1 ChIP-qPCR
Four biological replicates of 8-day-old WT or 2h1 crosslinked seedlings were ground in liquid nitrogen. For each replicate, 2 g of tissue was resuspended in 30 ml EB1. Nuclei lysis buffer was added to 520 μg of chromatin to reach 1 μg/μl of proteins according to BCA assay. 20 μl was kept as an input. 1.44ml of Dynabeads protein G slurry (ThermoFisher Scientific 10004D) was added to each sample and incubated 1h at 4°C with 6.5 μl of Bridging Antibody for Mouse IgG (Active Motif #53017) on a rotating wheel. 60 μl of bridged beads were then added to each sample for pre-clearing. The remaining bridged beads were incubated with 50μl of anti-TRB1 5.2 antibody (Schrumpfová et al., 2014) for 3 hours at 4°C under rotation before transfer to the pre-cleared chromatin samples and incubation overnight at 4°C under rotation. Beads were washed and chromatin was eluted in 500 μl of SDS elution buffer (1 % SDS, 0.1 M NaHCO3) at 65°C before reverse crosslinking adding 20 μl of 5 M NaCl overnight at 65°C. After Proteinase K digestion at 45°C for 1 h, DNA was purified by phenol-chloroform extraction and ethanol precipitated. The pellet of each input and IP was resuspended in 40 μl of 0.1X TE pH 8.0. DNA was analyzed by quantitative PCR using a LightCycler 480 SYBR green I Master mix and a LightCycler 480 (Roche) using the primers sequences given in Additional file 3.
ChIP-hybrization analysis of telomeric H3K27me3 and H3
Anti-H3K27me3 (Millipore, #07-449 antibody) and anti-H3 (Abcam #Ab1791 antibody) ChIPs were conducted using 2 g of tissue. Pellets of both inputs (20%) and immunoprecipitated DNA were resuspended in 40 μl of TE, pH 8.0 and analyzed through dot-blot hybridization using a radioactively labeled telomeric probe synthesized by non-template PCR (IJdo et al., 1991; Adamusová et al. 2020). ITRs contribution to the hybridization signal was minimized using high stringency hybridization as detailed in Adamusová et al. (2020).
H3K27me3 ChIP-Rx
ChIP-Rx analysis of H3K27me3 (Millipore, #07-449) was conducted as in (Nassrallah et al., 2018) using two biological replicates of 8-day-old WT and 2h1 seedlings. For each biological replicate, two independent IPs were carried out using 120 μg of Arabidopsis chromatin mixed with 3 % of Drosophila chromatin quantified using the Pierce™ BCA Protein Assay Kit (Thermo Fisher Scientific). DNA samples eluted and purified from the two technical replicates were pooled before library preparation (Illumina TruSeq ChIP) and sequencing (Illumina NextSeq 500, 1×50bp) of all input and IP samples by Fasteris (Geneva, Switzerland).
ChIP-seq and MNase-seq bioinformatics
For H3K27me3 spike-in normalized ChIP-Rx, raw reads were pre-processed with Trimmomatic v0.36 (Bolger et al., 2014) to remove Illumina sequencing adapters. 5′ and 3′ ends with a quality score below 5 (Phred+33) were trimmed and reads shorter than 20 bp after trimming were discarded (trimmomatic-0.36.jar SE -phred33 INPUT.fastq TRIMMED_OUTPUT.fastq ILLUMINACLIP:TruSeq3-SE.fa:2:30:10 LEADING:5 TRAILING:5 MINLEN:20). We aligned the trimmed reads against combined TAIR10 Arabidopsis thaliana and Drosophila melanogaster (dm6) genomes with Bowtie2v.2.3.2 using “--very-sensitive” setting. Alignments with MAPQ < 10, duplicated reads and reads mapping on repetitive regions (as defined in (Quadrana et al., 2016)) were discarded with sambamba v0.6.8. (Tarasov et al., 2015). Peaks of H3K27me3 read density were called using MACS2 (Zhang et al., 2008) with the command “macs2 callpeak −f BAM --nomodel −q 0.01 −g 120e6 --bw 300 --verbose 3 --broad”. Only peaks found in both biological replicates and overlapping for at least 10 % were retained for further analyses. We scored the number of H3K27me3 reads overlapping with marked genes using bedtools v2.29.2 multicov and analyzed them with the DESeq2 package (Love et al., 2014) in the R statistical environment v3.6.2 to identify the genes enriched or depleted in H3K27me3 in mutant plants (p-value < 0.01). To account for differences in sequencing depth we used the function SizeFactors in DESeq2, applying a scaling factor calculated as in (Nassrallah et al., 2018).
For the H1.2-GFP and H3 ChIP-seq datasets, raw reads were processed as for H3K27me3. We counted the reads of GFP-H1.2 and ATAC-seq over genes and TEs using bedtools v2.29.2 multicov and converted them in median counts per million, dividing the counts over each gene or TE by its length and by the total counts in the sample and multiplying by 106. The mean value between biological replicates of IP was used in Figure 1, while the ratio between IP and Input was used for violin-plot analysis of H1.2-GFP in Figure S13. Annotation of genes and TEs overlapping with peaks of histone marks H3K27me3, H3K4me3, and H2Bub were identified using bedtools v2.29.2 intersect. To include nucleosomes in close proximity of the TSS, an upstream region of 250 bp was also considered for the overlap for H3K27me3, TRB1 and H3K4me3. H3K27me3 TE cluster 1 and TE cluster 2 were identified using Deeptools plotHeatmap using the --kmeans setting. Tracks were visualized using Integrative Genomics Viewer (IGV) version 2.8.0 (Thorvaldsdóttir et al., 2012). Metagene plots and heatmaps were generated from depth-normalized read densities using Deeptools computeMatrix, plotHeatmap, and plotProfile. Violin-plots, histograms and box-plots were drawn using the package ggplot2 v3.2.1 (https://cran.r-project.org/web/packages/ggplot2/) in the R statistical environment. All scripts used will be made publicly available. Shuffled controls, where present, were produced with random permutations of genomic position of the regions if interest. The permutations were generated with bedtools v2.29.2 and the command “bedtools shuffle -chromFirst -seed 28776 -chrom”.
For MNase-seq analyses, MNase read density (Lyons et al, 2017) was obtained from NCBI GEO under the accession GSE96994. Genomic location of WPNs shared between WT and 2h1 plants were identified as overlapping WPN coordinates between the two genotypes calculated with bedtools v2.29.2 intersect.
Telobox positioning was analyzed using the coordinates described in (Zhou et al., 2018) and obtained from https://gbrowse.mpipz.mpg.de/cgi-bin/gbrowse/arabidopsis10_turck_public/?l=telobox;f=save+datafile. Telobox repeat numbers were scored over 10-bp non-overlapping bins, smoothed with a 50-bp sliding window and subsequently used to plot telobox density.
DNA sequence motif search
Motifs enriched in gene promoters (−500 bp to +250 bp after the TSS) and in annotated units of TE cluster 1 elements were identified using MEME version 5.1.1 (Bailey et al., 2015). The following options were used for promoters: “-dna -mod anr -revcomp -nmotifs 10 -minw 5 -maxw 9” and for TEs: “-dna -mod anr -nmotifs 10 -minw 5 -maxw 9 -objfun de -neg Araport11_AllTEs.fasta -revcomp -markov_order 0 - maxsites 10000” where Araport11_AllTEs.fasta correspond to the fasta sequence of all TEs annotated in Araport11.
Gene ontology analysis
Gene ontology analysis of H3K27me3 differentially marked genes were retrieved using the GO-TermFinder software (Boyle et al., 2004) via the Princeton GO-TermFinder interface (http://go.princeton.edu/cgi-bin/GOTermFinder). The REVIGO (Supek et al., 2011) platform was utilized to reduce the number of GO terms and redundant terms were further manually filtered. The log10 p-values of these unique GO terms were then plotted with pheatmap (https://CRAN.R-project.org/package=pheatmap) with no clustering.
Protein alignment
Protein sequences of H1.1, H1.2, H1.3, TRB1, TRB2 and TRB3 were aligned using T-Coffee (http://tcoffee.crg.cat/apps/tcoffee/do:regular) with default parameters. Pairwise comparison for similarity and identity score were calculated using Ident and Sim tool (https://www.bioinformatics.org/sms2/ident_sim.html).
Data and materials availability
This study did not generate new unique reagents. All public genomic data used in this study are listed in Additional file 4. Additional files will be made available upon request to the corresponding authors.
Graphical abstracts were created using Biorender.com
Funding
FB benefitted from grants of the Agence Nationale de la Recherche projects ANR-10-LABX-54, ANR-18-CE13-0004-01, ANR-17-CE12-0026-02 at IBENS and ANR-11-EQPX-0029, ANR-10-INBS-04; ANR-11-IDEX-0003-02 at the Imagerie-Gif core facility. Collaborative work between FB and CeB was supported by a research grant from the Velux Foundation (Switzerland) and the Ricola Foundation (Switzerland). Collaborative work between FB and AP was supported by CNRS EPIPLANT Action (France). GT benefitted from a short-term fellowship of the COST Action CA16212 INDEPTH (EU) for training in Hi-C by SG in UG’s laboratory, which is supported by the University of Zurich (Switzerland) and the Swiss National Science Foundation (project 31003A_179553). SA benefitted from a CAP20-25 Emergence research grant from Région Auvergne-Rhône-Alpes (France). Work in JF’s team was supported by the Czech Science Foundation (project 20-01331X) and Ministry of Education, Youth, and Sports of the Czech Republic - project INTER-COST (LTC20003).
Author contributions
GT, LW and ClB performed ChIP and ChIP-RX experiments; GT and MB generated ATAC-seq datasets; KA and MF performed telomere dot-blots; MB contributed to FACS nucleus sorting; SA performed cytological experiments and quantification; GT and SG generated the Hi-C datasets. AK and VC developed ATAC-seq bioinformatics tools. LoC, GT, ClB performed RNA-seq, ChIP-seq, and ATAC-seq bioinformatics analyses. LoC and LeC performed Hi-C bioinformatics. GT, ClB, SG, and FB conceived the study. FB, ClB, AC, CeB, ChB, AC, SA, AP, UG, PPS, JF, and SG supervised research, discussed the results and edited the manuscript. FB and ClB jointly wrote the paper and directed the study.
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Fredy Barneche (barneche{at}bio.ens.psl.eu).
Acknowledgements
The authors are grateful to Erwann Cailleux (IBENS, Paris, France) and David Latrasse (IPS2, Orsay, France) for technical guidance with ATAC-seq; Nicolas Valentin (I2BC, Gif, France) for assistance with FACS; Magali Charvin (IBENS, Paris, France) for technical assistance with the IBENS plant growth facility, to Frédérique Perronet (IBPS, France) for providing Drosophila samples; Kinga Rutowicz (University of Zurich, Switzerland) and Angélique Déléris (IBENS, Paris and I2BC, Gif-Sur-Yvette, France) for sharing unpublished work.
Footnotes
We corrected spelling errors in the author names' metadata and in the graphical abstract