Dynamic DNA methylation turnover at the exit of pluripotency epigenetically primes gene regulatory elements for hematopoietic lineage specification

Epigenetic mechanisms govern developmental cell fate decisions, but how DNA methylation coordinates with chromatin structure and three-dimensional DNA folding to enact cell-type specific gene expression programmes remains poorly understood. Here, we use mouse embryonic stem and epiblast-like cells deficient for 5-methyl cytosine or its oxidative derivatives (5-hydroxy-, 5-formyl- and 5-carboxy-cytosine) to dissect the gene regulatory mechanisms that control cell lineage specification at the exit of pluripotency. Genetic ablation of either DNA methyltransferase (Dnmt) or Ten-eleven-translocation (Tet) activity yielded largely distinct sets of dysregulated genes, revealing divergent transcriptional defects upon perturbation of individual branches of the DNA cytosine methylation cycle. Unexpectedly, we found that disrupting DNA methylation or oxidation interferes with key enhancer features, including chromatin accessibility, enhancer-characteristic histone modifications, and long-range chromatin interactions with putative target genes. In addition to affecting transcription of select genes in pluripotent stem cells, we observe impaired enhancer priming, including a loss of three-dimensional interactions, at regulatory elements associated with key lineage-specifying genes that are required later in development, as we demonstrate for the key hematopoietic genes Klf1 and Lyl1. Consistently, we observe impaired transcriptional activation of blood genes during embryoid body differentiation of knockout cells. Our findings identify a novel role for the dynamic turnover of DNA methylation at the exit of pluripotency to establish and maintain chromatin states that epigenetically prime enhancers for later activation during developmental cell diversification. Highlights We perform a detailed epigenetic characterisation of the mouse embryonic stem cell (ESC) to epiblast-like cell (EpiLC) transition in wild type, Tet triple-knockout (TKO) and Dnmt TKO lines and develop a novel clustering approach to interrogate the data. Tet TKO reduces H3K4me1 and H3K27ac levels across enhancer elements upon pluripotency exit whilst Dnmt TKO affects only H3K4me1 levels, suggesting a novel role for oxidative derivatives in H3K4me1 deposition. Tet TKO and Dnmt TKO affect enhancer priming in EpiLCs which is associated with failure to upregulate hematopoietic genes upon differentiation. Long-range chromosomal interactions between primed enhancers and their target genes are weakened in both Dnmt and Tet TKO.

stem cells, we observe impaired enhancer priming, including a loss of three-dimensional interactions, at regulatory elements associated with key lineage-specifying genes that are required later in development, as we demonstrate for the key hematopoietic genes Klf1 and Lyl1. Consistently, we observe impaired transcriptional activation of blood genes during embryoid body differentiation of knockout cells. Our findings identify a novel role for the dynamic turnover of DNA methylation at the exit of pluripotency to establish and maintain chromatin states that epigenetically prime enhancers for later activation during developmental cell diversification.

Introduction
The early preimplantation epiblast (~embryonic day E3.75-E4.5 in mice) is populated by stem cells that are able to self-renew but lack the capacity to differentiate into the major cell lineages 1,2 . In order to acquire specific cell fates during development, epiblast cells are thought to exit this "naïve" state of pluripotency and progress towards a "formative" state where the transcriptional, epigenetic, and metabolic landscape is remodelled in a way that cells acquire multi-lineage competence (in the E5.5 -E6.0 post-implantation epiblast in mice) 3 . This transition from naïve to formative pluripotency can be modelled in vitro by transitioning naïve mouse embryonic stem cells (ESCs) from media containing leukaemia inhibitory factor (LIF), MAPK inhibitor and GSK3 inhibitor (known as 2i+LIF conditions) to media containing FGF2 and Activin A, which promotes the formation of epiblast-like cells (EpiLCs). EpiLCs closely resemble cells in a formative state in vivo 4,5 , and they acquire germ line competence that is missing in the early epiblast and naïve ESCs 4,6 .
DNA methylation (DNAme) is an important epigenetic mark that is dramatically remodelled during this developmental period 7 . In mammals, DNAme principally occurs at the 5 th position of the base cytosine in the context of CpG dinucleotides, generating 5-methyl-cytosine (5mC). Found at high levels in the gametes, DNAme is depleted in the zygote following fertilisation but is subsequently re-established around implantation.
Genome-wide levels increase from~25% in the early preimplantation epiblast to~75% in the post-implantation epiblast 7 . Whilst these high global levels are then sustained in the majority of somatic tissues following differentiation, the precise distribution of DNAme is highly variable between tissues and cell types, especially at regulatory elements such as enhancers [8][9][10][11] .
The molecular machinery that deposits and removes DNAme are relatively well understood: DNA methyltransferases 3A and 3B (DNMT3A/B) deposit DNAme de novo and DNMT1 maintains DNAme patterns following cell division (together with cofactors) 12 . Conversely, DNAme can be lost passively when the levels or activity of the DNMTs are reduced or can alternatively be removed in an active process involving sequential oxidation by Ten Eleven Translocation enzymes (TET1, 2, 3) to 5-hydroxy-, 5-formyl-and 5-carboxy-cytosine (5hmC, 5fC and 5caC respectively) [13][14][15][16] . These derivatives are not efficiently recognised by DNMT1 and are therefore lost upon replication [17][18][19] . Alternatively, 5fC and 5caC can also be removed in an active and cell-cycle independent manner by thymine-DNA glycosylase and base excision repair 15,20 .
As cells exit naive pluripotency, both DNMTs and TETs are paradoxically co-expressed at high levels and target overlapping genomic regions leading to a cyclical turnover of methylation. This DNAme turnover occurs genome-wide but is especially prevalent across regulatory elements such as enhancers [21][22][23][24] . Whilst the role of DNAme turnover is not clear, the remodelling of DNAme in early development is functionally critical. Embryonic stem cells lacking all three DNMTs or TETs (triple knockout cells, TKO) remain pluripotent but they do not differentiate effectively 25,26 . Dnmt1 or Dnmt3b knockout embryos and Tet TKO embryos die shortly after gastrulation with severe but poorly characterised defects including abnormal primitive streak patterning and differentiation of mesodermal tissues [27][28][29] .
Mutations in both Dnmt and Tet genes have been observed frequently in haematological malignancies and clonal hematopoiesis [30][31][32][33][34][35][36] , and more recently lineage specific knockouts and chimeric mice have demonstrated that TET enzymes are required for efficient hematopoiesis 37,38 , suggesting an important role for methylation and oxidation in this lineage. Bulk RNA-sequencing and whole genome bisulfite-sequencing (WGBS) of Tet TKO embryos implicated Lefty1/2 hypermethylation and perturbed NODAL and WNT signalling as a driver of these defects 28,39,40 , but the broader effect of TET activity loss on the histone modification landscape and enhancer activity was not investigated, despite enhancers being primary regions of DNAme turnover.
In contrast to active enhancers (marked by H3K4me1 and H3K27ac) that drive expression of associated genes, primed or poised enhancers (marked by H3K4me1 alone or a combination of H3K4me1 and H3K27me3, respectively) are silent but molecularly prepared for gene activation upon stimulation (e.g. following differentiation) [41][42][43] . Recent multi-omic data has demonstrated that enhancers driving ectoderm specific gene expression programmes are already hypomethylated and accessible in the epiblast, suggesting priming, whilst mesoderm and endoderm enhancers are not primed at this stage 10 .
However, the role of DNA methylation and oxidation in regulating this epigenetic environment is unknown.
Here, we profile in depth the transcriptome, chromatin structure and epigenome of mouse ESCs across culture conditions that model the exit of pluripotency, and we develop a novel analysis approach that uses promoter-capture HiC data (PCHi-C) to identify promoter-regulatory element pairs. To understand the effects of DNAme turnover on this transcriptional and epigenetic landscape, we perform the same molecular characterisation in cells lacking DNAme (Dnmt TKO) or oxidation (Tet TKO). Excitingly, we find that both knockouts result in reduced H3K4me1 levels whilst H3K27ac is only affected upon Tet TKO suggesting a novel role for methylation turnover in H3K4me1 deposition. Gene expression changes upon Tet TKO were linked to enhancer number, where highly expressed genes interacting with multiple enhancers were more likely to be downregulated than those with fewer putative enhancer interactions. Loss of methylation and oxidation result in distinct transcriptional defects upon differentiation into embryoid bodies (EBs): Dnmt TKO cells fail to exit pluripotency effectively whilst Tet TKO EBs have distinct lineage biases including defects in blood specification. Finally, we describe extensive loss of enhancer priming at regulatory elements associated with key hematopoietic transcription factors in both knockouts at the EpiLC stage, which may explain the failure to upregulate this transcriptional programme upon differentiation.

Results
EpiLC establishment is accompanied by epigenetic remodelling To understand the interplay between higher-order chromatin structure and epigenetic chromatin modifications, and how this governs the transcriptional programmes that underpin formative pluripotency, we transitioned wild type (WT) naïve ESCs (cultured in 2i-LIF conditions) to EpiLCs 4 and profiled chromatin states using a broad range of genomics approaches. These included RNA-sequencing, chromatin immunoprecipitation and sequencing (ChIP-seq), assay for transposase accessible chromatin and sequencing (ATAC-seq), whole genome bisulfite sequencing (WGBS), 5hmC pulldown and sequencing (HMCP-seq), and PCHi-C (Fig. 1a).
Over a period of 48 hours cells became morphologically flatter and began expressing key post-implantation associated genes including Pou3f1, Fgf2, Otx2 and Dnmt3a/b (Fig. 1b).
Naïve pluripotency genes including Tbx3, Klf4 and Esrrb were also robustly downregulated whilst core pluripotency genes such as Oct4 and Sox2 remained expressed at high levels ( Fig. 1b). More broadly, we identified 434 significantly upregulated genes and 765 significantly downregulated genes (DESeq p <0.05 and dynamic fold-change filter) that were enriched in gene ontology (GO) terms such as MAPK signalling and focal adhesion, respectively (Supplemental Fig. 1a).
The promoters of both activated and repressed genes remained largely hypo-methylated (Supplemental Fig. 1b) despite the expected global accumulation of DNA methylation observed (Supplemental Fig. 1c), suggesting that promoter methylation plays a minor role in gene regulation during the formation of EpiLC cells. Histone modification changes at promoters reflected expression change for some but not all genes. For example, 316 of 434 upregulated genes (~73%) accumulated H3K4me3 (DESeq p<0.05), a mark associated with active promoter elements. Similarly, 300 of 765 downregulated genes lost H3K4me3 (~39%) from their promoters (Fig. 1c, Supplemental Fig. 1d). Similar correlations were observed for H3K27ac levels and accessibility levels (Supplemental Fig. 1d).
Regulatory elements, such as enhancers, are thought to interact with cognate promoters via three-dimensional loops 44 . To connect enhancers with their putative target genes, we used PCHi-C data to build a network 45 in which each node represented a genomic location (either a promoter or a promoter interacting region [PIR]) and each edge represented a PCHi-C interaction. For visualisation we used a force-directed graph layout in which highly interacting regions are pulled close together, and epigenetic information can be superimposed (similar to Canvas in 46 , Supplemental Fig. 2a, b). Colouring edges by the levels of histone modifications at promoters or PIRs did not reveal substantial changes between ESC and EpiLC cells on a global scale; however, we detected clusters of active or inactive genes and regulatory elements (Supplemental Fig. 2b). For example, there were distinct clusters of H3K27me3 marked genes and regulatory elements, including clusters of homeobox (Hox) genes 47 known to be repressed by Polycomb repressive complex 2 (PRC2) and PRC1 in embryonic stem cells (Supplemental Fig. 2b) 48,49 .
As anticipated, genes activated upon ESC to EpiLC transition tended to interact with a greater number of PIRs that gain the active histone modifications H3K27ac and H3K4me1, whilst repressed genes tended to interact with more PIRs that lose these marks (Supplemental Fig. 2c-f). For example, of the PIRs interacting with upregulated genes, 11.2% accumulated H3K27ac (change in RPKM >1 between conditions), whilst just 0.1% of PIRs interacting with downregulated genes gained H3K27ac upon ESC to EpiLC transition (Supplemental Fig. 2f). A subnetwork centred on the well-characterised post implantation gene Dnmt3b showed that all significant PIRs detected (4/4) gained H3K27ac upon the EpiLC transition, concomitant with activation of the gene (Fig. 1b, d). These PIRs are putative enhancer elements that may play a role in gene activation. Conversely, the naïve pluripotency associated gene Tbx3 formed significant interactions with 27 different PIRs. Of these, 8 were marked by H3K27ac in ESCs, and 5 of these lost H3K27ac upon transition to the EpiLC state, consistent with transcriptional repression (Fig. 1b, d). Similar networks built around other genes involved in naïve pluripotency and the post implantation epiblast showed similar patterns, where histone modifications on some but not all PIRs correlated with expression change (Supplemental Fig. 3a, b). As some of these PIRs are likely gene regulatory elements such as enhancers, we next sought a method to define these systematically.
A novel clustering approach to dissect gene regulatory element -

chromatin interactions
To define promoter-PIR interactions in a systematic manner, we developed a novel method whereby we treated each promoter-PIR pair individually and used the epigenetic and transcriptional data that we had collected for each pair to perform dimensionality reduction and expectation-maximisation based clustering (methods, Fig. 2a). The resulting UMAP separated promoter-PIR pairs into 20 clusters based on transcription of the associated gene and epigenetic information from both interacting regions including histone modifications, accessibility, DNAme and interaction strength (Fig. 2b, c, Supplemental Fig. 4a). For example, cluster 16 consisted of active promoters (transcribed, marked by H3K4me3 and H3K27ac) interacting with active enhancers (marked by high levels of H3K27ac and H3K4me1) (Fig. 2b-d). Cluster 17 consisted of inactive promoters interacting with putative "poised" enhancers (marked by HK4me1 and HK27me3) (Fig. 2b, c). Clusters 6,12 and 19 consisted of lowly or unexpressed genes interacting with H3K4me1 marked PIRs, which are putative "primed" enhancers [41][42][43] . Clusters 1, 2 and 15 were marked by varying levels of H3K27me3 at the promoter, including repressed and bivalent genes, interacting with unmarked PIRs. Finally, there were a number of clusters that contained active promoters interacting with unmarked PIRs (3,10,13,14,18), as well as some clusters containing inactive promoters interacting with unmarked PIRs (0, 5, 7, 8 11).
Interestingly, the cluster of active promoters interacting with active enhancers (cluster 16) contained the promoters for pluripotency associated genes (e.g. Sox2, Sall1, Nanog) that are highly expressed at the epiblast stage. Clusters of active promoters interacting with unmarked PIRs (e.g., cluster 10) frequently involved housekeeping genes, such as genes involved in mRNA processing. Polycomb marked clusters (e.g., cluster 2) were enriched for lineage-specific genes that are commonly upregulated upon differentiation (e.g. Olig2, Hand1, Hoxb1, Neurog1). Meanwhile, clusters of inactive promoters interacting with unmarked PIRs (e.g., Cluster 8) contained highly lineage specific genes (such as olfactory receptors). (Supplemental Fig. 4a). These results suggest fundamentally different modes of regulation for genes required at distinct times during development.
As anticipated, single genes were involved in more than one promoter-PIR pair, appearing in more than one cluster. For example, Sox2 was involved in promoter-PIR pairs found in clusters 16 (active promoter -active enhancer), 6 (promoter -primed enhancer), and 13 (active promoter -unmarked PIR) (Fig. 2d-f). Indeed, the active enhancer region identified using our method overlapped the previously described Sox2 control region (SCR), essential for Sox2 activity 50 .
Putative active enhancers defined using our method overlapped well with elements defined by others using ChIP-seq (Supplemental Fig. 4b) 51 . Moreover, consistent with previous work, we found that polycomb marked promoters (clusters 1, 2, 15) and poised enhancers (cluster 17) dramatically accumulate H3K27me3 and form more significant Hi-C interactions upon ESC to EpiLC transition and global methylation of the genome (Supplemental Fig.   4c-e) 52 . Interestingly, H3K4me1 also accumulated upon transition from ESC to EpiLC at putative active (cluster 16) and primed enhancers (clusters 6, 12 and 19) whilst H3K27ac levels were unchanged (Supplemental Fig. 4f-h). Thus, our data show that some chromatin modifications are stable whilst others are re-organised during pluripotency exit.
Overall, our dataset and clustering method form a comprehensive picture of the transcriptional and epigenetic landscape of EpiLCs that (1) integrates numerous layers of molecular information, (2) clearly defines pairs of promoters and putative regulatory elements (allowing one to identify elements of particular interest such as the SCR) and (3) can be easily interrogated by, for example, filtering for different types of elements or filtering for promoter-PIR pairs involving genes of interest.
H3K4me1 and H3K27ac are differentially affected by DNA methylation DNA methylation (5mC) and oxidative derivatives (5hmC, 5fC, 5caC) accumulate during the transition from naïve ESC to EpiLC when Dnmt and Tet genes are co-expressed at high levels 22 . We therefore asked whether DNA methylation and oxidation are important to establish the histone post-translational modification landscape of EpiLC cells that we have defined.
We generated epigenetic datasets (as in Fig. 1a) from Dnmt triple knockout (TKO) 53 and Tet  Fig. 5a). Importantly however, DNA mass spectrometry and WGBS confirmed that 5hmC was absent from both lines, whilst 5mC was absent from the Dnmt TKO line but accumulated in the Tet TKO line (Supplemental Fig. 5b, c).
We first plotted the levels of H3K4me1 and H3K27ac across the active enhancer elements defined in this study (cluster 16) in TKO EpiLCs relative to matched WT cells (Fig. 3a-d).
Excitingly, we found that H3K4me1 was depleted in both TKO lines relative to their WT controls, whilst H3K27ac was depleted only in the Tet TKO line (Fig. 3a-d). The same was true across primed and poised enhancers (e.g., clusters 12 and 17 respectively), though H3K27ac levels were expectedly low at these (Fig. 3c, d).
Considering the accumulation of H3K4me1 we observed during the ESC to EpiLC transition (Supplemental Fig. 4f) required both DNMT and TET activities (Fig. 3a), our results suggest that oxidation derivatives (5hmC, 5fC or 5caC) are important for H3K4me1 deposition. Meanwhile, because H3K27ac levels are stable during the ESC to EpiLC transition (Supplemental Fig. 4g) and unaffected by loss of DNMT activity (Fig. 3a), we reason that deposition of this mark requires (maintenance of) unmethylated DNA.
Next, given that we observed an accumulation of H3K27me3 across bivalent and repressed promoters (clusters 1, 2, 15) and poised enhancers (cluster 17) upon ESC to EpiLC transition (Supplemental Fig. 4c, d), and that TET proteins are known to localise with PRC2 at bivalent promoters [54][55][56][57] , we asked what effect loss of DNMT or TET activity has on this mark. We observed a pronounced accumulation of H3K27me3 across repressed promoters and poised enhancers in Tet TKO EpiLCs relative to WT controls (Fig. 3e, Supplemental   Fig. 5d) together with a loss of H3K27me3 largely from non-promoter regions that do not feature prominently in promoter -PIR clusters (Supplemental Fig. 5d). Given that these cells express Tet catalytic mutants (Supplemental Fig. 5a), our results suggest that TETs facilitate H3K27me3 deposition in a manner independent of their catalytic activity, and indeed that this can be boosted in the absence of the catalytic domain (e.g., by altering PRC2 targeting or activity). This is consistent with recent publications showing that full length Tet1 knockout cells, but not catalytic mutants, lose H3K27me3 from promoters 58,59 . Highly expressed genes that interact with multiple enhancers are sensitive to Tet TKO Enhancers are thought to drive expression of linked genes, and we therefore hypothesised that the dramatic changes at enhancer chromatin that we had observed would correlate with transcriptional downregulation of linked genes. To investigate this, we profiled mRNA expression levels in WT, Dnmt and Tet TKO EpiLCs using RNA-seq.
Dnmt TKO resulted in more upregulated (735) than downregulated (373) genes in EpiLC cells (Fig. 4a), consistent with the role of DNA methylation in gene repression. Indeed, the majority of the upregulated genes were ESC expressed genes that failed to be repressed upon EpiLC transition (445/735 genes, Fig. 4b). Dnmt TKO also resulted in downregulation of genes associated with MAPK signalling and pluripotency; the latter included the Dnmt3a/b and Dnmt1 genes themselves. Tet TKO resulted in upregulation of 834 and downregulation of 576 genes in EpiLC cells, the latter of which were especially enriched in pluripotency genes by gene ontology analysis (adjusted p-value = 5.214e-10) (Fig. 4c, d).
The downregulated pluripotency genes included Lefty1, which was previously shown to be aberrantly repressed by DNAme in Tet TKO mice 28 . Most of these genes were expressed at near normal levels in TKO ESCs (Fig. 4c), suggesting that their downregulation is caused by the inappropriate accumulation of DNAme and/ or loss of oxidative derivatives (5hmC, 5fC, 5caC) at regulatory elements upon transition to EpiLC. Surprisingly, we did not observe a strong negative correlation between mRNA expression change and H3K4me1/H3K27ac change at linked enhancers (cluster 16) in Tet TKO (Fig.   4d, e) or Dnmt TKO (Supplemental Fig. 6a) EpiLCs. Significantly downregulated genes had a tendency to interact with enhancers that lose active histone modifications in Tet TKO cells but not in Dnmt TKO cells. Therefore, downregulation of genes in Dnmt TKO EpiLCs relative to WT controls may be a result of cells failing to exit pluripotency effectively rather than failing to activate enhancers (consistent with our finding that many ESC genes are not repressed effectively - Fig. 4a).
We hypothesised that the lack of correlation between enhancer modification change and expression change was a result of enhancer redundancy (multiple regulatory elements controlling single genes). To test this, we studied the change in ATAC-seq, H3K4me1 and H3K27ac signal upon Tet TKO at the putative enhancer elements associated with significantly downregulated genes versus a representative set of stable genes that were not differentially expressed (Fig. 4f, g, Supplemental Fig. 6b-d). We observed that enhancers associated with downregulated genes had a tendency to lose more ATAC-seq signal (accessibility) in the Tet TKO EpiLC cells than enhancers associated with stable genes (Fig.   4g), despite starting with similar levels of accessibility (Supplemental Fig. 6b). The same was true of H3K4me1 and H3K27ac levels across these two sets of enhancer elements (Supplemental Fig. 6c, d). Interestingly, genes that were downregulated upon Tet TKO were more highly expressed in WT EpiLCs than genes that were stable (Fig. 4f) and they, on average, interacted with more enhancer elements (Fig. 4h). These data suggest that highly expressed genes, which are dependent on multiple enhancer elements to maintain such expression levels, are sensitive to impaired enhancer function upon Tet TKO.
Meanwhile, stable genes interacting with fewer enhancer elements are less affected by enhancer disruption upon Tet TKO.
Tet TKO results in impaired upregulation of hematopoietic genes Interestingly, many of the regulatory elements that lost active histone modifications in Tet TKO cells were linked to lowly expressed genes in Tet WT cells (Fig. 4e). One possibility is that these regulatory elements prime associated genes for expression later on during development, and that loss of active histone modifications at these elements may then affect their ability to be upregulated.
To test this hypothesis, we generated RNA-seq data across an 8 day embryoid body (EB) differentiation time course (sampling every 2 days) in Dnmt TKO and Tet TKO mESCs as well as matched WT controls (Fig. 1a). Principal component analysis (PCA) of the EB, ESC and EpiLC RNA-seq data separated the WT samples along PC1 indicating a gradual transcriptional change upon differentiation (Fig. 5a, b). According to GO analysis, genes associated with differentiated cell types were upregulated (e.g., terms related to ossification and heart development) whilst pluripotency genes were downregulated in both WT lines (Supplemental Fig. 7a, b). Many differentially expressed genes were common between the WT lines, indicating that our protocol was robust (Supplemental Fig. 7c, d).
Based on PCA, Tet TKO cells appeared to exit pluripotency more quickly than their WT counterparts but did not appear to reach the same endpoint (Fig. 5a), suggesting transcriptional defects in both pluripotency and differentiation programmes. Meanwhile, Dnmt TKO had a more severe phenotype where day 8 EBs transcriptionally resembled WT EBs at day 4, suggesting a differentiation block (Fig. 5b). A few hundred genes were differentially expressed in both knockout lines at the EpiLC stage and in EBs (Fig. 5c, d) but the limited overlap between these gene lists suggests fundamentally different transcriptional defects upon loss of DNMT and TET activity (Supplemental Fig. 7e, f). Based on GO analysis, Dnmt TKO EBs at day 8 expressed pluripotency genes at higher levels than their WT counterparts but failed to efficiently upregulate genes associated with heart, muscle, bone and blood development, consistent with a transcriptional block (Fig. 5d). Interestingly, Tet TKO EBs at day 8 showed lineage biases, including an upregulation of genes associated with Wnt signalling, muscle and neural crest formation, and a downregulation of genes strongly enriched in GO terms associated with the hematopoietic system (Fig. 5c). These results are consistent with recent chimaera studies 37,38 showing that a lack of TET enzyme activity impairs erythropoiesis.
Tet TKO results in priming defects at enhancers associated with hematopoietic genes We hypothesised that some of the regulatory elements that are epigenetically perturbed in Tet TKO EpiLCs prime lowly expressed developmental genes for future activation (Fig. 4e).
In this scenario, these genes would fail to be upregulated upon differentiation of Tet TKO cells. To identify such cases, we looked for genes that were not expressed in EpiLC cells or early EBs but significantly lower in Tet TKO EBs at day 4, day 6 and day 8 than their WT counterparts (and then remained significantly lower for the rest of the time course). This analysis identified 14, 98 and 297 genes significantly lower than expected at day 4, 6 and 8, respectively (Fig. 6a, Supplemental Fig. 8a, b). Interestingly, the genes that failed to be upregulated in Tet TKO EBs also failed to be upregulated in Dnmt TKO EBs (Supplemental

Fig. 8c).
Consistent with GO analysis using all differentially expressed genes (Fig. 5d), these lists were significantly enriched for genes associated with hematopoietic stem cell differentiation (adjusted p-value 9.643e-7), and included genes encoding key transcription factors -Tal1, Nfe2, Runx1, Klf1, Lyl1 and Gata1 (Fig. 6a). To determine whether the regulatory elements associated with these blood transcription factor genes lose epigenetic priming at the EpiLC stage in the mutants, we plotted the levels of ATAC, H3K4me1 and H3K27ac signal as well as interaction strength at associated PIRs in WT controls versus Tet TKO and Dnmt TKO EpiLCs (Fig. 6b-e, Supplemental Fig. 8d-g). Strikingly, many of the PIRs that we had previously annotated as active or primed enhancers (Fig. 2b)  Syce2, demonstrating evolutionary conservation of the locus (Fig. 6f, top). Interestingly, subnetworks for those two genes revealed that many of these regulatory elements were shared by both Klf1 and Lyl1 (Fig. 6f, g). For example, one PIR that interacts with both genes is flanked by H3K4me1 and H3K27ac enriched regions. These histone marks and the strength of the interactions between the elements were substantially reduced upon Tet TKO in EpiLC cells (Fig. 6f, g), suggesting loss of epigenetic priming. Next, we analysed published 10x Multi-ome (combined single cell RNA-seq and ATAC-seq) 62 data of E8.5 mouse embryos to identify cell type specific accessibility profiles. Strikingly, the Klf1/Lyl1 interacting regulatory elements that lose epigenetic priming in Tet and Dnmt TKO EpiLC cells became accessible specifically in hematopoietic progenitor cells and erythroid lineages (Fig. 6f, bottom). Moreover, these accessible elements are later bound by blood associated transcription factors that are not expressed at the EpiLC stage, including GATA1, TAL1 and LYL1, according to reanalysed ChIP-seq data 63-65 from hemato-endothelial progenitor cells (Fig. 6f, bottom). Collectively, our data suggest that these regulatory elements function as epigenetically primed enhancers that are required for hematopoietic differentiation. We find that methylation and oxidation by DNMTs and TETs is required to maintain accessibility, active histone modifications and looping interactions between these elements and key blood TF genes.

Discussion
Dynamic turnover of DNA methylation at pluripotent stem cell enhancers Paradoxically, the enzymes that deposit (DNMTs) and remove (TETs) DNA methylation are co-expressed during early mammalian development and target overlapping genomic regions, in particular distal gene regulatory regions such as enhancers 21,23,24,66,67 . As a consequence, DNA methylation states at gene regulatory elements cycle dynamically in mouse and human pluripotent stem cells, especially during the transition from naïve to formative pluripotency. This raises the possibility that the antagonistic enzymes of the DNA methylation cycle, or the modifications they generate, may be involved in modulating the activity of key gene regulatory elements during early developmental cell fate diversification 22 .
Here, to test this hypothesis, we combine an established in vitro cell model of formative pluripotency with genetic mutants that are defective for specific enzymatic steps in the DNA methylation and oxidation cycle. We extensively profile multiple chromatin modifications, DNA methylation, and transcription, and we developed a novel clustering approach to track promoter-gene regulatory element interactions during mouse pluripotency state transitions.
We believe that this rich resource, containing more than 300 new datasets, will be of great value to the community. We find that during the exit from pluripotency, dynamic turnover of cytosine methylation is required to activate enhancers that act in mouse pluripotent stem cells. Further, the cycle of DNA methylation and oxidation is also necessary to prime enhancers for future activation, and we show that aberrant enhancer priming in TET and DNMT mutants results in the failure to upregulate key transcription factor genes within the hematopoietic lineage tree. Our data thus suggest that the role of DNA oxidation at gene regulatory regions goes beyond removing DNA methylation as a repressive epigenetic mark, to keep the local chromatin accessible for trans-acting factors that mediate enhancer function 68,69 . We propose that in addition to this role, enhancer DNA demethylation functions in a dynamic DNA methylation and oxidation cycle to establish an active and primed enhancer landscape that controls cell-type specific gene expression programmes in pluripotent stem cells and their differentiated progeny.
Potential mechanisms of enhancer priming by cyclical DNA methylation and oxidation Enhancers have been grouped into functional categories based on post-translational histone modifications they harbour, including primed (H3K4me1), active (H3K4me1 and H3K27ac) and poised (H3K4me1 and H3K27me3; 41,42 ). These post-translational histone modifications are deposited at enhancers by MLL3/MLL4 (H3K4me1), P300/CBP (H3K27ac), and Polycomb repressive complex 2 (H3K27me3). We demonstrate that interfering with DNA methylation dynamics affects the chromatin states at all three enhancer classes, and that the lack of DNMT activity causes changes in enhancer chromatin signatures that are distinct from those resulting from the absence of catalytically active TETs ( Fig. 7). H3K4me1 occupancy at regulatory elements is markedly increased during the ESC to EpiLC transition, which we found is dependent on both DNMT and TET activity. In contrast, H3K27 acetylation is depleted at enhancers in oxidation deficient cells, but persists in the absence of DNMT activity. The Polycomb mark H3K27me3 is broadly redistributed in Tet knockout EpiLC, resulting in a pronounced gain of H3K27me3 at poised enhancers. Our data suggest that oxidation derivatives of 5mC (5hmC, 5fC or 5caC) may be important for H3K4me1 deposition at enhancers. Intriguingly, we have previously identified WDR5, a member of the WRAD complex that may be involved in targeting histone methyltransferases to specific genomic sites 70 , as strongly interacting with 5fC modified DNA 71 . This interaction may provide a mechanistic link between oxidation derivatives and MLL3/MLL4 mediated H3K4me1 deposition at enhancers (Fig. 7).
Our data further imply that H3K27ac can persist when H3K4me1 is depleted, in agreement with a recent study reporting that H3K27ac is gained at hundreds of enhancers independently of MLL3/4 activity or H3K4me1 binding during ESC differentiation 72 . The requirement for enhancer priming, and the underlying mechanisms, are likely to be developmental context dependent. Ectoderm-specifying enhancers, unlike endoderm-and mesoderm-associated enhancers, are hypomethylated and accessible long before ectoderm specification during mouse gastrulation 10 . However, the low levels of DNA methylation at these enhancers are TET-dependent. It will be interesting to determine whether a dynamic DNA methylation turnover occurs at ectoderm enhancers during early mouse development, and if it does, how the kinetics differ compared to endo-and mesoderm specifying enhancers. Moreover, we identify here a subset of mesodermal enhancers that are epigenetically primed by H3K4me1 in epiblast-like cells, and so early epigenetic priming is not completely exclusive to ectoderm enhancers.
A recent study demonstrated that H3K27 acetylation is dispensable for gene activation during the ESC to EpiLC cell fate transition 73 . One possible explanation is that substrates other than H3K27, including non-histone targets such as transcription factors, are the functionally relevant targets of P300/CBP's lysine acetyltransferase activity at enhancers 74,75 . We note that our model (Fig. 7) is entirely compatible with this scenario, as it places the DNA methylation turnover upstream of enhancer-associated chromatin modifiers and their histone and non-histone targets.
Rewiring of 3D promoter chromatin interactions upon perturbed DNA methylation turnover Intriguingly, we found that in DNA methylation or oxidation deficient cells, regulatory interactions are disrupted and active chromatin marks are reduced at specific promoter interacting regions, long before the linked target genes are expressed during development.
This priming defect appears to be an enhancer-specific rather than a global effect, with genes encoding key blood lineage transcription factors including Klf1 and Lyl1 among those affected. Catalytic inactivation of MLL3/4 has been shown to result in a loss of H3K4me1 at enhancers 76 , and MLL3/4 and H3K4me1 occupancy at enhancers correlates with increased levels of chromatin interactions via recruitment of cohesin 77 . Interestingly, MLL3/4 activity is specifically required at enhancers that gain H3K4me1 during ESC differentiation, and enhancer-promoter contacts at these de novo enhancers are severely disrupted in the absence of MLL3/4 78 . This indicates that MLL3/4 may be necessary to maintain enhancers in a plastic state that allows them to respond efficiently to developmental cues. How the dynamic turnover of DNA methylation contributes to enhancer plasticity, and which role it plays at MLL3/4-dependent versus -independent enhancers 78 warrants future investigation.

Aberrant DNA methylation in haematological malignancies
Finally, our data may also provide an explanation for the paradoxical finding that mutations in DNMT and TET genes both have been causally linked to a range of blood cancers. For example, TET2 mutations are present in approximately 50% of chronic myelomonocytic leukaemia (CML) and 10% of acute myeloid leukaemia (AML) cases 33 . Mutations in DNMT3A were first reported in AML patients and have since been discovered in several other haematological malignancies, pointing to DNMT3A as an important tumour suppressor 79 . We propose that the antagonistic activities of DNMTs and TETs converge on the dynamic methylation turnover at distal regulatory elements, which is required for their activity. The dysregulation of DNA methylation and oxidation at enhancers may contribute to the aetiology and pathology of blood cancers 80 , which could inform novel diagnosis approaches and treatment strategies to improve patient care.

Data Availability
Sequencing data will be available in the NIH GEO database upon publication. Quantitated PCHi-C PCHi-C was performed as described in and 45 using in-nucleus ligation 84 . The SureSelect Target Enrichment system (Agilent Technologies) described by 45 .

DNA Mass Spectrometry
A total of 200 ng of genomic DNA was digested using 5U of DNA Degradase Plus (Zymo Research, E2020) overnight at 37ºC and the base content quantified using a nanoHPLC and Q Exactive mass spectrometer (as described in 85 ).
Processing of sequencing data RNA-seq RNA-seq libraries were sequenced as 75 base paired-end runs on an Illumina HiSeq2500 instrument. Quality and adapter trimming were performed using Trim Galore v0.6.1 (using Cutadapt v1.18) 86 . Mapping to the mouse reference genome (GRCm38) was performed using Hisat2 v2.1.0 (--no-softclip) 87  Global interactions were also represented as a Canvas 46 style network. The R igraph package 102 was used to construct an undirected graph in which genomic regions (HindIII fragments) are represented as vertices/nodes and interactions are represented as edges. A combined network was produced using all significant interactions in any of the conditions. The network was visualised with a force-directed layout (ForceAtlas2) in Gephi v0.9 103 . This representation pulls highly interacting regions closer together while less interacting regions are kept apart. Details can be found in the accompanying material (https://christelkrueger.github.io/Exit-from-Pluripotency/).           Fig. 6a). ATAC and H3K27ac levels are quantified as reads per million (RPKM). PIRs are labelled by the gene they interact with and coloured by the cluster that the promoter-PIR pair was assigned to (see Fig. 2b).        Figure 5 Wikipathways  PCHi-C PCHi-C  Poised (17) Repressed (2) Poised (17) Repressed (2)