Integrative DNA methylation analysis of pediatric brain tumors reveals tumor type-specific developmental trajectories and epigenetic signatures of malignancy

Understanding oncogenic epigenetic mechanisms in brain tumors is crucial for improved diagnosis and treatment. To evaluate tumor type-specific features, we compared atypical teratoid/rhabdoid tumors (AT/RTs), medulloblastomas, and choroid plexus tumors with each other by integrating DNA methylation (507 samples), gene expression (120 samples), and transcription factor (TF) -binding data. Our results suggest that aberrant DNA methylation plays a vital role in the oncogenesis of AT/RTs, which are known for their aggressiveness and exceptionally low mutation rates. In AT/RT, elevated DNA methylation masks the binding sites of TFs driving neural development and is associated with reduced gene expression for specific neural regulators, whereas DNA methylation patterns predict a more advanced differentiation state for MB. By analyzing the data together with pluripotent stem cells, we found two regulatory programs in AT/RT: pluripotent stem cell -like DNA methylation patterns and AT/RT-specific DNA methylation uniquely associated with polycomb repressive complex 2 (PRC2) members. Both methylation patterns cover neural TF binding sites in a TF-specific manner, suggesting their relevance for oncogenic regulation.


Introduction
Central nervous system (CNS) tumors can arise at any age and have the highest cancerassociated mortality rate in pediatric patients 1,2 . Atypical teratoid rhabdoid tumors (AT/RTs), medulloblastomas (MBs), and choroid plexus tumors (PLEXs) are all detected in infants 3 . AT/RTs and MBs are aggressive, grade IV tumors according to the World Health Organization (WHO) classification 4 , and high-grade (III) PLEXs are typically detected in children [4][5][6] . In particular, AT/RTs, SHH and group 3 subtypes of MBs, and grade III PLEXs have a poor prognosis [7][8][9] ; thus, improving patient outcomes is an urgent task. Because aberrant epigenetic regulation plays a major role in many aggressive pediatric tumors 10 , including AT/RTs 11 , we wanted to specify the role of DNA methylation in AT/RT, MB, and PLEX. For AT/RT, the significance of epigenetics is also highlighted by their deserted DNA alteration landscape: the sole recurrent genetic alteration is the inactivation of SMARCB1 or SMARCA4, both of which are subunits of SWItch/Sucrose-Non-Fermentable (SWI/SNF) chromatin remodeling complex 12 . Surprisingly, many histone modifications are also depleted in AT/RT 11 , suggesting an alternate epigenetic driver of AT/RT development.
DNA methylation is one of the major mechanisms underlying epigenetic regulation. It contributes to cell differentiation 13 ; is altered in different malignancies [14][15][16] ; and is, together with histone methylation, a clinically relevant therapeutic target 17,18 . In CNS tumors, DNA methylation is used for accurate tumor classification and early cancer detection [19][20][21] . DNA methylation-based classification of CNS tumors also reveals tumor type-or subtype-specific DNA methylation patterns, linking DNA methylation to tumor type-specific oncogenesis 21 .
Traditionally, researchers have compared and analyzed subtypes of AT/RT, MB, and PLEX separately [22][23][24][25] . For example, molecular AT/RT subtypes have been reported to have distinct genomic profiles, SMARCB1 genotypes, and chromatin landscapes, including DNA methylation differences among the subtypes 22,26 . In this study, we aimed to define DNA methylation patterns and related gene regulation, which are characteristic of each tumor type irrespective of their internal heterogeneity, to gain insight into these tumors and their oncogenic regulation. AT/RT, MB, and PLEX were used for the comparison because they have a partly shared history and characteristics [27][28][29] .
Our analysis revealed elevated DNA methylation patterns in AT/RT, suggesting that the binding of TFs that drive neural differentiation is often inhibited in AT/RTs. In addition, polycomb repressive complex 2 (PRC2) sites were hypermethylated in AT/RTs. We combined gene expression data with DNA methylation and observed upregulation of especially neural genes in MBs when compared with AT/RTs and/or PLEXs. Because the SWI/SNF complex is critical for the targeted opening of chromatin at certain genomic loci during neural development by recruiting EP300, leading to H3K27 acetylation 12,30 , we analyzed our results with respect to DNA methylation data from pluripotent stem cells (PSCs) and fetal brain (FB). We found that some sites shared similar methylation patterns in AT/RTs and PSCs. However, these sites are different from PRC2 target sites, suggesting that there are two different DNA methylation programs in AT/RTs.

DNA hypermethylated in AT/RT when compared with MB or PLEX
To study oncogenic epigenetic regulation in AT/RTs, MBs, and PLEXs, we collected genomewide DNA methylation Illumina microarray data (i450K) from 497 tumors and expression data from 110 tumors. In addition, we generated data from 10 matched tumors with reduced representation bisulfite sequencing (RRBS) and RNA sequencing ( Figure 1A) (Supplementary Table 1), and 89 normal brain DNA methylation samples were used as controls in i450k-based DNA methylation analysis. DNA methylation data separated the tumor types and their subtypes in t-distributed stochastic neighbor embedding (tSNE) analysis ( Figure 1B). RRBS samples were positioned adjacent to the tumor subgroups that matched their clinical diagnosis 21 . To clarify the role of aberrant DNA methylation in AT/RTs, MBs and, PLEXs, we identified differentially methylated genomic regions between them, transcriptional regulators enriched in these regions, and how differential methylation was associated with gene expression.
Differential methylation analyses resulted in 11,016 and 64,983 differentially methylated regions (DMRs) in the i450k and RRBS data, respectively ( Figure 1C). Higher DMR counts in RRBS data analysis were predictable because the method has a wider genomic coverage and i450k DMRs were, on average, longer than those from RRBS (always 1kb) 31,32 . Furthermore, tumor location information and normal samples were used to filter out tumor location-related differences in i450k DNA methylation data, which decreased the DMR counts ( Figure 1C vs. Supplementary Figure 1A). There was a higher proportion of intergenic DMRs in RRBS than in i450k data (36%, 30%, and 36% vs. 14%, 10% and, 11% in AT/RT-MB, ATRT-PLEX, and PLEX-MB, respectively) (Supplementary Figure 1C, Supplementary Table 1). DMRs were distributed across all chromosomes (Supplementary Figure 1D).
The majority of DMRs were more methylated in AT/RT than in other tumor types (84% and 72% in i450k and RRBS data, respectively) and less methylated in MB than in other tumor types (77% and 79% in i450k and RRBS data, respectively) ( Figure 1D). Consistently, the highest DNA methylation levels were detected in AT/RTs and lowest in MBs ( Figure 1E).
As higher DNA levels were reported in AT/RT-TYR and AT/RT-SHH subgroups 26 , we asked whether higher DNA methylation was subgroup-specific. No AT/RT subgroup specificity was observed in our setting; the highest DNA methylation levels were consistently detected in each AT/RT subgroup when compared with the subgroups of other tumor types (Wilcoxon p values < 0.005, effect size Cliff's delta |δ| > 0.2) (Supplementary Figure 1B,  Regions inside topologically associating domains (TADs) potentially share similar patterns of epigenetic regulation 33 . Our analysis also found several large-scale DNA methylation differences within TADs ( Figure 1F). Especially in the AT/RT-MB comparison, there were clusters of large-scale differences in chromosomes 1, 11, 13, 17, and 19. Chromosome 17, which carries many DMRs (Supplementary Figure 1D) and harbors recurrent gains and losses in MB 34 , is notable, because it harbors 14 large-scale methylation-related TADs, 13 of which were differentially methylated in the AT/RT-MB comparison. a) Illustration of our data analysis and integration approach. The number of samples in each cohort is shown, and also for the three separate expression cohorts analyzed using expression microarrays. b) Tumor types are separated based on DNA methylation in a tSNE visualization, when the 10 000 most variable regions shared between i450k and RRBS data were used for the visualization. Methylation subgroups of each tumor type (omitted from 21 ) in i450k were also visualized. c) Venn diagram showing the number of differentially methylated regions (DMRs) in each comparison. Tumor type-specific DMRs are marked into the intersecting areas. A higher number of DMRs were detected in the RRBS than in the i450k data. For i450k results, DMRs were filtered using methylation data from normal brain samples. d) Numbers of hypermethylated and hypomethylated regions for each comparison and for i450k and RRBS data. e) AT/RT had the highest overall DNA methylation level; MB was the least methylated. Violin plots visualized the distribution of DNA methylation in the 10 000 most variable sites in the RRBS and i450k microarray datasets. f) Several TADs were influenced by large-scale methylation differences, especially in AT/RT-MB comparison. Karyoplot visualizes the topologically associating domains (TADs) that harbor large-scale DNA methylation differences. Color indicates a comparison in which a difference is observed.

AT/RT hypermethylated sites are enriched with neural differentiation TFs
To gather further information on the function and possible upstream regulation of cancerspecific DMRs, we analyzed which TFs bind to these sites by using the Gene Transcription Regulation Database (GTRD) 35 . Because most TFs do not bind to methylated DNA or closed chromatin 36,37 , we presumed that methylation in the TF binding site typically means decreased DNA binding and thus decreased activity for the enriched TF. DMRs, which are concordantly hypermethylated or hypomethylated in one tumor type when compared with the others, were used for the analysis (Figure 2A).
Next, we studied the expression of the enriched TFs in our datasets (Supplementary Figures  2 and 3, Supplementary Table 3). ERF was the only TF that had a significantly higher expression in AT/RTs than other tumors (adjusted p < 0.05, absolute logFC > 1), whereas 13 TFs, including EBF3 and NEUROD1, had significantly higher expression in MBs than in other tumors ( Figure 2D, Supplementary Figures 2, 3, and 4). The expression of 164 (90%) TFs was not significantly associated with any tumor type (Supplementary Figures 2 and 3). We observed that neural differentiation TFs enriched in AT/RT hypermethylated regions were expressed in AT/RTs and that some, such as NEUROG2 and HAND2, had even the highest expression in AT/RTs ( Figure 2D). Thus, we conclude that DNA methylation changes in TF binding sites affected the function of these TFs independently of their gene expression. a) The majority of AT/RT DMRs were hypermethylated (99% and 79% in i450K and RRBS data, respectively), whereas hypomethylated DMRs were more commonly observed in MB (85% and 88% in i450K and RRBS data, respectively) and PLEX (98% and 71% in i450K and RRBS data, respectively). Four-field plots for each tumor type show the number of concordant and non-concordant hypermethylated and hypomethylated regions. Concordant means that DMRs are either hypermethylated or hypomethylated in the specified tumor type when compared with two other tumor types, whereas an opposite difference was observed for non-concordant DMRs. Concordant DMRs were defined as tumor-specific, differentiating each tumor type from the others. b) The majority of TFs are specifically enriched in AT/RT hyper, MB hyper, or PLEX hypo DMRs and are linked to relevant cellular functions. Upper part: Upset plot of the number of enriched TFs for tumor-specific hypermethylated and hypomethylated regions. Some TFs were associated with several DMR groups. Lower part: TFs were annotated into themes based on their reported functions. The number of TFs in a given theme for each upset plot column is shown. The color of the heatmap shows the fraction of TFs in each theme (row). c) Binding sites of neural differentiation factors NEUROG2, ASCL1, and PAX7 are more methylated in AT/RT, irrespective of the tumor subtype. Out of TFs linked to histone lysine methylation, EZH2 (involved in histone H3 lysine 27 trimethylation) behaved similarly, but a distinct DNA methylation pattern was observed for MIER1 and EHMT2 (involved in histone H3 lysine 9 methylation) binding sites. Violin plots visualizing the distribution of DNA methylation with selected TF binding sites in all i450k DMRs. The tumor subgroups are presented separately. d) Some TFs enriched in DMRs hypermethylated in AT/RT (AT/RT hyper) showed differential expression between tumor types. The expression of selected TFs enriched in tumor-specific DMRs are visualized with boxplots (*,**, and *** refer to P < 0.05, P < 0.01 and P < 0.001, respectively). e) The cell-type-specific binding patterns provide information on the epigenetic state of TF regulomes in the analyzed tumors. GTRD data were split based on the sample of origin in the listed categories (at the bottom), and TF binding site enrichment was calculated for each of them separately. TFs were organized into the themes listed in Figure 2B. The dot is not marked when a given TF is not measured in a given GTRD category. f) PRC2 subunits rarely co-localize with neural TFs in AT/RT hypermethylated sites.
Heatmap visualization of the enrichment p-value (one-sided Fisher's exact test) for colocalization. All the adjusted p values of 0.05 or higher are marked in white. TF themes for each TF were annotated on the right-hand side of the heatmap.

Analysis of split GTRD data underlines the epigenetic states of neural cell differentiation in AT/RT and MB
GTRD information is from various cells of origin 35 ; thus, we analyzed TF binding data in detail by considering the sample source. We divided GTRD into 11 categories based on the cell of origin ( Table 2). Using this split GTRD data, we found 15 neural differentiation and neural cells-related TFs measured in brain tumors or neural cells. Ten of these (67%) were enriched in AT/RT hypermethylated sites, MYCN binding sites in brain tumors were enriched in AT/RT hypomethylated sites, and only ZEB1 (6.7%) was enriched in MB hypermethylated sites (Supplementary Table 2 Table 2). All 11 pluripotent TFs, measured from pluripotent cells, were enriched in regions hypermethylated in MB.

DNA methylation is associated with PRC2 and EP300 binding in a tumor-type-specific manner
Because dysfunctional PRC2 and EZH2 were linked to AT/RT development 11 , we analyzed the association of PRC2 members with tumor-type-specific DMRs. Five of six (83%) TFs involved in PRC2 function and all seven (100%) TFs involved in H3K27 methylation [59][60][61] were enriched in sites hypermethylated in both MB and AT/RT or in MB ( Figure 2B). EZH2 was enriched in DMRs hypermethylated in AT/RT, hypermethylated in MB, or hypomethylated in MB, suggesting different modes of regulation ( Figure 2E, Supplementary Table 2). The enriched binding of EZH2 to hypermethylated regions could mean that the PRC2 complex guides DNA methylation instead of advancing H3K27 methylation, as reported in the literature previously 62 .
We continued our analysis with split GTRD-data. When examining PRC2 subunits, all the EZH2 binding sites, but those originating from other brain tumors, were enriched in regions hypermethylated in AT/RT ( Figure 2E). EZH2 was also enriched in regions hypermethylated in MB irrespective of the sample source, but only the EZH2 binding sites in carcinomas and other cancers significantly overlapped with MB hypomethylated regions. The results were somewhat different for another PRC2 subunit SUZ12: only its binding sites in PSCs and other samples were enriched in regions hypermethylated in AT/RT or hypomethylated in PLEX, whereas in MB hypermethylated regions, the SUZ12 binding sites in rhabdoid tumors, carcinomas, and other cancers were also enriched. Overall, the results suggest that DNA methylation is directed to some of the sites EZH2 occupies in pluripotent cells and during neural differentiation in a cancer-type-specific manner.
Next, we examined the SWI/SNF complex because it is altered in AT/RTs due to SMARCB1 or SMARCA4 inactivation and acts with EP300 to open chromatin during neural differentiation 12,29 . We found that the SMARCA4 binding sites in rhabdoid tumors were enriched in sites hypermethylated in MB and in sites hypomethylated in AT/RT. The opposite is true in other brain and neural cells: the SMARCA4 binding sites are enriched in sites hypermethylated in AT/RT and sites hypomethylated in MB (Supplementary Table 2, Figure 2E). This finding was supported by EP300 binding sites in neural precursors or other brain and neural cells, which were enriched in regions hypermethylated in AT/RT. By contrast, the EP300 binding sites in pluripotent cells were enriched in regions hypermethylated in MB. The results suggest that the neural SWI/SNF-EP300 target sites have been methylated in AT/RT. Furthermore, SWI/SNF binding sites are partly unique in MB and AT/RT, but the complex remains able to promote open chromatin in the regions to which it binds.

PRC2 subunits and neural differentiation -related TFs rarely co-localize
We performed TF binding co-localization analysis for regions hypermethylated in AT/RT or MB, or hypomethylated in MB or PLEX (Figure 2F, Supplementary Figures 6 and 7). Most TFs related to neural differentiation share their binding sites and are clustered, but significant co-localization (adjusted p < 0.01, one-sided Fisher's exact test) was rarely observed between PRC2 members and TFs related to neural differentiation ( Figure 2F, Supplementary Figures  6 and 7). In regions hypermethylated in AT/RT, the only TFs co-localizing with PRC2 subunits were the neural TFs BCHE and TFAP2A, which were co-localized with all the PRC2 subunits. Only co-localization with one PRC2 subunit was detected for MBD1 and PHOX2B in MB hypermethylated and hypomethylated regions, respectively ( Figure 2F, Supplementary  Figure 6). This result suggests that PRC2-related DNA methylation largely affects parts of the genome other than the sites that recruit neural TFs during differentiation, BCHE, and TFAP2A, representing possible TFs that act at their intersection.

DNA methylation -related regulation of genes involved in neural differentiation
We next evaluated the association between DNA methylation and gene expression. Principal component analysis (PCA) visualization showed that samples clustered to their respective groups based on gene expression (Supplementary Figure 8), although the separation was less clear than with DNA methylation (Figure 1B).
We used Venn diagrams to summarize genes that were similarly regulated in both microarray and sequencing experiments ( Figure 3A). Because gene expression is generally inhibited by DNA methylation 63 Figure 9), these genes were largely hypermethylated in AT/RT (226 had a high methylation in AT/RTs) and overexpressed in PLEX or MB or in both in respect to AT/RT (179 and 150 genes in array and RNA-seq data, respectively) ( Figure 3B, Supplementary  Figure 8). The number of AT/RT-specific genes was partly reduced because DNA methylation was decreased in MB or PLEX alone or because decreased DNA methylation in MB and PLEX led to increased expression in only one of them.
Thirty-two DM-DE genes were specific to a given tumor type, that is, differential methylation in the same genomic context (genomic neighborhood, promoter or enhancer) was observed in two tumor comparisons (Figure 3B-C, Supplementary Figure 8). Of the seven tumor typespecific genes that are hypermethylated and underexpressed, CXXC5 and TCEA3 are specific to AT/RT, RSPH9, CLDN23, C5AR1, and IGDCC3 specific to MB, and CHTF18 specific to PLEX. Of the 25 tumor type-specific genes that were hypomethylated and overexpressed, nine genes, including neural genes NEUROG1, EBF3, NBEA, BARHL1, and AMER2, are specific to MB. Of these, EBF3, BARHL1, and AMER2 have been reported to be overexpressed in MB [64][65][66] .
We separately visualized the genomic neighborhoods of CXXC5, TCEA3, NEUROG1, and NEUROD2. Several DMRs were present in the CXXC5 locus but only one was present in the other gene loci (Figure 3D, Supplementary Figures 10 and 11, Supplementary Table 3). The CXXC5 locus was also located in a large-scale methylation area (in Chr5q31.2, Figure  1F), which is hypermethylated in ATRT when compared with MB or PLEX. In addition, TCEA3 had a DMR in the promoter region, indicating classical regulation through promoter methylation. The binding sites of the SWI/SNF subunits SMARCA4 or BRD9 measured from AT/RT cells were detected in the CXXC5, NEUROG1, and NEUROD2 DMRs. Notably, EP300 binding sites measured from neural cells were also found around CXXC5, representing a possible neural SWI/SNF-EP300 target site described earlier (Figure 3D).
We identified NEUROG1 and NEUROD2 as DM-DE genes that were less expressed in AT/RTs (Figure 3B), and NEUROG2 and NEUROD1 as TFs whose binding sites were enriched specifically in AT/RT hypermethylated regions ( Figure 2D, Supplementary Figure  5). Because these transcription factors (TFs) are key regulators of neural differentiation 67,68 , we visualized their expression together with the expression of their target genes TBR1, RCOR2, BLHE22, NHLH1, NHLH2, HES6, DAB1, and CDK5R1 ( Figure 3E, Supplementary  Figure 10). Except for NEUROG2, which had the highest expression in AT/RT, the highest expression of all these genes was observed in MB. This result suggests that their induction, which should occur during neural differentiation, has been successful mainly in MB. TFs. The highest expression is typically detected in MB (*,**, and *** refer to P < 0.05, P < 0.01 and P < 0.001, respectively).

AT/RTs harbor unique and PSC-like DNA methylation patterns
On the basis of TF binding patterns, genes regulated by DNA methylation, and recent results in the field 69 , we wanted to analyze how the observed DNA methylation differences relate to normal neural cell differentiation. To achieve this objective, we obtained i450k DNA methylation data from pluripotent cells (PSCs) and primary fetal brain (FB) 70,71 and defined their methylation differences with respect to tumors and to each other. PSC and FB samples clustered to separate clusters in the tSNE visualization (Figure 4A), as expected. We then divided the DMRs between the tumors to 1) those that are tumor type-specific, PSC-like, or FB-like and 2) whether they were differentially methylated between PSC and FB (Supplementary Table 4).
DMR counts in different categories were visualized to estimate how AT/RT hypermethylated, MB hypomethylated, and MB hypermethylated DMRs are related to DNA methylation in normal cells (Figure 4B, left side). Among the 898 DMRs hypermethylated in AT/RT, 470 (52%) were similarly methylated in PSCs as AT/RTs and hypermethylated in PSCs when compared with FBs (group 5, Figure 4B); thus, they were linked to neural differentiation. Only 31 (3.5 %) of AT/RT hypermethylated DMRs were FB-like (group 8), highlighting the higher similarity of AT/RTs to PSCs than FB. However, there were also 300 (33%) AT/RT-specific DMRs, 124 (14%) of which were not related to differentiation (group 4), and the remainder (176, 20%) were also hypermethylated in PSCs when compared with FB (group 1).
By contrast, 116 of 123 (94%) MB hypermethylated DMRs and 699 of 957 (73%) MB hypomethylated DMRs were MB specific ( Figure 4B). Furthermore, 241 (25%) MB hypomethylated DMRs were FB-like and differentiation-related (group 18). Within MB specific regions, 445 (51%) of the MB hypomethylated regions were differentiation-related and hypermethylated in PSC (group 15), 321 (34%) of which were also PSC-like when compared with AT/RT (group 16). A similar trend was observed for 123 MB-specific hypermethylated regions, 31 (25%) of which were also more methylated in FB than in PSC (group 13). Thus, a high proportion of even tumor-type-specific DMRs change their DNA methylation during differentiation and are differentiation-related in a manner that underlines the similarity of MBs to FB and AT/RTs to PSCs.
These results further support our prior findings that a majority of DMRs are hypermethylated in AT/RT when compared with MB or PLEX. The majority of these sites are demethylated upon neural differentiation, and a large proportion of them are PSC-like. However, we also observed AT/RT-specific DNA methylation, part of which is not differentiation-related and represents unique targeting of DNA methylation in these malignant cells.
We calculated the proportion of large-scale DMRs ( Figure 1F) in each DMR category ( Figure  4B, right side) to determine whether the observed DNA methylation differences were more linked to focal or large-scale regulation of DNA methylation. Notably only 14 (11%) of MB hyper DMRs represented large-scale DNA methylation differences, whereas 384 (49%) and 505 (55%) of AT/RT hyper and MB hypo DMRs, respectively, were involved in large-scale methylation differences ( Figure 4B). Thus, MB hyper DMRs appear to target specific cytosines or focal areas, such as specific promoters and enhancers, whereas focal and largescale regulation of DNA methylation is observed for regions hypermethylated in AT/RT or hypomethylated in MB. c) The DMR category-related binding patterns revealed TFs involved in tumor-specific, normal cell-like, and differentiation-related regulation of DNA methylation. TF binding site enrichment was calculated separately for each DMR category (bottom). TFs were organized into the themes listed in Figure 2B. The dot is not marked when a given TF is not measured in a given GTRD category.

AT/RT-specific methylation is associated with PRC2
We analyzed TF binding separately in the DMR categories presented in Figure 4B to define TFs relevant to different regulatory programs. This step was performed as we previously observed that PRC2 subunits are only rarely co-localized with neural TFs in AT/RT hypermethylated sites ( Figure 2F) and with other TFs in MB hypermethylated sites (Supplementary Figure 6 and 7). The binding of EZH2 and other PRC2 members was significantly enriched in AT/RT-specific, but not in PSC-like or FB-like, AT/RT hyper DMRs ( Figure 4C). Therefore PRC2 might target DNA methylation in regions specific to AT/RTs, which are not linked to DNA methylation in sites shared with PSCs. In MB, PRC2 members were enriched in hypermethylated regions, but EZH2 was also enriched in hypomethylated regions that were FB-like.
The SWI/SNF subunits SMARCA4 and SMARCC1 are especially associated with DMRs that are AT/RT-specific and not differentiation-related. SMARCA4, whose binding sites in other brain and neural cells are enriched in AT/RT hypermethylated DMRs (Figure 2E), is enriched in sites that are AT/RT-specific and demethylated during neural differentiation. These sites might represent hypermethylation of the neural SWI/SNF-EP300 target sites discussed earlier.
Contrary to prior findings, a large proportion of neural TFs was enriched in PSC-like AT/RT hyper DMRs. This finding suggests that these PSC-like sites remain methylated, and thus inaccessible, in AT/RT, which has the potential to partly maintain AT/RT in a low differentiation state and promote malignancy. PSC-like methylation was not linked to PRC2. Notably, neural TFs BCHE and TFAP2A, the only neural TFs co-localized with PRC2 subunits in AT/RT hypermethylated sites (Figure 2F), were enriched only in AT/RT-specific AT/RT hyper-DMR categories (Figure 4C), linking them closely to PRC2-related DNA methylation.
In conclusion, our results suggest that AT/RT-specific methylation is linked to malfunctional SWI/SNF and PRC2 complexes. This phenomenon has the potential to drive oncogenic epigenetic regulation, which is unique to AT/RT. In addition, we detected regions that harbor binding sites for several neural regulators and remained similar to PSCs that were originally methylated via a PRC2-independent mechanism.

Discussion
Our study aimed to identify DNA methylation patterns, epigenetically regulated TFs, and abnormally expressed genes specific to AT/RTs, PLEXs, or MBs by using both array and sequencing data. Although the characteristics of the materials, e.g., the genomic coverage, sample amounts, and suitable algorithms differed, the key findings align.
Our results showed that many TFs that would bind to the sites hypermethylated in AT/RT are the drivers of neural differentiation. NEUROG2 and NEUROD1 are especially notable because they regulate a wide range of factors that affect neural differentiation 72 . Both belong to the basic helix-loop-helix (bHLH) gene family, members of which bind to unmethylated DNA and are thus sensitive to DNA methylation 37 . Furthermore, e.g. NEUROD1 target genes 72,73 OTX2 and EBF3, both with the highest expression in MB, have their binding sites also hypermethylated in AT/RTs, indicating the silencing of the whole differentiation cascade of genes, which has been described 52,74-77 . In our downstream analysis, NEUROG1 and NEUROD2 were also defined as DM-DE genes. Thus, NEUROG-NEUROD family TFs appear to be targets of DNA methylation-driven regulation at different levels, via decreased gene expression or altered DNA binding ability, both of which contribute to suppressed neural differentiation. Several other neural genes, such as NBEA, BARHL1, and AMER2 78-80 , were more methylated and less expressed in AT/RT than in MB. Our results are supported by prior notions that neuronal differentiation factors have lowered functionality in AT/RTs 11,81,82 .
Erkek et al. reported that active enhancers and PRC2-repressed chromatin regions are more depleted in AT/RT than in other brain tumors or the pediatric normal brain 11 . Instead, a higher proportion of the genome was in a quiescent state and thus did not have the measured chromatin marks 11 . It is likely that some of the quiescent regions acquired DNA methylation. PRC2 marks many genes that are methylated during cell differentiation into neural precursor cells 62 . EZH2 has been reported to interact with DNA methyltransferases 62,83 and to directly control DNA methylation 83 . In rhabdoid tumor cells, DNA methylation is decreased, and PRC2 members concomitantly released from the p16 locus upon SMARCB1 reexpression 84 , also suggesting an interplay between PRC2 and DNA methylation. Our results suggest that EZH2 contributes to AT/RT-specific hypermethylation, whereas the majority of neural TFs were not associated with PRC2. BCHE and TFAP2A were exceptions: they co-localized with PRC2 members and were solely enriched in AT/RT-specific regions. TFAP2A is known to regulate neural crest induction and cranial placode specification 85,86 ; thus, it might mark regions relevant for developmental branches being silenced in AT/RT, MB, and even in the brain in general. BCHE is expressed early in nervous system development and plays a role in neural stem cell development 49 . This could possibly contribute to the observed differentiation blockage. Further studies are necessary to clarify the roles of these genes.
PSC-like DNA methylation and gene expression patterns were shown to exist in human AT/RT samples 69 . We showed that areas similarly methylated in both AT/RTs and PSCs are differentiation-related, giving them the function and strengthening the idea that these are important regulatory regions controlling the fate of these cells. We also showed that many of these regions involve large-scale methylation differences, implying larger changes in the 3Dstructure of chromatin. What is the relationship between these PSC-like sites and AT/RTspecific, PRC2-related sites that do not co-localize with the binding sites of most neural TFs? One possibility is that altered PRC2-mediated regulation is halting AT/RT cells to a specific point in the developmental trajectory, prohibiting the demethylation steps necessary for further neural development. This notion is supported by a report showing that EZH2 inhibition suppresses an ESC-like expression pattern in SMARCB1 deficient tumors 69 . EZH2 inactivation or deletion prevents AT/RT development and growth 69,81,87,88 , making it a relevant regulator for these tumors. The PSC-like hypermethylation in AT/RT possibly reflects an unaccomplished differentiation-related DNA demethylation and could thus be a downstream effect, which does not need to overlap with PRC2 binding.
In summary, our analysis revealed DNA methylation patterns specific to AT/RT, MB, and PLEX and associated them with gene regulation, cell differentiation, and malignancy. AT/RTs carry very few genetic alterations, which target the SWI/SNF epigenetic regulatory complex; therefore, aberrant epigenetic regulation is relevant for these tumors. Because other studied epigenetic marks are mainly depleted 11 , whereas DNA methylation is increased in AT/RT, DNA methylation is likely to play a role in AT/RT malignancy, at least partly by halting normal neural development. Further research will show whether treatment options affecting differentiation blockage could generate positive outcomes for patients suffering from this devastating disease.

Materials and Methods
Reduced representation bisulfite sequencing DNA was isolated from the frozen samples using a QIAamp DNA Mini kit (QIAGEN) with RNAse treatment and from the FFPE sample with a turXTRAC FFPE DNA kit (Covaris). Library construction and sequencing were performed at the Beijing Genomics Institute (BGI), Hong Kong. Samples were sequenced using Illumina HiSeq2000 technology. Paired-end sequencing of 50 bp was used. The data goal was 5G.
RNA sequencing RNA was isolated using a mirVana isolation kit (Invitrogen). Library construction and sequencing were performed at Novogene, Hong Kong. Samples were sequenced using Illumina HiSeq2000 technology. Paired-end sequencing of 150 bp was used. The data goal was 20 million raw reads per sample.
Preprocessing and analysis of DNA methylation datasets i450k data. The full CNS tumor reference cohort established by Capper et al. was filtered, batch-effect corrected, and normalized as described in their study 21  Mapping against hg38 (UCSC) reference and mapping quality control without duplicate removal was performed using Bismark v0.19.1 96 . The quality information of all these steps was summarized over the dataset using MultiQC 97 .
Methylation percentage calling and differential methylation analysis were performed using methylKit v1.8.1 91,98 . CpGs with coverage of less than 10 reads or hitting into scaffolds, sex chromosomes, or mitochondrial DNA were removed. The CpGs were tiled into 1000 bp nonoverlapping regions, i.e., tiles, which were used for DMR calling with overdispersion correction. Regions with methylation percentage difference greater than 25% and q value (FDR) smaller than 0.05, were extracted and then annotated similarly as i450k data.

Dimensionality Reduction.
The b values for all AT/RT, MB, and PLEXUS samples were aggregated on RRBS tiles based on lift-over hg38 probe coordinates. This step was performed by taking the complete overlaps and then calculating the mean beta value for probes hitting each tile 92 . The scales in the dataset were unified, and the 10 000 most varying regions were used as inputs for various dimensionality reduction algorithms.

TF binding analyses
Preliminary Statistical Testing. TF binding site (TFBS) enrichment analysis was performed using one-sided Fisher's exact test (FET). TFBS data were obtained from GTRD v. 19.04 metaclusters 35 . Before testing, DMRs with beta fold change >= 0.25 were filtered by overlap operations between the comparisons to obtain the regions specific to each cancer with a concordant direction of methylation change. Hypermethylated and hypomethylated regions were tested separately. For the i450k data, the background set was generated from previously filtered 396700 probes, and for the RRBS data, the methylKit tiles were used as a background.
When performing the tests with i450k data, the hg38 coordinates of each 50 bp probe overlapping with DMRs were first obtained. These were extended to 500 bp in both directions, and the methylation fold-change of the original DMR was used to represent the methylation difference in each extended probe. Regions were considered cancer-specific if they overlapped with at least 600 bp between comparisons, and with RRBS, because of the nonoverlapping nature of the DMRs, the exact overlaps could be used.
FET p values were adjusted by Benjamini-Hochberg (BH) correction 99 . TFs with adjusted pvalues smaller than 0.05, were considered significant. Sets of validated TFs with significant enrichment in the analogous DMR-sets, that is, in hypermethylated or hypomethylated cancer regions of the i450k and RRBS experiments, were constructed. Each validated TF had been studied in the literature, and because common themes existed, they were manually annotated into 11 categories when relevant.
Customized TFBS enrichment analysis with cell-type splitting. GTRD annotations for cell types linked to a given TFBS were further examined and manually divided into 11 groups (Supplementary Table 2). The GTRD was then split based on the terms of the groups, and these groups were used as input material for the FET to replace the full GTRD.
Studying average methylation levels at binding sites. Methylation of the relevant i450k probes near the binding sites of the validated TFs (maximum gap 500 bp) was visualized as violin plots. The probes were selected by pooling the cancer regions and then finding complete overlaps with the hg38 probe coordinates. The beta values over each tumor subclass were aggregated probewise using the mean.
TFBS co-localization. The co-localization of validated TFs was studied with TF binding sites hitting target areas (hypermethylated AT/RT regions, hypermethylated MB regions, hypomethylated MB regions, or hypomethylated PLEX regions). TFs were tested pairwise against each other to construct four-field matrices for each case. Regions with 1) no binding site from either TF, 2) binding sites from both TFs, and 3-4) binding sites from one TF but not the other were counted and the significance was tested using one-sided FET and stored as symmetric TF matrix of P values. These P value matrices were corrected using the BH method.
Adjusted P values were subsetted by the TF names of the validation group (AT/RThyper-TFs, MBhyper-TFs, MB-hypoTFs, PLEXhypoTFs) and visualized using the ComplexHeatmap package 100 .  106 . Genes with BH adjusted p values smaller than 0.05, and log2-transformed fold changes (absolute change) of at least 1 were considered significant.

Differential expression analysis for expression datasets and integration with methylation data
Integration. The DE results were merged with the annotated DMRs by symbolic gene names to find cases where the direction of the methylation change in a regulatory region (i.e., promoter, enhancer, genomic neighboorhood) was opposite to the gene expression change. GEO expression array results were integrated with i450k DMRs, and RNA-seq based results were integrated with RRBS DMRs from matched cases to provide data for validation. Validation was performed by extracting the genes that showed similar behavior in both approaches. Since annotatr uses transcript-level annotations, the validated hits, e.g., for promoters, represent DE genes that show regulatory methylation in at least one of its transcript specific promoters in both datasets. Additional filtering and validation steps were performed for the enhancers and genomic neighborhoods, described later. The results of the experiments were summarized using Venn diagrams and an oncoprint-like visualization using the ComplexHeatmap package 100 .
A recent regulatory genomics workflow was adapted to integrate DMRs and DE genes with precalculated promoter-enhancer annotations provided by the FANTOM5 database 94,107 . DE-DMR pairs in genomic neighborhoods were queried between i450k and RRBS experiments with a maximum gap of 5 kb, and the regions inside this "validation distance" with similar methylation direction were extracted. Moreover, to ensure that a given DMR can spatially interact with the TSS of a given gene, the results were studied with TAD information from Hi-C experiments 108,109 (http://3dgenome.fsm.northwestern.edu/). The region between the DMR and TSS of each record overlapped with TADs originating from five cell types: SKNDZ, Cortex DLPFC, Hippocampus, H1-ESC, and H1-NPC 108 . Genes without any TAD boundaries in any of their transcript TSS-DMR pairs were extracted.
Linking DMRs to TADs for large-scale DNA methylation analysis DMRs were overlapped with Cortex DLPFC and H1-NPC TADs by using Bedtools GroupBy v2.25.0 and in-house scripts to calculate hyper-and hypomethylated records in each TAD 108,110 . The input material was either a broadened (+-500bp) probe extracted from the i450k DMRs or the methylKit DMRs of the RRBS analysis. The resulting TADs for each comparison were filtered such that at least five similarly directed probes/tiles with both the i450k and RRBS approach should hit the TAD and, in addition, less than 1/10 of the hits can undergo methylation changes in the opposite direction. Moreover, the distance between the first and last probe or tile in the TAD had to be over 50 kb. The filtered TADs from different cell lines were pooled and reduced to produce single TAD coordinates for each comparison and then visualized with karyoploteR 111 .
Comparing brain tumors to ESCs and primary fetal brain Two ESC samples, nine primary FB samples (GSE116754) 71 , and 25 ESC samples (GSE31848) 70 , were preprocessed using the Capper et al. data 21 . Beta-value density plots and tSNE-clustering for the 10 000 most varying probes were used to check that samples were separated based on tumor and normal sample type. The DMRs were called with DMRcate by comparing AT/RT, MB, and PLEX samples against ESC and FB cells. In addition, ESC samples were compared with FBs.
From these seven comparisons, hypermethylated and hypometylated DMRs were extracted (q < 0.05, abs. average beta fold change >= 0.25 and length between 100 and 5000 bp). The region lists were overlapped with the same cancer-specific regions extracted for TFBS analysis (min 1 bp overlap). As a result, the experiment was summarized in a "developmental DMR matrix" of 2301 non-overlapping regions showing which of the comparisons or cancers each region overlapped. The resulting region sets were categorized based on the differences detected in tumor-tumor, tumor-normal, and ESC-FB comparisons. TFBS enrichment analysis was performed for these categories by extracting and extending the probes hitting the regions and testing them against GTRD using one-sided FET, as described.
acknowledge the CSC -IT Centre for Science, Finland, for providing computational resources and the Biocenter Finland (BF) and Tampere Genomics Facility for the service.