Loss of the H4 lysine methyltransferase KMT5B drives tumorigenic phenotypes by depleting H3K27me3 at loci otherwise retained in H3K27M mutant DIPG cells

DIPG are characterised by histone H3K27M mutations, resulting in global loss of the repressive mark H3K27me3, although certain key loci are retained. We recently identified subclonal loss-of-function mutations in the H4 lysine methyltransferase KMT5B to be associated with enhanced invasion/migration, but the mechanism by which this occurred was unclear. Here we use integrated ChIP-seq, ATAC-seq and RNA-seq on patient-derived, subclonal and CRISPR-Cas9-KD DIPG cells to show that loss of KMT5B/C causes depletion of these retained H3K27me3 loci via changes in chromatin accessibility, causing a raft of transcriptional changes which promote tumorigenesis. De-repression occurred at bivalent loci marked by H3K4me3, driving increased transcriptional heterogeneity and elevated gene expression associated with increased invasion, abrogated DNA repair and mesenchymal transition, along with a markedly altered secretome. These data suggest a previously unrecognised trans-histone (H4/H3) interaction in DIPG cells with a potentially profound effect on their diffusely infiltrating phenotype.


INTRODUCTION
Diffuse midline glioma with H3K27M mutation (DMG) comprise diffuse intrinsic pontine glioma (DIPG, arising in the pons) and high grade glioma occurring in other midline regions such as the thalamus (Jones and Baker, 2014). They present predominantly in young children (ages 5-10), and have a median survival of 9-12 months, with virtually all patients succumbing to the disease within 2-3 years of diagnosis . Radiation provides a transient response, however chemotherapeutics show little effect, and to-date targeted therapies have not provided any significant survival advantage (Jones et al., 2016). They are biologically distinct from histologically similar tumours arising in adults, harboring a key driver mutation, K27M, in genes encoding either the canonical histone H3.1 or variant H3.3 (Schwartzentruber et al., 2012;Wu et al., 2012). In a small proportion of cases, H3K27me3 is instead targeted through upregulation of the polycomb repressor complex (PRC2)-interacting protein EZHIP (Castel et al., 2020).
A major challenge in developing novel therapeutic interventions for DIPG is the extensive intra-tumoral heterogeneity present within an individual patient's specimen (Hoffman et al., 2016;Hoffman et al., 2019;Salloum et al., 2017). We identified for the first time genotypicallyand phenotypically-distinct cancer stem-like cells to be present in these tumours, and to be retained during in vitro and in vivo model propagation (Vinci et al., 2018). Notably, we discovered novel inactivating mutations in the gene KMT5B, an H4 lysine methyltransferase, to be present at subclonal frequencies in DIPG specimens (<1%), and derived subclonal paired models both with and without inactivating R187* mutations (Vinci et al., 2018). KMT5B inactivation appeared to confer an enhanced migratory and invasive phenotype both in vitro and in vivo, in addition to a differential sensitivity to DNA repair inhibitors (Vinci et al., 2018), although the precise mechanism was unclear. Critically, these subclones were found to interact to promote tumorigenic phenotypes, whereby co-culture of KMT5B-deficient cells with a non-motile isogenic subclonal counterpart substantially enhanced these properties, indicating the importance of dysregulation of H4 modifications in DIPG, even when it occurs in only a small proportion of cells (Vinci et al., 2018). KMT5B (also known as SUV420H1) and KMT5C (SUV420H2) have overlapping functions catalysing the addition of methyl groups to monomethylated histone H4K20, with a preference for H4K20me2 and H4K20me3, respectively (Jorgensen et al., 2013;Schotta et al., 2008).
Disruption of these marks appear to play a key role in the maintenance of genomic integrity (Bromberg et al., 2017;Jorgensen et al., 2013;Schotta et al., 2008), and loss-of-function mutations increasingly reported in other tumour types (Brohm et al., 2019). Although less well understood, a role for H4K20 methylation and invasion has also been reported in breast cancer (Yokoyama et al., 2014). How the control of H4 modifications impact those on neighbouring histones is also unclear, however has attracted a growing interest (Bhanu et al., 2019;Bo et al., 1988;Fingerman et al., 2008;Huang et al., 2015;Liu et al., 2019). In vitro studies in HeLa cells have shown that nucleosomes containing the histone variant H2A.Z are enriched for H4K20me2, and that regulated replication efficiency through a KMT5B-dependent highly consistent across samples, with 15.8% loci concordant between all five cells ( Figure   1G), and including intergenic as well as genic regions ( Figure 1H). The intergenic regions were enriched for transcription factor binding sites such as NRF1 (Supplementary Figure S2A) and, particularly, TP73 (Supplementary Figure S2B). Gene ontology analysis of the genes at these loci revealed a significant enrichment of processes involved in neural development and communication (Supplementary Figure S2C). Thus, KMT5B/C loss redistributes H4K20me3 at important genes and regulatory regions in DIPG (Supplementary Table S1).

Retained H3K27me3 loci in H3K27M DIPG are lost in KMT5B/C deficient cells
In addition to the changes in H4K20me3 in response to KMT5B/C loss, we observed that whilst there were tight correlations between all cells at loci bound by H3K4me3, there were distinct patterns in our models with other marks (Supplementary Figure S2D), both in the H3K27M (Supplementary Figure S2E) and EZHIP-overexpressing backgrounds (Supplementary Figure   S2F). Most notably, there was a marked difference in F8 cells and those lacking the H4-lysine methyltransferases in respect of regions occupied by H3K27me3 (Figure 2A). There was a marked reduction in H3K27me3 at the transcriptional start sites in KMT5B/C-deficient cells ( Figure 2B), far more pronounced than the changes seen with H4K20me3 ( Figure 2C).
Although global loss of H3K27me3 is a hallmark of H3K27M mutation (Bender et al., 2013;Chan et al., 2013;Lewis et al., 2013), certain loci retain occupancy (Bender et al., 2013;Chan et al., 2013;Funato et al., 2014). We saw a remarkable ablation of these retained H3K27me3 marks seen in the KMT5B-deficient cells, and to a lesser degree the KMT5C-deficient and double-KD ( Figure 2D). This was true at all levels of H3K27me3 occupancy in F8, with 2553/4641 (55%) of bound loci lost in F10 and/or K5B cells ( Figure 2E). Notably, this transhistone effect on H3K27me3 had a profound effect on transcription, with the inhibitory effect on gene expression exerted by H3K27me3 in KMT5B wild-type F8 cells absent with KMT5B-KD, although H3K4me3 was unchanged ( Figure 2F). To investigate whether these loci were cell line-dependent, we assessed H3K27me3 levels in these regions in further H3.3K27M (n=3) and H3.1K27M samples (n=3) (Castel et al., 2018) (Figure 2G). There was a high degree of concordance in retention of these loci in 5/6 tumours, more so in H3.3 than H3.1 K27M tumours, and with a total of 1626/2553 (64%) common with those in our models ( Figure 2H) (Supplementary Table S2). Importantly, some of these 'retained' loci were not subsequently lost with KMT5B/C deficiency, and included key genes reported to play an important role in DIPG tumorigenesis such as CDKN2A/B (Mohammad et al., 2017) ( Figure   2I), though curiously this locus was lost in B193.

KMT5B-driven transhistone interactions drives aberrant expression of bivalent genes marked by H3K4me3
The genomic regions in which otherwise retained H3K27me3 occupancy was lost in the absence of KMT5B/C encompassed both relatively large loci spanning several genes, such as GLI1 / ARHGAP9 ( Figure 3A Table S3).
Such an effect was also noted by pharmacological inhibition of KMT5B/C by the selective inhibitor A-196 (Bromberg et al., 2017), which causes reduction in global H4K20me2 and H4K20me3 ( Figure 3G) in the absence of direct effects on enzyme protein levels Analysing RNAseq data on cells treated with the drug for 24hrs at 1µM, we found 345 shared differentially expressed genes in common with the double-KD cells, with a higher degree of overlap in those down-regulated ( Figure 3H). Such genes ( Figure 3I) were significantly enriched for process associated with cell adhesion, integrin signalling and interleukin secretion, amongst others, by gene ontology analysis (Supplementary Figure S3I). Thus, the KMT5B/C-associated trans-histone effects on gene expression appear to be directly mediated by H4K20me2/me3 deposition.

H3/H4-mediated gene expression changes drive functionally relevant processes in DNA repair and tumour cell invasion
To explore the functional consequences of the observed epigenomic/transcriptomic changes, we first focussed on the DNA damage response, a process known to be mediated by H4K20me2/me3. The histone acetyltransferase KAT5 (TIP60) was one the genes identified as having lost H3K27me3 in our models, (Supplementary Figure 4A), and treatment with 2Gy ionising radiation (IR) resulted in a significant reduction in the number of 53BP1 foci observed in KMT5B-deficient cells (p=0.0376 -<0.0001, ANOVA) (Supplementary Figure 4B). These models were also significantly more sensitive to DNA repair modulation such as the PARP inhibitor talazoparib (p=0.0209 -<0.0001, ANOVA) (Supplementary Figure 4C), as we had previously demonstrated for our single cell-derived subclones (Vinci et al., 2018).
We next sought to explore whether the increased invasive capacity was a result of secreted factors produced in the KMT5B-deficient cells, and performed a comprehensive secretome analysis by LC-MS/MS ( Figure 4J), identifying 170 differentially secreted proteins in F10 and/or K5B (Supplementary Table S4). Although there were common proteins identified in the two KMT5B-deficient models, the majority of these were unique to F10, which has diverged in vitro, and plays a role in signalling to F8 cells in heterogeneous cultures (Vinci et al., 2018).
Proteins observed in both the lines included know secreted factors previously associated with a more invasive phenotype in this context, such as MMP2, VCAN, SERPINH1, ITGAV and IGFBP3 ( Figure 4K), and were strongly enriched in processes linked to extracellular matrix reorganisation/degradation and cell adhesion, amongst others ( Figure 4L).

Changes in chromatin accessibility drives gene expression via distinct cis-regulatory elements
Having determined the profound and functionally relevant effects caused by KMT5B/C loss in DIPG through bulk profiling, we next explored the differences in our model systems at the single cell level. Using UMAP projections from 10X expression data (n = 16552 cells, median = 2798 per library), we were able to discretely resolve the libraries not only from B193 and the single cell-derived F10 culture, but also our KMT5B/C KD cells from the parental F8 cells, particularly the double-KD ( Figure 5A). Overlaying gene expression signatures relating to H3K27me3 targets ( Figure 5B), we observed a significant over-expression in KMT5B/Cdeficient cells compared to F8, in keeping with bulk profiling ( Figure 5C), though not homogenous across all cells from our models. This heterogeneity was particularly pronounced in relation to gene signatures associated with glioma cell identity (Neftel et al., 2019). As well as the substantial plasticity between mesenchymal (MES), astrocytic (AC), oligodendrocyte precursor-like (OPC) and neural precursor-like (NPC) cell states observed in our unselected B193 glioma stem-like cell culture, we also observed marked differences in distinct cell subpopulations in our KMT5B and/or KMT5C-KD clones (Supplementary Figure S5A). In particular, there was a shift in some cells towards OPC and/or NPC-like identity associated with the loss of the H4 methyltransferases (p<0.0001, ANOVA) (Supplementary Figure S5B).
Of note, inferred DNA copy number profiles derived from 10X Genomics single-cell RNA sequencing (scRNA-seq), demonstrated largely consistent genomic profiles of our CRISPR-KD models, though some divergence such as an absence of 1q gain in F10 (Supplementary Figure S5C).
To determine the extent to which these changes were associated with chromatin dynamics, we carried out single cell ATAC-seq, also using the 10X platform (n = 6310 cells, median = 1020 per library). Here the libraries were resolved to an even greater extent by UMAP projection, particularly among the single gene KD cells ( Figure 5D). Again focusing on H3K27me3 targets, we observed a distinct increase in open chromatin in F10 and K5B compared to F8 ( Figure 5E), which could also be observed more globally (Supplementary Figure S5D). Most strikingly we observed key differences in co-accessibility of cis-regulatory domains associated with KMT5B loss. Using Cicero (Pliner et al., 2018) to identify coaccessible pairs of DNA elements, we identified numerous differentially expressed genes in F10 and K5B compared to F8 to be associated with open chromatin at distinct regulatory elements which seemingly account for their transcriptional activation. These included key genes from our previously identified gene signatures associated with mesenchymal transition, such as SERPINH1 ( Figure 5F) and MMP24 ( Figure 5G). These and similarly differentially cis-regulated genes demonstrated heterogenous accessibility profiles at the single cell level, demonstrating the wide-ranging effects on chromatin architecture driving differential expression wrought by KMT5B loss (Supplementary Table S5).

KMT5B loss drives transcriptional diversity, and confers poor prognosis in paediatric high grade glioma
Finally, we further carried out SmartSeq2 single cell RNA sequencing in order to assess transcriptional heterogeneity by examining full-length transcripts. These data, although derived from fewer cells than 10X (n=746, median=131 per library), were able to discriminate the various cell model libraries to a similar extent (Supplementary Figure S6A), and showed similar patterns of inferred DNA copy number change (Supplementary Figure S6B). Applying the GINI index to the expression data reinforced the transcriptional heterogeneity in this sample compared to the clones, but also demonstrated a shift towards a more diverse transcriptional profile in KMT5B/C-deficient cells compared to F8 ( Figure 6A). This was even more apparent by calculating Shannon diversity indices for the different libraries, with the specifically KMT5B-deficient libraries (and not KMT5C) showing significantly more heterogeneity than the wild-type (p< 0.001 for F10, K5B, KCo, B193 compared to F8, t-test) ( Figure 6B). This was found to be particularly true for certain key genes, such as NCAN, and VGF, in which expression was markedly different across cellular subpopulations, unlike housekeeping genes such as GAPDH provided as a comparator ( Figure 6C) (Supplementary Table S6).
To determine whether this association held true in patient tumour samples, we reverted to publicly available bulk RNAseq data, and explored the associations between KMT5B expression and Shannon diversity. Across all paediatric high grade glioma, there was a significant reduction in transcriptional heterogeneity in samples with high levels of KMT5B (p>0.0001, t-test) ( Figure 6D), recapitulating the single-cell data. Importantly, paediatric high grade glioma with lower levels of KMT5B had a significantly poorer prognosis than those with high expression (p=0.0126, log-rank test) ( Figure 6E). Although the numbers were smaller, the same held true when considering H3K27M-DMG alone (p=0.05, log-rank test) ( Figure   6F,G), demonstrating the importance of transcriptional heterogeneity in providing the evolutionary substrate for overcoming therapeutic interventions. This was again in contrast to KMT5C, whereby diversity increased with higher expression, and there was no significant impact on clinical outcome (Supplementary Figure S6C-F).
In summary, we have identified a novel mechanism whereby loss of an H4 methyltransferase results in changes in chromatin accessibility which conveys an ablation of H3K27me3 binding at bivalent loci otherwise retained in H3K27M DIPG cells. This results in profound transcriptional changes associated with pro-tumorigenic processes including mesenchymal transition, invasion and cellular heterogeneity, and highlights the role of transhistone mechanisms in these childhood brain tumours of unmet clinical need ( Figure 7).

DISCUSSION
H3K27M DMG/DIPG maintain a cellular hierarchy reflecting the stalled developmental origins of these tumour subtypes (Filbin et al., 2018;Jessa et al., 2019). We have shown the genotypic and phenotypic divergence of these cancer stem-like cells, whereby subclones derived from them have distinct tumorigenic capabilities acting in concert; the heterogeneity itself appears to be the evolutionary unit of selection (Little et al., 2012;Vinci et al., 2018). Specific subclones acquire unique genetic aberrations, and generate and respond to diverse signals into the tumour milieu -driving infiltration, early escape from the pons, and colonisation of distant metastatic sites (Qin et al., 2017;Venkatesh et al., 2015;Vinci et al., 2018).
Mechanistically, the manner in which these cells arise, and interact, is driven by a complex interplay between genetic and epigenetic mechanisms (Jones, 2020;Jones and Baker, 2014;Sturm et al., 2014). Genetic insults arising in precisely defined developmental contexts lock in profound changes in chromosomal architecture, chromatin accessibility and gene expression, which defines cellular states (Filbin et al., 2018;Nagaraja et al., 2017). Here for the first time, we begin to unravel the mechanism by which this takes place, whereby loss-of-function mutations in histone H4 modifying enzymes cause profound changes in gene expression via altered chromatin accessibility and H3K27me3 binding.
The lysine methyltransferases KMT5B (SUV420H1) and KMT5C (SUV420H2) catalyse H4K20 di-and tri-methylation, a post-translational modification associated with constitutive heterochromatin, telomeres and centromeres (Brohm et al., 2019); it is also involved with euchromatic gene silencing, and plays a key role in the maintenance of genomic integrity via DNA repair, replication and chromatin compaction (Jorgensen et al., 2013;Schotta et al., 2008). Although originally described in subclonal proportions (Schwartzentruber et al., 2012;Vinci et al., 2018), our identification of a DIPG sample with a clonal truncating mutation in KMT5B demonstrates the importance of this novel tumour suppressor gene in this context. In our 'natural' isogenic subclones, engineered KD cells, and in bulk clonally mutant cultures, we observe the rewiring of the epigenome caused by loss of the H4 methyltransferase in the background of H3K27M / EZHIP-overexpressing DIPG cells. A key feature of these tumours is the global loss of the repressive H3K27me3 mark, though the retention of this modification at key loci has been shown to regulate key oncogenic pathways (Bender et al., 2013;Chan et al., 2013;Funato et al., 2014;Larson et al., 2019;Mohammad et al., 2017). These retained loci were found to be consistent across individuals tumour specimens, with our data suggestive of an H4-related mechanism maintaining the modifications. Loss of this additional fine-control appears to drive profound transcriptional changes promoting tumorigenesis, particularly in relation to cell identify, mesenchymal transition, and invasion/migration; even if only occurring in a subset of tumour cells, a co-operative advantage may be conferred on neighbouring cells through the upregulation and exocytosis of important secreted factors (Vinci et al., 2018).
The importance of such trans-histone effects in cancer is only recently garnering attention, some 20 years after the discovery of a 'trans-tail' regulation of histone modification, linking H2B ubiquitination to H3 methylation and gene silencing in yeast (Briggs et al., 2002;Sun and Allis, 2002). Whilst most attention to this cross-talk has focussed on the links between H2B ubiquitination and H3K79 trimethylation by DOT1L (Anderson et al., 2019;Valencia-Sanchez et al., 2021;Valencia-Sanchez et al., 2019;Worden et al., 2019), H3K4me3 has also been implicated via COMPASS . The H3K4 methyltransferases MLL3/4 have been shown to bind to histone H4 during stem cell differentiation (Liu et al., 2019), and moreover MLL1-mediated histone H3K4 methylation has been found to act synergistically with histone H4K16 via the histone acetyltransferase MOF (Dou et al., 2005).
H4K20me3 has been found to co-localise with H3K9me3 in proliferating myoblasts (Bhanu et al., 2019) and ES cells (Xu and Kidder, 2018), whilst bivalent chromatin domains consisting of H3K4me3 and H3K27me3, enriched at developmental genes that are repressed in ES cells but active during differentiation, were found to have a simultaneous presence of H4K20me3 (Liu et al., 2019). A comparative genomics approach highlighted a substantial set of genes enriched with H3K27me3 and H4K20me3 in baboon subventricular zone (SVZ)-derived NSCs are altered in human GBM, suggesting a mechanism to control gene expression and proliferation in the progenitor cell compartment, protecting against abnormal cell cycle entry (Rhodes et al., 2016).
The profound shift in co-accessibility of DNA elements observed with KMT5B/C loss demonstrates the complexity of cis-regulation in these tumours. Heterogenous chromatin landscapes have recently been shown to play an important role in glioma stem-like cell diversity in adult glioblastoma (Guilhamon et al., 2021), whilst in H3G34R/V mutant glioma, it was recently shown that the specific lineage context of these tumours may facilitate PDGFRA co-option through a chromatin loop connecting PDGFRA to GSX2 regulatory elements, promoting PDGFRA overexpression . A fundamental consequence of the altered gene expression landscape observed with KMT5B loss is a significant increase in the transcriptional heterogeneity, in patient samples as well as model systems, as has also been observed with the histone demethylase KDM5B in breast cancer (Hinohara et al., 2018). This likely provides the substrate for cellular phenotypic diversity and treatment escape, and confers the poor prognosis found in patients with these alterations.
These data add a layer of complexity to devising treatment strategies for these oncohistone-         Chromatin was isolated by the addition of lysis buffer, followed by disruption with a Dounce homogenizer. Lysates were sonicated and the DNA sheared to an average length of 300-500 bp. Genomic DNA (Input) was prepared by treating aliquots of chromatin with RNase, proteinase K and heat for de-crosslinking, followed by ethanol precipitation. Pellets were resuspended and the resulting DNA was quantified on a NanoDrop spectrophotometer.

Supplementary
Extrapolation to the original chromatin volume allowed quantitation of the total chromatin yield.
An aliquot of chromatin (30 ug) was precleared with protein A (or G; for H4K20me3) agarose beads (Invitrogen). Genomic DNA regions of interest were isolated using 4 µg of antibody against H4K16ac (#39167; 5µl), H4K20me3 (#91107; 5µl), H3K27me3 (#39155;4µl), H3K9me3 (#39161; 5µl), and H3K4me3 (#39159; 3µl). Antibody against H2Av (0.4 ug) was also present in the reaction to ensure efficient pull-down of the spike-in chromatin (Egan et al., 2016). All antibodies were from Active Motif (Carlsbad, CA, USA). Complexes were washed, eluted from the beads with SDS buffer, and subjected to RNase and proteinase K treatment. Crosslinks were reversed by incubation overnight at 65 C, and ChIP DNA was purified by phenol-chloroform extraction and ethanol precipitation. Illumina sequencing libraries were prepared from the ChIP and Input DNAs by the standard consecutive enzymatic steps of end-polishing, dA-addition, and adaptor ligation. Steps were performed on an automated system (Apollo 342, Wafergen Biosystems/Takara). After a final PCR amplification step, the resulting DNA libraries were quantified and sequenced on Illumina's NextSeq 500 (75 nt reads, single end).
PCR duplicates were removed using PICARD tool, and the read extension was counted using GAlignments. Significant peaks were identified using Model-Based Analysis for ChIP-Seq MACS version 2 (Zhang et al., 2008) using broadpeak with a p value cut-off of 10 -9 . We counted merge peaks for each mark by counting all the BAMs within the genome region.
Peaks were annotated using ChIPeakAnno) with promoter region classified at peaks within +/-2.5kb of a transcription start site (TSS). We down-sampled the tags from each mark and each sample to achieve the normalized tag counts.

RNA-seq
RNA was isolated from cell pellets using the RNeasy Mini Kit protocol (Qiagen,74104), and measured using a Nanodrop spectrophotometer. RNA integrity was analysed and quantified using 2200-Tapestation (Agilent). A minimum of 150 ng of RNA was sequenced at Eurofins Genomics after generation of strand-specific cDNA libraries by purification of poly-A containing mRNA molecules, mRNA fragmentation, random primed cDNA synthesis (strand specific), adapter ligation and adapter specific PCR amplification. Library was then run on an Illumina Hiseq or Illumina Novaseq sequencer at 2x150 bp read length generating at least 30 million reads.
Gene expression from bulk RNASeq was quantitated in 14918 known protein-coding genes using HTSeq based upon an Ensembl gene model. Gene Set Enrichment Analysis (software.broadinstitute.org/gsea) was performed using the fgsea R package based upon pairwise comparisons. Differential expression analysis used raw HTSeq counts compared using DESeq2 and expression values were summarised both as rlog transformed expression values and RPKM.

Drug treatment
Parental and genetically engineered cell lines were seeded at 3000 cells density per well into black wall 96-well plates (#655090, Greiner) with 50ul of complete media, 6 replicates per condition. Cells were incubated at 37C, 5% CO2. Three days after seeding, 50ul of media with either A-196 (KMT5B/C inhibitor), Talazoparib (PARP inhibitor) or DMSO were added to each well at a final concentration range between 0-10µM for A-196 and 0-1µM for Talazoparib. The cell proliferation was assessed 5 days after treatment by measuring viable cells in culture based on quantitation of the ATP present using CellTiter-Glo® Luminescent Cell Viability Assay (Promega) according to the manufacturer's instructions using a Victor X5 Multi-label plate reader luminescence protocol (Perkin Elmer, Waltham, MA, USA). Luminescence data from each well was normalised to the median signal from DMSO-containing wells to calculate the survival fraction.
Immunofluorescence DIPG cells were grown on laminin pre-coated 8 well-chamber slides (Cole Palmer). Cells on chamber slides were fixed with 4% paraformaldehyde at room temperature for 15 minutes, quenched with 50mM glycine and washed with phosphate buffered saline (PBS) solution.
Invasive neurospheres were collected embedded in Matrigel after 3 days of invasion assay (see method below) and were fixed in conical tube with an excess of 4% paraformaldehyde for 5h at 4°C; then washed, permeabilised with 1% TritonX-100 PBS solution for 45min. Cells were permeabilised with 0.5% Triton X-100 solution for 10 minutes at room temperature. Cells and neurospheres were then blocked with appropriate serum according to the species of secondary antibody for 1 hour at room temperature. Primary antibodies directed against 53BP1 (#CS-4937; Cell signaling) were added to the cells and incubated overnight at 4°C; H4K20me3 at 1/500 (#ab9053; Abcam), H4K20me2 at 1/1000 (#ab9052; Abcam), H4K20me1 at 1/400 (#ab9051; Abcam) were added to the neurospheres for 48h at 4C. Samples were then washed in PBS three times and incubated with Alexa Fluor488-conjugated secondary antibodies for 1 hour at room temperature for the cells or 48h at 4C for the neurospheres.
Nuclei were counterstained with DAPI for all samples. The neurospheres were transferred into 8-well chamber slides. All samples were mounted with Vectashield (Vector Laboratories) and examined using a Zeiss LSM700 confocal microscope. The number of 53BP1 foci were quantified using the function Tissue Studio IF from the software Definiens Software XD64.
Invasion 50-100µm diameter neurospheres were harvested after three days culture and invasion assays performed essentially as described (Vinci et al., 2015;Vinci et al., 2013;Vinci et al., 2018). Briefly, a total of 100μl medium was removed from ULA 96-well round-bottomed plates and 100μl of matrigel gently added to each well (6 replicates). Plates were incubated at 37°C, 5% CO2, 95% humidity for 1hr. Once the matrigel solidified, 100μl/well of culture medium was added on top. Starting from time zero, and at 3h intervals up to 6 days, automated image analysis was carried out on the live-cell imaging system IncucyteS3 (Essen Bioscience) using the Tumor Spheroid Assays application. Cell spread was measured after segmentation using MATLAB to perform threshold-based image segmentation on pixel intensity values followed by filtering by size, a minimum bounding circle algorithm was used to quantify cell spread from the resulting segmentation masks (n>=3).
Proteins were precipitated by 20% (w/v) trichloroacetic acid to remove the amino acids in the samples. Proteins were resuspended in 100 mM TEAB buffer (triethylammonium bicarbonate, Sigma) and then digested by trypsin (Pierce MS grade, Thermo Fisher) for 18 hours at 37°C.
Peptides were labelled by 0.8 mg of TMT10plex according to the manufacturer's instruction.
Samples were pooled and dried in SpeedVac, and then fractionated either by Pierce™ high pH reversed-phase peptide fractionation spin column to 8 fractions from 10% -50% ACN/0.1% triethlyamine as instructed by the manufacturer, or online on an XBridge BEH C18 column intensity threshold at 5000 were fragmented in ion trap at 35% collision energy, with AGC at 10,000 and 35 ms maximum injection time, and isolation width at 0.7 Da in quadrupole. The top 5 MS2 fragment ions were SPS selected with the isolation width at 0.7 Da, and fragmented in HCD at 65% collision energy, and detected in the Orbitrap to get the report ions' intensities at a better accuracy. The resolution was set at 50,000, and the AGC at 50,000 with maximum injection time at 86 ms. The dynamic exclusion was set 40 s with ± 10 ppm exclusion window.
The raw files were processed with Proteome Discoverer 2.3 or 2.4 (Thermo Fisher) using the Sequest HT search engine. Spectra were searched against fasta files of reviewed Uniprot homo Sapiens entries (April 2019 for or February 2020) and an in-house contaminate database. Search parameters were: trypsin with 2 maximum miss-cleavage sites, mass tolerances at 20 ppm for Precursor, and 0.5 Da for fragment ions, dynamic modifications of Deamidated (N, Q), Oxidation (M) and Acetyl (Protein N-terminius), static modifications of Carbamidomethyl (C) and TMT6plex (Peptide N-terminus and K). Peptides were validated by Percolator with q-value set at 0.01 (strict) and 0.05 (relaxed). The TMT6plex reporter ion quantifier included 20 ppm integration tolerance on the most confident centroid peak at the MS3 level. Only unique peptides were used for quantification. The co-Isolation threshold was set at 100%. Peptides with average reported S/N>3 were used for protein quantification, and the SPS mass matches threshold was set at 55%. The abundance normalization mode used the total peptide amount, and scaling mode used on all average. Only master proteins were reported.
Single cell RNA-Seq Each sample was prepared as a single cell suspension, with neurospheres grown until ~100µm diameter, pelleted down, gently dissociated with accutase (#A6964, Sigma), washed and resuspended with complete media, counted and then washed again in PBS with 0.04% BSA. Single cell suspensions were processed through the 10X Chromium Single Cell Platform using Chromium Single Cell 3' v3.1 Chemistry (10X Genomics, Pleasanton, CA) following the manufacturer's protocol. A total of 4,000 cells were added to each channel of a chip to be partitioned into Gel Beads in Emulsion (GEMs) in the Chromium instrument, followed by cell lysis and barcoded reverse transcription of RNA in the droplets. Breaking of the emulsion was followed by amplification, fragmentation, and addition of adaptor and sample index. Single Cells 3' Gene Expression libraries were sequenced on the NextSeq 500 and processed with Cell Ranger analysis pipeline. For SmartSeq2, Single cell suspensions were sorted for viable cells (calcein positive,TO-PRO-3 negative) into 96-well plates for whole transcriptome amplification, library preparation and pooling for sequencing on a NextSeq 500 as previously described (Picelli et al., 2014). Raw sequencing reads were aligned to hg19 genome using Bowtie, and gene expression was quantified as transcripts per million (TPM) with RSEM. Analysis was performed using Seurat version 3.0 (Butler et al., 2018)) and visualised using UMAP projection. Copy number profiles from single cell RNA sequencing were generated with infercnv (Patel et al., 2014).

Single cell ATAC-Seq
Each sample was prepared as a single cell suspension, with neurospheres grown until ~100µm diameter, pelleted down, gently dissociated with accutase (#A6964, Sigma), washed and resuspended with cryopreservation media (STEM-CELLBANKER; Amsbio) at 4 million cells/mL, then cells were cryopreserved in a Mr Frosty. Cells were thawed in a 37 o C water bath and prepared as described by 10X Genomics Demonstrated Protocol -Nuclei Isolation for Single Cell ATAC Sequencing Rev B. Briefly, cell pellets were resuspended in lysis buffer and incubated on ice for 5 minutes. Lysed cells were washed, strained, and nuclei were resuspended and counted using a Countess II FL Automated Cell Counter. Isolated nuclei were then used as input following the 10X Genomics Chromium Next GEM Single Cell ATAC Reagent Kits v1.1 manual. Targeting a 5,000 nuclei recovery, samples were added to the tagmentation reaction, loaded into the Chromium Controller for nuclei barcoding, and prepared for library construction following manufacturer's protocol (10X Genomics PN-1000175).
Resulting libraries were quantified using the KAPA Library Quantification Kit for Illumina platforms (KAPA Biosystems), and sequenced with PE34 sequencing on the NextSeq 500/550 sequencer (Illumina).
Sequenced data were processed with the Cell Ranger ATAC software, with alignment to the human (hg38) genome. The Cell Ranger output files were used as input to Active Motif's proprietary analysis program, which creates Excel tables containing detailed information on cluster-specific peak locations, gene annotations, and motif enrichment. The alignment files generated by Cell Ranger were also processed as pseudo-bulk ATAC-Seq samples. Duplicate reads were removed, only reads mapping as matched pairs and only uniquely mapped reads (mapping quality >= 1) were used for further analysis. Alignments were extended in silico at their 3'-ends to a length of 200 bp and assigned to 32-nt bins along the genome. The resulting histograms (genomic "signal maps") were stored in bigWig files. Peaks were identified using the MACS 2.1.0 algorithm at a cutoff of p-value 1x10-7, without control file, and with thenomodel option. Peaks that were on the ENCODE blacklist of known false ChIP-Seq peaks were removed. Signal maps and peak locations were used as input data to Active Motif's proprietary analysis program, which creates Excel tables containing detailed information on sample comparison, peak metrics, peak locations and gene annotations.

Statistics
Statistical analysis was carried out using R 3.5.0 (www.r-project.org) and GraphPad Prism 8.
Categorical comparisons of counts were carried out using Fishers exact test, comparisons between groups of continuous variables employed Student's t-test or ANOVA. Univariate differences in survival were analysed by the Kaplan-Meier method and significance determined by the log-rank test. All tests were two-sided and a p value of less than 0.05 was considered significant. Normalised and scaled abundances proteins were compared by using a Student's t-test adjusted for false discovery rate (FDR) according to Benjamini and Hochberg. For gene expression data analysis multiple testing corrections were made using FDR according to Benjamini and Hochberg. Effects of drug treatment on survival as the primary endpoint and overall survival in the orthotopic in vivo models were assessed using Mantel Cox log-rank test. Adjusted P-values < 0.05 was considered significant.

Data accessibility
All newly generated sequencing data have been deposited in the European Genomephenome Archive (ebi.ac.uk/ega) with accession number EGAS00001005365.