ABSTRACT
Chromosomal translocations frequently promote carcinogenesis by producing gain-of-function fusion proteins. Recent studies have identified highly recurrent chromosomal translocations in patients with Endometrial Stromal Sarcomas (ESS) and Ossifying FibroMyxoid Tumors (OFMT) leading to an in-frame fusion of PHF1 (PCL1) to six different subunits of the NuA4/TIP60 complex. While NuA4/TIP60 is a co-activator that acetylates chromatin and loads the H2A.Z histone variant, PHF1 is part of the Polycomb repressive complex 2 (PRC2) linked to transcriptional repression of key developmental genes through methylation of histone H3 on lysine 27. In this study, we characterize the fusion protein produced by the EPC1-PHF1 translocation. The chimeric protein assembles a mega-complex harboring both NuA4/TIP60 and PRC2 activities and leads to mislocalization of chromatin marks in the genome. These are linked to aberrant gene expression, in particular over an entire topologically-associated domain including part of the HOXD cluster. Furthermore, we show that JAZF1, implicated with PRC2 components in the most frequent translocations in ESS, is a potent transcription activator that physically associates with NuA4/TIP60. Altogether, these results indicate that most chromosomal translocations linked to these sarcomas employ the same molecular oncogenic mechanism through a physical merge of NuA4/TIP60 and PRC2 complexes leading to mislocalization of histone marks and aberrant polycomb target gene expression.
Highlights
Recurrent oncogenic chromosomal translocations fuse Polycomb-like-1 (PHF1) with different subunits of the NuA4/TIP60 complex
Translocation produces a mega-complex containing NuA4/TIP60 and PRC2 activities
Translocation leads to mistargeting of histone marks throughout the genome and changes in gene expression
Loss of H3K27me3 and gain of H4ac on HOXD cluster are restricted to a topologically-associated domain (TAD)
A highly recurrent JAZF1-SUZ12 translocation uses the same mechanism merging NuA4/TIP60 with PRC2
INTRODUCTION
ATP-dependent remodelers and histone modifiers are key regulators of the structure and function of chromatin, essential for genome expression and stability, cell proliferation, development and response to environmental cues. Post-translational modifications of specific residues on histones are also part of the epigenetic mechanisms ensuring gene expression memory during cell divisions (Zentner and Henikoff, 2013). Histone modifying enzymes are often part of large multisubunit protein complexes. They are highly conserved and composed of various combinations of subunits like readers, writers, and erasers of histone marks, histone chaperones, and chromatin remodelers. The combination of different modules ensures specific localization, histone target selection, enables epigenetic crosstalk and context-specific activity. The strictly coordinated activities of chromatin modifying complexes ensure the proper functioning of the cell (Lalonde et al., 2014). Disruption of chromatin marks and their regulators can lead to various pathologies including cancer (Shen and Laird, 2013).
Recurrent chromosomal translocations producing oncogenic fusion proteins are common in many cancers and particularly prevalent in hematopoietic malignancies and sarcomas. These fusions frequently involve chromatin and transcription regulators and, in many cases, are thought to be the primary drivers of cancer. In recent years, disruption of chromatin dynamics has emerged as a consistent oncogenic mechanism used by these fusion proteins (reviewed in (Brien et al., 2019)). Sarcomas are rare mesenchymal tissue cancers with distinct molecular profiles. 1/3rd of all sarcomas harbor chromosomal translocations with otherwise normal karyotypes, showing robust clustering of gene expression profiles compared to normal tissues. These chromosomal translocations are the driver oncogenes and the common oncogenic mechanism employed is either transcriptional deregulation (EWS-FLI in Ewing’s Sarcoma) or aberrant signaling (ALK fusions in Inflammatory Myofibroblastic Tumor) (Taylor et al., 2011). Of particular interest are the fusions involving proteins associated with chromatin regulators that impact gene expression programs. Recurrent translocations are found in Endometrial Stromal Sarcomas (ESS) and Ossifying FibroMyxoid Tumors (OFMT) that potentially fuse subunits of distinct chromatin modifying complexes with opposite functions in gene regulation, namely the NuA4/TIP60 histone acetyltransferase (HAT) complex and Polycomb Repressive complexes (Figure 1A).
Among the recurring chromosomal translocations that characterize endometrial stromal sarcoma, genes for five different subunits of the NuA4/TIP60 complex (EPC1/2, MBTD1, MEAF6, BRD8) are repeatedly found fused to genes for different PRC2 components (PHF1/SUZ12/EZHIP) (Ferreira et al., 2018; Hoang et al., 2018). Interestingly, other soft tissue sarcomas such as ossifying fibromyxoid tumors also harbor EPC1-PHF1, EP400-PHF1 and MEAF6-PHF1 fusions. OFMTs are rare cancers of uncertain cellular origin, and ∼50-85% of cases show PHF1 translocations, with EP400-PHF1 being the most frequent (40%) (Schneider et al., 2016). JAZF1-PHF1 was reported in a case of Cardiac Ossifying sarcoma and fusions of NuA4/TIP60 subunits with the PRC1.1 component BCOR have also been reported in ESS, demonstrating that these types of fusion events represent a critical oncogenic mechanism in mesenchymal cancers (Schoolmeester et al., 2013) (Figure 1A).
NuA4/TIP60 is an evolutionarily conserved and multi-functional MYST-family HAT complex with 17 distinct subunits (Figure 1B)(Avvakumov and Cote, 2007; Doyon and Cote, 2004; Judes et al., 2015; Sheikh and Akhtar, 2019; Steunou et al., 2014; Voss and Thomas, 2018). The catalytic subunit KAT5/Tip60 acetylates histones H4 and H2A, as well as variants H2A.Z and H2A.X in the context of chromatin. In addition, the ATP-dependent remodeler subunit EP400 allows the exchange of canonical histone H2A in chromatin with variant histone H2A.Z (Billon and Cote, 2013; Pradhan et al., 2016). There are also multiple subunits with different reader domains that allow context-dependent site-specific activity of the complex and epigenetic crosstalk (Figure 1B)(Jacquet et al., 2016; Steunou et al., 2014). The 836aa EPC1 protein is a non-catalytic scaffolding subunit of the TIP60 complex. The conserved N-terminal EPcA domain (aa1-280) interacts with KAT5/Tip60, MEAF6, and ING3 thereby forming the human Piccolo NuA4 complex, enabling binding and acetylation of chromatin substrates (Boudreault et al., 2003; Chittuluru et al., 2011; Doyon et al., 2004; Huang and Tan, 2013; Lalonde et al., 2013; Selleck et al., 2005; Xu et al., 2016). The C-terminal part of EPC1 is thought to associate with the rest of the NuA4/TIP60 complex based on data from the yeast NuA4 complex (Auger et al., 2008; Boudreault et al., 2003; Setiaputra et al., 2018; Wang et al., 2018) (Figure 1B). The NuA4/TIP60 complex is an important transcriptional co-activator that can be recruited to gene regulatory elements by several transcription factors. Through acetylation of histones as well as non-histone substrates, it activates a multitude of gene expression programs, including: proliferation, stress response, apoptosis, differentiation and stem cell identity (Judes et al., 2015; Steunou et al., 2014; Voss and Thomas, 2018). The NuA4/TIP60 complex is also a key player in the response to DNA damage, assisting repair pathway choice as well as the repair process itself ((Cheng et al., 2018; Jacquet et al., 2016) and references therein). All these functions are essential for cellular homeostasis, hence NuA4/TIP60 subunits are often mutated or deregulated in cancers (Avvakumov and Cote, 2007; Judes et al., 2015; Voss and Thomas, 2018) and Tip60/KAT5 itself is a haplo-insufficient tumor suppressor (Gorrini et al., 2007).
Polycomb Group proteins (PcG) are evolutionarily conserved proteins involved in development and transcriptional regulation, originally linked to HOX gene repression during Drosophila development (Schuettengruber et al., 2017). They are key regulators of mammalian development, differentiation, and cell fate decisions. PcG proteins assemble into multisubunit complexes, major ones being Polycomb Repressive Complexes 1 and 2 (PRC1 and PRC2). These complexes co-operate with each other to create repressive chromatin regions through histone modifications and chromatin organization. While PRC1 catalyzes H2AK119 mono-ubiquitination and chromatin compaction, PRC2 catalyzes H3K27 methylation. The PRC2 complex is composed of the core components EZH2, SUZ12, EED, and RBBP4/7. Through a read-write mechanism, it can deposit H3K27me3 over large chromatin regions such as the HOX clusters. Other associated factors function to stabilize the PRC2 complex on chromatin and modulate its activity, defining distinct variant complexes PRC2.1 and PRC2.2. PRC2.1 contains one of the Polycomb-like (PCL) proteins, PHF1, MTF2 or PHF19, as well as either PALI1/2 or EPOP (Figure 1C) (Laugesen et al., 2019; Loubiere et al., 2019). The 567aa PHF1 protein binds unmethylated CpG islands particularly at long linker DNA regions, stabilizes the binding of PRC2.1 complex on chromatin and increases its activity (Cao et al., 2008; Choi et al., 2017; Li et al., 2017; Sarma et al., 2008). PHF1 is also known to bind the H3K36me3 mark through its Tudor domain (Cai et al., 2013; Musselman et al., 2012), restricting the catalytic activity of EZH2 (Musselman et al., 2012) (Figure 1C).
Clinical studies frequently report new rearrangements in sarcomas, and many groups are working on the classification of ESS using recurrent fusions as molecular markers. However, there is currently little understanding about the molecular consequences of recurrent translocations and their primary role in tumorigenesis. To address this, we used biochemical and genomic approaches to study a recurrent translocation that fuses the EPC1 subunit of the NuA4/TIP60 complex to the PHF1 subunit of the PRC2.1 complex (Figure 1D). Low grade ESS, as well as OFMTs, harbor the EPC1-PHF1 fusion protein (Antonescu et al., 2014; Micci et al., 2006). We investigated the molecular impact of the EPC1-PHF1 fusion protein by generating isogenic cell lines, used affinity purification followed by mass spectrometry to identify interactors and analyzed chromatin occupancy of the fusion protein. Importantly, we analyzed the effect of the fusion protein on global chromatin dynamics while correlating differential gene expression.
RESULTS
To construct the fusion EPC1-PHF1 gene (Figure 1D), we used a portion of the chimeric gene recovered by RT-PCR from total RNA isolated from surgically removed endometrial stromal sarcoma (a kind gift from Francesca Micci’s group) (Micci et al., 2006). As necessary controls, we also subcloned full-length PHF1 and EPC1, as well as a portion of EPC1 corresponding to the fragment found in the fusion, referred to as EPC1(1-581). We confirmed the expression of all constructs and then tested whether the EPC1-PHF1 fusion protein can act like an oncogenic driver by performing Colony Formation Assay in HEK293T cells (Figure S1A-B). In contrast to full-length and truncated EPC1 which strongly inhibit growth, we observed that expression of EPC1-PHF1 leads to a greater number of colonies compared to controls (Figure S1B), supporting the idea that this gene fusion may be a driver event, giving a growth advantage to the expressing cells.
EPC1-PHF1 forms a mega-complex merging NuA4/TIP60 and PRC2.1
To isolate the native complex(es) formed by the EPC1-PHF1 fusion protein, we generated isogenic K562 cell lines by targeted integration of C-terminally TAP-tagged cDNAs into the AAVS1 safe harbor genomic locus (Dalvai et al., 2015). To circumvent issues related to protein overexpression, we used the moderately active PGK1 promoter which was shown to achieve expression of NuA4/TIP60 and PRC2 subunits within 2.5-fold of the native levels (EP400, EPC1, MBTD1 and EZH2) (Dalvai et al., 2015). Our previous studies have demonstrated that this system enables purification of stable and stoichiometric protein complexes (Dalvai et al., 2015; Doyon and Cote, 2016; Jacquet et al., 2016). We chose the K562 cell line as it is an ENCODE tier-1 cell line and can grow in suspension culture to high cell densities, ideal for biochemical and genomic studies.
Nuclear extracts from K562 cell lines expressing the fusion protein, individual fusion partners or the empty tag (Mock, C-terminal 3xFlag-2xHA-2A-puromycin tag) were allowed to bind to anti-FLAG resin and the bound material was eluted with 3xFlag peptides. An alternate K562 cell line expressing EPC1-PHF1 with a C-terminal 3xFLAG-2xStrep tag was also used to improve yield and purity by binding the FLAG eluted fraction to Streptactin beads followed by elution with biotin. SDS-PAGE and silver-staining identified the components of the purified complexes (Figure 2A). Mass spectrometry analysis identified all expected components of NuA4/TIP60 and PRC2 complexes (Figure 2B, S2A). We further confirmed the complex subunits by western blotting with NuA4/TIP60 and PRC2 specific antibodies (Figure 2C). The PHF1 fraction contained the core subunits EZH2, EED and SUZ12 of the PRC2 complex, as well as RBBP4 and the PRC2.1 specific PALI-1/LCOR. The EPC1(1-581) fraction contained all the subunits of NuA4/TIP60 complex except MBDT1, which was expected since it associates through an interaction with EPC1 C-terminus (aa644-672) (Jacquet et al., 2016; Zhang et al., 2020). Strikingly, the EPC1-PHF1 fraction contained subunits of both the TIP60 and PRC2.1 complexes. These results indicate that the EPC1-PHF1 fusion protein efficiently associates with both NuA4/TIP60 and PRC2.1. This was also supported by data in HEK293T cells (Figure S1C).
To determine if the fusion protein’s association with the two complexes occur independently of each other or could occur simultaneously, we integrated 3xHA-tagged EPC1-PHF1 at the AAVS1 locus in K562 cell lines in which TALEN/CRISPR was used to introduce a 3xFLAG tag at endogenous EZH2 or EP400 genes, respectively (Dalvai et al., 2015). This allowed us to purify endogenous PRC2 and NuA4/TIP60 complexes and verify if the fusion protein can physically bridge the two complexes. As shown in Figure 2D and 2E, it is clear that expression of the EPC1-PHF1 fusion leads to the association of the PRC2 complex with NuA4/TIP60 and vice versa, demonstrating the formation of a mega-complex.
We then performed in vitro histone acetyltransferase (HAT) and histone methyltransferase (HMT) assays with purified fractions on native chromatin as substrate. We observed robust HAT activity towards histone H4 and H2A in both EPC1(1-581) and EPC1-PHF1 complexes, as expected for NuA4/TIP60 (Figure 2F, S2B). Simultaneously, we detected HMT activity towards histone H3 in PHF1 and EPC1-PHF1 complexes, as expected for PRC2 (Figure 2F, S2C). This demonstrates that the merged assembly maintains the enzymatic activities of both original complexes. Altogether, these results indicate that expression of the EPC1-PHF1 fusion protein in cells leads to the formation of a hybrid TIP60-PRC2 mega-complex that harbours opposite functions in term of chromatin modifications and impact on gene regulation (Figure 2G). It is important to note that we did not detect an effect of expressing the fusion protein or its partners on the bulk level of H3K27me3 (Figure S2D), in contrast to what has been previously proposed for the JAZF1-SUZ12 fusion (Ma et al., 2017).
The EPC1-PHF1 fusion complex is enriched at genomic loci containing bivalent chromatin
To determine the genomic locations where the EPC1-PHF1 mega-complex may act, we performed anti-FLAG chromatin immunoprecipitation coupled to high-throughput sequencing (ChIP-seq) using the isogenic cell lines. We analyzed binding regions (significant peaks) of EPC1-PHF1, EPC1(1-581) and PHF1 in comparison to the empty vector cell line (Figure 3A, S3A) and correlated them with chromatin regions containing H3K4me3 (usual NuA4/TIP60 targets), H3K27me3 (PRC2 targets) or a combination of both, i.e. bivalent chromatin. We found that the fusion protein is specifically enriched in regions of bivalent chromatin, which is likely facilitated by the presence of both H3K4me3 and H3K27me3 readers in the mega-complex (ING3 in NuA4/TIP60 and EED in PRC2) (Figure 3B).
Effect of EPC1-PHF1 on the chromatin and transcriptional landscape
Two major possible effects are expected from the presence of the fusion mega-complex at specific genomic loci: 1-appearance of chromatin acetylation where PRC2.1 is normally bound on silenced regions of the genome; 2-appearance of H3K27 methylation where TIP60 is normally bound on expressed regions of the genome. We also speculated that the fusion protein may mislocalize TIP60-mediated H4 acetylation to poised bivalent chromatin regions, pushing them towards transcriptional activation. To test this, we performed H4 Penta-acetyl (RRID:AB_310310) and H3K27me3 (RRID:AB_27932460) ChIP-sequencing in our isogenic cell lines. A first observation was that 28% of the genomic regions bound by EPC1-PHF1 show increased H4 acetylation but only 2.9% have increased H3K27me3, arguing for a dominant effect of Tip60 acetyltransferase vs PRC2 methyltransferase in the mega-complex. Furthermore, to understand the effect of EPC1-PHF1-induced changes in local histone modifications on transcription, we performed gene expression analysis. Upon integrative analysis, we found that EPC1-PHF1-induced changes in H4 acetylation correlate with changes in local gene expression (Figure 3C, S3B). In parallel, significant changes in H3K27me3 anti-correlate with changes in gene expression (Figure 3D, S3C).
Since the appearance of new H4 acetylation signals at EPC1-PHF1-bound regions was predominant over the appearance of new H3K27me3 signals, this suggests that subverting NuA4/TIP60 activity to new genomic loci by the fusion protein was the primary outcome. We found that the HOX gene clusters, in particular, demonstrated this effect in a striking fashion (Figures 4, S4 and S5). Representative ChIP-seq tracks of FLAG, H4 acetylation and H3K27me3 at the posterior HOXD genes in the isogenic cells are shown in Figure 4A (see also Figure S4A for HOXC genes). Binding of EPC1(1-581) alone is detected on two precise locations coinciding with small pre-existing H4 acetylation islands at the end of the EVX2 gene and between HOXD12 and HOXD13. In contrast, PHF1 associates with the entire region, as expected from the H3K27me3 signal detected throughout the same region. Interestingly, while the H3K27me3 may be increased by the expression of exogenous PHF1, the two small H4 acetylation islands persist. Most strikingly, the EPC1-PHF1 cell line, while expressing much less fusion protein compared to EPC1(1-581) and PHF1 (Figure S4C), clearly demonstrates appearance and mislocalization of H4 acetylation over a relatively large region (∼25kb highlighted in yellow). Importantly, we can also observe a decrease in the levels of H3K27me3 in this same region, compared to the isogenic cell lines and also compared internally to the neighboring region (starting from the HOXD12 gene). This local increase in H4 acetylation correlates with specific increase in expression of the EVX2 and HOXD13 genes located within the region (Figure 4B, see also Figure S4B for HOXC genes). To validate this observation, we performed ChIP-qPCR with primers to four sites of de novo H4 acetylation (HOXD13-A, B, C, and D highlighted in Figure 4A). Correcting for nucleosome occupancy with total H2B signal, we could confirm the specific increase in H4 acetylation at these locations in the cell line expressing EPC1-PHF1 (Figure 4C).
Replacement of canonical H2A with H2A.Z in chromatin is another enzymatic activity of the NuA4/TIP60 complex through its EP400 subunit (Billon and Cote, 2013; Pradhan et al., 2016). ChIP-qPCR with a H2A.Z specific antibody revealed a similar mislocalization of this variant histone at the HOXD sites (Figure 4D). These data clearly validate our hypothesis that the EPC1-PHF1 fusion complex can localize at genomic regions normally occupied by PRC2, mislocalizing the activities of the TIP60 complex to deregulate transcription.
We were intrigued at the apparent reduction in the levels of H3K27me3 at the HOXD gene cluster in the EPC1-PHF1 expressing cells (Figure 4A-highlighted in yellow). We observed neither a global decrease in H3K27me3 levels (Figure S2D) nor destabilization of the PRC2 complex in our cell line (Figure 2). Moreover, we could detect productive histone methylation by the EPC1-PHF1 mega-complex (Figure 2F, Figure S2C). In a previous study (Musselman et al., 2012), we demonstrated that the Tudor domain of PHF1 binds to H3K36me3 and constrains PRC2 mediated H3K27me3 activity. We hypothesized that this could be part of the mechanism leading to a local decrease of H3K27me3. We checked for the levels of H3K27me3 and H3K36me3 at the HOXD13-C locus by ChIP-qPCR to confirm this possible cross-talk. A decrease in H3K27me3 was observed in EPC1-PHF1 cells compared to controls, validating our ChIP-seq results (Figure 4E). We also saw an increase in H3K36me3 at the HOXD13-C locus in EPC1-PHF1 cells (Figure 4F), likely linked to the increased transcription of the gene. This cross talk was not observed at the NuA4/TIP60-bound and highly transcribed gene RPSA (Figure S4D)(Jacquet et al., 2016) or at HOXA9, a repressed gene not occupied by the EPC1-PHF1 fusion complex (Figure S4E). We conclude that the mislocalization of H4 acetylation and the variant histone H2A.Z at the posterior HOXD gene locus in cells expressing EPC1-PHF1 leads to de-repression and productive transcription. Subsequent deposition of H3K36me3 – a histone mark linked to transcription elongation - blocks the spread of the repressive H3K27me3 mark, possibly in part through its recognition by the Tudor domain of PHF1 and direct effect on EZH2/PRC2 activity (Finogenova et al., 2020; Jani et al., 2019; Musselman et al., 2012; Schmitges et al., 2011).
The EPC1-PHF1 mediated effect on histone acetylation/methylation at HOXD is restricted to a specific Topologically Associated Domain (TAD)
The HOXD gene cluster is involved in mammalian axial patterning, limb and genital development. Spatiotemporal regulation of the HOXD locus depends on long-acting multiple enhancer sequences located in “gene deserts” outside the HOXD cluster. Mammalian cells organize such regulatory landscapes in structural units called Topologically Associated Domains (TADs), which maintain genomic contacts even in the absence of transcription. These pre-formed 3D structures, delimited by CTCF and cohesin proteins, remain globally similar in cell types and are conserved from mouse to humans (reviewed in (Bompadre and Andrey, 2019; Szabo et al., 2019)). The HOXD locus is positioned between two large TADs (Rodriguez-Carballo et al., 2017)(Figure 5A). During limb development, the telomeric TAD is activated and controls the transcription of early HOXD genes (forearm), while later the centromeric TAD controls the posterior HOXD genes (digits)(Andrey et al., 2013). A strong boundary limits the posterior HOX genes from getting activated aberrantly (Lonfat and Duboule, 2015).
Chromatin modifications influence the dynamic mammalian TADs. The repressive H3K27me3 modification segregates chromatin into discrete sub-domains, and studies suggest that these 3D clusters require PcG proteins and H3K27me3 (Szabo et al., 2019). The HOXD cluster is known to harbor an H3K27me3 dense region, and the 1.8kb region between HOXD11 and HOXD12 (D11.12) may act as a putative mammalian Polycomb Responsive Element (PRE) (Woo et al., 2010). Recent studies have shown that this region acts as a nucleation hub for PRC2 to stably bind and spread H3K27me3 through intra- and interchromosomal interactions (Oksuz et al., 2018; Vieux-Rochas et al., 2015).
We observed that the Tip60/KAT5-mediated H4 acetylation and reduction of H3K27me3 at the HOXD locus in EPC1-PHF1 cells localized to the posterior HOXD genes. This led us to explore the possibility that a TAD boundary exists in this region. When we aligned the publicly available CTCF ChIP-seq data and Hi-C dataset in K562, we observed that the chromatin changes induced by the EPC1-PHF1 fusion protein are restricted to a large portion of the centromeric TAD at the HOXD locus (Figure 5B). Since Polycomb group protein-mediated long-range contacts are also at play at the HOXD locus, we looked for spreading of EPC1-PHF1-mediated H4 acetylation to other genomic loci that were shown to interact with the HOXD locus. Although existing Hi-C/5-C studies in K562 cells are not detailed enough to detect PcG-mediated TADs (Kundu et al., 2017) and many of the explored regions are not conserved from mouse to humans (Oksuz et al., 2018; Vieux-Rochas et al., 2015), we could detect the presence of H4 acetylation at the HOXC locus (H3K27me3 spreading site), EVX1, HOXB13, CYP26B1 and LHX2 (H3K27me3 nucleation sites) (Figure S5A-F). This observation suggests that the EPC1-PHF1 fusion protein could mislocalize activating histone marks through PcG-mediated chromatin contacts, activating an oncogenic transcriptional network. A similar mechanism was observed recently for an oncogenic EZH2 mutant that co-opted PcG-mediated long-range chromatin contacts to repress multiple tumor suppressors (Donaldson-Collier et al., 2019). Further work will be required to fully dissect the effect of the EPC1-PHF1 fusion complex on the structure and dynamics of TADs/polycomb-mediated chromatin contacts.
JAZF1 is a stochiometric interaction partner of the NuA4/TIP60 complex
Various subunits of the NuA4/TIP60 complex can be fused to different subunits from Polycomb Repressive complexes 1 and 2 in Endometrial Stromal Sarcoma (Figure 1A). However, the most frequently observed translocation is that of JAZF1-SUZ12, which occurs in more than 50% of all low grade ESS (LG-ESS (Ferreira et al., 2018)). There are other recurrent translocations involving the JAZF1 protein in sarcomas, namely JAZF1-PHF1 and JAZF1-BCORL1 (Allen et al., 2017; Micci et al., 2006; Schoolmeester et al., 2013). This connection led us to investigate whether JAZF1 interacts with the TIP60 complex.
The molecular function of the JAZF1 protein is largely unexplored. It is a transcription factor with three C2H2 zinc fingers, and studies have linked it to glucose production, lipid metabolism, ribosome biogenesis, metabolic disorders and pancreatic cancer (Kobiita et al., 2020; Liao et al., 2019; Nakajima et al., 2004; Zhou et al., 2020). The JAZF1 protein has a homolog in S. cerevisiae called Sfp1 (Figure 6A), an essential transcription factor involved in regulation of ribosomal protein (RP), ribosomal biogenesis(RiBi) and cell cycle genes, responding to nutrients and stress through regulation by TORC1 (Lempiainen et al., 2009; Marion et al., 2004). Crucially, Sfp1 and the yeast NuA4 complex cooperate functionally to regulate RP and RiBi genes (Figure 6B) (Loewith and Hall, 2011; Reid et al., 2000; Rossetto et al., 2014). High-throughput human interactome studies have also suggested a potential interaction of JAZF1 with TIP60 complex components (Hein et al., 2015; Huttlin et al., 2015).
Since JAZF1 is not expressed in K562 cells (www.proteinatlas.org), we performed multiple cell compartment affinity purification coupled to tandem mass spectrometry (MCC-AP-MS/MS) in HEK293 as described in (Lavallee-Adam et al., 2013). This technique allowed us to isolate protein-protein interaction occurring on the chromatin fraction. By purifying JAZF1, we recovered NuA4/TIP60 core components KAT5, EP400, RUVBL1/2, EPC1/2, TRRAP, BAF53a, BRD8, MBTD1, ING3, and DMAP1 with good reliability (Figure S6A). JAZF1 is highly expressed in the male and female reproductive tissues and may thus be involved in the tumorigenesis of LG-ESS of all fusion subtypes. To confirm this interaction, we ectopically expressed JAZF1-3xFLAG-2xSTREP in K562 cells from the AAVS1 locus and tandem affinity-purified the protein from nuclear extracts. Silver staining of the purified fraction on gel, western blot and mass spectrometry analyses all confirm a very tight association of JAZF1 with the NuA4/TIP60 complex (Figure 6C, 6D and S6B), suggesting that JAZF1 can be a stable stoichiometric subunit. Accordingly, JAZF1 occupies the promoters of known TIP60-bound Ribosomal protein genes RPSA and RPL36AL as determined by ChIP-qPCR (Jacquet et al., 2016) (Figure 6E).
To analyze this tight interaction in the context of the JAZF1-SUZ12 translocation, we performed co-immunoprecipitation experiments showing that the TIP60 complex still interacts strongly with the portion of JAZF1 (JAZF1(1-124)) that is fused to SUZ12 (Figure 6F, S6C). The same experiments showed that JAZF1 also still interacts with EPC1(1-581) and the EPC1-PHF1 mega-complex, but not with PHF1/PRC2.1. These conclusive data suggest that the pathogenesis of JAZF1 fusions could also involve the physical merge of NuA4/TIP60 with PRC2.
Conserved oncogenic mechanism of EPC1-PHF1 and JAZF1-SUZ12 fusion proteins
We have shown that the oncogenic EPC1-PHF1 fusion complex can lead to local transcription activation despite associating with a functional PRC2.1 complex. To further confirm the effect on transcription, we used a chemically inducible dCas9-based site-specific recruitment assay described in (Gao et al., 2016) in HEK293T cells (Figure 7A). In this system we could detect specific transcriptional activation by the EPC1 protein, EPC1(1-581) and EPC1-PHF1, but not by PHF1 (Figure 7B). Interestingly, we found that the JAZF1 protein is a very robust transcriptional activator in this reporter system. This is also the case for the JAZF1-SUZ12 fusion protein, as well as for the truncated JAZF1 (1-124), but not SUZ12 by itself (Figure 7C).
To confirm that the JAZF1-SUZ12 fusion leads to the formation of a similar TIP60-PRC2 mega-complex, we repeated the purification of endogenous TIP60 and PRC2 subunits (as in Figure 2D-E), this time from cells expressing the JAZF1-SUZ12 fusion protein. Again, a merged TIP60-PRC2 mega-complex co-fractionated with TIP60 and PRC2 when the fusion protein was expressed in the cells (Figure 7D-E). Accordingly, the gene expression profiles of ESS with different fusions (JAZF1 and TIP60) cluster together (Dewaele et al., 2014; Micci et al., 2016; Przybyl et al., 2018). A recent interactome analysis of JAZF1-SUZ12 also validates our results (Piunti et al., 2019). Altogether, these data confirm that JAZF1 interacts with the TIP60 complex and that all of the currently detected fusion proteins in LG-ESS bridge the TIP60 complex to Polycomb repressive complexes.
To confirm that the JAZF1-SUZ12 fusion generates the same molecular events on chromatin as EPC1-PHF1 we looked again at the HOXD locus by ChIP-qPCR. In JAZF1-SUZ12 expressing cells, we again detected an increase in H4 acetylation (Figure 7F), a decrease in H3K27 methylation (Figure 7G, S7D) and an increase in transcriptional elongation-associated H3K36me3 (Figure 7H, S7C) in the region flanking the HOXD13 gene. JAZF1-SUZ12 is thus able to mislocalize the TIP60 complex-associated histone mark through the JAZF1 protein, drawing parallels with the mislocalization mediated by EPC1-PHF1 (Figure 4).
The loss of H3K27me3 in these data can be explained by the loss of the ZnB domain of SUZ12 in the JAZF1-SUZ12 fusion protein, which has been shown to reduce the binding of JARID2 and EPOP subunits (Chen et al., 2018). SUZ12 acts as a platform to assemble the non-canonical subunits of the PRC2 complex, and a recent study shows that mutations in SUZ12 can modulate PRC2.1 vs PRC2.2 formation (Youmans et al., 2021). Furthermore, recent structural data suggest that the JARID2-containing PRC2.2 complex can partially override H3K36me3-mediated inhibition of EZH2 catalytic activity (Kasinath et al., 2021), whereas the PRC2.1 complex is known to be catalytically inhibited by the presence of H3K36me3 (Finogenova et al., 2020; Musselman et al., 2012). We hypothesize that the loss of JARID2 in JAZF1-SUZ12 skews it to associate with the PRC2.1 complex, which is catalytically inhibited by the presence of the H3K36me3 modification. Overall, our data indicate that the mislocalization of TIP60 activities could be the predominant oncogenic mechanism of the recurrent fusion proteins found in Endometrial Stromal Sarcomas and also other Sarcomas (Figure 7I).
DISCUSSION
Our study demonstrates that binding of the EPC1-PHF1 fusion protein increased histone acetylation and H2A.Z incorporation within regions regulated by PRC2, notably within the HOXD and HOXC gene clusters. This is usually correlated with increased gene expression, consistent with the notion that EPC1-PHF1 mistargets the HAT and histone exchange activities of NuA4/TIP60 to chromatin regions that are normally maintained in the silenced state (Figure 7D). Interestingly, mistargeting of histone acetylation at the HOXD cluster leads to loss of H3K27 trimethylation over an entire specific topology-associated domain (TAD) without affecting the neighboring ones, albeit also bound by PRC2 complexes. In theory, since EPC1-PHF1 in integrated in a mega-complex with PRC2.1, it could act at most PRC2.1 binding sites. We speculate that the presence of small preexisting TIP60 binding sites within or in the vicinity of Polycomb/repressed regions, as we see at the HOXD cluster in the centromeric TAD, could make a big difference. The TIP60 moiety of the EPC1-PHF1 mega-complex could be efficiently recruited there (by DNA-bound factors and/or small regions of H3K4me3 in bivalent chromatin) while the PRC2 moiety would allow it to spread along the region like PRC2 normally does, leading to de novo acetylation. This would lead to increased transcription, coupled with H3K36 tri-methylation, which in turn inhibits the HMT activity of PRC2 (through recognition by PHF1 Tudor domain), leading to a decrease of H3K27me3 over the region (Figure 7D, middle panel). It is important to note that H3K36 methylation has also been described as a potent boundary mark to block H3K27me3 spreading (Streubel et al., 2018).
In addition, our study validates the protein JAZF1 as a potent transcription activator that stably associates with the NuA4/TIP60 complex. This information further integrates all of the reported translocations in Low-grade Endometrial Stromal sarcomas (LG-ESS) as producing a physical merge of TIP60 and PcG complexes, unifying the underlying molecular mechanism (Figure 7D, right panel). This is supported by the robust clustering of gene expression profiles of the LG-ESS with various fusion proteins when compared to High-grade ESS or tumors without fusion genes (Micci et al., 2016). Our findings further underline the importance of deregulated epigenetic modifiers in promoting oncogenesis, particularly in sarcomas (Nacev et al., 2020), in part through changes of the chromatin landscape over large genomic regions. The highly recurrent physical merger of TIP60 and PcG complexes in these sarcomas, through several distinct subunits, indicates that this event is the driving force in the oncogenic mechanism. It is somewhat reminiscent of the MLL fusions proteins in pediatric acute myeloid leukemia, in which the H3K4 HMT MLL is recurrently fused with diverse components of the super-elongation complex, creating a large complex driving oncogenesis through altered transcription programs involving chromatin modifiers and readers as well as HOXA cluster misregulation (Mohan et al., 2010). A parallel can also be drawn with the discovery of various BRD4-linked Z4 complex proteins that are fused to NUT in NUT midline carcinoma (Shiota et al., 2018).
The chromatin modification changes associated with EPC1-PHF1 are delimited to a TAD (Figure 5). This is especially interesting given recent studies demonstrating the nucleation and spreading of H3K27me3 through Polycomb domain contacts (Oksuz et al., 2018; Vieux-Rochas et al., 2015). Interestingly, EPC1-PHF1 localizes to the HOXD13/EVX2 nucleation region and mislocalizes H4 acetylation only in the centromeric TAD. We could also observe EPC1-PHF1 localization and H4 acetylation presence at “spreading” sites (Figure S5). A similar observation was made for the Ewing sarcoma fusion protein at the HOXD centromeric TAD leading to overexpression of HOXD13 as seen here (Svoboda et al., 2014; von Heyking et al., 2016). A question that emerges from our observation is whether mislocalization of activating marks will lead to disruption of these H3K27me3-centric repressive domains and disrupt the organization of chromatin domains. Disruption of polycomb-mediated repression by fusion proteins to activate stem cell-related gene expression programs has emerged as a common theme recently (Brien et al., 2019). Unlike in Synovial sarcoma or NUT midline carcinoma, we do not observe extensive mislocalization/disruption in chromatin. This could be an artifact of our system of study. Alternatively, the LG-ESS pathology is rare, milder, not rapidly progressing and with limited metastasis. LG-ESS may hence only deregulate few genes. In the BRD-NUT fusion cancers, the fusion of an acetylation reader protein (BRD4) and HAT interacting protein (NUT) creates a complex that can allosterically activate its activity, forming mega-domains of chromatin changes (Alekseyenko et al., 2017). Unlike BRD-NUT, in the EPC1-PHF1 fusion, the “Read-Write” allosteric activation ability of PRC2 is curtailed by the decrease in H3K27me3 and increase in H3K36me3. Intriguingly, a newly described subunit of PRC2 complexes, EZHIP (CXORF67), was also found fused to the TIP60 subunit MBTD1 in LG-ESS (Dewaele et al., 2014). EZHIP functions similarly to the oncogenic H3K27M mutant by binding to EZH2 and inhibiting its activity in trans (Hubner et al., 2019; Jain et al., 2019; Piunti et al., 2019). This mechanism seems similar to the trans-inhibition of EZH2 through PHF1 Tudor-H3K36me3 binding and recently described direct inhibitory effect of H3K36me3 on EZH2 catalytic activity (Finogenova et al., 2020; Jani et al., 2019). The molecular characterization of MBTD1-EZHIP will further reveal if parallels can be drawn with EPC1-PHF1.
While the HOXA9 gene has been implicated in many cancers, HOXD13 and EVX2 are also deregulated in some cancers albeit more rarely (Luo et al., 2019). Posterior HOX genes are involved in the development of reproductive organs, are endometrial developmental markers and implicated in the differentiation of Endometrial stromal cells (Akbas and Taylor, 2004; Miyazaki et al., 2018). Since Endometrial Stromal sarcomas and OFMTs are very rare, we could not have timely access to patient tissues or patient tissue-derived cell lines to model EPC1-PHF1 fusion-mediated oncogenesis. Moreover, the cell of origin of soft tissue sarcomas is still not clear and differentiated endometrial cells are difficult to expand and clone. Our cellular model system for the study is undifferentiated/immature and highly proliferative, therefore heterologous and does not truly represent the pathways and gene expression programs in ESS. Considering these limitations, we use caution in identifying ESS relevant gene expression changes from our study. Nevertheless, the data presented indicate that the mislocalization of TIP60 leads to histone H4 acetylation and H2A.Z incorporation in chromatin, favoring local gene expression that can subsequently inhibits PRC2 activity. This succession of molecular events is likely the predominant oncogenic mechanism of the recurrent fusion proteins found in Endometrial Stromal Sarcomas, Ossifying Fibromyxoid Tumors, and Cardiac Ossifying Sarcomas.
Since the mutational burden of translocation carrying cancers is relatively lesser than other cancers (clonal homogeneity), a true characteristic of sarcomas, targeted therapeutic approaches specific to the fusion protein are plausible (Brien et al., 2019; Taylor et al., 2011). According to our study, targeting TIP60 HAT activity with available small molecule inhibitors seems a viable approach (Brown et al., 2016). Alternatively, development of inhibitors designed to disrupt the interaction between JAZF1 and the TIP60 complex may also be a promising therapeutic approach.
Author Contributions
D.S., N.A., M.-E.L., B.C., M.T., Y.D. and J.C. designed the experiments. D.S., N.A., M.-E.L., N.A., A.M., K.J., J.R., J.-P.L., J.L., and Y.D. performed the experiments. E.P., S.T.S. analyzed the genomic data. A.-C.G., M.T., B.C., Y.D. and J.C. supervised and secured funding. D.S. and J.C. wrote the manuscript with the help of co-authors.
Declaration of Interests
The authors declare no conflict of interest
STAR Methods
CONTACT FOR REAGENT AND RESOURCE SHARING
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Jacques Côté (Jacques.Cote{at}crhdq.ulaval.ca)
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Cell culture
K562 cells were obtained from the ATCC and maintained at 37°C under 5% CO2 in RPMI medium supplemented with 10% fetal bovine serum and GlutaMAX. 25 mM HEPES-NaOH (pH 7.4) was added during culture in Spinner flasks.
HEK293 cells were obtained from the ATCC and maintained at 37°C under 5% CO2 in DMEM medium supplemented with 10% fetal bovine serum.
METHOD DETAILS
Construction of Recombinant DNA
RT-PCR product from Total RNA isolated from surgically removed endometrial stromal sarcoma tumor was a kind gift from Dr. Fransesca Micci (Micci et al., 2006). The cDNA containing the entire truncation of EPC1 and a fused portion of PHF1 up to base pair (bp) 139, the included portion of PHF1 contained a BglI restriction site, which was then utilized to reconstruct the full length fusion gene using a clone of PHF1 obtained from Open Biosystems Products (Hunstville, AL). This construct was then subcloned into AAVS1-PGK Puro_2XHA-3XFlag (Jacquet et al., 2016) AAVS1_Puro_PGK1_3XFlag_Twin_Strep (Addgene plasmid #68375)(Dalvai et al., 2015) or pREV-CMV-3xFLAG (Avvakumov et al., 2012). We also sub-cloned full length PHF1 and EPC1, as well as a portion of EPC1 that included bp 1-1743, corresponding to the fragment of this gene found in the fusion. All constructs were verified by sequencing, and their expression was tested by transient transfection.
Human cDNA of JAZF1 was purchased from GE Healthcare. Open reading frame (ORF) of JAZF1 was subcloned into the pMZI vector (Zeghouf et al., 2004) containing a tandem affinity purification tag at its C terminus (Rigaut et al., 1999). ORF of JAZF1, was subcloned in either the p3X-CMV-Flag-14 (Sigma-Aldrich) and/or the pcDNA3Myc6 vector (a kind gift from Dr. Sylvain Meloche) as previously described (Julien et al., 2003).
Cell Line Generation
EPC1-PHF1 and controls were targeted to the AAVS1 safe harbour locus in K562 cells, using ZFN (Hockemeyer et al., 2009) as described previously in (Dalvai et al., 2015)
Briefly, Lipofectamine 2000 was used to transfect 1.25 million cells with 1 μg ZFN expression vector and 4 μg donor constructs according to manufacturer’s instructions. Clonal selection was performed with 0.25 μg/ml puromycin for 10 days in methylcellulose-based semi-solid RPMI medium after 48hr of transfection.
Purification of Complexes
Native complexes were purified as described in detail before in (Dalvai et al., 2015; Doyon and Cote, 2016). K562 cells expressing 3XFlag-2XHA or 3xFlag-2xStrep tag at the C-terminus of EPC1(1-581), PHF1, EPC1-PHF1 fusion, JAZF1 or mock controls, were expanded in spinner flasks (3 L cultures at 0.6–1.0 million cells/ml density). Nuclear cell extracts prepared as described in (Abmayr et al., 2006) and ultracentrifuged at 100,000 G for 1h. Extracts were precleared with 300 μl Sepharose CL-6B (Sigma). 250μl of anti-FLAG M2 affinity resin (Sigma) was added to the extract and incubated for 2 hr at 4°C. The resin was washed consecutively with buffer containing 300mM salt and 150mM salt. Complexes were eluted in two fractions with 200 μg/ml of 3xFLAG peptide (Sigma) and 150mM salt for 1hr in 4°C. These fractions were used for experiments or pooled to perform a second round of affinity purification with Streptactin resin (3xFlag-2xStrep Cell lines). Eluted fractions were mixed with 125μl Strep-Tactin Sepharose (IBA) affinity matrix for 2 hr at 4°C, and the beads were washed with buffer containing 150mM salt. Complexes were eluted in two fractions with buffer containing 4mM D-Biotin, 150mM salt. The purified complexes were loaded on NuPAGE 4–12% Bis-Tris gels (Invitrogen) and visualized by silver staining. Fractions were then analyzed by mass spectrometry.
Sample Digestion for Mass spectrometry Analysis
Purified complexes were run on preparative gels (Bolt 12% Bis-Tris, Invitrogen) over 1 cm, to stack all proteins into 1 band, stained with Sypro Ruby (Bio-Rad) and excised under UV lamp. The gel slice was reduced, alkylated and digested based on the original protocol (Wilm et al., 1996). Briefly, proteins were reduced with 10 mM DTT at 56°C for 15 minutes and alkylated with 100 mM iodoacetamide for 15 minutes. Trypsin digestion was performed using modified porcine trypsin (Sequencing grade, Sigma) at 37°C for 16 h. Digestion products were extracted sequentially using 25 mM ammonium bicarbonate, then 50% acetonitrile with 5% formic acid and finally 100% acetonitrile. The recovered extracts were pooled, vacuum centrifuge dried and stored at −80°C until MS analysis.
Proteins identification by mass spectrometry – TripleTOF 5600
Each sample (5 μL) was directly loaded at 400 nL/min onto an equilibrated HPLC column. The peptides were eluted from the column over a 90 min gradient generated by a NanoLC-Ultra 1D plus (Eksigent, Dublin CA) nano-pump and analysed on a TripleTOF 5600 instrument (AB SCIEX, Concord, Ontario, Canada). The gradient was delivered at 200 nL/min starting from 2% acetonitrile with 0.1% formic acid to 35% acetonitrile with 0.1% formic acid over 90 min followed by a 15 min clean-up at 80% acetonitrile with 0.1% formic acid, and a 15 min equilibration period back to 2% acetonitrile with 0.1% formic acid, for a total of 120 min. To minimize carryover between each sample, the analytical column was washed for 3 h by running an alternating sawtooth gradient from 35% acetonitrile with 0.1% formic acid to 80% acetonitrile with 0.1% formic acid, holding each gradient concentration for 5 min. Analytical column and instrument performance were verified after each sample by loading 30 fmol bovine serum albumin (BSA) tryptic peptide standard (Michrom Bioresources Inc. Fremont, CA) with 60 fmol α-casein tryptic digest and running a short 30 min gradient. TOF MS calibration was performed on BSA reference ions before running the next sample to adjust for mass drift and verify peak intensity. The instrument method was set to data dependent acquisition (DDA) mode, which consisted of one 250 milliseconds (ms) MS1 TOF survey scan from 400–1300 Da followed by 20 100 ms MS2 candidate ion scans from 100–2000 Da in high sensitivity mode. Only ions with a charge of 2+ to 4+ that exceeded a threshold of 200 cps were selected for MS2, and former precursors were excluded for 10 s after one occurrence.
Proteins identification by mass spectrometry – Orbitrap Fusion
Peptide samples were separated by online reversed-phase (RP) nanoscale capillary liquid chromatography (nanoLC) and analyzed by electrospray mass spectrometry (ESI MS/MS). The experiments were performed with a Dionex UltiMate 3000 nanoRSLC chromatography system (Thermo Fisher Scientific) connected to an Orbitrap Fusion mass spectrometer (Thermo Fisher Scientific) equipped with a nanoelectrospray ion source. Peptides were trapped at 20 μl/min in loading solvent (2% acetonitrile, 0.05% TFA) on a 5mm x 300 μm C18 pepmap cartridge pre-column (Thermo Fisher Scientific) during 5 minutes. Then, the pre-column was switch online with a self-made 50 cm x 75 um internal diameter separation column packed with ReproSil-Pur C18-AQ 3-μm resin (Dr. Maisch HPLC) and the peptides were eluted with a linear gradient from 5-40% solvent B (A: 0,1% formic acid, B: 80% acetonitrile, 0.1% formic acid) in 90 minutes, at 300 nL/min. Mass spectra were acquired using a data dependent acquisition mode using Thermo XCalibur software version 3.0.63. Full scan mass spectra (350 to 1800m/z) were acquired in the orbitrap using an AGC target of 4e5, a maximum injection time of 50 ms and a resolution of 120 000. Internal calibration using lock mass on the m/z 445.12003 siloxane ion was used. Each MS scan was followed by acquisition of fragmentation spectra of the most intense ions for a total cycle time of 3 seconds (top speed mode). The selected ions were isolated using the quadrupole analyzer in a window of 1.6 m/z and fragmented by Higher energy Collision-induced Dissociation (HCD) with 35% of collision energy. The resulting fragments were detected by the linear ion trap in rapid scan rate with an AGC target of 1E4 and a maximum injection time of 50ms. Dynamic exclusion of previously fragmented peptides was set for a period of 20 sec and a tolerance of 10 ppm.
Multiple cell compartment affinity purification coupled to tandem mass spectrometry (MCC-AP-MS/MS)
Inducible EcR-293 stable cell line containing pMZI empty vector as control, pMZI-JAZF1 were generated as previously described (Jeronimo et al., 2007). Protein expression was induced with 3 µM ponasterone A (Life Technologies) and cells were harvested after 48h. Protein-protein interactions of JAZF1 were identified using two biological replicate runs for each bait of the multiple cell compartment affinity purification coupled to tandem mass spectrometry (MCC-AP-MS/MS) procedure, as recently described (Lavallee-Adam et al., 2013), (Cloutier et al., 2014)). The eluates were precipitated with trichloroacetic acid (TCA) and stored at −80 °C until liquid chromatography (LC) MS/MS analysis.
Affinity Purification followed by Immunoblotting
Figure 2D, 2E and 6G: K562 cells with endogenously tagged (3XFLAG-2XStrep) EP400 or EZH2 (from (Dalvai et al., 2015)) were used to stably express EPC1-PHF1 or JAZF1-SUZ12 3X-HA tagged constructs from the AAVS1 safe harbor locus. Selected clones were expanded to get 6-8 million cells. Cells were collected and washed twice in 1X PBS, lysed for 30 min in 2X volume Lysis Buffer (450mM NaCl, 10% Glycerol, 50mM Tris-HCl pH8, 1%Triton-X 100, 2mM MgCl2, 0.1mMZnCl2, 2mM EDTA, 1mM DTT, Protease inhibitors), same volume of lysis buffer without salt was added to get a final salt concentration of 225mM. Fractions were centrifuged to prepare whole cell extracts. The extracts were incubated with FlagM2 agarose resin (Sigma) for 4hr at 4°C. The resin was centrifuged, washed in Lysis buffer with 225mM salt and Eluted with 3X Flag peptide (Sigma). The eluted fraction was loaded on 4-15% gradient gels with Input and immunoblotted with anti-HA, anti-KAT5, anti-DMAP1, anti-SUZ12 or anti-EZH2 antibody as appropriate.
Figure 6F, S6: EPC1-PHF1-Flag tagged K562 Cells and controls were grown in 100mm dishes, transiently transfected with JAZF1-full length or JAZF1 (1-281) Myc tagged constructs. Cells were collected after 48 hours and affinity purified as described above. The eluted fraction was loaded on 4-15% gradient gels with Input and immunoblotted with anti-Myc antibody.
Chromatin Immunoprecipitation
FLAG ChIPs were performed according to the protocol described in (Jacquet et al., 2016). 1mg of cross-linked chromatin from K562 cells was incubated with 10µg of anti-FLAG antibody (Sigma, M2) pre-bound on 300 µl of Dynabeads Protein G (Invitrogen) overnight at 4°C. The beads were washed extensively and eluted in 0.1% SDS, 0.1M NaHCO3. Crosslink was reversed with 0.2M NaCl and incubation overnight at 65°C. Samples were treated with RNase and Proteinase K for 2h and DNA was recovered by phenol chloroform and ethanol precipitation.
Histone Mark ChIPs were performed according to the protocol described in (Lalonde et al., 2013). Briefly 200 µg of chromatin was immunoprecipitated with 1-3ug of histone mark antibodies, overnight at 4 and incubated with 100ul of Dynabeads Protein-A 4h at 4°C. The beads were washed extensively and eluted in 0.1% SDS, 0.1M NaHCO3. Crosslink was reversed with 0.2M NaCl and incubation overnight at 65C. Samples were treated with RNase and Proteinase K for 2h and DNA recovered by phenol chloroform and ethanol precipitation.
Libraries for sequencing were prepared as described (Jacquet et al., 2016). Samples were sequenced by 50 bp single reads on HiSeq 2000 platform (Illumina).
Quantitative real-time PCRs were performed on a LightCycler 480 (Roche) with SYBR Green I (Roche) to confirm the specific enrichment at defined loci. The error bars represent standard errors based on two independent experiments. Quantitative real-time PCR primers used are listed in Supplementary Table.
HAT and HMT Assays
Fractions of purified complexes were assayed for enzymatic activity on short oligonucleosomes and free histones isolated from HeLa S3 cells as described previously (Doyon et al., 2004; Musselman et al., 2012; Utley et al., 1996).
Histone Acetyltransferase (HAT) Assay: In vitro histone acetylation assay 0.5µg of the indicated substrates in a 15µl reaction containing 50mM Tris-HCl pH8.0, 10% glycerol, 1mM EDTA, 1mM DTT, 1mM PMSF, 10mM sodium butyrate and 0.25 µCi/µl (4.9 Ci/mmol) of [3H]-labeled Acetyl-CoA (Perkin Elmer) or 0.15mM unlabeled Acetyl-CoA (Sigma). Samples were either spotted on P81 membrane (GE Healthcare) for count or analyzed on 18% SDS-PAGE. For radioactive gel assays, Coomassie staining was followed by treatment with Enhance® (Perkin Elmer) and fluorography, or by western blot analysis for non-radioactive assays. In results from liquid counting, error bars represent standard deviation of independent reactions.
Histone Methyltransferase (HMT) Assay : In vitro histone acetylation assay 0.5µg of the indicated substrates in a 15µl reaction containing 20mM Tris-HCl pH8.0, 5% glycerol, 0.1mM EDTA, 1mM DTT, 1mM PMSF, 100mM KCl and 1 mCi of [3H]AdoMet (S-adenosyl-L-[methyl-3H] methionine, 80 Ci/mmol; SAM) (Perkin Elmer) or 0.15mM unlabeled S-adenosyl methionine for 45 min at 30°C. Samples were either spotted on P81 membrane (GE Healthcare) for Scintillation count or analyzed on 15% SDS-PAGE. For radioactive gel assays, Coomassie staining was followed by treatment with Enhance® (Perkin Elmer) and fluorography, or by western blot analysis for non-radioactive assays. In results from liquid counting, error bars represent standard deviation of independent reactions.
Recruitment-activator assay
The reporter HEK293T cell line with a TRE3G-EGFP reporter was a kind gift from Stanley Qi (Stanford University). The reporter cell line was transduced with a lentivirus expressing ABI-dCas9 followed by two rounds of selection under 6µg/ml blasticidin. These cells were then transduced with a lentivirus expressing EBFP2 and a gRNA targeting seven tetO repeats in the TRE3G promoter (gRNA sequence GTACGTTCTCTATCACTGATA). Single-cell derived clonal cell lines were generated and a clone showing robust EGFP induction by a strong transcriptional activator VPR was selected for downstream assays. 96-well plates were seeded with 3×104 cells per well one day prior to transfection. 150 ng of each construct was transfected using polyethylemine (PEI). Transfected cells were induced 24 hours after transfection by treatment with 100 µM abscisic acid. 48 hours after induction, cells were dissociated and resuspended in flow buffer using a liquid handing robot and analyzed by LSRFortessa (BD). Flow cytometry data was analyzed using FlowJo by gating for positive gRNA (EBFP2), then further for construct (TagRFP) expression. At least 25,000 cells were analyzed for each replicate.
Microarray
RNA samples were extracted using TRIzol reagent (Invitrogen), following manufacturer’s instructions. Duplicate RNA samples from EPC1-PHF1 expressing and control K562 cell lines were compared. We performed gene expression microarray experiments using the Human Illumina HumanHT-12_V4 platform.
Data Analysis
ChIP-seq
Reads from ChIP-seq experiments were obtained from the Genome Innovation Center at McGill University. All the single-end reads were mapped on the human genome build hg18 using bowtie version 0.12.8. The read alignments in sam format were then converted to bam files using samtools (Langmead et al., 2009). Samtools was then successively utilized to remove duplicated reads, sort, and index bam files. Peak calling was performed using MACS version 1.4.0 by comparing EPC (1-581), PHF1 and EPC-PHF1 ChIP-seq signals to the control signal (Zhang et al., 2008). We identified intersection between two sets of peaks in hg18 coordinates using bedtools “intersect” function (Quinlan and Hall, 2010). We identified significant changes in H4ac and H3K27me3 ChIP-seq read counts at EPC-PHF1 bound regions using DESeq (Anders and Huber, 2010). Custom analyses specifically tailored to answer questions presented in this study were implemented in R version 2.14.0. For all the analyses using bivalent chromatin regions in K562 cells, we defined bivalent chromatin regions as the overlap between H3K4me3 and H3K27me3 significant ChIP-seq peaks (adjusted Bonferroni P-value < 0.05) identified in K562 cells by the ENCODE project (Consortium, 2012) and available from the UCSC genome browser website (Kent et al., 2002). We used the UCSC hg18 gene definition for all the analyses that necessitate transcription start and end coordinates in genomic context.
Microarray
We first log2 transformed and quantile normalized the data before using them for the analyses presented in the paper. All the data were pre-processed using the lumi Bioconductor package (Du et al., 2008).
Mass Spectrometry
Mass spectrometry data were stored, searched and analyzed using the ProHits laboratory information management system (LIMS) platform (Liu et al., 2016). Within ProHits, AB SCIEXWIFF files were first converted to an MGF formatusing WIFF2MGF converter and to an mzML format using ProteoWizard (v3.0.4468) (Kessner et al., 2008) and the AB SCIEX MS Data Converter (V1.3 beta). The mzML files were searched using Mascot (v2.3.02). The spectra were searched with the RefSeq database (version 57, January 30th, 2013) acquired from NCBI against a total of 72,482 human and adenovirus sequences supplemented with common contaminants from the Max Planck Institute (http://141.61.102.106:8080/share.cgi?ssid=0f2gfuB) and the Global Proteome Machine (GPM; http://www.thegpm.org/crap/index.html). The database parameters were set to search for tryptic cleavages, allowing up to two missed cleavage sites per peptide with a mass tolerance of 40 ppm for precursors with charges of 2+ to 4+ and a tolerance of +/− 0.15 amu for fragment ions. Deamidated asparagine and glutamine and oxidized methionine were allowed as variable modifications.
Multiple cell compartment affinity purification coupled to tandem mass spectrometry (MCC-AP-MS/MS)
Protein were solubilized, digested with trypsin and analysed by tandem mass spectrometry as previously described (Cloutier et al., 2014). Protein database searching was performed with Mascot 2.2 (Matrix Science) against the human NCBInr protein database. Reliability assessment of protein-protein interactions identified with MCC-AP-MS/MS was performed using our previously published software package Decontaminator (Lavallee-Adam et al., 2011). pMZI empty vector controls were used to train Decontaminator’s Bayesian inference algorithm. Decontaminator assigned a False Discovery Rate (FDR) to each protein-protein interaction in our dataset using a leave-one-out procedure. Interactions with a FDR below 20% in both replicate runs of MCC-AP-MS/MS and interactions observed in only one replicate run with a FDR < 10% were reported.
Hi-C data alignment
Alignment of the ChIP sequencing data with previously published Hi-C data in K562 cells (Rao et al., 2014) was performed by converting the ChIP sequencing alignment from hg18 to hg19 using CrossMap Tool (Zhao et al., 2014). Hi-C data and ChIP sequencing data was then visualized using PyGenome Tracks tool V2.1 (Ramirez et al., 2018). The CTCF ChIP-sequencing data used is from GEO: GSM733719 (Broad Institute /ENCODE group). The Hi-C data (Heat map, domains and loops) was downloaded from the datasets provided at Chorogenome (http://chorogenome.ie-freiburg.mpg.de/data_sources.html)
DATA AND SOFTWARE AVAILABILITY
The ChIP-seq data and expression analysis reported in this study are available at the GEO repository under accession number GSE162544.
All Mass Spectrometry files generated as part of this study were deposited at MassIVE (http://massive.ucsd.edu). The MassIVE ID is MSV000083618 and MSV000086476. The MassIVE FTP download links are ftp://massive.ucsd.edu/MSV000083618 and ftp://massive.ucsd.edu/MSV000086476. The password for download prior to final acceptance is “fusion”.
SUPPLEMENTARY FIGURE LEGENDS
SUPPLEMENTARY METHODS
Acid Extraction of Histones
EPC1-PHF1-Flag Tagged K562 and control cells were grown to 107 cell density in 100mm dishes, harvested, washed twice in 1XPBS. Cells were pelleted and resuspended in 1ml lysis buffer (PBS containing 0.5% Triton X 100 (v/v), 2 mM phenylmethylsulfonyl fluoride (PMSF), 0.02% (w/v) NaN3, 5mM sodium Butyrate) for 10 min, nuclei were spun down, resuspended in half volume of lysis buffer and centrifuged. Pellet was resuspended in 250 µl 0.2N HCl, incubated overnight at 4C, centrifuged. Supernatant containing acid extracted histones was neutralized with 1/10 volume of NaOH. Histones were quantified by Bradford assay, resolved on 15% SDS-PAGE and Immunoblotted with anti-H3K27me3 and anti-H4 antibodies. (https://www.abcam.com/protocols/histone-extraction-protocol-for-western-blot)
Colony formation assay
HEK293T were transfected using the calcium phosphate method and then grown under selection with 300ug/ml hygromycin B for two weeks. Colonies that formed were fixed with methanol and colored with a solution of Giemsa stain for easier visualization. Plates were scanned using an Epson Perfection 4870 scanner; colonies in the images were counted using Alpha Imager software.
KEY RESOURCES TABLE
Acknowledgments
We thank Céline Roques, Catherine Lachance, Valérie Côté and Philippe Cloutier for important technical support. We are very grateful to Prof. Francesca Micci for providing the cDNA obtained from patient sample that covered the EPC1-PHF1 fusion. We thank Compute Canada for the use of supercomputers, McGill Genome Center for sequencing/expression microarray and the CHUL proteomic platform. This work was supported by grants from the following: the Canadian Institutes of Health Research (CIHR) to J.C. (FDN-143314); the Government of Québec, Ministry of Economy and Innovation to B.C.; the Natural Sciences and Engineering Research Council of Canada to Y.D. (RGPIN-2014-059680) and J.-P.L. (RGPIN-2017-06124); University of Toronto startup funds to M.T. D.S., M.-E.L. and K.J. were supported by PhD studentships from Fonds de la Recherche Québec-Santé (FRQS) and Nature/Technologie (FRQNT). N.A. was supported by a CIHR post-doctoral fellowship. B.C. holds the IRCM Bell-Bombardier Research Chair. A.-C.G. holds the Canada Research Chair in Functional Proteomics and the Lea Reichmann Chair in Cancer Proteomics. J.-P.L. and Y.D. are Junior 1 and Junior 2 FRQS scholars, respectively. J.C. holds the Canada Research Chair in Chromatin Biology and Molecular Epigenetics.
Footnotes
↵6 Lead contact