The DBD-α4 helix of EWS::FLI is required for GGAA microsatellite binding that underlies genome regulation in Ewing sarcoma

Ewing sarcoma is the second most common bone cancer in children and young adults. In 85% of patients, a translocation between chromosomes 11 and 22 results in a potent fusion oncoprotein, EWS::FLI. EWS::FLI is the only genetic alteration in an otherwise unaltered genome of Ewing sarcoma tumors. The EWS portion of the protein is an intrinsically disordered domain involved in transcriptional regulation by EWS::FLI. The FLI portion of the fusion contains a DNA binding domain shown to bind core GGAA motifs and GGAA repeats. A small alpha-helix in the DNA binding domain of FLI, DBD-α4 helix, is critical for the transcription function of EWS::FLI. In this study, we aimed to understand the mechanism by which the DBD-α4 helix promotes transcription, and therefore oncogenic transformation. We utilized a multi-omics approach to assess chromatin organization, active chromatin marks, genome binding, and gene expression in cells expressing EWS::FLI constructs with and without DBD-α4 helix. Our studies revealed DBD-α4 helix is crucial for cooperative binding of EWS::FLI at GGAA microsatellites. This binding underlies many aspects of genome regulation by EWS::FLI such as formation of TADs, chromatin loops, enhancers and productive transcription hubs.


Introduction
Ewing sarcoma is an aggressive bone-associated tumor currently treated with dose-intense chemotherapy, radiation, and surgery (Pappo and Dirksen, 2018).It affects adolescents and young adults with an incidence rate of 3 per million (Ozaki, 2015).Of these patients, 25-35% have overt metastatic disease with a recurrence rate of 50-80% (Gaspar et al., 2015).Roughly a quarter of patients with localized disease also relapse (Van Mater and Wagner, 2019).The five-year survival rate of metastatic and relapsed patients is only 10-30% (Stahl et al., 2011).Treatment options for relapsed/metastatic Ewing sarcoma patients have not improved for the last four decades.The lack of efficient and tar-1 of 25 geted treatment for Ewing sarcoma can be attributed to our poor understanding of how precisely Ewing sarcoma is driven by a fusion oncoprotein called EWS::FLI.. Expression of EWS::FLI results from a translocation between chromosomes 11 and 22 that fuses two genes, EWSR1 and FLI1 (Delattre et al., 1992).As a result, the transcription activation domain of EWSR1 and the DNA binding domain of FLI1 fuse together to create a potent transcription factor (May et al., 1993).The EWS domain is highly disordered and is important for multimerization and transcriptional hub formation (Chong et al., 2018).The FLI portion contains a DNA binding domain that binds at two distinct sites: canonical FLI1 binding sites containing a GGAA core and microsatellites containing GGAA repeats (Gangwal et al., 2008).At these GGAA microsatellites, EWS::FLI binding creates de-novo enhancers driving genome-wide reprogramming of enhancers to Ewing-specific enhancers (Riggi et al., 2014).Regulation of Ewing-specific enhancers underlies the mechanisms by which EWS::FLI regulates expression of thousands of target genes (Grunewald et al., 2015).More specifically, activation of genes involved in proliferation, migration and invasion pathways leads to oncogenic transformation of precursor cells to Ewing sarcoma cells (Kauer et al., 2009).
In our recent studies of EWS::FLI, we found a small alpha helix in the DNA binding domain, DBD-4, to be required for transcription and regulation by the fusion protein (Boone et al., 2021).
Interestingly, this study did not find any change in chromatin accessibility (ATAC-Seq) and genome localization of EWS::FLI constructs (CUT&RUN) when DBD-4 helix was deleted leaving the mechanistic basis for the requirement of DBD-4 in transcription regulation unclear.In parallel studies, we also observed that EWS::FLI expression results in widespread changes to 3D chromatin organization (Showpnil et al., 2022).These results together prompted us to consider whether DBD-4 helix is contributing towards EWS::FLI's abilities to make changes in the epigenome and the local chromatin structure.To assess whether DBD-4 helix is important in chromatin organization, we used our "knock-down/rescue" system in A673 cells (Theisen et al., 2019;Boone et al., 2021;Showpnil et al., 2022) in conjunction with genomics techniques including RNA-Seq, CUT&Tag, and Micro-C.These sets of experiments allowed us to characterize the involvement of DBD-4 helix in modulating the chromatin organization, epigenetic reprogramming of enhancers, binding at GGAA microsatellites, and promoting transcription leading to transformation.

DBD-𝛼4 helix of FLI domain is required to restructure 3D chromatin in A673 cells
To assess the mechanism by which DBD-4 promotes transcription, we depleted endogenous EWS::FLI expression with shRNA (KD) and rescued with previously published DBD and DBD+ EWS::FLI constructs with 3X-FLAG tags (Figure 1A, Boone et al. (2021)).The main difference between DBD and DBD+ constructs is the presence of DBD-4 helix in DBD+ (Supplementary Figure 1A).We recapitulated our previous findings that DBD is unable to rescue the same number of Ewing-sarcoma specific genes as DBD+, therefore incapable of driving oncogenic transformation (Supplementary Next, we sought to understand the role of DBD-4 helix in 3D chromatin organization.We carried out Micro-C experiments in KD, DBD and DBD+ cells.First, we performed multidimensional scaling (MDS) analysis of the top 1000 interactions of Micro-C matrices at 500kb resolution to assess the global regulation of 3D chromatin organization (Figure 1B).KD replicates clustered together with DBD replicate 1 on both axes and with DBD replicate 2 on y-axis.DBD+ replicates, on the other hand, clustered away from both KD and DBD replicates.These observations suggest that the global chromatin structure of DBD replicates are more similar to KD than DBD+ replicates.
We also assessed the contact frequency across the genome in KD, DBD and DBD+ cells at 5kb resolution (Figure 1C).We found that at the mid region of the plot (105-106 bp), DBD+ cells make more contacts in comparison to KD and DBD cells.At this genome-wide level, we observe that DBD cells display similar contact frequency as KD cells suggesting that DBD-4 helix is required for To determine if these changes were significant, we conducted differential interaction analysis using 500kb matrices to compare DBD and DBD+ biological replicates to KD replicates.First, we detected 14,326 and 65 interactions in DBD+ and DBD cells respectively using cut-off values of adjusted p-value < 0.2 and log 2 FC > 0.5.Then, we used cut-off values of adjusted p-value < 0.05 and log 2 FC > 1.5 to identify differentially interacting regions (DIRs).We found that DBD+ cells have 151 DIRs compared to KD cells (Figure 1D) whereas DBD cells only have 7 DIRs (Figure 1E).When we used all interactions detected including DIRs for an overlap analysis, we found 31 common interactions detected in both DBD and DBD+ (Figure 1F).We further assessed the distance between these interactions and identified that DBD unique interactions (mean=54Mbp) were the most distant followed by common (mean=39.5Mbp)and DBD+ unique interactions (mean=29.7Mbp).Taken together, these data demonstrate that the DBD-4 helix of FLI domain is required for EWS::FLI to restructure 3D chromatin in A673 cells.

Altered TAD structure in DBD cells is linked to GGAA microsatellite binding
To investigate the necessity of the DBD-4 helix for higher-order chromatin organization, we carried out topologically associated domain (TAD) analysis using HiCExplorer suite of tools (Wolff et al., 2020).TADs are functional subunits of chromatin that define regulatory landscape (Szabo et al., 2019).Within these domains, chromatin interactions including those involving regulatory elements are spatially confined, often by insulators (Dixon et al., 2012;Hou et al., 2012;Nora et al., 2012).
Moreover, genes located in the same TAD are typically co-regulated together (Dixon et al., 2016;Ramírez et al., 2018) and this co-regulation is facilitated by interactions between enhancers and promoters, which also tend to occur within the confines of a same TAD (Bonev et al., 2017).Our initial step involved identifying TADs in both DBD and DBD+ matrices with hicFindTADs at 10kb, 25kb, 50kb and 100kb resolutions.Subsequently, we computed differential TADs with hicDifferential-TAD.This analysis entailed a comparison of precomputed TAD regions of DBD and DBD+ matrices with the same regions of KD matrix.The purpose was to detect TADs that appeared specifically in response to the expression of DBD and DBD+ constructs in comparison to KD sample (Figure 2A).
At each resolution, the number of TADs detected when DBD+ was expressed was far greater than when DBD was expressed.
To exclude duplicated TADs when combining TADs from different resolutions, we used the default threshold value of 5000bp as the minimum boundary difference required to consider two TADs as distinct.With this approach, we found 6,690 total TADs for DBD and 15,412 total TADs for DBD+ (Figure 2B).To understand the local biological context of these total TADs in DBD and DBD+ cells, we annotated the whole width of the TADs with CTCF and FLAG CUT&Tag peaks (FDR < 0.05, FC >8, counts > 80, IDR < 0.01) (Figure 2B).In DBD cells, we detected 11.2% of the TADs bound by CTCF only, 12.9% bound by DBD FLAG only, 36.3% bound by both CTCF and DBD FLAG, and 39.6% bound by neither.In contrast, for DBD+, we detected 7.4% and 7.0% of the TADs bound by only CTCF and only DBD+ FLAG, respectively, 29.0% bound by both and 56.6% bound by neither.These observations highlight both the increased number of total TADs detected in DBD+ cells and a notable increase in the percentage of TADs detected in DBD+ cells with no CTCF or FLAG association.This suggests a broader role for DBD-4 helix in genome-wide reorganization of TADs by EWS::FLI, consistent with the results discussed above relating to Figure 1.
Next, we overlapped total TADs from DBD and DBD+ conditions and identified 2,530 TADs common to both.There were 4,345 unique TADs in DBD cells and 12,971 unique TADs in DBD+ cells (Figure 2C).To understand the transcriptional activity driven by the DBD-4 helix at GGAA repeats, we focused on the whole width of unique TADs containing DBD and DBD+ FLAG peaks at GGAA microsatellites (Figure 2D-2H).This analysis narrowed the TADs down to 868 unique ones for DBD cells and 2,460 unique TADs for DBD+ cells (Figure 2E).We observed that within these TAD subsets containing FLAG peaks at GGAA microsatellites, the intensity of the DBD+ FLAG peaks was higher compared to DBD FLAG peaks (Figure 2D).Moreover, TADs uniquely bound by DBD+ at  These findings suggest that DBD+ accesses longer and denser microsatellites within larger TADs 144 and binds to these sites with higher binding intensity than DBD.Furthermore, these TADs with 145 DBD+ bound microsatellites show greater changes in genes expression suggesting a functional 146 role for the DBD-4 helix in gene regulation within TADs.

147
TAD boundaries are regions between TADs that act as insulators (Dixon et al., 2012) and are 148 often marked by active chromatin (Ay et al., 2014).We conducted a similar analysis as above in a 149 20kb region around unique TAD boundaries (Supplementary Figure 2B-2G).We observed a compa-150 rable percentage of the unique TAD boundaries were bound by only CTCF peaks in DBD and DBD+ 151 cells (36.3% and 32.2%, respectively).The percentage of boundaries with FLAG peaks in DBD and 152 DBD+ cells were 15.2% and 8.6%, respectively (Supplementary Figure 2B).When we considered 153 boundaries that overlap with FLAG peaks at microsatellites (Supplementary Figure 2C-2G), we ob-154 serve similar findings as those spanning the whole width of TADs (Figure 2D, 2F-2H).The FLAG 155 peaks at DBD+ specific TAD boundaries were of higher intensity than FLAG peaks at DBD specific 156 TAD boundaries (Supplementary Figure 2C).Next we assessed gene expression at TAD boundaries 157 and found DBD+ boundaries contained more significantly regulated genes (Supplementary Figure  based on the linear genomic distance between loop anchors (Figure 3A).The only range with a 171 difference between DBD and DBD+ cells was short-range loops.Specifically, we found that DBD+ 172 expressed cells gained 1,913 more short-range loops than DBD expressed cells compared to KD.

173
Both conditions had a similar number of short-range lost loops (5,100 and 5,463 for DBD and DBD+ 174 respectively, Figure 3A).For an overlap analysis between DBD and DBD+ conditions, we included all 175 lost loops and found 5610 loops that were commonly lost in both conditions (Supplementary Figure 176 3A).When the same overlap analysis was carried out for DBD and DBD+ gained loops of all ranges, 177 we found only 219 common loops (Figure 3B).This small overlap of gained loops suggested that Transcription factors have been shown to establish super-enhancers at key genes that drive cellular identity (Whyte et al., 2013).We therefore sought to understand whether DBD-4 helix participates in formation of super-enhancers.We utilized ROSE algorithm to define enhancers and super-enhancers in DBD and DBD+ cells (Lovén et al., 2013;Whyte et al., 2013).We found 9,480 and 10,370 typical enhancers in DBD and DBD+, respectively (Figure 4D).We also identified 707 and 454 super-enhancers in DBD and DBD+, respectively.Although the super-enhancers associated with DBD+ were fewer in numbers compared to DBD, these super-enhancers were larger in size (Figure 4E) and contained more H3K27ac peaks (Figure 4D).We also assessed the expression level of genes associated with super-enhancers and found DBD+ super-enhancer genes had more gene activation (Figure 4F) and deactivation (Supplementary Figure 4A).These data suggest DBD-4 helix is required in formation of enhancers and super-enhancers, therefore important in regulation of genes from those highly clustered regions of the genome.

DBD-𝛼4 helix promotes binding at longer and denser GGAA microsatellites
The results of TAD, loop and enhancer analysis hinted towards a mechanism of DBD-4 helix in binding GGAA microsatellites that differed from our prior CUT&RUN analysis which showed no difference in binding (Boone et al., 2021).We decided to further investigate binding patterns of DBD and DBD+ FLAG peaks genome-wide from our CUT&Tag dataset generated for this study.We found 8,118 DBD+ and 10,101 DBD FLAG peaks genome-wide that passed the threshold of FDR < 0.05, FC > 8, counts > 80, IDR < 0.01.When we overlapped the two sets of FLAG peaks, we found 6,102 common, 3,999 DBD unique and 2,016 DBD+ unique FLAG peaks (Figure 5A).When each group of FLAG peaks was overlapped with genomic GGAA microsatellites, we found that 70.68% of DBD+ unique and 73.04% of common peaks bind at GGAA microsatellites (Figure 5B).When we considered DBD unique peaks, we found GGAA microsatellites binding shifted down to 53.04% (Figure 5B, DBD bar).Common FLAG peaks were found to have the highest peak intensity, followed by DBD+ peaks then DBD peaks (Figure 5C).Taken together, these data demonstrate a role of the DBD-4 helix in binding at a subset of GGAA microsatellites.
We therefore further characterized the GGAA microsatellites bound by these 3 sets of peaks: DBD unique, common and DBD+ unique peaks (Figure 5D-H).We assessed the length of microsatellites and found uniquely DBD bound GGAA microsatellites to be shorter (mean=36.61bp)than uniquely DBD+ bound GGAA microsatellites (43.25bp) (Figure 5D).GGAA microsatellites bound by both DBD and DBD+ were the longest (50.94bp) (Figure 5D).When we quantified the total number of GGAA repeats, we found that the mean number of GGAA motifs in uniquely DBD bound GGAA microsatellites was 4.66 compared to 7.78 in uniquely DBD+ bound GGAA microsatellites and 8.70 in GGAA microsatellites commonly bound in both conditions that were significantly different from each other (Figure 5E).Next, we quantified the maximum consecutive GGAA repeats within a microsatellite and found that the DBD unique peaks contained GGAA repeats no longer than 2 repeats (Figure 5F) whereas common and DBD+ unique peaks were localized at approximately 5 consecutive GGAA repeats (mean = 5.21 and 5.32 for common and DBD+, respectively, Figure 5F).Next we plotted GGAA motif density and observed that DBD unique microsatellites are on average (mean=0.54)less dense compared to common (0.65) and DBD+ (0.69) bound GGAA microsatellites (Figure 5G).Inverse of GGAA density is the max gap or number of non-GGAA bp in the microsatellites.We then assessed the gap size in each of the groups and found DBD+ binds microsatellites that contain the least amount of gaps (mean=8.3)followed by common (9.1) then DBD unique peak microsatellites (10.1) (Figure 5H).These analyses demonstrate that the DBD-4 helix is required for effective binding at GGAA microsatellites that are medium in length and denser in its GGAA motifs.

DBD-𝛼4 helix is necessary for productive transcription hub formation at GGAA microsatellites
Transcription is a process that is regulated across large genomic distances, time, and space (Cramer, 2019;Lee and Young, 2013).One of the emerging concepts that unifies observations of such a regulation is the concept of transcriptional hubs.Transcriptional hubs are actively transcribed regions containing clusters of transcription factors and RNA Pol II, and they are highly characterized by enhancers (Lim and Levine, 2021).EWS::FLI has been shown to form such hubs via the intrinsically disordered region of EWS at GGAA microsatellites (Chong et al., 2018).Since DBD-4 helix is required in effective binding at GGAA microsatellites and the downstream regulation of 3D chromatin, we sought to understand if DBD-4 helix participates in the function of EWS::FLI in the formation of transcriptional hubs.
To characterize EWS::FLI-driven transcription hubs, we turned to a region containing the FCGRT gene since EWS::FLI at this locus is shown to form a hub at GGAA repeats (Figure 6A, Chong et al. (2018)).The FCGRT gene resides in a 250kb TAD (top panel) in both DBD and DBD+ cells.Considering that TADs are highly self-interacting regions and that interactions across different TADs are limited by insulator elements (Dixon et al., 2012;Hou et al., 2012;Nora et al., 2012), we used TADs as proxy for hubs as we did not expect hubs to extend beyond TAD boundaries.Within the hub, we observed instances where DBD and DBD+ clusters appear in similar locations.In DBD cells, however, we noticed increases in the intensity of all peaks except the FCGRT promoter GGAA microsatellite peak (Figure 6A, magenta tracks).When we zoomed in on this microsatellite, DBD and DBD+ peaks had a similar binding pattern (Figure 6B, magenta and green tracks).Despite this similarity, DBD binding did not facilitate the formation of chromatin loops at the FCGRT promoter as efficiently as DBD+ (inverted red arcs).Furthermore, this lack of loop formation correlated with decreased enhancer marks (Figure 6A, green tracks and bar) and decreased expression of the FC-GRT gene (black tracks, padj < 0.05, FC > 2).Even though binding at the FCGRT microsatellite is comparable between DBD and DBD+, DBD is less able to modulate transcription and chromatin at this microsatellite with high GGAA motif density (0.79, Figure 6B).
We also studied the DBD peaks that had higher intensity compared to DBD+ peaks.The majority of these peaks were at microsatellites of varying density (0.35-0.75) and two peaks at the promoters of PRR12 and RRAS were not at GGAA repeats (Figure 6A).These peaks were still near GGAA microsatellites with densities of 0.53 and 0.28 for PRR12 and RRAS peaks respectively.Upon analysis of the RNA-sequencing data, we found that the expression of genes associated with this subset of peaks were not significantly differentially expressed in both DBD and DBD+ cells compared to KD (Supplementary Table 1).The only gene with significantly changed expression was FCGRT and Another locus we investigated is CCND1, where hub formation at GGAA repeats is also reported (Chong et al., 2018).The CCND1 locus is located at the border of a 700kb TAD in both DBD and DBD+ cells (Supplementary Figure 5).However, in DBD+ cells, formation of two smaller TADs nested within the larger TAD is detected with an increased number of loops (Supplementary Figure 5B).DBD and DBD+ peaks cluster in similar locations across the hub.However, DBD+ binding at MYEOV and CCND1 promoters were the only ones that effectively modulated chromatin at these microsatellites to promote gene expression (black tracks, padj < 0.05, FC >2).The CCND1 promoter microsatellite has a density of 1 meaning that it only consists of GGAA repeats without any gaps, whereas MYEOV microsatellites densities were 0.44 and 0.28.These findings further demonstrate the preference of DBD-4 helix for binding at dense GGAA repeats and the increased binding intensity seen for DBD without the DBD-4 helix at shorter and less dense GGAA microsatellites.Taken together, these data provide examples of transcriptional hubs spanning a large genomic distance containing numerous genes and show how transcriptional output at these hubs can be regulated by EWS::FLI binding at GGAA microsatellites.

Discussion
Detailed differential analysis of genomic binding patterns of DBD and DBD+ was not previously performed.In our previous study, we used CUT&RUN to identify DBD and DBD+ peaks and found 90% overlap (Boone et al., 2021).In this study, CUT&Tag showed that this overlap is only 60%.This With the current results, we favor a model whereby DBD-4 helix stabilizes collective binding at high density GGAA microsatellites.Our findings underscore the specific characteristics of GGAA repeats bound by EWS::FLI to drive pathogenesis of Ewing sarcoma.At Ewing-specific gene loci such as FCGRT and CCND1 (Figure 6 and Supplementary Figure 5), we found dense microsatellites.This finding was also recapitulated at the genome-wide level as seen in the preference of DBD+ unique peaks for denser microsatellites compared to common peaks (Figure 5G).The DBD+ unique peaks also show a preference of binding at microsatellites that are longer than the DBD unique peaks and shorter than the common peaks (Figure 5D) aligning with our previous findings that an optimal number of GGAA repeats is required for binding by EWS::FLI in Ewing sarcoma transformation (Johnson et al., 2017b).
Although our data demonstrates that DBD-4 helix is required in collective binding at GGAA repeats, we are still unable to decipher the exact mechanism of stabilization at such repeats with our current set of data.There are a few possible mechanisms that exist.DBD-4 helix could be directly interacting with the DNA or the DNA binding domain of the adjacently bound EWS::FLI molecule.Another possibility is DBD-4 helix affects its function through interactions with other epigenetic regulators or transcription machinery proteins.Alternatively, intramolecular interaction with the EWS domain may stabilize DNA binding and could further promote phase condensates.
We favor the last mechanism since de-novo ability of EWS::FLI binding GGAA repeats depends on the presence of the EWS domain (Johnson et al., 2017a).Further directed studies are needed to address these possible mechanisms.
The multi-omics approach utilized in this study provided us with an opportunity to characterize transcription hubs driven by EWS::FLI genome-wide.We have shown the clustering of EWS::FLI at GGAA microsatellites underlies the formation of local 3D features such as TADs and chromatin loops.If we use TADs as proxies for hubs, we detected thousands more of these hubs in DBD+ cells compared to DBD cells, highlighting the importance of DBD-4 helix in binding at dense GGAA repeats and formation of hubs across the genome.We also observed thousands of loops at unique microsatellites for both DBD and DBD+ cells adding detail to the architecture of hubs, often represented as flower-shaped structures of many loops (Zhu et al., 2021).We also probed another aspect of hubs with our H3K27ac CUT&Tag data: the presence of enhancers and super-enhancers.
We showed that DBD-4 helix promotes more active marks genome-wide compared to DBD leading to formation of enhancers and super-enhancer.Finally, we presented FCGRT and CCND1 loci as specific examples of productive transcription hubs driven by EWS::FLI binding at dense GGAA microsatellites.Our study thus uncovered a surprising role for FLI DBD in the process of hub formation which is usually attributed to the EWS low complexity domain.
Because EWS::FLI is the sole driver mutation in Ewing sarcoma tumors, it is an attractive therapeutic target.However, many approaches to target EWS::FLI have been hampered by common issues that arise with targeting transcription factors, such as its location in the nucleus, its abundance in cells, and its lack of an enzymatic pocket to design a small molecule tailored to target it (Knott et al., 2019).We propose DBD-4 helix is a promising therapeutic target as it doesn't directly bind the major groove of the DNA.Additionally, because the DBD-4 is a helix, it provides a structured region to design small molecules to disrupt its interaction unlike the EWS low complexity domain.Moreover, its importance in maintaining effective binding at GGAA microsatellites offers an opportunity to target EWS::FLI at the most mechanistically important sites in pathogenesis of Ewing sarcoma.Further studies are needed to clarify the mechanism by which the DBD-4 helix promotes effective binding at GGAA microsatellites and regulates transcription hub formation.

Transfection, virus production and transduction
For the generation of retroviruses, HEK293-EBNA cells were co-transfected with retroviral expression plasmids, vesicular stomatitis virus G glycoprotein (VSV-G) and gag/pol packaging plasmids.
Briefly, 2.5 x 106 HEK293-EBNA cells were seeded in a 10 cm tissue culture dish the day before transfection, resulting in 60-70% confluency the day of transfection.10 µg of each plasmid (gagpol, vsv-g, and transfer plasmid) were combined with 2 ml Opti-MEM I Reduced Serum Medium (Gibco 31985070) and 90 µl MirusBio TransIT-LT1 Transfection Reagent (Mirus MIR2306) and incubated at room temperature for 20 minutes.The transfection mix was then added drop wise to the cells in 3 ml culture medium.Virus-containing supernatant was collected every 4 hours on day 2 (48 hours) and 3 (72 hours) post transfection, pooled, filtered and stored at -80°C.Ewing Sarcoma cells were transduced with viral supernatants using polybrene (8 µg/ml), followed by selection with appropriate antibiotics at 48 hours post infection.In the case of knockdown/rescues, cells were selected with 0.5-2 µg/ml Puromycin and 50-150 µg/ml Hygromycin B (Thermo 10687010) for 7-10 days.

Immunodetection
Whole-cell protein extraction was completed using Pierce™ RIPA buffer (ThermoFisher 88901) supplemented with Protease Inhibitor Cocktail (Sigma P8340-5ML) on ice for 30 minutes.For nuclear extracts cell pellets were resuspended in LB1 buffer (50 mM Tris-HCl pH 7.5, 20 mM NaCl, 1 mM EDTA, 0.5% NP-40, 0.25%, Triton-X 100, 10% Glycerol, 1 mM DTT, Protease Inhibitor Cocktail) and incubated on a nutator for 10 minutes at 4C.After a 5 minute spin at 400xg and 4C, the pellets were washed with LB2 buffer (10 mM Tris-HCl pH 7.5, 20 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 1 mM DTT, Protease Inhibitor Cocktail) and centrifuged again for 5 minutes at 400xg and 4C.The nuclear pellets were resuspended in a small volume of RIPA buffer, incubated on ice for 30 minutes, followed by a 30 minute centrifugation step at maximum speed and 4C.
ImageJ software was used to perform densitometry analysis.

Soft agar assays
The anchorage-independent growth capacity of Ewing sarcoma cells was assessed using soft agar assays.Cells were seeded at a density of 7500 cells in 6-cm3 tissue culture dish in duplicate in 0.8% SeaPlaque GTG agarose (Lonza 50111) mixed with Iscove's Modified Dulbecco's medium (Gibco 12200-036) containing 20% FBS, penicillin/streptomycin/glutamine and puromycin/hygromycin. Agars were imaged at least 14 days after seeding and colony counts were quantified using ImageJ software (V1.51).

RNA-sequencing experiments, data processing, and analysis
RNA-sequencing was performed on 3 biological replicates of KD, DBD and DBD+.Total RNA were extracted using RNeasy Extraction Kit and submitted to the Nationwide Children's Hospital Institute for Genomic Medicine for RNA quality measurement, library preparation, and sequencing.Briefly, cDNA libraries were prepared from total RNA with TruSeq Stranded mRNA Kit (Illumina 20020594) and sequenced on Illumina NovaSeq SP to generate 150-bp paired-end reads.We used in-house RNA-sequencing pipeline to process and analyze the data.Low-quality reads (q < 10) and adapter sequences were trimmed to align to hg19 genome using STAR (Dobin et al., 2013).After alignment, the reads were counted and differential analysis performed using DESeq2 (Love et al., 2014).

Micro-C experiments
Micro-C kits (Catalog 21006) purchased from Dovetail Genomics were used to prepare Micro-C libraries.For each condition, multiple aliquots of 110 6 cells were harvested and frozen at -80°C for at least 30 minutes.Cells were thawed in room temperature and resuspended in first PBS containing 0.3M DSG then in 37% formaldehyde to crosslink DNA.Cells were then digested with various amount of MNase to achieve the digestion profile of 40%-70% mononucleosome peak observed on TapeStation D5000 HS Screen Tape.Conditions that are to be analyzed comparatively were digested to a similar range of mononucleosome peaks (50%-70%).Once desired digestion profiles achieved, the cells were lysed and the chromatin was captured with beads to perform proximity ligation.Libraries were prepared per the protocol of Micro-C kit and each library was indexed with unique primer pairs from IDT (10009816 and 10010147).Micro-C libraries were then shallow sequenced at 7 to 8 million (2 x 150bp) read pairs on Illumina NovaSeq6000 and the QC analysis pipeline provided from Dovetail Genomics were used to assess the quality of each library.Libraries that passed the QC step was then sequenced up to 300M read pairs on NovaSeq6000.

Micro-C data processing and analysis
Micro-C libraries of 2 biological replicates of KD, DBD and DBD+ were prepared.Sequenced libraries were processed per instructions of Dovetail Genomics.Briefly, fastq files were aligned to hg19 reference genome using BWA-MEM algorithm with options -5SP to map mates independently (Li and Durbin, 2010).Next, parse module from pairtools (?) was used to find ligation junctions in Micro-C libraries with options min-mapq 40 (alignment with mapq < 40 will be marked as multi) and max-inter-align-gap 30 (if the gap is 30 or smaller, ignore the map, if the gap is >30, mark as "null" alignment).The parsed pair is then sorted using pairtools sort and PCR duplicates were removed with pairtools dedup.The pairtools split command was used to split the final .pairsaminto .bam(then sorted with samtools sort) and .pairsfiles.Using Juicer Tools, .pairsfiles were converted into HiC contact matrices (Durand et al., 2016b).HiC matrices were then converted to mcool matrices using hic2cool (Durand et al., 2016a).
For MDS plot of individual replicates (Figure 1B), 500kb resolution cool matrices were converted gi interaction object using hicConvertFormat from HiCExplorer (v.3.7.2) (Wolff et al., 2020).Then using diffHiC package (Lun and Smyth, 2015), bins with low average abundance and low absolute counts were filtered out.Filtered reads were then scaled using library size and bin pairs that were on the diagonal line were also removed from analysis.Joint normalization of all replicates were carried out with diffHiC.Specifically, normOffsets function was used to remove trended biases with loess normalization and then a new set of log2-transformed counts adjusted by the negative binomial offset were computed.Batch effect was removed using removeBatchEffect from limma package (Ritchie et al., 2015) before plotting the top 1000 interactions.For distance-decay plot (Figre 1C), individual replicates were combined.First, the combined matrices were normalized using For TAD analysis, individual replicates were combined and then scaled using the smallest library size with hicNormalize from HiCExplorer (v.3.7.2) (Wolff et al., 2020).Then matrices at 10kb, 25kb, 50kb and 100kb resolutions were ICE-corrected with the thresholds determined from diagnostic plots.First, using hicFindTADs function, we called TADs at the previously mentioned 4 resolutions for DBD and DBD+ matrices.Then, hicDifferentialTAD used to compute differential TADs by comparing the precomputed DBD and DBD+ TAD regions with the same regions of KD matrix.Differential TADs from each resolution were combined using hicMergeDomains with default -value of 5000 to account for duplicated TADs.For CUT&Tag peak annotation, peaks (FDR < 0.05, log2 Fold-Change > 8, mean normalized counts of signal > 80, and IDR < 0.01) were overlapped with findOverlaps functions from GenomicRanges package (Lawrence et al., 2013).
For loop calling, combined replicates at 1kb resolution matrices that were scaled and normalized in same manner for TAD analysis were used with Mustache (v.1.2.0) (Roayaei Ardakany et al., 2020).Mustache uses scale-space theory in computer vision to detect chromatin loops.For differential loops compared to KD matrix, diff_mustache.py was used to detect loops that were gained in DBD and DBD+ compared to KD and loops that were lost in DBD and DBD+ compared to KD matrix.

Statistical analysis
When comparing means of two groups, two-sided Student's t test was used.When comparing more than two groups, ANOVA test was used with Tukey honest significant differences test.* P value < 0.05, ** P value < 0.01, and *** P value < 0.001

Figure 1 .
Figure 1.DBD-4 helix of FLI domain is required to restructure chromatin in A673 cell. A. A schematic of DBD and DBD+ constructs used in shRNA knock-down and rescue experiments.B. Multidimensional scaling (MDS) plot of top 1000 interactions (500kb resolution) in each biological replicates.C. Genome-wide interaction frequency (ICE-corrected Micro-C counts) over genomic distance (bp) at 5kb resolution.D. Volcano plot showing differentially interacting regions (DIRs) detected at 500kb resolution for DBD+ replicates versus KD replicates.E. Volcano plot showing DIRs detected at same resolution for DBD replicates versus KD replicates.F. Violin plot of interaction distance (unit length of 500kb) for interactions (p adj < 0.2, log2FC > 0.5) detected in DBD only (mean=54Mbp), in both DBD and DBD+ conditions (mean=39.5Mbp),and in DBD+ only (mean=29.7Mbp).Boxplots depict the minimum, first quartile, median, third quartile, and maximum.*P value < 0.05, *** P value < 0.001

Figure 2 .
Figure 2. Altered TAD structure in DBD cells is linked to GGAA microsatellite binding.A. Number of TADs detected in DBD and DBD+ compared to KD at resolutions of 10kb, 25kb, 50kb and 100kb.B. Proportion of TADs (compared to KD) bound by FLAG, CTCF, both, or neither.C. Venn diagram of overlap between DBD and DBD+ TADs (compared to KD).D-H.Comparison of DBD and DBD+ unique TADs.D. Binding intensity of unique FLAG peaks (FDR < 0.05, FC > 8, counts > 80, IDR < 0.01) across the width of DBD and DBD+ unique TADs.E. Width of DBD and DBD+ unique TADs in bp.F. Expression level of significantly upregulated genes within unique TADs in DBD and DBD+ bound by FLAG.G. Length of microsatellites bound by unique FLAG peaks in DBD and DBD+ conditions in bp.H. Density of GGAA motif in the microsatellites calculated as (# of motif x 4)/(length of microsatellites) in DBD and DBD+ unique TADs bound by unique FLAG peaks.Boxplots depict the minimum, first quartile, median, third quartile, and maximum.*** P value < 0.001 143 158

Figure 3 .
Figure 3. DBD and DBD+ form loops at GGAA microsatellites, but DBD+ rescues more and shorter loops.A. Number of loops detected in DBD and DBD+ compared to KD at resolutions of 1kb at short-range (<50kb), mid-range (50kb-500kb), and long-range (>500kb).B. Venn diagram of overlap between DBD and DBD+ uniquely gained loops (compared to KD). C. Expression level of significant genes overlapped with uniquely gained loop anchors of DBD and DBD+.Means = 0.45 (DBD) and 0.77 (DBD+) D. Peak intensity of unique FLAG peaks (FDR < 0.05, FC > 8, counts > 80, IDR < 0.01) at anchors of uniquely gained loops in DBD and DBD+ cells.Means = 5.01 and 6.30 per DBD and DBD+ E. Length of microsatellites bound by unique FLAG peaks at the anchors of DBD and DBD+ uniquely gained loops in bp.Means = 22.5bp (DBD) and 38bp (DBD+) F. Density of GGAA motif in the microsatellites calculated as (# of motif x 4)/(length of microsatellites) at the anchors of DBD and DBD+ uniquely gained loops bound by unique FLAG peaks.Means = 0.51 (DBD) and 0.70 (DBD+).Boxplots depict the minimum, first quartile, median, third quartile, and maximum.*P value < 0.05, *** P value < 0.001

178Figure 4 .
Figure 4. DBD+ rescues de-novo enhancer formation at microsatellites. A. PCA plot of H3K27ac peaks in biological replicates of KD, DBD and DBD+.B. Venn diagram of overlap of H3K27ac peaks (FDR < 0.05, FC > 8, counts > 80, IDR < 0.01) of DBD and DBD+.C. Percentage of H3K27ac peaks at microsatellites in common, DBD unique and DBD+ unique peaks.D. Number of H3K27ac peaks constituting typical and super enhancers called in DBD and DBD+ conditions.E. Constituent size (in bp) of typical and super enhancers in DBD and DBD+ conditions.F. Expression level of significantly upregulated genes at DBD and DBD+ super enhancers.Boxplots depict the minimum, first quartile, median, third quartile, and maximum.Circles depict outliers.*** P value < 0.001.

Figure 6 .
Figure 6.DBD-4 helix promotes formation of transcription hubs by effective binding at microsatellites. A. 250kb region on chr 19 containing FCGRT and other genes.TADs are depicted on 1kb matrices (DBD/KD and DBD+/KD).Uniquely gained loops are shown as red inverted arcs.FLAG CUT&Tag bigwig tracks depicted in magenta.GGAA microsatellites in hg19.CTCF CUT&Tag track is in blue middle row.H3K27ac tracks are in green.Enhancers and super-enhancers are shown as green bars.Gene expression is in black tracks.B. FCGRT promoter region containing GGAA microsatellites.C. NOSIP promoter region containing GGAA repeats.D. Model of EWS::FLI-driven transcription hub provided us with an opportunity to discern meaningful difference in genomic localization and sequence characteristics of bound loci across three groups: DBD unique, common, and DBD+ unique peaks.Importantly, the methods utilize different enzymes for the digestion step: Micrococcal nuclease (MNase) for CUT&RUN and Tn5 transposase for CUT&Tag.Thus, a few possible factors might explain the differences in peak detection between the two techniques.MNase enzyme has cleavage preferences at sites rich in adenylate, deoxyadenylate, and thymidylate(Cuatrecasas et al., 1967).It also shown to have a preference for open chromatin or nucleosome-free regions(Henikoff et al., 2011;Weiner et al., 2010).Because the specific effects of EWS::FLI binding on GGAA repeats in chromatin is unknown, these loci may be particularly susceptible to excess cleavage by MNase in a way that biases CUT&RUN results.In contrast, the Tn5 transposase is especially efficient at integrating adapters into open chromatin without chromatin degradation and may, therefore, be particularly efficient at capturing binding at GGAA repeat regions.Additionally, CUT&RUN generally has lower signals, higher background, and lower yields compared to CUT&Tag(Kaya-Okur et al., 2019).We thus attribute the difference in detection of FLAG peaks in our previous and current studies to the different enzymes and their ability to recognize and access repetitive elements such as GGAA repeats.
hicNormalize function from HiCExplorer to scale the libraries to the smallest library(Wolff et al., 2020).Scaled libraries were then plotted for diagnostic plots to determine the thresholds to use in hicCorrectMatrix function for ICE normalization.ICE-corrected 5kb matrices were then plotted with hicPlotDistVsCounts.For differentially interacting region analysis (Figure1D-F), multiHiCcompare package was used(Stansfield et al., 2019).Briefly, individual replicates of KD, DBD and DBD+ were loess normalized and pairwise comparison of DBD+ to KD and DBD to KD was done using QLF (quasi-likelihood) method with batch effect correction.Volcano plots of differentially interacting regions with padj < 0.05 and fold-change > 1.5 plotted for each comparison (DBD+ and DBD vs KD).
A. DBD-4 helix of FLI1 depicted on dsDNA (PDB) B. Knock-down of endogenous EWS::FLI detected with FLI1 ab and rescue of wtEF, DBD, and DBD+ detected with FLAG ab. C. A PCA plot of RNA-seq experiments in A673 cells.D. Representative image of soft agar colony plates and quantification of three biological replicates.

Table 1 .
Supplementary Table1.Differential expression of FCGRT hub genes in DBD and DBD+ compared to KD.