HOTAIR primes the Ewing sarcoma family of tumors for tumorigenesis via epigenetic dysregulation involving LSD1

The EWS-FLI1 fusion protein drives oncogenesis in the Ewing sarcoma family of tumors (ESFT) in humans, but its toxicity in normal cells requires additional cellular events for oncogenesis. We show that the lncRNA HOTAIR maintains cell viability in the presence of EWS-FLI1 and redirects epigenetic regulation in ESFT. HOTAIR is consistently overexpressed in ESFTs and is not driven by EWS-FLI1. Repression of HOTAIR in ESFT cell lines significantly reduces anchorage-independent colony formation in vitro and impairs tumor xenograft growth in vivo. Overexpression of HOTAIR in human mesenchymal stem cells (hMSCs), a putative cell of origin of ESFT, and IMR90 cells induces colony formation. Critically, HOTAIR-expressing hMSCs and IMR90 cells remain viable with subsequent EWS-FLI1 expression. HOTAIR induces histone modifications and gene repression through interaction with the epigenetic modifier LSD1 in ESFT cell lines and hTERT-hMSCs. Our findings suggest that HOTAIR maintains ESFT viability through epigenetic dysregulation. Significance While the EWS-FLI1 fusion gene was determined to be the oncogenic driver in the overwhelming majority of ESFT, it is toxic to cell physiology and requires one or more additional molecular events to maintain cell viability. As these tumors have surprisingly few genetic mutations at diagnosis, epigenetic changes have been considered to be such an event, but the mechanism by which these changes are driven remains unclear. Our work shows that HOTAIR is consistently expressed among ESFT and induces epigenetic and gene expression changes that cooperate in tumorigenesis. Furthermore, expression of HOTAIR allows for cell viability in the setting of subsequent EWS-FLI1 expression. Our findings elucidate new steps of malignant transformation in this cancer and identify novel therapeutic targets.


Abstract 19
The EWS-FLI1 fusion protein drives oncogenesis in the Ewing sarcoma family of tumors (ESFT) 20 in humans, but its toxicity in normal cells requires additional cellular events for oncogenesis. We 21 show that the lncRNA HOTAIR maintains cell viability in the presence of EWS-FLI1 and 22 redirects epigenetic regulation in ESFT. HOTAIR is consistently overexpressed in ESFTs and is 23 not driven by EWS-FLI1. Repression of HOTAIR in ESFT cell lines significantly reduces 24 anchorage-independent colony formation in vitro and impairs tumor xenograft growth in vivo.

25
Overexpression of HOTAIR in human mesenchymal stem cells (hMSCs), a putative cell of origin 26 of ESFT, and IMR90 cells induces colony formation. Critically, HOTAIR-expressing hMSCs and 27 IMR90 cells remain viable with subsequent EWS-FLI1 expression. HOTAIR induces histone 28 modifications and gene repression through interaction with the epigenetic modifier LSD1 in 29 ESFT cell lines and hTERT-hMSCs. Our findings suggest that HOTAIR maintains ESFT viability 30 through epigenetic dysregulation. 31 Significance 32 While the EWS-FLI1 fusion gene was determined to be the oncogenic driver in the 33 overwhelming majority of ESFT, it is toxic to cell physiology and requires one or more additional 34 molecular events to maintain cell viability. As these tumors have surprisingly few genetic 35 mutations at diagnosis, epigenetic changes have been considered to be such an event, but the 36 mechanism by which these changes are driven remains unclear. Our work shows that HOTAIR 37 is consistently expressed among ESFT and induces epigenetic and gene expression changes 38 that cooperate in tumorigenesis. Furthermore, expression of HOTAIR allows for cell viability in The Ewing sarcoma family of tumors (ESFT) consists of primitive cancers of the bone and soft 44 tissues that arise in children and young adults. These tumors harbor chromosomal 45 translocations that result in the fusion of the 5' portion of the EWSR1 gene to the 3' end of an 46 ETS family member, with >85% resulting in the EWS-FLI1 fusion gene(1). The resultant 47 oncoprotein alters gene expression and alternative splicing(2, 3) and is necessary for tumor 48 viability in laboratory models of ESFT(4). However, exogenous expression of EWS-FLI1 is toxic 49 to normal cells and other cancer cell types, inducing rapid apoptosis or senescence(5). This 50 toxicity suggests that additional cellular events must allow tolerance of the oncoprotein. Three 51 large genomic sequencing studies identified some recurrent genetic mutations in ESFT, 52 including loss-of-function mutations in TP53, CDKN2A,, but these mutations 53 were found in only 25-30% of all samples. Thus, additional molecular changes, such as 54 transcriptional or epigenetic events, must occur to allow tumor cell survival. 55 Long noncoding RNAs (lncRNAs) have significant roles in the regulation of gene expression, 56 either directly or epigenetically. The lncRNA HOTAIR was specifically shown to direct epigenetic 57 repression in trans across the genome, in part through recruitment of the LSD1/REST/CoREST 58 complex at its 3' end(9). This RNA-protein complex alters histone methylation at histone 3 lysine 59 4 (H3K4), demethylated by LSD1 from a dimethylated state (Me2) to mono-(Me) or 60 unmethylated. This histone modification represses gene expression and maintains an 61 embryonic state in tissues where HOTAIR is expressed. HOTAIR is abnormally overexpressed 62 in numerous cancers (reviewed in (10)), and HOTAIR has been shown to epigenetically modify 63 gene expression in these cancers (11)(12)(13). 64 In this study, we evaluated the function of HOTAIR in ESFT. We confirmed that HOTAIR is 65 overexpressed in ESFT cell lines and primary tumors, as compared to normal tissues and to 66 human mesenchymal stem cells (hMSCs), a putative cell of origin of these tumors (14,15). We 67 demonstrated that HOTAIR is necessary for the formation and viability of ESFT cell line-derived 68 anchorage-independent colonies. We also showed that repression of HOTAIR by shRNA 69 reduced tumor xenograft formation from ESFT cell lines in immunodeficient mice. In contrast, 70 overexpression of HOTAIR in primary and hTERT-immortalized hMSCs induces anchorage-71 independent colony formation. HOTAIR expression in hMSCs and IMR90 fibroblasts also allows 72 for subsequent viable expression of EWS-FLI1. We verified that HOTAIR associates with LSD1 73 in ESFT cell lines, and interaction with LSD1 is necessary for colony formation in hMSCs. We 74 modulated HOTAIR expression in ESFT cell lines and hMSCs, alone and with concomitant 75 EWS-FLI1 expression. This change in HOTAIR expression was associated with significant gene 76 expression changes across the transcriptome, including a set of genes modulated across all 77 models, as determined by next-generation RNA-Sequencing (RNA-Seq). We further 78 demonstrated by Chromatin Immunoprecipitation and Sequencing (ChIP-Seq) that HOTAIR 79 expression induced H3K4 demethylation across the genome including in a significant number of 80 genes with corresponding repression of expression. 81

HOTAIR is overexpressed, independently of EWS-FLI1, in ESFT cell lines and primary 83 tumors as compared to normal tissues and mesenchymal stem cells 84
We first assessed the expression of HOTAIR in ESFT as compared to normal tissues, utilizing 85 the OncoGenomics DB of the National Cancer Institute (https://pob.abcc.ncifcrf.gov/cgi-bin/JK). 86 We examined next-generation RNA-sequencing (RNA-Seq) data annotated there for ESFT, 87 including 50 cell lines and 72 primary tumor RNA samples (6), with expression normalized to that 88 of a set of samples of normal adult tissues ( Figure 1A). HOTAIR was expressed in all samples 89 tested, with over a log-fold higher expression in >90% of cell lines and tumors as compared to 90 normal tissues. 91 We next examined expression in ESFT as compared to other cancer types. Using the R2 92 Genomics Analysis and Visualization platform, (http://R2.amc.nl), we compared HOTAIR 93 expression among cancer gene expression datasets that were analyzed using Affymetrix u133 94 microarrays ( Figure 1B). The three datasets with the highest average HOTAIR expression were 95 comprised of ESFT samples (16)(17)(18), as compared to 311 other datasets including other cancer 96 cell lines, non-ESFT primary tumor sets, and sets of mixed normal and cancerous samples. 97 We validated these findings directly by analysis of 13 ESFT cell lines, 22 primary tumor RNA 98 samples, and 2 primary hMSC samples by RT-qPCR. Using a threshold of two-fold expression 99 as compared to hMSC, 11/13 cell lines and 17/22 tumor samples had high HOTAIR expression 100 ( Figure 1C). This consistent overexpression of HOTAIR in ESFT supports a functional role for 101 the lncRNA in this cancer. 102 We hypothesized that HOTAIR expression in ESFT is an EWS-FLI1-independent event. We 103 confirmed this by knocking down EWS-FLI1 expression in 4 ESFT cell lines by siRNA ( Figure  104 1D). In all four cell lines, knockdown of EWS-FLI1 did not result in loss of expression of 105 HOTAIR, and in two lines this knockdown actually led to a modest upregulation of HOTAIR 106 expression. Thus, HOTAIR expression in ESFT cell lines is not driven by EWS-FLI1 and 107 represents an independent biological pathway. 108

HOTAIR expression correlates with anchorage independent colony formation in ESFT 109 cell lines and hMSCs while maintaining MSC characteristics 110
We examined the phenotypic effects of HOTAIR in ESFT cell lines by first knocking down 111 expression using shmiRNA in the ES2, TC32, and SK-ES cell lines. We repressed expression to 112 20-50% of baseline HOTAIR expression (Figure 2A), with some variation among the cell lines. 113 We were unable to maintain viable cells with expression below this level for each cell line, 114 supporting a role for HOTAIR in maintaining cell viability. We also confirmed that loss of 115 HOTAIR expression had no significant effect on EWS-FLI1 protein expression (SI 1A). 116 We next examined the effects of HOTAIR repression on the growth of ESFT cells. Proliferation 117 of these cells in two-dimensional tissue culture did not show any significant difference at this 118 level of repression (SI 1B). However, prior work in Ewing sarcoma biology demonstrated that 119 inhibition of LSD1, shown in other cancers to interact with HOTAIR, had no significant effect on 120 proliferation but did alter 3D tumorsphere growth (19). Accordingly, we evaluated growth of 121 5 tumorspheres in soft agar. We found in all three cell line models that repression of HOTAIR 122 resulted in a significant decrease in anchorage-independent colony formation in soft agar 123 ( Figure 2B). 124 hMSCs are the presumptive cell of origin for Ewing sarcoma but they poorly form colonies in 3D 125 culture in vitro and lack tumorigenicity in vivo(20, 21). We evaluated how exogenous HOTAIR 126 expression affected the growth of these cells. We initially used hTERT-immortalized hMSCs for 127 this assay for ease of use in in vitro culture. These cells phenocopy primary hMSCs except for 128 avoiding senescence, and they poorly form colonies in 3D culture. We overexpressed HOTAIR 129 or a control vector in these cells, at levels comparable to ESFT cell lines ( Figure 2C). HOTAIR 130 induced colony formation in soft agar, whereas expression of the empty vector did not ( Figure  131 2D). This ability to enable anchorage-independent growth supports a role for HOTAIR not only 132 in cell proliferation and viability but also in tumorigenesis. We repeated this experiment with 133 early passage primary mesenchymal stem cells and found that overexpression of HOTAIR 134 induced colony formation in soft agar ( Figure 2C and D). We also used the wildtype IMR90 lung 135 fibroblast cell line, which has been shown to be useful as a model of EWS-FLI1 expression (22)  136 and does not form colonies in 3D culture. Again, overexpression of HOTAIR induced 137 anchorage-independent colony formation in these cells ( Figure 2C and D). 138 MSCs are defined by characteristics that include the ability to grow on plastic, expression of the 139 surface markers CD73, CD90, and CD105, and maintenance of differentiation capacity along 140 mesenchymal lineages(23). ESFT also have most of these properties, distinct from other 141 sarcomas(24, 25). We confirmed that HOTAIR-expressing hTERT-hMSCs maintained these 142 properties, as did the ESFT cell lines with knockdown of HOTAIR expression. Analysis by flow 143 cytometry showed that control cells and cells with modulated HOTAIR expression had strong 144 surface expression of CD73, CD90, and CD105, and had no significant alteration of other 145 lineage markers (SI 2). The cells with modulated HOTAIR expression were also able to be 146 differentiated into osteoblasts or adipocytes in patterns similar to their GFP-expressing controls 147 (SI 3). 148

HOTAIR interacts with LSD1 in ESFT and requires its 3' interaction domain for 149 tumorsphere formation capacity in hMSCs 150
We performed formaldehyde-crosslinked RNA immunoprecipitation and confirmed that, in ESFT 151 cell lines, HOTAIR interacts with LSD1 (SI 4A). We wanted to evaluate if this interaction is 152 necessary for the anchorage-independent colony formation seen in the hMSC models. The 153 interaction domain of HOTAIR with LSD1 has been previously defined(9). We generated a 154 HOTAIR deletion construct that lacked the 3' LSD1 interaction domain ( Figure 3A). We then 155 expressed this construct or wild-type HOTAIR in hTERT-hMSCs. We confirmed that expression 156 of each of the constructs was not significantly different amongst the pools of cells (SI 4B), then 157 repeated the anchorage-dependent colony formation assay. As compared to the cells with 158 overexpression of wild-type HOTAIR, those cells with expression of the mutant HOTAIR had a 159 markedly diminished or absent colony formation capacity in soft agar ( Figure 3B). This loss of 160 function suggests that HOTAIR must interact with LSD1 to support tumor formation in hMSCs 161 and, analogously, in ESFT. 162 6 HOTAIR primes hMSCs for tolerance of EWS-FLI1 and maintains cell viability 163 As previously noted, exogenous expression of EWS-FLI1 in most normal and malignant cell 164 lines rapidly causes cell death. We confirmed this phenotype by expression of either GFP alone 165 or EWS-FLI1 and GFP in hTERT-hMSCs. Within 48 hours, the EWS-FLI1-overexpressing 166 hTERT-hMSCs morphologically changed with cell shrinkage and nuclear collapse, and no viable 167 cells could be seen at 72 hours ( Figure 3C), whereas the GFP-expressing cells remained 168 unchanged. We created a vector for simultaneous co-expression of HOTAIR and EWS-FLI1 and 169 transfected hTERT-hMSCs with this vector. The hTERT-hMSCs again underwent rapid 170 apoptosis. We next used the hTERT-hMSCs that stably expressed HOTAIR and overexpressed 171 EWS-FLI1 in those cells. In marked contrast, these cells stably expressed both EWS-FLI1 and 172 HOTAIR, as confirmed by western blot and RT-qPCR respectively (SI 5A and B), and remained 173 viable without morphologic evidence of differentiation ( Figure 3C). These cells also formed 174 tumorspheres in soft agar, at an increased rate as compared to hTERT-hMSC-HOTAIR cells.

175
We were similarly able to stably express EWS-FLI1 in primary hMSCs and IMR90 cells with 176 stable HOTAIR expression (SI 5A-C). These data support the hypothesis that HOTAIR 177 expression in the ESFT precursor cell primes the cell to tolerate the subsequent chromosomal 178 translocation that results in EWS-FLI1, driving malignant transformation to ESFT. 179 HOTAIR is necessary but not sufficient for tumor xenograft growth in immunodeficient 180 mice 181 We next evaluated how HOTAIR affects the tumor-initiating capacity of ESFT cells in vivo. We 182 implanted the ESFT cell lines above, with shRNA-mediated knockdown of HOTAIR or 183 nonsilencing control, into the flanks of SCID mice. The tumor xenografts of cells with repressed 184 HOTAIR expression had significantly slower tumor growth across all three cell lines as 185 compared to control (p<0.001 for all three cell lines, Figure 4). We also implanted into the SCID 186 mice the primary hMSC, hTERT-hMSC, and IMR90-GFP, -HOTAIR, and -HOTAIR-EWS-FLI1 187 cells, at cell numbers up to 1x10 7 cells/implant, but no tumors grew in any of the mice after 12 188 weeks. These data suggest that HOTAIR is necessary but not sufficient for tumor growth in cells 189 in the context of concomitant EWS-FLI1 expression. 190 cell types ( Figure 5A, SI data). Among these transcripts, we found a set of genes that are 208 downregulated in cells with increased HOTAIR expression independently of EWS-FLI1. We also 209 defined a set of transcripts whose expression is upregulated in cells with increased HOTAIR 210 expression independently of EWS-FLI1, as well as a larger set of transcripts specifically 211 upregulated by EWS-FLI1. These patterns are similarly seen in the complete dataset of all 212

HOTAIR expression in hMSCs and ESFT cells is associated with gene expression
differentially expressed genes (SI Data were upregulated when cells were treated with GapmeRs as compared to control ( Figure 5B, SI 219 data). This suggests that HOTAIR expression plays a greater role in directing gene repression 220 in ESFT cells than in driving gene expression. 221 We evaluated which biological pathways were affected by HOTAIR expression in these cell line 222 model sets. We noted significant heterogeneity of the specific genes affected within each cell 223 line, both in terms of basal expression and change in expression relative to HOTAIR. As such, 224 we identified those differentially-expressed genes in the hTERT-hMSC cells and any of the 225 ESFT cell lines and defined common biological pathways affected by those genes (Table 1, SI  226 Data). HOTAIR expression correlated with repression of genes involved in cell differentiation, 227 extracellular matrix organization, and cell-cell adhesion, and with upregulation of genes affecting 228 angiogenesis, cell motility, and biological adhesion. These functions correlate with 229 tumorigenesis, metastasis, and inhibition of differentiation. 230 HOTAIR expression induces H3K4 demethylation at the promoters of HOTAIR-repressed 231 genes 232 As previously mentioned, RNA immunoprecipitation experiments in ESFT cell lines confirmed 233 that HOTAIR associates with LSD1 (SI 4A). LSD1 has been shown in other cancers to repress 234 gene expression through this association with HOTAIR by demethylation of histone 3 at lysine 4 235 (H3K4), particularly from a dimethylated (Me2) to a monomethylated (Me) state(9). We 236 hypothesized that differential HOTAIR expression in our model systems would alter LSD1-237 directed histone methylation at gene promoters, repressing gene expression. As such, we 238 performed Chromatin Immunoprecipitation-Sequencing (ChIP-Seq) for H3K4Me1 and 239 H3K4Me2, using ES2 and SK-ES cells with shRNA-mediated knockdown of HOTAIR or 240 nonsilencing control as described above, and the hTERT-hMSC-GFP and hTERT-hMSC-241 HOTAIR cells. In the ES2 cell lines, we examined those genes that had increased expression 242 when HOTAIR expression was repressed ( Figure 5A and B). We found that the loss of HOTAIR 243 8 also resulted in decreased H3K4Me1 and increased H3K4Me2 around the transcriptional start 244 sites (TSS) of these genes ( Figure 6, SI data). In the hTERT-hMSC cells, HOTAIR expression 245 similarly induces increased H3K4Me1 around the TSS of HOTAIR-repressed genes, with a less-246 pronounced but still significant decrease in H3K4Me2 (SI 7, SI data). In the SK-ES cells, loss of 247 HOTAIR is again associated with significantly decreased H3K4Me broadly adjacent to the TSS, 248 though not as markedly as in the ES2 or hTERT-hMSC cells, and with a particularly increased 249 amount of H3K4Me2 immediately adjacent to the TSS (SI 8, SI data). These data demonstrate 250 that genes repressed in the context of HOTAIR expression consistently have histone 251 methylation changes at their promoters associated with LSD1-mediated epigenetic repression. 252

DISCUSSION 253
The Ewing sarcoma family of tumors is characterized by the EWS-FLI1 fusion gene or similar 254 fusion genes. The resultant fusion proteins are the drivers of oncogenesis in these tumors, but 255 these proteins are toxic to most studied normal cells and even other types of cancer cells. limit the ability to measure HOTAIR's activity in modulating epigenetic regulation, which has 289 some rapidly altered features but also additional characteristics only seen over time. These 290 different methods used to modulate HOTAIR expression likely contribute to the difference in the 291 number of genes identified in the different models. In the stable hTERT-hMSC system, 292 thousands of genes were found to be differentially expressed, while comparatively fewer were 293 identified in the transient ESFT models. Nonetheless, key effects were still identified. In 294 particular, there is a set of genes whose expression is regulated by HOTAIR independently of 295 EWS-FLI1 and conversely another set regulated specifically by EWS-FLI1, suggesting 296 complementary functions of the two genes in ESFT tumorigenesis. 297 In both ESFT and hTERT-hMSC models of HOTAIR and EWS-FLI1 expression, differential 298 expression of genes integral to tumor formation were identified, including cytoskeletal and 299 adhesion proteins (collagens, keratins, cadherins and protocadherins) and matrix 300 metalloproteases. The specific genes affected in each individual cell line are variable, consistent 301 with data showing significant diversity in the gene expression profiles of ESFT in general (26, 302 28). Regardless, we identified key pathways affecting tumorigenesis, including cell motility and 303 migration, across the different models. It is particularly noteworthy that these pathways were 304 previously identified to be biomarkers of survival in patients (28). 305 We also identified other critical pathways that are differentially affected by HOTAIR, including 306 DNA repair pathways and normal differentiation and development. ESFT are characteristically 307 undifferentiated, which contributes to the inability to as yet define the cell of origin for these 308 tumors. Additionally, errors in DNA repair pathways have been hypothesized to contribute to 309 oncogenesis(29, 30). HOTAIR may function in the cell of origin to maintain the cells in an 310 undifferentiated and EWS-FLI1 receptive state, tolerant of DNA damage induced by pathways 311 activated by the fusion protein.

312
HOTAIR was originally demonstrated in fibroblasts to function as a scaffold for the LSD1/REST 313 complex at its 3' end and PRC2 at its 5' end(9, 31). We confirmed the interaction of HOTAIR 314 and LSD1, the correlated effect on histone methylation and gene expression in ESFT cell lines, 315 and the necessity of this interaction for the anchorage-independent growth phenotype seen in 316 the hTERT-hMSC model. It is important to acknowledge that the effects of HOTAIR varied 317 among the disease models, with far greater effects seen on gene expression in the hMSC 318 models than in ESFT cell lines. This may have been due to greater effects of stable HOTAIR 319 expression in contrast to the transient effects of HOTAIR repression by the GapmeRs, as noted 320 above. Alternatively, HOTAIR may induce gene expression changes during ESFT 321 transformation but may not be required to maintain gene expression at all sites after 322 transformation occurs. A comprehensive evaluation of HOTAIR binding across the genome, and 323 a comparison of that binding to histone methylation, will elucidate the direct effects of HOTAIR 324 in ESFT. 325 Additional work is warranted in investigating other functions of HOTAIR on gene expression, 326 including its interaction with PRC2 and other regulatory mechanisms, such as effects on DNA 327 methylation independent of LSD1 or PRC2, as recently described(32). We have begun 328 additional work on the interaction of HOTAIR and PRC2, but these studies are much more 329 complicated. A prior study showed a lack of effects on H3K27 methylation despite the presence 330 of HOTAIR (9), which may be due to additional cofactors that allow PRC2 binding but prevent 331 methyltransferase activity(33). We felt it important to make our present findings available while 332 we perform more expansive studies on HOTAIR in ESFT, particularly because of the 333 demonstration of LSD1's importance in ESFT (34)  conservation of the domains responsible for EZH2 and LSD1 interaction(37). As they state, "it 353 nevertheless suggests that the function of this RNA in mice is not identical to that described for 354 its human cognate (37)." This fact is further supported by the scores of reports on the expression 355 of HOTAIR in multiple cancers, its correlation with disease outcome, and its biological functions 356 in these cancers. While studies are needed to discriminate the direct function of HOTAIR from 357 its indirect effects on downstream targets, our work supports these effects and merits such 358 additional study. 359 The identification of HOTAIR in ESFT has implications on the understanding of the disease and 360 on its treatment. The driver of HOTAIR expression has yet to be confirmed, but its expression 361 was previously described in normal embryonic stem cells(11) and cancer stem cells(38, 39). 362 The ESFT  Lentivirus was generated using HEK-293T cells. Briefly, cells were seeded onto a 10 cm tissue 417 culture plate at 70% confluency, then transfected with shmiRNA plasmid using the calcium 418 phosphate precipitation method, pCMV-dR8.2, and pMD2.G at a ratio of 4:3:1. Lentivirus-419 containing media were collected at 48 and 72 hours, pooled and concentrated using Lenti-X 420 Concentrator (Takara Bio USA). Target cells were then transduced with lentiviral particles using 421 polybrene (8 µg/ml), with exposure for 24 hours then selection with puromycin for 3-7 days. 422    Expression is compared to basal HOTAIR expression in three ESFT cell lines. Expression is normalized to expression in the hTERT-hMSC-GFP control. D) Soft agar colony formation assay of hTERT-hMSC, 1 o hMSC, and IMR90 cells (from Figure 2C) after two weeks of growth. Representative photomicrographs are shown in the left panels. Colony counts from three independent experiments are shown in the bar graph to the right (error bars SD). More colonies were formed from hTERT-hMSCs, 1 o hMSCs, and IMR90 cells expressing HOTAIR than GFP alone. p<0.001 for all three lineages.

Figure 3:Anchorage independent growth mediated by HOTAIR requires both 5' and 3' ends. A)
Schematic of wild-type and deletion mutant of HOTAIR generated with, the LSD1-interacting 3' domain deleted. Variant was cloned into GFPcoexpressing plasmid vector for transfection into hTERT-hMSCs, which were then selected for stable integrants. B) Soft agar colony formation assay of hTERT-hMSC (left) and 1 o hMSC cell lines with GFP control, 3'-deleted HOTAIR mutant or wild-type HOTAIR expression, after three weeks of growth.
Representative photomicrographs are shown of each variant, with colony counts of three independent experiments shown in the bar graph (error bars SD).
Only cells expressing wild-type HOTAIR form significantly more colonies than GFP-expressing control. C) Phase contrast and fluorescent photomicrographs of hTERT-hMSC cells transfected with GFP alone (left), GFP and EWS-FLI1 (middle) or GFP and HOTAIR first followed by EWS-FLI1 (right). Cells expressing only EWS-FLI1 undergo apoptosis rapidly, but cells expressing HOTAIR stably first then EWS-FLI1 are fully viable.   Figure 2A, were implanted into SCID mice at 1x10 6 cells/injection, suspended in 1:1 mix of PBS and Matrigel, n=5 mice/group. Tumor volumes were measured twice weekly. In all three cell lines, tumor xenografts from cells with repressed HOTAIR expression grew at a significantly lower rate than those from nonsilencing controls (p=<0.05 for days 3-17 for SK-ES cells, days 3-21 for ES2 cells, and days 14-21 for TC32, error bars SD).

Figure 6: Genes repressed by HOTAIR expression have increased H3K4 monomethylation and decreased H3K4 dimethylation around their transcriptional start sites (TSS) in ES2 cells. Chromatin
Immunoprecipitation and Sequencing (ChIP-Seq) identified genomic regions with differential H3K4Me1 and H3K4Me2 binding in ES2-pGIPZ cells as compared to ES2-shHOTAIR, with FDR<0.05. A) Profiles of HOTAIR-induced H3K4Me1 modification around TSS of HOTAIR-repressed genes. Each row corresponds to HOTAIRrepressed genes (195 genes). Color for H3K4Me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4Me1. Mean H3K4Me1 profiles in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in ES2-pGIPZ compared to ES2-shHOTAIR (i.e. genes that are repressed by HOTAIR). H3K4Me1 was differentially bound adjacent to the TSS of these HOTAIR-repressed genes in the context of HOTAIR expression in ES2-pGIPZ cells as compared to ES2-shHOTAIR2 cells. B) Profiles of HOTAIR-repressed H3K4Me2 bindings in ES2-pGIPZ and ES2-shHOTAIR around TSS of HOTAIRrepressed genes. Each row corresponds to HOTAIR-repressed genes (495 genes). Color for H3K4Me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4Me2. Mean H3K4Me2 profiles in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in ES2-pGIPZ compared to ES2-shHOTAIR (i.e. genes that are repressed by HOTAIR). H3K4Me2 was differentially bound adjacent to the TSS of HOTAIR-repressed genes in the context of loss of HOTAIR in ES2-shHOTAIR cells as compared to the ES2-pGIPZ control cells (i.e., loss of HOTAIR correlates with increased H3K4Me2 binding and increased expression of these correlated genes).    : RNA-Seq analysis summaries. A) HOTAIR expression in ESFT cell lines treated with nonsilencing GapmeR control or HOTAIR-targeting GapmeR. All cell lines had repression of HOTAIR to 20-30% of baseline. C) Numbers of common and disparate genes differentially expressed (adjusted p-value -fold relative change) in different cell line models. Upper panels are genes differentially expressed in hTERT-hMSC-HOTAIR or hTERT-hMSC-HOTAIR-EWS-FLI1 versus GFP control. Lower panels are HOTAIR-GapmeR-treated ESFT cells vs nonsilencing control. Left panels identify genes whose expression directly correlates with HOTAIR expression; right panels identify genes whose expression inversely correlates with HOTAIR expression. Proportionately more genes were found to be upregulated than downregulated with overexpression of HOTAIR in hTERT-hMSC cells, with a majority of genes similarly affected regardless of EWS-FLI1 expression. In contrast, more genes were found to be downregulated than upregulated in correlation with HOTAIR expression in ESFT cells.
SI 7 SI7: Chromatin Immunoprecipitation and Sequencing (ChIP-Seq) identified genomic regions with differential H3K4Me1 and H3K4Me2 binding in hTERT-hMSC-HOTAIR cells ("MSCHOT", left) as compared to hTERT-hMSC-GFP ("MSCGFP", right), with FDR<0.05. A) Profiles of HOTAIR-induced H3K4me1 modification around TSS of HOTAIRrepressed genes. Each row corresponds to HOTAIR-repressed genes (286 genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hTERT-hMSC-HOTAIR cells compared to hTERT-hMSC-GFP controls (i.e. genes that are repressed by HOTAIR). H3K4Me1 was differentially bound adjacent to the TSS of these HOTAIR-repressed genes in the context of HOTAIR expression in hTERT-hMSC-HOTAIR cells as compared to control. B) Profiles of HOTAIR-repressed H3K4me2 binding in hTERT-hMSC-HOTAIR and hTERT-hMSC-GFP around TSS of HOTAIR-repressed genes. Each row corresponds to HOTAIR-repressed genes (287 genes). Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hTERT-hMSC-HOTAIR compared to hTERT-hMSC-GFP (i.e. genes that are repressed by HOTAIR). H3K4Me2 was differentially bound to adjacent to the TSS of HOTAIR-repressed genes in the context of low HOTAIR expression in hTERT-hMSC-GFP cells as compared to the hTERT-hMSC-HOTAIR cells (i.e., low HOTAIR expression correlates with increased H3K4Me2 binding and increased expression of these correlated genes). Differential H3K4Me2 expression is significant for each gene identified (FDR <0.05 by MACS2 analysis). SI 8 SI8: Chromatin Immunoprecipitation and Sequencing (ChIP-Seq) identified genomic regions with differential H3K4Me1 and H3K4Me2 binding in SKES-pGIPZ ("SKEGIPZ", left) as compared to SKES-shHOTAIR ("SKEshHOT," right), with FDR<0.05. A) Profiles of HOTAIR-induced H3K4me1 modification around TSS of HOTAIRrepressed genes. Each row corresponds to HOTAIR-repressed genes (44 genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in SKES-pGIPZ cells compared to SKES-shHOTAIR controls (i.e. genes that are repressed by HOTAIR). H3K4Me1 was differentially bound adjacent to the TSS of these HOTAIR-repressed genes in the context of HOTAIR expression SKES-pGIPZ cells as compared to SKES-shHOTAIR. B) Profiles of HOTAIR-repressed H3K4me2 binding in SKES-pGIPZ and SKES-shHOTAIR around TSS of HOTAIR-repressed genes. Each row corresponds to HOTAIR-repressed genes (287 genes). Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in SKES-pGIPZ compared to SKES-shHOTAIR (i.e. genes that are repressed by HOTAIR). H3K4Me2 was differentially bound to adjacent to the TSS of HOTAIR-repressed genes in the context of loss of HOTAIR expression in SKES-shHOTAIR cells as compared to SKES-pGIPZ cells (i.e., loss of HOTAIR expression correlates with increased H3K4Me2 binding and increased expression of these correlated genes).

H3K4me1 enrichment in ES2 in relation to HOTAIR expression
81,362 and 55,991 H3K4me1 broad peaks were identified at FDR < 0.05 using MACS2 in ES2GIPZ and ES2shHOTAIR, respectively. Peaks in each cell line were normalized to their corresponding input samples. 1,410 genes were found to be differentially regulated by HOTAIR at FDR < 0.05 (ES2Control vs ES2HOTAIRGapmer).

Figure 1.
Heatmap of H3K4me1 profiles in ES2GIPZ and ES2shHOTAIR plotted ± 5000-bp of HOTAIR-regulated genes in ES2 (FDR < 0.05). Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. As shown in the heatmap, HOTAIR expression induces increased H3K4me1 binding around TSS of HOTAIR-regulated genes. (hm_ES2_H3K4M1_norm_diff_GIPZ_shHOT.pdf) Figure 2. 36,202 H3K4me1 binding peaks were found to be increased in ES2GIPZ and 12,333 peaks were found to be increased in ES2shHOTAIR. Regions with increased binding in ES2shHOTAIR were adjacent to 3,648 genes of which 193 is HOTAIR-regulated genes (40 up-and 153 down-regulated by HOTAIR). Regions with increased binding in ES2GIPZ were adjacent to 10,556 of which 544 is HOTAIR-regulated genes (141 up-and 403 down-regulated by HOTAIR). Venn diagram shows overlap between number of HOTAIRregulated genes adjacent to significant increased H3K4me1 binding in ES2 (dark blue) and significant increased H3K4me1 binding in ES2shHOTAIR (magenta). Note that multiple differential binding may be adjacent to the same gene. venn_ES2GIPZvsshHOT_H3K4M1.pdf.

Figure 3
Plot of significantly increased H3K4me1 binding in ES2GIPZ compared to ES2shHOTAIR around TSS of HOTAIR-regulated genes (417 genes in venn diagram on Figure 2). Color for H3K4me1 ChIP-seq represents log 2 ratio of ChIP to input samples. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 signals in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in ES2Control compared to ES2shHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_ES2_H3K4M1_norm_GIPZenrich_shHOT.pdf" Figure 4. Plot of H3K4me1 significantly increased binding in ES2shHOTAIR compared to ES2GIPZ around TSS of HOTAIR-regulated genes (66 genes in venn diagram on Figure 2). Color represents log2 ratio of ChIP vs. input samples. Mean signals in each cell line are plotted on top. The expression of adjacent genes is plotted on the right panel (log 2 ratio of ES2HOTAIRGapmer/ ES2Control). Negative (purple) represents decreased expression in ES2Control compared to ES2shHOTAIR (i.e. genes that are repressed by HOTAIR). hm_ES2_H3K4M1_norm_GIPZ_shHOTenrich.pdf Figure 5. Plot of read counts difference between H3K4me1 and input in hMSC-HE samples (left panel) and in ES2GIPZ samples (right panel). As shown by the heatmap, there are more H3K4me1 enriched regions in ES2GIPZ compared to hMSC-HE. Color represents density of regions from high (red) to low (blue). Figure 6. Heatmap of H3K4me2 profiles in ES2GIPZ and ES2shHOTAIR plotted ± 5000-bp of HOTAIR-regulated genes in ES2 (FDR < 0.05). Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. Overall, knockdown of HOTAIR induces increase of H3K4me2 binding around TSS of HOTAIR-regulated genes. (hm_ES2_H3K4M2_norm_diff_GIPZ_shHOT.pdf)

Differential H3K4me2 binding in ES2GIPZ versus ES2shHOTAIR
There are 70,018 regions associated with significant increase of H3K4me2 binding in ES2shHOTAIR compared to ES2GIPz. These regions are adjacent to 17,171 unique genes and 750 of them are HOTAIR-regulated (544 down-regulated and 206 up-regulated by HOTAIR). In contrast, only 3 regions are associated with significant increase of H3K4me2 binding in ES2GIPZ compared to ES2shHOTAIR. These 3 regions are adjacent to 3 unique genes but none of them is HOTAIR-regulated genes. ES2_H3K4M1_norm_GIPZenrich_shHOT.xlsx contains list of regions with gain of H3K4me1 in ES2GIPZ compared to ES2shHOT with its adjacent gene.
ES2_H3K4M2_norm_GIPZ_shHOTenrich.txt contains list of regions with gain of H3K4me2 in ES2shHOTAIR compared to ES2GIPZ with its adjacent gene.
ES2_H3K4M2_norm_GIPZenrich_shHOT.txt contains list of regions with gain of H3K4me1 in ES2GIPZ compared to ES2shHOTAIR with its adjacent gene.
Assessing batch effects with additional sequencing reads.
As of 8/8/2017 there are a total of 53 ChIP-seq runs. 35 ChIP-seq ran previously using high output mode (_1 files), and new sets of sequencing run has 18 ChIP-seq using rapid mode ran on two lanes ( _2 and _3 files).
Read counts (Corrected for total counts) for random 7,000 windows of size 7,000bp in each ChIP-seq runs (53 runs) are evaluated and compared. Duplicated reads are only count as one. Counting and random selection were done using seqmonk (https://www.bioinformatics.babraham.ac.uk/projects/seqmonk/).  The genes in HEH3K4me2 and HOTH3K4me1 have more overlap than expected by random chance (pvalue < 10 -120 ) The genes in HEH3K4me1 and HOTH3K4me1 have more overlap than expected by random chance (pvalue < 10 -10 ) P-values of overlap were calculated using Fisher's exact test implemented in R package GeneOverlap.
The number of genes in HEH3K4me1 is much less than the others make it looks they don't overlap much. However, there are only 42 genes in HEH3K4me1 and 40 of them overlap with HOTH3K4me1 (95%). While there are 3731 genes in HEH3K4me2 and 2396 of them overlap with HOTH3K4me1 (64%)

From here on analysis are done on merged reads (from two different runs of ChIP-seq)
H3K4me1 enrichment in ES2 in relation to HOTAIR expression 95,615 (adjacent to 19,061 unique genes) and 90,201 (adjacent to 17,498 unique genes) H3K4me1 broad peaks were identified at FDR ≤ 0.05 using MACS2 in ES2GIPZ and ES2shHOTAIR, respectively. Peaks in each cell line were normalized to their corresponding input. 1,181 genes were found to be differentially regulated by HOTAIR (FDR ≤ 0.05 and fold-change ≥ 2). Genes regulated by HOTAIR have positive fold-change. 665 and 617 HOTAIR-regulated genes are found adjacent to H3K4me1 peaks in ES2GIPZ and ES2shHOTAIR, respectively. A. B.

H3K4me1-induced and -repressed binding by HOTAIR expression
We found 66,387 H3K4me1 bindings that are induced by HOTAIR and 48, 859 H3K4me1 bindings that are repressed by HOTAIR. HOTAIR-induced H3K4me1 binding are adjacent to 544 HOTAIR-regulated genes, while HOTAIR-repressed H3K4me1 binding are adjacent to 318 HOTAIR-regulated genes. H3K4me1 bindings adjacent to 295 HOTAIR-regulated genes are exclusively induced by HOTAIR. On the other hand, 69 genes have H3K4me1 bindings that are exclusively repressed by HOTAIR.

Methods:
66,387 regions with significant increase of H3K4me1 binding (HOTAIR-induced binding) in ES2GIPZ compared to ES2shHOTAIR were identified with FDR ≤ 0.05 using MACS2. 48, 859 regions with significant increase of H3K4me1 binding (HOTAIR-repressed) were identified in ES2shHOTAIR compared to ES2GIPZ. Figure 13. Overlap between HOTAIR-induced and -repressed H3K4me1 bindings. Venn diagram shows overlap between the number of HOTAIR-regulated genes adjacent to increased H3K4me1 binding in ES2GIPZ (dark blue) and increased H3K4me1 binding in ES2shHOTAIR (magenta). Note that multiple differential binding may be adjacent to the same gene. venn_ES2GIPZvsES2shHOT_H3K4M1.pdf. Figure 14. Profiles of HOTAIR-induced H3K4me1 modification around TSS of HOTAIR-regulated genes (295 genes in venn diagram on Figure 13). Each row corresponds to HOTAIR-regulated genes (195 downand 100 up-regulated genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in ES2Control compared to ES2shHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_clustES2GIPZenrich_H3K4M1_vsES2shHOT.pdf" Figure 15. Profiles of HOTAIR-repressed H3K4me1 bindings in ES2GIPZ and ES2shHOTAIR around TSS of HOTAIR-regulated genes (69 genes in venn diagram on Figure 13). Each row corresponds to HOTAIRregulated genes (54 down-and 15 up-regulated genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in ES2Control compared to ES2shHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_clustES2shHOTenrich_H3K4M1_vsES2GIPZ.pdf" -+ H3K4me2 modulation in ES2 cell line 28,843 (adjacent to 13,337 unique genes) and 109,565 (adjacent to 19,193 unique genes) H3K4me2 peaks were identified at FDR cut-off of 0.05 using MACS2 in ES2GIPZ and ES2shHOTAIR, respectively. Peaks in each cell line were normalized to their corresponding input. 416 and 657 HOTAIR-regulated genes are found adjacent to H3K4me2 peaks in ES2GIPZ and ES2shHOTAIR, respectively. A B Figure 16. Overlap between genes adjacent to H3K4me1 peaks in (A) ES2GIPZ and (B) ES2shHOT with the HOTAIR-regulated genes.
In contrast to H3K4me1 mark, globally HOTAIR expression represses H3K4me2 modification around TSS of HOTAIR-regulated genes in ES2 cell line. Figure 17. HOTAIR expression induces decreased H3K4me2 binding around TSS of HOTAIR-regulated genes. Visualization of H3K4me2 profiles in ES2GIPZ (left) and ES2shHOTAIR (right) around ± 5000-bp of HOTAIR-regulated genes in ES2. Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. Each row represents HOTAIR-regulated gene. hm_ES2shHOT_ES2GIPZ_H3K4M2_norm_diff.pdf

H3K4me2-induced and -repressed binding by HOTAIR expression
We found 185 H3K4me2 bindings that are induced by HOTAIR and 92,516 H3K4me2 bindings that are repressed by HOTAIR. HOTAIR-induced H3K4me2 binding are adjacent only to 9 HOTAIR-regulated genes, while HOTAIR-repressed H3K4me2 binding are adjacent to 620 HOTAIR-regulated genes.
H3K4me2 bindings adjacent to only 1 HOTAIR-regulated gene are exclusively induced by HOTAIR. On the other hand, 612 genes have H3K4me2 bindings that are exclusively repressed by HOTAIR.

Methods:
185 regions with significant increase of H3K4me2 binding (HOTAIR-induced binding) in ES2GIPZ compared to ES2shHOTAIR were identified with FDR ≤ 0.05 using MACS2. 92,516 regions with significant increase of H3K4me2 binding (HOTAIR-repressed) were identified in ES2shHOTAIR compared to ES2GIPZ. Figure 18. Overlap between HOTAIR-induced and -repressed H3K4me2 bindings. Venn diagram shows overlap between number of HOTAIR-regulated genes adjacent to increased H3K4me2 binding in ES2GIPZ (dark blue) and increased H3K4me2 binding in ES2shHOTAIR (magenta). Note that multiple differential binding may be adjacent to the same gene. venn_ES2GIPZvsES2shHOT_H3K4M2.pdf.
We found decreased H3K4me2 binding induced by HOTAIR in ES2 cell line around TSS of HOTAIRregulated genes. Interestingly, although H3K4me2 bindings around some of the non-HOTAIR-regulated genes are also increased without HOTAIR expression, these increased bindings were not at the TSSs. "hm_ES2shHOT_ES2GIPZ_H3K4M2_norm_nodiff.pdf" 20 Figure 19. Profiles of HOTAIR-repressed H3K4me2 bindings in ES2GIPZ and ES2shHOTAIR around TSS of HOTAIR-regulated genes (612 genes in venn diagram on Figure 13). Each row corresponds to HOTAIRregulated genes (439 down-and 172 up-regulated genes). Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in ES2Control compared to ES2shHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_clustES2shHOTenrich_H3K4M2_vsES2GIPZ.pdf" Figure 19b. Profiles of HOTAIR-repressed H3K4me2 bindings in ES2GIPZ and ES2shHOTAIR around TSS of HOTAIR-repressed genes. Each row corresponds to HOTAIR-repressed genes. Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in ES2Control compared to ES2shHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_clustES2shHOTenrich_H3K4M2_vsES2GIPZ_rep_genes.pdf" hMSC cell line H3K4me1 enrichment in hMSC in relation to HOTAIR expression 22 48,017 (adjacent to 15,314 unique genes) and 1,885 (adjacent to 696 unique genes) H3K4me1 broad peaks were identified at FDR ≤ 0.05 using MACS2 in hMSCHOTAIR and hMSCGFP, respectively. Peaks in each cell line were normalized to their corresponding input. 3,657 genes were found to be differentially regulated by HOTAIR (FDR ≤ 0.05 and fold-change ≥ 2) (fold-change = hMSC-hTERTHOTAIR/hMSC-hTERTGFP, i.e. genes regulated by HOTAIR have positive fold-change). 15,314 and 696 HOTAIR-regulated genes are found adjacent to H3K4me1 peaks in hMSCHOTAIR and hMSCGFP, respectively. A.
B.  Visualization of H3K4me1 profiles in hMSCHOTAIR (left) and hMSCGFP (right) around ± 5000-bp of HOTAIR-regulated genes. Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. Each row represents HOTAIR-regulated gene. hm_MSCGFP_MSCHOT_H3K4M1_norm_diff.pdf.

24
H3K4me1 binding around TSS of non HOTAIR-regulated genes are shown in hm_MSCGFP_MSCHOT_H3K4M1_norm_nodiff.pdf

H3K4me1-induced and -repressed binding by HOTAIR expression
We found 12,089 H3K4me1 bindings that are induced by HOTAIR and 1,394 H3K4me1 bindings that are repressed by HOTAIR. HOTAIR-induced H3K4me1 binding are adjacent to 667 HOTAIR-regulated genes, while HOTAIR-repressed H3K4me1 binding are adjacent to 7 HOTAIR-regulated genes. H3K4me1 bindings adjacent to 662 HOTAIR-regulated genes are exclusively induced by HOTAIR. On the other hand, 7 genes have H3K4me1 bindings that are exclusively repressed by HOTAIR.

Methods:
12,089 regions with significant increase of H3K4me1 binding (HOTAIR-induced binding) in hMSCHOTAIR compared to hMSCGFP were identified with FDR ≤ 0.05 using MACS2. 1,394 regions with significant increase of H3K4me1 binding (HOTAIR-repressed) were identified in hMSCGFP compared to hMSCHOTAIR. Figure 22. Overlap between HOTAIR-induced and -repressed H3K4me1 bindings. Venn diagram shows overlap between the number of HOTAIR-regulated genes adjacent to increased H3K4me1 binding in hMSCHOTAIR (dark blue) and increased H3K4me1 binding in hMSCGFP (magenta). Note that multiple differential binding may be adjacent to the same gene. venn_MSCHOTvsMSCGFP_H3K4M1.pdf Figure 23. Profiles of HOTAIR-induced H3K4me1 modification around TSS of HOTAIR-regulated genes (662 genes in venn diagram on Figure 22). Each row corresponds to HOTAIR-regulated genes (286 down-and 376 up-regulated genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hMSC-hTERTHOTAIR compared to hMSC-hTERTGFP (i.e. genes that are repressed by HOTAIR). "hm_clustMSCHOTenrich_H3K4M1_vsMSCGFP.pdf" Figure 23b. Profiles of HOTAIR-induced H3K4me1 modification around TSS of HOTAIR-repressed genes. Each row corresponds to HOTAIR-regulated genes (286 downregulated genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hMSC-hTERTHOTAIR compared to hMSC-hTERTGFP (i.e. genes that are repressed by HOTAIR). "hm_clustMSCHOTenrich_H3K4M1_vsMSCGFP_rep_genes.pdf" Figure 24. Profiles of HOTAIR-repressed H3K4me1 binding around TSS of HOTAIR-regulated genes (7 genes in venn diagram on Figure 22). Each row corresponds to HOTAIR-regulated genes (2 down-and 5 up-regulated genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hMSC-hTERTHOTAIR compared to hMSC-hTERTGFP (i.e. genes that are repressed by HOTAIR). "hm_clustMSCGFPenrich_H3K4M1_vsMSCHOT.pdf" H3K4me2 modulation in hMSC cell lines 29 37,543 (adjacent to 15,427 unique genes) and 35,678 (adjacent to 14,251 unique genes) H3K4me2 peaks were identified at FDR cut-off of 0.05 using MACS2 in hMSCHOTAIR and hMSCGFP, respectively. Peaks in each cell line were normalized to their corresponding input. 1,823 and 1,709 HOTAIR-regulated genes are found adjacent to H3K4me2 peaks in hMSCHOTAIR and hMSCGFP, respectively. While the H3K4me2 bindings around HOTAIR-regulated genes in ES2 cell line are globally repressed by HOTAIR, in hMSC cell line the repression by HOTAIR is very minimal (Figure 26). Figure 26. HOTAIR expression induces a slight decrease of H3K4me2 binding around TSS of HOTAIRregulated genes. Visualization of H3K4me2 profiles in hMSCHOTAIR (left) and hMSCGFP (right) around ± 5000-bp of HOTAIR-regulated genes. Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. Each row represents HOTAIR-regulated gene. hm_MSCGFP_MSCHOT_H3K4M2_norm_diff.pdf.
H3K4me2 bindings around TSS of non-differentially expressed genes is shown in hm_MSCGFP_MSCHOT_H3K4M2_norm_nodiff.pdf

31
We found 1,876 H3K4me2 bindings that are significantly induced by HOTAIR and 8,417 H3K4me2 bindings that are repressed by HOTAIR. HOTAIR-induced H3K4me2 binding are adjacent to 186 HOTAIRregulated genes, while HOTAIR-repressed H3K4me2 binding are adjacent to 748 HOTAIR-regulated genes. H3K4me2 bindings adjacent to only 106 HOTAIR-regulated gene are exclusively induced by HOTAIR. On the other hand, 668 genes have H3K4me2 bindings that are exclusively repressed by HOTAIR.

Methods:
1,876 regions with significant increase of H3K4me2 binding (HOTAIR-induced binding) in hMSCHOTAIR compared to hMSCGFP were identified with FDR ≤ 0.05 using MACS2. 8,417 regions with significant increase of H3K4me2 binding (HOTAIR-repressed) were identified in hMSCGFP compared to hMSCHOTAIR. Figure 27. Overlap between HOTAIR-induced and -repressed H3K4me2 bindings. Venn diagram shows overlap between numbers of HOTAIR-regulated genes adjacent to increased H3K4me2 binding in hMSCHOTAIR (dark blue) and increased H3K4me2 binding in hMSCGFP (magenta). Note that multiple differential binding may be adjacent to the same gene. venn_MSCHOTvsMSCGFP_H3K4M2.pdf. Figure 28. Profiles of HOTAIR-induced H3K4me2 bindings in hMSC cell lines around TSS of HOTAIRregulated genes (106 genes in venn diagram on Figure 27). Each row corresponds to HOTAIR-regulated genes (45 down-and 61 up-regulated genes). Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hMSC-hTERTHOTAIR compared to hMSC-hTERTGFP (i.e. genes that are repressed by HOTAIR). "hm_clustMSCHOTenrich_H3K4M2_vsMSCGFP.pdf" Figure 29. Profiles of HOTAIR-repressed H3K4me2 bindings in hMSC cell lines around TSS of HOTAIRregulated genes (668 genes in venn diagram on Figure 13). Each row corresponds to HOTAIR-regulated genes (287 down-and 381 up-regulated genes). Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hMSC-hTERTHOTAIR compared to hMSC-hTERTGFP (i.e. genes that are repressed by HOTAIR). "hm_clustMSCGFPenrich_H3K4M2_vsMSCHOT.pdf" Figure 29b. Profiles of HOTAIR-repressed H3K4me2 bindings in hMSC cell lines around TSS of HOTAIRrepressed genes. Each row corresponds to HOTAIR-repressed genes (287 downregulated genes). Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in hMSC-hTERTHOTAIR compared to hMSC-hTERTGFP (i.e. genes that are repressed by HOTAIR). "hm_clustMSCGFPenrich_H3K4M2_vsMSCHOT.pdf" In SKES cell line, HOTAIR expression seems to induce increased H3K4me1 in broad regions surrounding TSS of HOTAIR-regulated genes. Without HOTAIR expression, there is loss of H3K4me1 binding at their TSSs. Figure 31. HOTAIR expression induces H3K4me1 binding around TSS of HOTAIR-regulated genes. Visualization of H3K4me1 profiles in SKESGIPZ (left) and SKESshHOTAIR (right) around ± 5000-bp of HOTAIR-regulated genes in ES2. Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. Each row represents HOTAIR-regulated gene. hm_SKEshHOT_SKEGIPZ_H3K4M1_norm_diff.pdf Figure 33. Profiles of HOTAIR-induced H3K4me1 modification around TSS of HOTAIR-regulated genes (21 genes in venn diagram on Figure 13). Each row corresponds to HOTAIR-regulated genes (9 down-and 12 up-regulated genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in SKESGIPZ compared to SKESshHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_clustSKEGIPZenrich_H3K4M1_vsSKEshHOT.pdf" 2,646 (adjacent to 2,090 genes) and 19,586 (adjacent to 10,265 genes) H3K4me2 peaks were identified at FDR cut-off of 0.05 using MACS2 in ES2GIPZ and ES2shHOTAIR, respectively. Peaks in each cell line were normalized to their corresponding input. 416 and 657 HOTAIR-regulated genes are found adjacent to H3K4me2 peaks in ES2GIPZ and ES2shHOTAIR, respectively. A B Figure 34. Overlap between genes adjacent to H3K4me2 peaks in (A) SKESGIPZ and (B) SKESshHOT with the HOTAIR-regulated genes.
Globally, HOTAIR expression induces H3K4me2 modification at TSS of HOTAIR-regulated genes and the broad region surrounding the TSS in SKES cell line. Figure 35. HOTAIR expression induces decreased H3K4me2 binding around TSS of HOTAIR-regulated genes. Visualization of H3K4me2 profiles in SKESGIPZ (left) and SKESshHOTAIR (right) around ± 5000-bp of HOTAIR-regulated genes. Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. Each row represents HOTAIR-regulated gene. hm_SKEshHOT_SKEGIPZ_H3K4M2_norm_diff.pdf Profiles of H3K4me2 binding around non-differentially expressed genes are shown in "hm_SKEshHOT_SKEGIPZ_H3K4M2_norm_nodiff.pdf" In SKES cells, HOTAIR expression seems to induce increased H3K4me1 in broad regions surrounding TSS of HOTAIR-regulated genes. Without HOTAIR expression, there is depletion of H3K4me1 binding at their TSSs. Figure 31. HOTAIR expression induces H3K4me1 binding around TSS of HOTAIR-regulated genes. Visualization of H3K4me1 profiles in SKESGIPZ (left) and SKESshHOTAIR (right) around ± 5000-bp of HOTAIR-regulated genes in ES2. Color represents log 2 ratio of normalized read counts over input. Mean signals in each cell line are plotted on top. Each row represents HOTAIR-regulated gene. hm_SKEshHOT_SKEGIPZ_H3K4M1_norm_diff.pdf

H3K4me1-induced and -repressed binding by HOTAIR expression
We found 1,757 H3K4me1 bindings that are induced by HOTAIR and 51 H3K4me1 bindings that are repressed by HOTAIR. HOTAIR-induced H3K4me1 binding are adjacent to 94 HOTAIR-regulated genes, while there are only 5 HOTAIR-regulated genes near HOTAIR-repressed H3K4me1 binding.

Methods:
1,757 regions with significant increase of H3K4me1 binding (HOTAIR-induced binding) in SKESGIPZ compared to SKESshHOTAIR were identified with FDR ≤ 0.05 using MACS2 diff. 51 regions with significant increase of H3K4me1 binding (HOTAIR-repressed) were identified in SKESshHOTAIR compared to SKESGIPZ.
Figure 32. Venn diagram shows the number of HOTAIR-regulated genes adjacent to increased H3K4me1 binding in SKESGIPZ (dark blue) and increased H3K4me1 binding in SKESshHOTAIR (magenta). Note that multiple differential binding may be adjacent to the same gene. venn_SKEGIPZvsSKEshHOT_H3K4M1_2.pdf. Figure 33. Profiles of HOTAIR-induced H3K4me1 modification around TSS of HOTAIR-repressed genes. Each row corresponds to HOTAIR-regulated genes (44 downregulated genes). Color for H3K4me1 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me1. Mean H3K4me1 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in SKESGIPZ compared to SKESshHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_clustSKEGIPZenrich_H3K4M1_vsSKEshHOT_rep_genes_2.pdf" 48 H3K4me2 modulation in SKES cell line 2,646 (adjacent to 2,090 genes) and 19,586 (adjacent to 10,265 genes) H3K4me2 peaks were identified at FDR cut-off of 0.05 using MACS2 in ES2GIPZ and ES2shHOTAIR, respectively. Peaks in each cell line were normalized to their corresponding input. 205 and 657 HOTAIR-regulated genes are found adjacent to H3K4me2 peaks in ES2GIPZ and ES2shHOTAIR, respectively. A B Figure 34. Overlap between genes adjacent to H3K4me2 peaks in (A) SKESGIPZ and (B) SKESshHOT with the HOTAIR-regulated genes. Figure 38. Profiles of HOTAIR-repressed H3K4me2 bindings around TSS of HOTAIR-repressed genes. Each row corresponds to HOTAIR-regulated genes (347 down-regulated genes). Color for H3K4me2 profile represents log 2 ratio of ChIP to input control. Positive (red) represents enrichment of H3K4me2. Mean H3K4me2 profile in each cell line are plotted on top. The corresponding mRNA expression (log2 ratio) is shown on the right panel. Negative (purple) represents decreased expression in SKESGIPZ compared to SKESshHOTAIR (i.e. genes that are repressed by HOTAIR). "hm_SKEshHOTenrich_H3K4M2_vsSKEshHOT_rep_genes_2.pdf"

File lists:
_peaks.xls contains ChIP-seq peaks location output files from MACS2 SI10 (Excel workbook): Gene expression datasets, including are differentially-expressed gene lists and quality-control files for RNA-Seq analyses described in the manuscript, and Figures 5A and B with gene names