ABSTRACT
The pediatric extra-cranial tumor neuroblastoma (NB) is characterised by a low mutation burden while copy number alterations are present in most high-risk cases. We identified SOX11 as a strong lineage dependency transcription factor in adrenergic NB based on recurrent chromosome 2p focal gains and amplifications, its specific expression in the normal sympatho-adrenal lineage and adrenergic NBs and its regulation by multiple adrenergic specific cis-interacting (super-)enhancers. Adrenergic NBs are strongly dependent on high SOX11 expression levels for growth and proliferation. Through genome-wide DNA-binding and transcriptome analysis, we identified and validated functional SOX11 target genes, several of which implicated in chromatin remodeling and epigenetic modification. SOX11 controls chromatin accessibility predominantly affecting distal adrenergic lineage-specific enhancers marked by binding sites of the adrenergic core regulatory circuitry. During normal sympathoblast differentiation we find expression of SOX11 prior to members of the adrenergic core regulatory circuitry. Given the broad control of SOX11 of multiple epigenetic regulatory complexes and its presumed pioneer factor function, we propose that adrenergic NB cells have co-opted the normal role of SOX11 as a crucial regulator of chromatin accessibility and cell identity.
INTRODUCTION
Neuroblastoma (NB) is the most common extra-cranial solid childhood cancer, originating from the developing sympatho-adrenergic nervous system1. The genomic landscape of NB is characterized by a low mutational burden and highly recurrent structural rearrangements. NB is considered a developmental disorder that is controlled by the complex interplay of multiple transcription factors (TFs) and reshaping of epigenetic landscapes1. Tumor cells can co-opt normal developmental pathways for functions that are linked to tumor progression and may become addicted to survival mechanisms controlled by developmental master regulators. Recent studies in NB revealed two distinct super-enhancer-associated differentiation states, i.e. adrenergic (ADRN) and early neural crest/mesenchymal (MES), each programmed by a specific core regulatory circuitry (CRC) defined by multiple lineage-specific transcription factors2,3. Furthermore, lineage identity switching and plasticity is an emerging key factor in therapy resistance of several cancers. Therefore, further insights into the nature and contribution of lineage dependency transcription factors may be important to understand frequently occurring relapses in high-risk NBs4.
Given that (1) in other tumors oncogenic lineage dependency factors were overexpressed through amplification4 and that (2) in addition to frequent MYCN amplification also other oncogenic (co-)drivers were found to be amplified in NB1, we aimed to identify novel putative lineage dependency TFs implicated in NB by delineating rare focal copy number gain and amplification events. We identified the SRY-related HMG-box transcription factor 11 (SOX11) as the sole protein coding gene residing in the shortest region of overlap at chromosome 2p distal to MYCN and pinpointed SOX11 as an important transcription factor implicated in adrenergic NB development.
SOX11 belongs to the SOX family of proteins, which are critical regulators of many developmental processes, including neurogenesis5. These TFs bind and bend the minor groove of the DNA using their highly similar high mobility group (HMG) domains. SOX11 belongs to the SoxC subgroup, which also includes SOX4 and SOX12, and the expression of these proteins is of key importance for the survival and development of the neural crest, multipotent neural and mesenchymal progenitors, and the sympathetic nervous system6,7. In addition to its presumed canonical transcription factor activity, SOX11 was recently also shown to have pioneering activity and thus can be assumed to direct chromatin accessibility at loci controlling cell fates8. Here, we report that SOX11 acts as a lineage-dependency factor in adrenergic NB cells. We performed in depth analysis of functional SOX11 target genes which include multiple genes involved in chromatin remodeling and modification and DNA methylation and show that SOX11 impacts chromatin accessibility of lineage-specific adrenergic enhancers. Furthermore, a comparative analysis of SOX11 and MYCN binding sites suggests functional interaction between both transcription factors.
RESULTS
Rare focal amplifications and lineage-specific expression of SOX11 in neuroblastoma
To identify lineage dependency TFs implicated in NB, we reanalysed DNA copy number profiling data of 556 high-risk primary NB tumors9 together with those from 263 additional published and 223 unpublished NB tumors10,11 and 39 neuroblastoma (NB) cell lines. We specifically searched for focal gains and/or amplifications of chromosomal segments encompassing TFs with a putative or known role in normal (neuronal) development. Within the 270 kb shortest region of overlap of the commonly gained large chromosome 2p segment (31% of cases), we identified focal amplifications (three primary NB cases with log2 ratio > 2) and high-level gains (two primary NB cases and one cell line with log2 ratio > 0.5) encompassing the TF SOX11 as the only protein coding gene (Fig. 1a). Furthermore, all tumors showing SOX11 focal amplification or high-level gain were also MYCN amplified (Fig. S1a). FISH analysis could be analyzed in two tumors and showed that SOX11 and MYCN reside in two independently amplified segments (Fig. 1b). SOX11 mRNA expression levels were found to be elevated in primary NB tumors with higher SOX11 copy numbers (p-value = 1.82e-09, t-test) (n= 276) (Fig. 1c). Next, we observed that high SOX11 mRNA expression levels (fourth quartile) are significantly related to worse overall and progression free survival in two independent NB cohorts of 276 and 498 patients (Fig. 1d, Fig. S1b). In addition, SOX11 immunohistochemical analysis using two independent SOX11 antibodies (Fig. S1c) showed that high SOX11 protein expression levels are associated with worse overall survival in a series of 56 NB tumors (Fig. 1e).
SOX11 is significantly higher expressed in high risk MYCN amplified tumors as compared to high risk MYCN non-amplified tumors and low risk tumors (Fig. S1d). Analysis of SOX11 tissue specific expression patterns in R2 platform (http://r2amc.nl) and the Cancer Cell Line Encyclopedia (CCLE) showed the highest SOX11 expression levels, highest copy number ratio and lowest methylation levels in NB tumors and cell lines as compared to other entities (Fig. 1f, Fig. S1e-f). Lineage restricted expression was evident from high expression levels in human fetal neuroblasts and in sympathetic neuronal lineages during early development as compared to normal cortex from the adrenal gland (Fig. S1g) and temporal increase of SOX11 expression was also noted in mouse NB tumor models in early hyperplastic lesions and full-blown tumors as compared to normal adrenal gland (Fig. S1h-i). Higher expression levels of SOX11 both at mRNA and protein level were observed in adrenergic NB cell lines compared to mesenchymal NB cell lines and tumors (Fig. 1g, Fig. S1j-k). Taken together, we identified recurrent focal copy number alterations of the SOX11 locus in MYCN amplified tumors and adrenergic lineage-specific SOX11 expression levels that are associated with poor prognosis in NB patients.
SOX11 is flanked by multiple cis-interacting distal adrenergic super-enhancers
Transcription factors implicated in defining cell lineage and identity are typically under the control of super-enhancers12. SOX11 was previously identified as a super-enhancer-associated TF in adrenergic NB cell lines2. To further map putative (super-) enhancers regulating SOX11 expression, we investigated the SOX11 locus and its neighbouring region for H3K27ac marks in 23 NB cell lines, a non-malignant neural crest cell line (hNCC) and the breast cancer cell line MCF-7 as non-embryonal control3,13. Distal to the SOX11 locus, we noted a large (1.1 Mb) region without protein coding genes (gene desert) which was marked by multiple H3K27ac peaks, indicative of the presence of multiple active enhancers, and in 7/23 NB cell lines we could identify a super-enhancer region (Fig. 2a, Fig. S2a). In accordance with the absence of SOX11 expression in the breast cancer cell line MCF-7 cell line, the non-malignant neural crest cell line and the mesenchymal/neural crest like NB cell lines (GI-ME-N, SH-EP), H3K27ac peaks were absent in the gene desert, although in the latter two NB cell lines SOX11 promoter H3K27ac peaks were seen. In further support of this cell identity specific H3K27ac pattern, we looked into H3K27ac data before and after NOTCH3-driven adrenergic-to-mesenchymal transition in NB cells14. As shown in Fig. S2b, a reduction in H3K27ac peak size was noted for the SOX11 associated super-enhancer after NOTCH3 induction and thus mesenchymal transition. Interestingly, the Sciatic Injury Induced LincRNA Upregulator Of SOX11 (SILC1) lncRNA is transcribed from the super-enhancer region and has been shown to play a critical role for induction of SOX11 during neurite outgrowth and neuron regeneration15. This lncRNA is strongly expressed in adrenergic NB cells and highly correlated to SOX11 expression levels in both NB cell lines (p=0.00806, R=0.773, Spearman, n=11) and tumors (p=3.01e-12, R=0.308, Spearman, n=498) (Fig. 2b, Fig. S2c). Also, in the normal developing sympathetic lineage and across a wide panel of cancer cell lines and adrenergic/mesenchymal isogenic cell line pairs2, we observed a similar pattern of SILC1 expression as for SOX11 (Fig. 2c, Fig. S2d-e). To investigate looping and physical contact of the cell type-specific enhancers with the promoter of SOX11 and SILC1, we performed 4C-seq analysis for the SOX11 and SILC1 locus in the CLB-GA (adrenergic) and SH-EP (mesenchymal) NB cell lines and observed looping in this highly active region between the enhancer loci, the SOX11 and SILC1 promoter in the adrenergic cell line CLB-GA while this interaction was not detectable in the mesenchymal cell line SH-EP (Fig. 2d, Fig. S2f). In summary, multiple adrenergic specific enhancers are flanking the SOX11 locus with putative roles in SOX11 regulation and SOX11 mediated cell identity, including the lncRNA SILC1.
SOX11 is a lineage dependency factor in adrenergic NB cells
In a next step, we investigated whether adrenergic NB cells are dependent on SOX11 expression for growth and survival as previously noted for MYCN and CRC members16. According to the publicly available CRISPR screen data in 769 cell lines (AVANA CRISPR 19Q1, available via the DepMap Portal)17, SOX11 is identified as a strongly selective gene with dependency in 19 NB cell lines (p=6.6e-17) and more specifically in 13 MYCN amplified NB (p=2.3e-20) cell lines. We further assessed the phenotypic effects of SOX11 knockdown in adrenergic NB cell lines, including two MYCN amplified cell lines (NGP and IMR-32) and two MYCN non-amplified cell lines with high MYC activity (SK-N-AS) or hTERT activation (CLB-GA), using RNA interference knockdown experiments (siRNAs and/or shRNAs) (Fig. 3a-b). Transient siRNA mediated knockdown of SOX11 for 48h in NGP, SK-N-AS and CLB-GA cells resulted in the expected decreased number of colonies, as compared to transfected control cells (Fig. 3c). Concomitantly, long-term phenotype assessment after SOX11 knockdown in the NGP, CLB-GA and IMR-32 cell lines, using the two most efficient shRNAs (sh3 and sh4), induced a significant G0/G1 growth arrest and reduction of proliferation (Fig. 3d-e). Our results confirm the CRISPR screen predicted strong lineage dependency role for SOX11 in the adrenergic NB cell line models we tested, both MYCN amplified and non-amplified.
The SOX11 regulated transcriptome is involved in epigenetic control, cytoskeleton and neurodevelopment
To identify the key factors and pathways contributing to the adrenergic NB cell phenotype, we performed global transcriptome analysis upon transient siRNA-mediated knockdown of SOX11 in IMR-32 cells for 48 hours (Fig. 4a, Fig. S4a, Supplementary Table 1). As an orthogonal experiment, we explored the transcriptome landscape after exogenously overexpressing SOX11 for 48h in the mesenchymal SH-EP NB cell line (Fig. 4b, Fig. S4a, Supplementary Table 1). First, validating our SOX11 knockdown specificity, we observed that both the significantly downregulated genes upon SOX11 knockdown (n=1016, adj. pval <0.05) and the upregulated genes upon SOX11 overexpression (n=4980, adj. pval <0.05) (further referred to as SOX11 activated targets) revealed enrichment of predicted SOX11 binding sites in their promoter (Fig. S4b). In line with the observed cell cycle arrest upon SOX11 knockdown, the CDK inhibitor CDKN1A (also known as p21) (p=6.8e-07) was one of the top upregulated genes (Fig. S4c). Unexpectedly, multiple epigenetic modifiers involved in DNA and histone methylation, SWI/SNF and PRC1/2 complex members were strongly enriched, several of which were identified through ChIP sequencing as direct SOX11 targets (see further) (Fig. 4c, Fig. S4d). Of further interest and in concordance with the above mentioned putative epigenetic drivers regulated by SOX11, upregulated genes upon SOX11 overexpression are involved in promoter and enhancer binding (Fig. 4c, Fig. S4e). In accordance with the putative role of SOX11 in adrenergic cell identity, we find enrichment for the genes of the proneuronal subtype in glioblastoma and adrenergic subtype in NB amongst the SOX11 activated targets. Vice versa, genes of the mesenchymal subtype are enriched amongst upregulated genes upon SOX11 knockdown and downregulated genes upon SOX11 overexpression (further referred as the SOX11 repressed targets) (Fig. S4f)18. Altogether, this is suggestive of a possible role for SOX11 in maintenance of cell identity. Furthermore, gene set enrichment analysis for the SOX11 activated targets revealed strong enrichment for axon outgrowth, neural crest cell migration, cytoskeleton and RHO signalling (Fig. 4c, Fig. S4g). The strong induction of MARCKSL1 upon SOX11 overexpression (log2 fold change 5.88) (Fig. S4c), an important regulator of actin stability and migration in neurons19, and the SOX11 regulated genes FSCN1 and TEAD2, two published SOX11 targets involved in cytoskeleton and neurodevelopment7,20, further supports the observed enriched gene sets.
Finally, SOX11 repressed targets are enriched for gene sets involved in translation initiation, ribosome hyperactivity and mRNA processing, which is indicative of an enhanced ribosome biogenesis response as previously reported in MYCN driven NB21(Fig. 4c). In conclusion, modulation of SOX11 expression affects a broad range of phenotypic features in NB cells, including epigenetic control, cytoskeleton and neurodevelopment.
SOX11 directly regulates the main modulators of the epigenome
To more comprehensively investigate direct functional targets of SOX11, we performed SOX11 DNA-occupancy analysis (ChIP-sequencing) in one of the prototypical MYCN amplified adrenergic cell lines, IMR-32. DNA binding motif analysis revealed a de novo SOX motif (24% of targets, bionomial test, p=1e-223) in the significant SOX11 bound sites (n=3105) providing support for the validity of our analytical procedure (Fig. S5a) (Supplementary Table 2). The majority of these SOX11 peaks showed significant overlap with the H3K27ac mark for active chromatin (88%), promoter mark H3K4me3 (67%) and open chromatin defined by ATAC-seq (95%)13, consistent with binding of SOX11 to both proximal and distal active transcriptional regulatory regions of both protein coding and noncoding genes (Fig. 5a-b, Fig. S5b) (p< 2.2e-16, Fisher exact test for overlap). We further validated the ChIP-seq targets by correlation with the above different generated transcriptome signatures upon SOX11 knockdown and overexpression in a panel of 29 NB cell lines and in two primary NB tumor transcriptome datasets which showed strong overlap between SOX11 ChIP-seq activity score, the above transcriptome derived SOX11 signatures, SOX11 expression levels and NB patient survival outcome (Fig. 5c, Fig. S5c-e). Functional SOX11 targets were identified based on binding of SOX11 to their promotor/enhancer regions and when expression levels were significantly affected by SOX11 knockdown and overexpression. From a total of 2626 identified SOX11 binding sites, 313 functional targets were selected using these criteria. Strong enrichment of SOX11 binding was preferentially found in downregulated genes upon SOX11 KD, suggesting that SOX11 acts mainly as transcriptional activator within this cell model (Fig. 5c-d). Also, further supporting the pathway analysis of the SOX11 transcriptome data, the activated targets bound by SOX11 (n=234) showed chromatin remodeling, transcriptional regulation and axonogenesis as the most significantly enriched pathways (Fig. 5e, Fig. S5f). Next, we narrowed down the target list and established an 88-gene signature with SOX11 top direct functional targets containing genes with SOX11 DNA binding, differential expression upon SOX11 perturbation and positive correlation with SOX11 expression in 2 independent primary NB cohorts (Fig. 5f, Supplementary Table 3). Strongly bound SOX11 targets included genes implicated in chromatin remodeling and enhancer activation (SWI/SNF core components SMARCC1, SMARCAD1), chromatin modification (PRC1/2-like complex components including CBX1 and CBX2), DNA methylation (DNMT1 binding partner UHRF1) and pioneer transcription factors including c-MYB, suggesting an important role for SOX11 in epigenetic regulation and control (Fig. 5b, Fig. 5g). In addition to c-MYB and SMARCC1, we observe elevated and reduced protein levels for the SWI/SNF core ATPase SMARCA4, upon SOX11 overexpression and SOX11 knockdown respectively (Fig. 5g, Fig. S5g-h). In summary, integration of SOX11 ChIP-seq data with SOX11 expression gene signatures reveals epigenetic modulators as bona-fide SOX11 direct targets in IMR-32.
SOX11 acts in concert with MYCN to regulate a subset of downstream targets
The finding of a canonical MYCN motif (13.5% of the SOX11 targets, p=1e-28, Fig. S5a) and MAX motif (11.5% of the SOX11 targets, p=1e-32) in the SOX11 ChIP-seq targets (Supplementary Table 2). prompted us to perform ChIP-sequencing for MYCN in IMR-32 cells to explore the relation between SOX11 and MYCN in more depth. We observed an enrichment for both MYC(N) and SOX motifs in the identified MYCN targets in IMR-32 cells (Supplementary Table 2). Indeed, SOX11 and MYCN share common binding sites, both at promoters and enhancers (Fig. 5h) (Fisher exact test called peaks p= 5.5e-06). The enrichment for SOX11 bound activated genes among the genes downregulated 6h and 12h after inducible knockdown of MYCN in IMR-5/75 (Fig. S5f) further confirmed SOX11 and MYCN shared targets. Interestingly, SOX11-only bound genes at enhancers revealed strong enrichment for a non-canonical E-box motif, while SOX11-MYCN co-bound genes or MYCN-only bound genes at enhancers show enrichment for a canonical E-box motif (Supplementary Table 2). This suggests that SOX11 co-binds enhancers with bHLH TFs binding at non-canonical E-box motifs, such as the TFs HAND2, TWIST1, TCF3 or ASCL1 (Fig. S5a). Of further interest, genes that were directly bound by SOX11-MYCN showed enrichment for the genes downregulated 6h and 12h upon inducible knockdown of MYCN in IMR-5/75, as well as for gene sets involved in histone mediated repression, such as targets of EZH2, a common binding target of SOX11 and MYCN, and the SWI/SNF complex (Fig. S5i). In summary, our data indicates co-binding of SOX11 and MYCN which indicates a putative cooperative function between both genes.
SOX11 regulates chromatin accessibility at active enhancers
SOX TFs exhibit the exceptional feature to bind the minor groove of DNA and subsequently bend the DNA, which is essential for the enhanceosome architecture as well as for the distortion of DNA wrapped around histones suggesting putative pioneer factor function8. As our current findings also reveal a role for SOX11 in transcriptional regulation of crucial components of several epigenetic protein regulatory complexes, including PRC2 and SWI/SNF, we propose a master epigenetic regulator function of SOX11 in adrenergic NB cells. To explore this further, we mapped chromatin accessibility changes by ATAC-sequencing 48h after SOX11 knockdown in adrenergic CLB-GA cells. Differential ATAC-seq peaks in CLB-GA revealed only significant closed regions upon SOX11 knockdown (n=2875), indicating a broad effect of SOX11 on chromatin accessibility. Of further interest, these sites of chromatin accessibility changes are predominantly observed at enhancers (Fig. 6a). Moreover, differential ATAC peaks overlapping with active chromatin marks (H3K27ac and H3K4me1 binding, n=1345) were highly enriched for motifs of the adrenergic core regulatory circuitry (CRC), including GATA3, PHOX2B, ISL1 and HAND2 (Fig. 6a). Indeed, 89% of the active enhancers closed upon SOX11 knockdown overlap with binding of at least one member of the adrenergic CRC (PHOX2B, GATA3, HAND2) (Fig. S6a). Adrenergic specific super-enhancer regions2 show impaired chromatin accessibility upon SOX11 knockdown (Fig. 6b). Given the adrenergic specific expression of SOX11, the presence of an active downstream super-enhancer, and the enrichment for CRC binding in the regions closed upon SOX11 knockdown in adrenergic cells, we investigated the putative role of SOX11 in the adrenergic core regulatory circuitry. While the broad SOX11 enhancer binding activity also includes adrenergic CRC super-enhancers with lost chromatin accessibility upon SOX11 knockdown (Fig. S6b), SOX11 binding is not enriched in enhancer regions that are occupied by several CRC members (Fig. S6c-d). In addition, SOX11 does not bind the super-enhancer and SILC1 lncRNA locus that loops to the SOX11 promotor. However, SOX11 binding is shown to occur at a more distal super-enhancer present in only a limited number of NB cell lines (3/23) (data not shown). Furthermore, we do not see transcriptional impact upon SOX11 knockdown on the published CRC signatures or the key CRC members such as GATA3 and HAND2 (data not shown). To further understand the dynamic relation of SOX11 towards the CRC, we looked into expression levels of SOX11 and key CRC members PHOX2, HAND2 and GATA3 in human Pluripotent Stem Cells (hPSC) induced developing sympathoblasts (Van haver et al., in preparation). SOX11 was found to be expressed in earlier developmental stages (Fig 6c, Fig. S6e). Taken together, our findings support the notion that SOX11 is not a canonical CRC member but plays a distinct role during early sympathoblast development prior to emergence of the adrenergic master regulator PHOX2B and the other CRC members including HAND2 and GATA3. As SOX11 mediates chromatin accessibility for adrenergic core regulatory circuitry gene binding, we propose a role for SOX11 in the epigenetic establishment and/or maintenance of the adrenergic CRC through activation of various key epigenetic components.
Discussion
Cellular mechanisms that govern lineage-specific proliferation and survival during development may be co-opted by tumor cells. Consequently, these tumors will be selectively dependent on such lineage factors offering interesting options for targeting these tumor cells whilst sparing normal tissues. We identified SOX11 as a lineage dependency gene in neuroblastoma (NB) which is recurrently affected by large segmental 2p gains in high risk NBs as well as recurrent focal gains and amplifications. SOX11 was identified as the sole protein coding gene residing in the shortest region of overlap at 2p distal to MYCN, suggesting a role as driver for selection of the respective amplicons during tumor formation. SOX11 is known to exhibit lineage specific expression during normal development, predominantly in the neuronal lineage. In line with the data from the CRISPR screen available through the DepMap portal, our SOX11 knockdown data support that SOX11 is a dependency factor in NB. Furthermore, higher SOX11 expression levels were found to be correlated with poor prognosis for NB patients. To gain insight into the functional contribution to the tumor phenotype, we identified functional SOX11 target genes through genome wide DNA binding combined with transcriptome analysis after SOX11 knockdown and overexpression yielding 79 upregulated and 234 downregulated direct target genes indicating a primarily gene transcription activating role for SOX11. Both the diverse functionality of target genes and our in vitro experiments suggest a possible broad functional impact of SOX11 on the NB phenotype. Here, we further focussed on the possible role of SOX11 as epigenetic master regulator, given its direct role in regulation of components of the PRC2 and SWI/SNF complexes, the DNMT1 recruiting protein UHRF1 and pioneering TF c-MYB22. While these targets require further individual functional validation, the finding of multiple functional targets implicated in a broad range of essential epigenetic regulatory processes is intriguing. SOX11 was previously found to be affected by loss-of-function germline mutations causing the intellectual disability Coffin-Siris syndrome23, a disease mostly caused by mutations in SWI/SNF components including SMARCE1, and thus suggesting that SWI/SNF regulation is one of the major functions of SOX11 during neuronal development. We therefore propose that this function is also critically important in adrenergic NB cells. Possibly, SOX11 may contribute to the differentiation arrest in NB initiation as it has been reported that in normal development (e.g. cardiogenesis), activation of specific SWI/SNF components can have an impact on developmental transcriptional programs24. In this respect, it is of interest that overexpression of SMARCA4/BRG1, the catalytical component of the SWI/SNF complex, is essential for NB cell viability25. Furthermore, the previously established role of SWI/SNF chromatin remodeling in maintenance of lineage-specific enhancers leads us to hypothesize that the SWI/SNF complex could impact on NB tumor maintenance26.
While we observed an enrichment of adrenergic and mesenchymal gene signatures upon SOX11 overexpression and knockdown respectively, SOX11 overexpression in itself was not sufficient to induce a transition in cell lineage, at least not at 48 hours after induction of SOX11 overexpression. The exact role of SOX11 in the development of the sympathetic neuronal lineage and its functional interconnection with the adrenergic TFs of the CRC will require further investigation. Our current data do not support an unequivocal role for SOX11 as a canonical CRC member, despite high SOX11 expression being restricted to the adrenergic lineage. There is further no indication for SOX11 being part of the extended regulatory network (ERN), reported as downstream genes regulated by super-enhancers and the CRC transcriptional factors27, that do not necessarily bind to the super-enhancers of the CRC genes, nor have an autoregulatory function. Rather than a CRC TF like function, we propose a CRC upstream hierarchical function based on gene expression analysis of SOX10 positive neural crest derived maturing sympathetic adrenergic neuroblasts. While the adrenergic master regulator PHOX2B is strongly induced at day 23 of differentiation together with several CRC TFs including HAND2 and GATA3, SOX11 is clearly expressed much earlier from day 16 on of the differentiation track. Given the recently identified role for SOX11 as pioneer factor and recent insights into the interaction of non-pioneering factor to control to modulate cell fate and identity28, together with the potential of SOX11 to broadly activate multiple epigenetic regulatory protein complexes, we propose a critical direct or cooperative role for SOX11 in induction of the CRC of TFs in developing sympathoblasts. We also propose that several SOX11 driven functions are co-opted by the transformed neuroblasts contributing to the aggressive phenotype of high-risk adrenergic NBs. Further support for this view comes from the recent description of a unique aggressive transitional cell state important for the inter-transition between adrenergic and mesenchymal cells through single-cell RNA-sequencing analysis of peripheral neuroblastic tumors29. Interestingly, SOX11 is described as a marker gene of the transitional state, amongst MYCN and others, while GATA3 and HAND2 mark the adrenergic state. This is in line with the fact that SOX11 was recently reported to be activated in a so-called proliferative active bridging population (transient cellular state) connecting a progenitor cell type coined Schwann cell precursors and their differentiated counterpart, chromaffin cells30, while the transitional signature in the above described tumors is enriched in the bridging population.
Our study also highlights long noncoding RNA SILC1 as a putative regulator of SOX11 activity. SILC is marked by strong adrenergic-specific H3K27ac peaks and maps to a putative regulatory SOX11 distal protein coding poor region. Interestingly, the SILC1 sequence and expression, in contrast to most other lncRNAs, is conserved in various mammals, and has been shown to regulate SOX11 expression in cis during neurite outgrowth and neuron regeneration15. Additionally, this lncRNA was previously shown to be implicated in neuronal cells with evidence for chromatin looping towards the SOX11 promotor as shown by HiC genome wide chromatin conformation data in mice.31,32 Recent data also associated SILC1 expression with proliferation and apoptosis in NB.33
In conclusion, we identify SOX11 as a novel dependency factor in adrenergic high-risk NB with a putative function as epigenetic master regulator upstream of the recently discovered CRC. Further studies in in vitro cellular models and targeted overexpression to the sympathetic adrenergic lineage in mice or zebrafish as well as developmental studies in animals or hPSC differentiation models are needed to further explore the complex interplay of the broad range of transcription factors in adrenergic neuroblasts and how these interact with unscheduled MYCN overexpression leading to NB formation.
Material & Methods
Samples and cell lines
All patient specimens and samples were used in accordance with institutional and national policies at the respective locations, with appropriate approval provided by the relevant ethical committees at the respective institutions. All patient-related information was anonymized.
Copy number analysis was performed on primary untreated NB tumors, representative of all genomic subtypes including 263 and 223 samples10,11 of the NRC cohort (GSE85047), 556 samples of the NB high-risk cohort9 (GSE103123), and one unpublished in-house sample. In addition, copy number data of 33 NB cell lines10 and the cell line COG-N-373h (Fig. 1a) were used. SOX11 expression analysis was performed on 283 NB tumors for which copy number (n=218), mRNA expression (n=283) and patient survival (n=276) data were available from the Neuroblastoma Research Consortium (NRC, GSE85047), which is a collaboration between several European NB research groups. Additionally, the NB dataset from Su et al. (n=489, GSE45547) was used as validation cohort34.
All NB cell lines used in this manuscript (genotype and mutation status in Supplementary Table 4), were grown in RPMI1640 medium supplemented with 10% foetal bovine serum (FBS), 2 mM L-Glutamine and 100 IU/ml penicillin/streptavidin (referred further to as complete medium) at 37°C in a 5% CO2 humid atmosphere. Short Tandem Repeat (STR) genotyping was used to validate cell line authenticity prior to performing the described experiments and Mycoplasma testing was done every two months and no mycoplasma was detected.
High-resolution DNA copy number analysis
DNA was obtained using the QiaAmp DNA Mini kit (Qiagen #51304) according to the manufacturer’s instructions and concentration was determined by Nanodrop (Thermo Scientific) measurement. Array comparative genomic hybridisation (arrayCGH) was performed as previously described35, as well as shallow whole genome sequencing36. Copy number data were processed, analysed and visualised using VIVAR35. Fluorescence in situ hybridization (FISH) was performed as previously described37 using the CTD-2037E22 probe for the SOX11 locus. SOX11 amplifications and high-level focal gains were identified as copy number segments overlapping with the SOX11 locus with log2 ratio >= 2 and >= 0.3 respectively and a maximal size of 5 Mb.
Tissue micro-array
A NB tissue micro-array was used as previously described38. For immunohistochemical staining, 5-μm sections were made, antigen retrieval was done in citrate buffer and endogenous peroxidases were blocked with H2O2 (DAKO). The sections were incubated with primary antibodies (SOX11-C1 antibody39 = Antibody 1, SOX11 antibody from Klinipath (cat#ILM3823-C01) = Antibody 2), followed by incubation with the Dako REALTM EnvisionTM-HRP Rabbit/Mouse system and substrate development was done with DAB (DAKO). Scanning of the slides was done using the Zeiss Axio Scan.Z1 (Zeiss) and counting of SOX11 positive NB cells was done by H-scoring. In brief, the percentage of SOX11-positive cells is each time multiplied by the intensity (0, 1, 2 of 3): [1 × (% cells 1+) + 2 × (% cells 2+) + 3 × (% cells 3+)]. Blind scoring was done by two independent persons. Each sample was present in triplicate and scores are presented as the average of the three replicates. 15 samples were omitted due to lack of survival data.
Statistical and transcriptomic analysis of NB cohorts and other entities
Neuroblastoma transcriptomic analysis was performed on a dataset of 283 NB tumors for which copy number (n=275), mRNA expression (n=283) and survival (n=276) data were available from the Neuroblastoma Research Consortium (NRC, GSE85047), which is a collaboration between several European NB research groups. Additionally, the NB dataset from Su et al. (n=489, GSE45547) was used as validation cohort34. The SILC1 expression data were analysed using the RNA Atlas (Lorenzi et al., in preparation). H3K27ac ChIP-seq data and super-enhancer annotation were public available from Boeva et al. 3 and Decaesteker et al. 13. The Depmap array17 and R2 platform (http://r2amc.nl) were used as repositories for gene expression and dependency data of different tumor entities.
All statistical analyses (two-sided t-test, Wilcoxon test, kruskal-wallis, ANOVA, post-hoc dun-test or tukey test, Kaplan-Meier, correlation spearman and pearson) were done using R (version 3.5.1). For correlation analysis, genes were ranked according to Pearson correlation coefficient.
4C-sequencing
4C-sequencing was performed as previously described13. For visualization purposes, the viewpoint was removed (chr2:58210000-4835000) and the plot was generated using R package Sushiplot with normalized bedgraph files.
Transfection and nucleofection of cell cultures
Cells were seeded in 6-well tissue culture plates 24 hours prior to transfection. 100nM of siRNA non-targeting control (siRNA NTC, D-001810-10-05) or siRNA SOX11 (L-017377-01-0005, Dharmacon) were transiently transfected using DharmaFect 2 (Thermo Fisher Scientific) according to the manufacturer's guidelines. For nucleofection, cells were nucleofected with 100 nM of the above described siRNA NTC and siRNA SOX11 using the Neon Transfection System (Thermo Fisher Scientific) and subsequently seeded in 6-well or T25 tissue culture plates.
Generation of stable cell lines
Four different mission shRNAs from the TRC1 library (Sigma-Aldrich, TRCN0000019174, TRCN0000019176, TRCN0000019177, TRCN0000019178, referred in the manuscript as sh1, sh2, sh3, sh4 respectively) targeting SOX11 and one non-targeting shRNA control (SHC002, NTC) were used to generate neuroblastoma cell lines with SOX11 knockdown.
Virus was produced by seeding 3×106 HEK-293TN cells in a 15cm2 dish 24h prior to transfection. Transfection of the cells was done with trans-lentiviral packaging mix and lentiviral transfection vector DNA according to the Thermo Scientific Trans-Lentiviral Packaging Kit (TLP5913) using CaCl2 and 2x HBSS. 16 hours after transfection, cells were refreshed with reduced serum medium and lentivirus-containing medium was harvested 48 hours later. Virus was concentrated by adding 2500 μl ice-cold PEG-IT (System Biosciences) to 10 ml harvested supernatants and incubating overnight at 4°C, after which complete medium was added to the remaining pellet upon centrifugation. NGP, CLB-GA and IMR-32 cells were transduced by adding 250 μl concentrated virus to 1750 μl complete medium. 24h after transduction cells were refreshed with medium and 48h after transduction, cells were selected using 1 μg/ml puromycin.
For SOX11 inducible overexpression, the OriGene vector SC303275 containing the cDNA of SOX11 was amplified by PCR and the obtained fragment was gel purified and ligated into the opened NdeI site of response vector pLVX-TRE2G-Zsgreen1 (Takara, cat#631353) producing pLVX-TRE3G-Zsgreen1-IRES-hSOX11. The constructed plasmid was verified by restriction digest and sequenced by Sanger DNA sequencing (GATC). Lenti-X 293T Cells (Takara, cat#632180) were transfected with the regulator vector pLVX-pEF1a-Tet3G (cat#631353 and Lenti-X Packaging Single Shots (VSV-G) (cat#631275) according to the manufacturer’s instructions. The supernatant containing the lenti-virus was collected, filtered through a 0.45 μm filter and concentrated using PEG-IT. SH-EP cells were infected with the concentrated virus and 48 hours of incubation thereafter, the transduced cells were selected using 500 μg/ml G418. Three individual clones were obtained by limiting dilution. After clonal expansion, the TET protein expression in each clone was checked by immunoblotting using TetR monoclonal antibody (Clone 9G9) (Clontech, cat#631131). In addition, induction of each expressing clone was tested after transduction with the pLVX-TRE3G-Luc control vector. Selected clones were transduced with lentivirus produced as described above from vector pLVX-TRE3G-Zsgreen1-IRES-hSOX11 and subsequently selected with 1 μg/ml puromycin. The SH-EP SOX11 clones were grown in completed medium supplemented with 10% tetracyclin-free FBS to avoid leakage.
Phenotypic assessment of cells
For the colony formation assay, 2000 viable NGP, CLB-GA and SK-N-AS cells with or without SOX11 knockdown were seeded in a 6-cm dish in a total volume of 5 ml complete medium and were then left unaffected for 10-14 days at 37°C. After an initial evaluation under the microscope, the colonies were stained with 0.005% crystal violet and digitally counted using ImageJ. The IncuCyte® Live Cell imaging system (Essen BioScience) was used for assessment of proliferation after SOX11 knockdown or overexpression. Briefly, 15×103 viable NGP or 10×103 viable SH-EP cells, with or without SOX11 knockdown or overexpression, were seeded in 5 replicates in a 96-well plate (Corning costar 3596) containing complete medium. Cell viability was measured in real-time using the IncuCyte by taking photos every 3 hours of the whole well (4x). Masking was done using the IncuCyte® ZOOM Software.
For cell cycle analysis, 7×105 cells were seeded in a T25 in complete medium and nucleofected with SOX11 siRNA or transduced with SOX11 shRNAs and respectively controls and selected with puromycin, as described above. Cells were trypsinized and washed with PBS. The cells were resuspended in 300 μl cold PBS and while vortexing, 700 μl of 70% ice-cold ethanol was added dropwise to fix the cells. Following incubation of the sample for minimum 1 hour at −20 °C, cells were washed in PBS and resuspended in 500 μl PBS with RNase A to a final concentration of 0.25 mg/ml. Upon 1 hour incubation at 37° C, 20 μl Propidium Iodide solution was added to a final concentration of 40 μg/ml. Samples were loaded on a BioRad S3TM Cell sorter and analysed with the Dean-Jett-Fox algorithm for cell-cycle analysis using the ModFit LTTM software package.
Culture and RNA-sequencing of hPSC differentiation track
Utilizing a modified dual-SMAD inhibition differentiation protocol developed by the Studer laboratory at the Memorial Sloan Kettering Cancer Center, we performed in vitro differentiations of hPSCs into SAPs. Over the course of a 40-day differentiation, cells were cultured and sorted on day 16 for the CD49d maker (SOX10 positive cells), when cells are committed to trunc neural crest cells. Cells were harvested at the neural crest and hSAP stages. RNA was isolated from the collected cell pellets by lysing the cells in TRIzol Reagent (ThermoFisher catalog # 15596018) and inducing phase separation with chloroform. Subsequently, RNA was precipitated with isopropanol and linear acrylamide and washed with 75% ethanol. The samples were resuspended in RNase-free water. After RiboGreen quantification and quality control by Agilent BioAnalyzer, 534-850ng of total RNA with DV200% varying from 38-74% was used for ribosomal depletion and library preparation using the TruSeq Stranded Total RNA LT Kit (Illumina catalog # RS-122-1202) according to manufacturer’s instructions with 8 cycles of PCR. Samples were barcoded and run on a HiSeq 4000 in a 50bp/50bp paired end run, using the HiSeq 3000/4000 SBS Kit (Illumina). On average, 48 million paired reads were generated per sample and 35% of the data mapped to the transcriptome.
RNA-sequencing of perturbated NB cells
Poly-adenylated stranded mRNA sequencing was performed as previously described13. In brief, the samples were prepared using the TruSeq Stranded mRNA Sample Prep Kit from Illumina and subsequently sequenced on the Nextseq 500 platform. Sample and read quality was checked with FastQC (v0.11.3). Preprocessing of the fastq reads was performed as previously described13. Reads were aligned to the human genome GRCh38 with the STAR aligner (v2.5.3a) and gene count values were obtained with RSEM (v1.2.31). Genes were only retained if they were expressed at counts per million (cpm) above 1 in at least four samples. Counts were normalized with the TMM method (R-package edgeR), followed by voom transformation and differential expression analysis using limma (R-package limma). A general linear model was built with the treatment groups (knockdown or overexpression) and the replicates as a batch effect. Statistical testing was done using the empirical Bayes quasi-likelihood F-test.
Gene Set Enrichment Analysis was performed on the genes ordered according to differential expression statistic value (t). Signature scores were conducted using a rank-scoring algorithm40. A custom-made ReplotGSEA function was used to generate gene set enrichment plots. For the data generated on the foetal adrenal glands and differentiation along the sympatho-adrenal lineage, normalisation was done using DESeq2 and rlog transformation, which is more robust in the case when the size factors vary widely.
Western blot analysis and antibodies
Protein isolation and western blot was performed as previously described13. The membranes were probed with the following primary antibodies: anti-SOX11 antibody (SOX11-PAb, 1:1000 dilution), anti-c-MYB antibody (12319S, Cell Signaling, 1:1000 dilution), anti-MYCN antibody (SC-53993, Santa Cruz 1:1,000 dilution), anti-SMARCC1 antibody (11956S, Cell Signaling 1:1,000), and anti-SMARCA4 antibody (3508S, Cell Signaling, 1:500). As secondary antibody, we used HRP-labeled anti-rabbit (7074S, Cell Signalling, 1:10,000 dilution) and anti-mouse (7076P2, Cell Signalling, 1:10,000 dilution) antibodies. Antibodies against Vinculin (V9131; Sigma-Aldrich, 1:10,000 dilution), alpha-Tubulin (T5168, Sigma-Aldrich, 1:10,000 dilution) or beta-actin (A2228, Sigma-Aldrich, 1:10,000 dilution) were used as loading control. The rabbit polyclonal antibody, SOX11-PAb, was custom made (Absea biotechnology, China) against the immunogenic peptide p-SOX11C-term DDDDDDDDDELQLQIKQEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPASP TLSSSAESPEGASLYDEVRAGATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVSTSSS and used for western blot and chromatin immunoprecipitation for SOX11. All antibodies were diluted in milk/TBST (5 % non-fat dry milk in TBS with 0.1 % Tween-20).
Chromatin immunoprecipitation (ChIP) assay and ATAC-seq
Chromatin immunoprecipitation (ChIP) and Assay for Transposase-Accessible Chromatin (ATAC) using sequencing was performed as previously described13. Briefly, for ChIP-seq a total of 10×107 cells were crosslinked with 1% formaldehyde, quenched with 125 mM glycine, lysed and sonicated with the S2 Covaris for 30 min to obtain 200-300 bp long fragments. Chromatin fragments were immunoprecipitated overnight using 1 μg antibody of SOX11-PAb antibody and 20 μl Protein A UltraLink® Resin (Thermo Scientific) beads per 10×106 cells. Reverse crosslinking was done at 65°C for 15h and chromatin was resuspended in TE-buffer, incubated with RNase A and proteinase K. DNA was isolated and concentration was measured using the Qubit® dsDNA HS Assay Kit. Library prep was done using the NEBNExt Ultra DNA library Prep Kit for Illumina (E7370S) with 500 ng starting material and using 8 PCR cycles according to the manufacturer’s instructions. For ATAC-seq, 50,000 cells were lysed and fragmented using digitonin and Tn5 transposase. The transposed DNA fragments were amplified and purified using Agencourt AMPure XP beads (Beckman Coulter). ChIP-seq and ATAC-seq libraries were sequenced on the NextSeq 500 platform (Illumina) using the Nextseq 500 High Output kit V2 75 or 150 cycles (Illumina).
ChIP-seq and ATAC-seq data-processing and analysis
Prior to mapping to the human reference genome (GRCh37/hg19) with bowtie2, quality of the raw sequencing data of both ChIP-seq and ATAC-seq was evaluated using FastQC and adapter trimming was done using TrimGalore. Quality of aligned reads were filtered using min MAPQ 30 and reads with known low sequencing confidence were removed using Encode Blacklist regions. Peak calling was performed using MACS2 taking a q value of 0.05 as threshold and peaks were filtered for chr2p amplified regions in the case of IMR-32 cells. DiffBind was used for differential ATAC-peak analysis. Homer41 was used to perform motif enrichment analysis, with 200 bp around the peak summit as input. Overlap of peaks, annotation, heatmaps and pathway enrichment was analysed using DeepTools, the R package ChIPpeakAnno, and the web tool enrichR. Sushiplot was used for visualization of the data upon RPKM normalization or log likelihood ratio calculation with MACS2.
DATA availability
The RNA-sequencing, ChIP-sequencing and ATAC-sequencing datasets generated during this study were deposited in the ArrayExpress database at EMBL-EBI (www.ebi.ac.uk/arrayexpress) with accession numbers: E-MTAB-9338, E-MTAB-9340, E-MTAB-9463 and E-MTAB-9464. Tumor data that supports the findings of this study are available from the Neuroblastoma Research Consortium (NRC, GSE85047), Su et al. [GSE45547]34 and Depuydt et al. [GSE103123]9, and ChIP-seq data are available from E-MTAB-6562 and E-MTAB-6570.
AUTHOR CONTRIBUTIONS
B.D., A.L., S.L., S.VHo., S.D., W.V., G.D., C.V., E.S., and N.R. contributed to the development and design of methodology; B.D., C.V., E.D., J.R., J.V., D.C. and J.K. performed computational and statistical analysis; B.D., A.L., S.L., S.D., F.D., S.VHa. and E.S. performed experiments; R.V., J.N., J.K., N.R., M.F., J.S. and S.E. provided material, data and analysis tools, B.D. managed the maintenance of data, B.D., A.L., F.S. and K.D. wrote the original draft, S.L., S.VHo., C.V., J.N., J.K., W.V., S.S.R., T.P. and P.V. contributed to manuscript review and editing, B.D., A.L. and G.D. contributed to data representation and visualization; F.S. and K.D. directed the project and were responsible for funding.
Supplementary Tables
Supplementary Table 1: Differentially expressed genes (adj.P.Val < 0.05) upon SOX11 knockdown in IMR-32 cells or SOX11 overexpression in SH-EP cells
Supplementary Table 2: SOX11 and MYCN ChIP-seq targets in IMR-32 cells (MACS2, q.Val < 0.05, gene annotation with homer), homer motif enrichment (known motifs) 200 bp size around peak summit, for enhancer or promoter binding
Supplementary Table 3: 88-gene signature of SOX11 top direct targets (up upon SOX11 overexpression, down upon SOX11 knockdown, ChIP-seq target and positively correlated expression in 2 NB tumor cohorts)
Supplementary Table 4: Cell lines used in the manuscript with the sample ID, origin and MYCN amplification status.
Supplementary Information
ACKNOWLEDGMENTS
The authors would like to thank C. Nunes, L. Mus, K. Verboom, S. Claeys, J. Van Laere and E. De Smet for technical assistance, Lucia Lorenzi for providing the data of the RNA atlas, and P. Reynolds and M. Hogarty for providing the COG-N-373 cell line. We acknowledge the use of the Integrated Genomics Operation Core, funded by the NCI Cancer Center Support Grant (CCSG, P30 CA08748), Cycle for Survival, and the Marie-Josée and Henry R. Kravis Center for Molecular Oncology. This research was supported by the following funding agencies: the Belgian Foundation against Cancer (project 2015-146 and F/2018/1246) to F.S., Ghent University (BOF10/GOA/019 and BOF16/GOA/23) to F.S., the Belgian Program of Interuniversity Poles of Attraction (IUAP Phase VII-P7/03) to F.S., the Fund for Scientific Research Flanders (Research projects G053012N, G050712N and G051516N to F.S., G021415N to K.D and F.S.), ‘Kom op tegen Kanker’ (Stand up to Cancer) the Flemish cancer society (Research grant to F.S.), the European Union H2020 (OPTIMIZE-NB GOD9415N and TRANSCAN-ON THE TRAC GOD8815N to F.S.) and FP7 (ENCCA 261474 and ASSET 259348 to F.S.), ‘Kinderkankerfonds’ (Research grant to F.S.), Olivia Fund to F.S. and Villa Joep to F.S. The following authors B.D., A.L., S.V. and C.V. are supported by an FWO grant.