Opposing immune and genetic forces shape oncogenic programs in synovial sarcoma

Synovial sarcoma is an aggressive mesenchymal neoplasm, driven by the SS18-SSX fusion, and characterized by immunogenic antigens expression and exceptionally low T cell infiltration levels. To study the cancer-immune interplay in this disease, we profiled 16,872 cells from 12 human synovial sarcoma tumors using single-cell RNA-sequencing (scRNA-Seq). Synovial sarcoma manifests antitumor immunity, high cellular plasticity and a core oncogenic program, which is predictive of low immune levels and poor clinical outcomes. Using genetic and pharmacological perturbations, we demonstrate that the program is controlled by the SS18-SSX driver and repressed by cytokines secreted by macrophages and T cells in the tumor microenvironment. Network modeling predicted that SS18-SSX promotes the program through HDAC1 and CDK6. Indeed, the combination of HDAC and CDK4/6 inhibitors represses the program, induces immunogenic cell states, and selectively targets synovial sarcoma cells. Our study demonstrates that immune evasion, cellular plasticity, and cell cycle are co-regulated and can be co-targeted in synovial sarcoma and potentially in other malignancies.


INTRODUCTION
Therapeutic strategies harnessing the cytotoxic capacity of the adaptive immune response to target tumor cells have radically changed clinical practice, but response varies dramatically across patients and tumor types (1,2). Cancer types with defined genetics and exceptionally low T cell infiltration levels could help provide clues to some of the immune escape mechanisms underlying lack of response to immune therapies.
The cellular plasticity (4), stem-like features (9,10), and unique genetics of SyS may explain its exceptional ability to escape immune surveillance despite the expression of immunogenic antigens. SyS is invariably driven by the SS18-SSX oncoproteinwhere the BAF subunit SS18 is fused to SSX1, SSX2 or, rarely, SSX4 (11). The BAF complex, the mammalian ortholog of SWI/SNF, is a major chromatin regulator (11), which has been previously shown to mediate resistance to immune checkpoint blockade in melanoma and renal cancer (12,13). SSX genes are a family of CTAs involved in transcriptional repression (14)(15)(16)(17). The resulting SS18-SSX oncoprotein leads to massive dysregulation of the chromatin architecture and transcriptional regulation (11,(18)(19)(20), generating a spectrum of malignant cellular morphologies (4), including epithelial-like malignant cells (in biphasic tumors), suggestive of pluripotential differentiation or mesenchymal to epithelial transitions.
Studies of human SyS to date have either relied on bulk tissue (21,22) or on established cellular models (11,18,19), masking important aspects of the tumor ecosystem. Moreover, given this cancer's rarity, even concerted efforts, such as TCGA, profiled only limited numbers of tumors (21)(22)(23). Here, we leveraged single-cell RNA-Seq (scRNA-Seq), imaging, functional perturbations, and computational modeling, to study the cancer-immune interplay in SyS. We Collectively, our findings demonstrate a strong connection between SyS development and immune evasion, and strengthen the notion that de-differentiation, immune evasion, and cell cycle are coregulated, such that cellular immunity can be targeted through modulation of cell cycle and epigenetic processes. Fig. 1D). Malignant cells primarily grouped by their tumor of origin, while their non-malignant counterparts (immune and stroma) grouped primarily by cell type (Fig. 1B,C), as we have observed in other tumor types (28)(29)(30). The malignant cells of each of the biphasic (BP) tumors (SyS1 and SyS12) formed two distinct subsetsepithelial and mesenchymalwhich clustered together with malignant cells of the other biphasic tumor (Fig. 1B,C, black, cyan and magenta dots, Methods). We next focused on characterizing the states of immune cells in SyS.

Evidence of antitumor immune activity despite low immune infiltration
The lack of effective antitumor immunity in SyS may results from: either the inactivity of immune cells, limiting their recognition of or response to SyS malignant cells, or hampered immune cell infiltration and recruitment into the tumor parenchyma. To test the first possibility, we examined CD8 T cell states ( Fig. 2A, Supp. Table 3A), and found clear hallmarks of antitumor immunity and recognition. T cell subsets span naïve, cytotoxic, exhausted, and regulatory T cells ( Fig. 2B; Methods), with evidence of expansion based on TCR reconstruction (31) (showing 57 clones, all patient-specific, with shared clones between matched samples from the same patient). While cytotoxic and exhaustion markers were generally co-expressed in T cells (Fig. 2B, consistent with previous reports (29)), clonally expanded T cells had unique transcriptional features (Methods, Supp. Table 3A), suggestive of an effector-like non-exhausted state (Fig. 2B, P < 6.6*10 -12 , mixed-effects). These expanded T cells might respond to SyS-specific CTAs, which were specifically expressed in large fractions of the malignant cell populations (Supp. Fig. 2A).
Other immune cells in the tumor microenvironment also showed features of antitumor immunity.
We next examined the alternative hypothesis that T cell abundance might be a limiting factor in SyS, despite these favorable T cell states. We compared SyS to 30 other cancer and sarcoma types.
SyS tumors showed extremely low levels of immune cells, which cannot be explained by variation in the mutational load ( Fig. 2D; P = 2.58*10 -11 , mixed effects when conditioning on the tumor mutational load), and despite the malignant-cell specific expression of immunogenic CTAs (Supp. Fig. 2A). In addition, unlike melanoma (Supp. Fig. 2E, left), T cell levels were not correlated with prognosis in SyS (Supp. Fig. 2E, right), indicating that they may not cross the critical threshold to impact clinical outcomes. Only mastocytes had a moderate positive association with improved prognosis (P = 0.012, Cox regression). These findings suggest that the lack of proper immune cell recruitment and infiltration is a key immune evasion mechanism in SyS, potentially mediated by the SyS cells.

Cellular plasticity and a core oncogenic program characterize synovial sarcoma cells
To identify malignant cell functions that may impact immune cell infiltration, we characterized the cellular programs in SyS malignant cells. We identified three co-regulated gene modules, which repeatedly appeared across multiple tumors in our cohort ( Fig. 3A-D, Supp. Table 4, Methods). The first two modules reflected mesenchymal and epithelial cell states (Fig. 3B, Supp. Among mesenchymal cells with a relatively low Overall Expression (Methods) of the mesenchymal program, one subset also expressed epithelial markers, reminiscent of transitioning to/from an epithelial state, while another underexpressed both programs, reminiscent of a poorly differentiated state. These poorly differentiated cells were highly enriched with cycling cells (P = 2.44*10 -60 , mixed effects), indicating that they might function as the tumor progenitors, fueling tumor growth (Fig. 3E,F, Supp. Fig. 3B,C). Diffusion map analysis of the cells based on these two programs highlighted putative differentiation trajectories, and found structured differentiation patterns only in the biphasic tumors (Fig. 3G, Methods). RNA velocity (38) demonstrated that epithelial to mesenchymal transitions may also take place (Supp. Fig. 3D), suggestive of cellular plasticity. Further supporting this hypothesis, the post-treatment sample of patient SyS12 includes a new subpopulation of mesenchymal cells, which was absent from the pre-treatment sample, and resembles the epithelial cells in terms of its CNAs (Supp. Fig. 3E).
The third module highlighted a new program present in a subset of cells in each tumor (25.2-84.7% per tumor, Fig. 3D,H, Supp. Fig. 4), which we named the core oncogenic program. The program is characterized by expression of genes from respiratory carbon metabolism (oxidative phosphorylation, citric acid cycle, and carbohydrate/protein metabolism, P < 1*10 -8 , hypergeometric test, Supp. Table 4), and repression of genes involved in TNF signaling, apoptosis, p53 signaling, and hypoxia processes (P < 1*10 -10 , hypergeometric test, Supp. Table   4), including known tumor suppressors, such as p21 (CDKN1A) and KLF4. The program was expressed in a higher proportion of cycling and poorly differentiated cells (P < 2.94*10 -4 , mixedeffects, Fig. 3I).
To test the clinical value of these transcriptional programs, we reanalyzed two independent bulk gene expression cohorts (21,22). Both dedifferentiation (Methods) and the core oncogenic program were substantially more pronounced in the more aggressive poorly differentiated SyS tumors (P < 2.76*10 -4 , one-sided t-test, Fig. 4A, Methods), and were associated with increased risk of metastatic disease (P < 1.36*10 -3 , Cox regression, Fig. 4B).
The core oncogenic program is associated with the cold phenotype and spatial niche Next we turned to explore the connection between the malignant cells' state and the tumor microenvironment and composition. Using our single-cell immune signatures we first estimated the composition of bulk SyS tumors in two published cohorts (18,22) and stratified them into "hot" or "cold", based on their relative inferred proportions of immune cells (Methods). "Hot" tumors showed the repression of the core oncogenic program and had significantly higher differentiation scores (P < 5.34*10 -3 , r = -0.44 and 0.48, respectively, partial Pearson correlation, conditioning on inferred tumor purity, Methods; Fig. 4C).
Interestingly, the core oncogenic program shows some degree of similarity to a transcriptional signature we recently associated with T cell exclusion in melanoma (32) (P < 7.16*10 -10 , hypergeometric test), although most genes in the program (~92%) were distinct from melanoma.
Among the overlapping genes we find the induction of the CTA MAGEA4, the BAF complex unit SMARCA4, and repression of apoptosis and p53 signaling (e.g., ATF3, JUN, KLF4, and SAT1).
The melanoma T cell exclusion signature and the synovial sarcoma mesenchymal state also overlapped (P = 6.33*10 -8 , hypergeometric test), for example, in the induction of SNAI2 and repression of 23 epithelial genes, including CDH1. Nevertheless, the programs were largely distinct, likely given the different tissues, microenvironments, cell of origin and genetic drivers.
The association between the core oncogenic program and T cell exclusion is observed in situ in the SyS samples from our single-cell cohort. We measured in situ expression of 12 proteins across 4,310,120 cells in 9 samples using multiplexed immunofluorescence (t-CyCIF) (39) (Fig. 4D,E; Methods), and profiled the in situ expression of 1,412 genes in 24 spatially distinct areas in two samples using the GeoMx high plex RNA Assay (early version for Next-Generation Sequencing; Methods). Both approaches showed that CD45 + immune cells were exceptionally low in SyS (<0.4%, compared to >8.7% in melanoma samples (32)). Moreover, the malignant cells in the more immune infiltrated areas show a marked decrease in the core oncogenic program (r = -0.53, P = 6.9*10 -3 , Pearson correlation, and P < 1*10 -10 , mixed effects; Methods). This suggests that the status of the malignant cells and the composition of the tumor microenvironment might be interconnected in SyS.

SS18-SSX sustains the core oncogenic program and blocks differentiation
To decouple the intrinsic and extrinsic factor determining the malignant cell states in SyS we first tested whether the core oncogenic and other programs were co-regulated by the genetic fusion driving SyS. We depleted SS18-SSX in two SyS cell lines (SYO1 and Aska) using shRNA, and profiled 12,263 cells by scRNA-Seq. The fusion knock-down (KD) led to extensive and highly consistent transcriptional alterations in both cell lines (Fig. 5A, Supp. Fig. 5A, Supp. Table 5): it substantially repressed the core oncogenic program and cell cycle genes (P < 8.05*10 -107 , t-test, and VIM (P < 1*10 -50 , t-test and likelihood-ratio test Fig. 5A,B,). The KD impact on the core oncogenic and differentiation programs was decoupled from the repression of cellular proliferation ( Fig. 5B), such that the impact on the core oncogenic and differentiation programs was observed even when controlling for the cycling status of the cells, and when considering only cycling or non-cycling cells (P < 1.54*10 -13 , t-test, Fig. 5B, Methods). Thus, the fusion's impact on cell cycle may be secondary or downstream to its impact on the core oncogenic program. In addition the fusion KD led to an induction of antigen presentation and cell autonomous immune responses, such as TNF and IFN signaling (P < 1*10 -30 , mixed-effects, Supp. Fig. 5A).
Using these SS18-SSX KD experiments we defined an SS18-SSX program, which we then stratified to direct and indirect fusion targets based on available SS18-SSX ChIP-Seq profiles (13,  Table 5A). According to the SS18-SSX program, the fusion directly dysregulates developmental programs and promotes the core oncogenic program (P < 2.51*10 -5 , hypergeometric test, Methods, Supp. Fig. 5B, Supp. Table 5), while its impact on cell cycle genes is mostly indirect (P < 1.2*10 -9 , hypergeometric test, Supp.  5B) and mediated by cyclin D2 (CCND2) and CDK6 -the only cell cycle genes that are members of the direct SS18-SSX program. Taken together, our findings support a model in which SS18-SSX directly promotes the core oncogenic program, blocks differentiations, drives cell cycle progression, and represses features necessary for immune recognition and recruitment.

TNF and IFN synergistically repress the core oncogenic and SS18-SSX programs
The association between the core oncogenic program and the cold phenotype suggest that the program promotes T cell exclusion in SyS. Another (non-mutually exclusive) hypothesis is that, despite their low numbers, the immune cells in the tumor microenvironment may nonetheless impact the state of the malignant cells, for example, through the secretion of different molecules and cytokines. To test this, we implemented a mixed-effects inference approach that uses scRNA-Seq data to find associations between the expression of secreted molecules and ligands in immune cells and the state of the malignant cells (Methods).
According to this analysis, the expression of IFN and TNF specifically from CD8 T cells and macrophages, respectively (Fig. 6A), was strongly associated with the repression of the core oncogenic program in the malignant cells (P < 9.4*10 -39 , mixed-effects). We further stratified the core oncogenic program to its predicted TNF/IFN -dependent and -independent components, by the association of each gene's expression in the malignant cells with the TNF and IFN expression levels in the corresponding macrophages and CD8 T cells, respectively (Methods, Supp. Table   6A).
Interestingly, TNF also induced TNF expression in the SyS cells (P < 5.57*10 -8 , mixed-effects), suggesting that autocrine signaling might induce the effect. Taken together, these findings demonstrate that macrophages and T cells can suppress the SS18-SSX program by secreting TNF and IFN .

HDAC and CDK4/6 inhibitors synergistically repress the immune resistant features of synovial sarcoma cells
Lastly, we turned to examine whether pharmacological agents could potentially repress the core oncogenic program and induce more immunogenic cell states in SyS cells. Computational modeling of the core oncogenic regulatory network (Methods) highlighted the SSX-SS18-HDAC1 complex (20) as the program's master regulator (Fig. 6C), and the tumor suppressor CDKN1A (p21) as its most repressed target. The latter indicates that the core oncogenic program is regulating, rather than regulated by, cell cycle genes through the p21-CDK2/4/6 axis, potentially reinforcing the direct induction of cyclin D and CDK6 by SS18-SSX (Fig. 6C,D). According to this model (Fig. 6D), modulators of cell cycle (e.g., CDK4/6 inhibitors) and SS18-SSX (e.g., HDAC inhibitors) could synergistically target the immune resistance features of SyS cells, especially in the presence of tumor microenvironment cytokines as TNF.
To test these predictions, we treated SyS lines and primary mesenchymal stem cells (MSCs) with low doses of HDAC and CDK4/6 inhibitors, in order to avoid global toxicity-related effects, and examined their impact on the transcriptional state of the cells. As predicted, the HDAC inhibitor panobinostat markedly repressed the core oncogenic program (P = 3.34*10 -14 , mixed-effects; The metabolic features of the core oncogenic program may also impact the tumor microenvironment. Supporting this notion, recent studies have shown that malignant cells use oxidative phosphorylation to create a hypoxic niche and promote T cell dysfunction (41). These metabolic features might reflect the conserved role of the SWI/SNF complex in regulating carbon metabolism and sucrose non-fermenting phenotypes in the yeast Saccharomyces cerevisiae (42).
These connections might generalize to other cancer types, as mutations in the BAF complex have been recently shown to induce a targetable dependency on oxidative phosphorylation in lung cancer (43).
Despite the extremely cold phenotypes displayed by SyS (Fig. 2D), expanded effector T cells are present in SyS tumors (Fig. 2B,C), potentially responding to the CTAs expressed specifically in the malignant cells, including NY-ESO-1 and PRAME (Supp. Fig. 2A). Consistently, vaccines triggering dendritic-cells to prime NY-ESO-1 specific T cells can lead to durable responses in SyS patients (7), further supporting the notion that SyS immune evasion operates primarily through impaired T cell or dendritic cell recruitment (44). The latter may also be mediated through Wnt/βcatenin signaling pathway, which has been previously shown to interfere with CD8 T cell recruitment to tumors by dendritic cells (44), and is indeed active in all the malignant SyS cells and directly induced by SS18-SSX (Supp. While the core oncogenic program shares some similar features with a T cell exclusion program we recently identified in melanoma (32), there are also substantial distinctions between the two programs, and >90% do not overlap between the two, likely reflecting the dramatic differences in driving events, cell of origin and tissue environment of the two tumors. This emphasizes the importance of understanding immune evasion for each tumor context. In particular, unlike the melanoma program, the core oncogenic program highlights a metabolic shift and is strongly connected to the genetic driver. In SyS tumors (but not in melanoma) we successfully decoupled, through computational inference, the intrinsic and extrinsic signals which modulate this transcriptional program, facilitating the reconstruction of multicellular circuits. This new approach revealed a bi-directional interaction between malignant and immune cells where CD8 T cells and macrophages can in turn repress the core oncogenic program through the secretion of TNF and IFN . Thus, beyond their direct cytotoxic activity, immune cells can alleviate some of the aggressive features of SyS cells through cytokine secretion, targeting also malignant cells with repressed antigen presentation or unrecognized epitopes.
Our findings also demonstrate that immune resistance, metabolic processes, cell cycle and dedifferentiation are tightly co-regulated in SyS. Hence, certain targeted therapies may be able to sensitize the tumor to immune surveillance. Supporting this notion, we demonstrate that the combined inhibition of HDAC and CDK4/6, two known repressors of SS18-SSX (45,46) and cellular proliferation (47), respectively, trigger immunogenic cell states even at sub-cytotoxic doses. This combinatorial treatment is also selectively cytotoxic to SyS cells, consistent with previous reports where HDAC and CDK4/6 inhibitors were used separately to induce cell death in SyS (45,47). The basal antitumor immune response we report, and the ability of T cells and macrophages to repress the core oncogenic and SS18-SSX programs support the potential of exploiting HDAC and CDK4/6 inhibitors together with immunotherapy.
Taken together, our study comprehensively maps and interrogates cell states in SyS and its surrounding tumor microenvironment, along with their multicellular regulatory circuits and clinical implications. We demonstrated that the SS18-SSX oncoprotein and the tumor microenvironment coordinately shape cell states in SyS, resulting in the establishment of an immune privileged environment (Fig. 7G). The possibility to selectively target the underlying mechanisms to reverse immune evasion offers a new perspective for the clinical management of SyS, and potentially other malignancies driven by similar genetic events.        Clinical annotations are provided in Supp. Table 1.

Fluorescence-activated cell sorting (FACS)
Tumor cells were kept in Phosphate Buffered Saline with 1% bovine serum albumin (PBS/BSA) while staining. Cells were stained using calcein AM (Life Technologies) and TO-PRO-3 iodide Plates were snap frozen on dry ice right after sorting and stored at -80°C prior to whole transcriptome amplification, library preparation and sequencing.

Library construction and sequencing
For plate-based scRNA-seq, Whole transcriptome amplification was performed using the SMART-seq2 protocol (24), with some modifications as previously described (30,49,50). The Nextera XT Library Prep kit (Illumina) was used for library preparation, with custom barcode adapters (sequences available upon request). Libraries from 384 to 768 cells with unique barcodes were combined and sequenced using a NextSeq 500 sequencer (Illumina).
In addition to SMART-seq2, cells from three samples (SS12pT, SS13 and SS14) were also sequenced using droplet-based scRNA-Seq with the 10x genomics platform. The samples were partitioned for SMART-seq2 and 10x genomics after dissociation. For each tumor, approximately two thirds of the sample was used for SMART-seq2 and one third for droplet based scRNA-seq (10x genomics). We sorted viable cells using MACS (Dead Cell Removal Kit, Miltenyi Biotec) and ran up to 2 channels per sample with a targeted number of cell recovery of 2,000 cells per channel. The samples were processed using the 10x Genomics Chromium 3' Gene Expression Solution (version 2) based on manufacturer instructions and sequenced using a NextSeq 500 sequencer (Illumina).

Whole exome sequencing (WES)
DNA and RNA were extracted from fresh frozen tissue or Formalin-Fixed Paraffin-Embedded (FFPE) blocks for each patient (obtained according to their respective Institutional Review Boardapproved protocols) using the AllPrep DNA/RNA extraction kit (Qiagen). We used tumor tissue and matched normal muscle tissue from the same patient as reference. Library construction was performed as previously described (50) Imager5 software (RareCyte Inc.) was used to sequentially scan the region of interest in 4 fluorescence channels. Image processing, background subtraction, image registration, single-cell segmentation and quantification were performed as previously described (51).

RNA in situ hybridization
Paraffin-embedded tissue sections from human tumors from Massachusetts General Hospital and and University Hospital of Lausanne were obtained according to their respective Institutional Review Board-approved protocols. Sections were mounted on glass slides and stored at -80°C.

scRNA-seq pre-processing and gene expression quantification
BAM files were converted to merged, demultiplexed FASTQ files. The paired-end reads obtained with SMART-Seq2 were mapped to the UCSC hg19 human transcriptome using Bowtie (55), and transcript-per-million (TPM) values were calculated with RSEM v1.2.8 in paired-end mode (56).
The paired-end reads obtained with droplet scRNA-Seq (10x Genomics) were mapped to the UCSC hg19 human transcriptome using STAR (57), and gene counts/TPM values were obtained using CellRanger (cellranger-2.1.0, 10x Genomics). For each cell, we quantified the number of genes with at least one mapped read, and the average expression level of a curated list of housekeeping genes (58). We excluded all cells with either fewer than 1,700 detected genes or an average housekeeping expression (E, as defined above) below 3 (Supp.

Copy number and copy ratio analysis
To infer somatic copy number from WES, we used ReCapSeg (http:// gatkforums.broadinstitute.org/categories/recapseg-documentation), calculating proportional coverage for each target region (i.e., reads in the target/total reads) followed by segment normalization using the median coverage in a panel of normal samples. The resulting copy ratios were segmented using the circular binary segmentation algorithm (66). To infer allele-specific copy ratios, we mapped all germline heterozygous sites in the germline normal sample using GATK Haplotype Caller (67) and then evaluated the read counts at the germline heterozygous sites in order to assess the copy profile of each homologous chromosome. The allele-specific copy profiles were segmented to produce allele specific copy ratios.

Gene sets Overall Expression
We used the following scheme to compute the Overall Expression (OE) of a gene set (signature).
The OE metric (32) filters technical variation and highlights biologically meaningful patterns. The procedure is based on the notion that the measured expression of a specific gene is correlated with its true expression (signal), but also contains a technical (noise) component. The latter may be due . Given a gene signature S that consists of K genes, with kb genes in bin b, we sample random S-compatible signatures for normalization. A random signature is S-compatible with a signature S if it consists of overall K genes, such that in each bin b it has exactly kb genes. The OE of signature S in cell or sample j is then defined as: Where S is a random S-compatible signature, and Cij is the centered expression of gene i in cell or In cases where the OE of a given signature/program has a bimodal distribution across the cell population, it can be used to naturally separate the cells into two subsets. To this end, we applied the Expectation Maximization (EM) algorithm for mixtures of normal distributions to define the two underlying normal distributions. We then assigned cells to two subsets, depending on the distribution (high or low) they were assigned to.

Cell type assignments
Cell type assignments were performed based on genetic and transcriptional features, according to the following four analyses: (1) Fusion detection. Fusion detection was performed with STAR-Fusion (26), to detect any transcript that indicates the fusion of two genes.  (Fig. 1G) using a Bayesian approach, as described in https://github.com/broadinstitute/infercnv/wiki/infercnv-i6-HMM-type.
(3) Differential similarity to bulk tumors. We compared the scRNA-Seq profiles to those of bulk sarcoma tumors (23). RNA-Seq of bulk sarcoma tumors was downloaded from TCGA Cell type assignments were preformed separately for the SMART-Seq2 and droplet scRNA-Seq datasets cohort. Fusion detection was performed only with the full length SMART-Seq2 data.

Cell type signatures
Cell type signatures were generated based on pairwise comparisons between identified cell subtypes: malignant cells, epithelial cells, CAFs, CD8 and CD4 T cells, B cells, NK cells, macrophages, and mastocytes. For each pair of cell subtypes we identified differentially expressed genes using the likelihood-ratio test (69), as implemented in the Seurat package (http://www.satijalab.org/seurat). Genes were considered as cell type specific if they were overexpressed in a particular cell subtype compared to all other cell subtypes (log-fold change > 0.25 and p-value < 0.05, following Bonferroni correction). We defined a general T cell signature for both CD4 and CD8 cells by identifying genes that were overexpressed in both CD4 and CD8 compared to all other (non T) cells. A more permissive version of this generic T cell signature includes genes which were overexpressed in CD4 or CD8 T cells compared to all other (non T) cells.

Inferring tumor composition
Tumor composition was assessed based on the Overall Expression of the different cell type specific signatures we identified from the scRNA-seq data (Supp. Table 2). For example, the CD8 T cell signature was used to infer the level of CD8 T cells in the tumor, and likewise for other cell types.
To estimate tumor purity we used the malignant SyS signature identified here (Supp. Table 2 We also used cell type signatures we previously derived from melanoma scRNA-Seq data (32) to predict the tumor composition of the simulated SyS bulk RNA-Seq profiles, and vice versa. We then compared the predictions to the known cell type composition. The predicted composition was highly correlated with the known composition (r > 0.9, P < 1*10 -30 , Spearman correlation) for all cell types.

Multilevel mixed-effects models
To examine the association between two cell features, denoted here as x and y, across different patients or experiments we used multilevel mixed-effects regression models (random intercepts models). The models include patient/experiment-specific intercepts to control for the dependency between the scRNA-seq profiles of cells that were obtained from the same patient/experiment. The models also control for data quality by providing the number of reads (log-transformed) that were detected in each cell as a covariate. To compute the association between features x and y we provided x as another covariate and used y as the dependent variable. The models were implemented using the lme4 and lmerTest R packages (https://CRAN.Rproject.org/package=lme4, https://CRAN.R-project.org/package=lmerTest).
For example, to test if malignant cycling cells were more frequent in treatment naïve samples, we used a logistic mixed-effects model as described above. The dependent variable y was the cycling status of the malignant cells. The independent covariate x was a binary variable denoting if the sample was obtained before or after treatment. Only malignant cells were included in this model.

T Cell Receptor (TCR) reconstruction and T cell expansion program
TCR reconstruction was performed using TraCeR (31), with the Python package in https://github.com/Teichlab/tracer. To characterize the transcriptional state of clonally expanded T cells, we first identified the clonality level of the T cells in our cohort. T cell that were obtained from tumors with a larger number of T cells with reconstructed TCRs were more likely to be defined as expanded. To control for this confounder we performed the following down-sampling procedure. First, we removed T cells without a reconstructed alpha or beta TCR chain, and samples with less than 20 T cells with a reconstructed TCR. Next, we computed the probability that a given cell will be a part of a clone when subsampling 20 T cells from each tumor. T cells with a high probability to be a part of a clone (above the median) were considered expanded, and nonexpanded otherwise. To identify the genes differentially expressed in expanded CD8 T cells we used mixed-effects models with a binary covariate, denoting if the cell was classified as expanded or not.

CD8 T cell analyses
The analysis of T cell exhaustion vs. T cell cytotoxicity was performed as previously described (58), with the exhaustion signature provided in (58). First, we computed the cytotoxicity and exhaustion scores of each CD8 T cell. Next, to control for the association between the expression of exhaustion and cytotoxicity markers, we estimated the relationship between the cytotoxicity and exhaustion scores using locally-weighted polynomial regression (LOWESS, black line in Fig. 2B).
Based on these values we classified the CD8 T cells into four groups: Cells with a low cytotoxicity score (below the 25 th percentile) were classified as naïve or memory-like cells, while the others were considered effector or exhausted if their cytotoxicity scores were significantly higher or lower than expected given their exhaustion scores, respectively. According to this classification, we examined if the clonal expansion program was higher in the effector-like cells. In addition, we compared the SyS CD8 T cells to CD8 T cells from human melanoma tumors (32) using mixedeffects models with a sample-level covariate denoting if the sample was obtained from a SyS or melanoma tumor.

Malignant epithelial and mesenchymal differentiation programs
The epithelial and mesenchymal signatures were obtained through intra-tumor differential expression analysis, using the likelihood-ratio test for single cell gene expression (69), as implemented in the Seurat package (http://www.satijalab.org/seurat). We compared the mesenchymal to epithelial cells in each of the three biphasic tumor samples (SyS1, SyS12 and SyS12pt). The tumor SyS16 was not included in this analysis (although it was annotated as partially biphasic according to its histology), because its scRNA-Seq sample did not include any epithelial malignant cells. Genes that were up-regulated in the epithelial cells compared to the mesenchymal cells in all three samples were defined as epithelial genes, and likewise for mesenchymal genes. When using the epithelial and mesenchymal signatures in the analysis of bulk gene expression we removed from these signatures those genes that are also part of non-malignant cell type signatures.
Using these signatures we defined: (1) the epithelial vs. mesenchymal differentiation score as the OE of the epithelial signature minus the OE of the mesenchymal signature, and (2) the differentiation score as the OE of the epithelial signature plus the OE of the mesenchymal signature. An alternative way to define the differentiation score of a particular cell is first to assign it to the epithelial or mesenchymal subset, and then use only the pertaining signature to estimate its differentiation level. However, this approach will not distinguish between poorly-differentiated mesenchymal cells, and mesenchymal cells which have begun to transition to an epithelial state.
Hence, we used the inclusive definition of differentiation.
Based on the genes in the epithelial and mesenchymal signatures we then generated diffusion maps (70) for each one of the tumors in our cohort, using the density R package (https://bioconductor.org/packages/release/bioc/html/destiny.html) with the default parameters.

Identifying co-regulated gene modules
To identify co-regulated gene modules that capture intra-tumor heterogeneity we analyzed each tumor separately. To identify patterns that explain the cell-cell variation both in epithelial and in mesenchymal malignant cells, we further divided the biphasic samples (SyS1, SyS12, and SyS12pt) to their epithelial and mesenchymal compartments. We used PAGODA (71) as implemented in https://github.com/hms-dbmi/scde to filter technical variation and identify coregulated gene modules in each sample. To identify genes that were repeatedly co-regulated we then constructed a gene-gene co-regulation graph. In this graph, an edge between two genes denotes that the two genes appeared together in the same gene module in at least five samples.
Next, we identified dense clusters in the graph using the Newman-Girvan (72) community clustering as previously implemented (73). We filtered out small gene clusters (< 20 genes). Lastly, for each gene cluster we identified the opposing gene module by identifying genes that were negatively correlated with its Overall Expression (OE) across the malignant cells. Correlation was computed using partial Spearman correlation, when controlling for the number of genes and (log-transformed) reads detected per cells, and correcting for multiple hypotheses testing using the Benjamini-Hochberg procedure (74).

Quantifying RNA velocity
Estimates of RNA velocity were computed using the Velocyto toolkit (http://velocyto.org/). We applied Velocyto with the default parameters, using a gene-relative model. To explore the potential transitions between the epithelial and mesenchymal cell states and avoid confounders, we used only the genes from these differentiation programs (Supp. Table 4) for the analysis.

Predicting patient prognosis
To test if a given program predicts metastasis free-survival or overall survival, we first computed extracted UMIs were then aligned to a genome containing the 12bp reference sequence tags using bowtie2 (version 2.3.4.1) in end-to-end mode with a seed length of four. Using a custom python function, the generated SAM files were split into multiple SAM files based on the tag to which they aligned to limit memory usage when removing PCR duplicates. The split SAM files were converted to bam files, sorted, and index using samtools (version 1.9) with the import, sort, and index options respectively. PCR duplicates were removed from the sorted and indexed bam files using the dedup command from umi tools with an edit distance threshold of three. An edit distance threshold of three was used. Using custom python functions, the SAM files with PCR duplicates removed were merged for each sample and used to generate digital counts of the tags.
Outlier counts were removed before generating a consensus count for each target. Outlier tags were identified as those with counts 90% below the mean of the probe group in at least 20% of the

Identifying SS18-SSX targets
The fusion program consists of genes that were differentially expressed in the Aska or SYO1 cells with the SS18-SSX shRNA (shSSX) compared to those with control shRNA (shCt) after 3 or 7 days post-infection. Gene that were previously reported (18,19) to be bound by the SS18-SSX oncoprotein in at least two SyS cell lines were defined as direct SS18-SSX targets, and were used to stratify the SS18-SSX program to direct and indirect targets.

Mapping cancer-immune interactions
The association between the core oncogenic program in the malignant cells and the expression of different ligands/cytokines in the immune cells was examined using the multilevel mixed-effects regression model described above, using the scRNA-Seq data collected from SyS tumors. The dependent variable y was the OE of the core oncogenic program and the covariate x was the average expression of a certain ligand/cytokine in a specific type of immune cells (e.g., macrophages) that were profiled from the same tumor. The model also corrected for inter-patient dependencies using the patient-specific intercepts and for cell complexity (log(number of reads)).
We restricted the analysis to ligands/cytokines that can physically bind to proteins expressed by the malignant cells (77). The immune cells were either macrophages or CD8 T cells, as other immune cell types were not sufficiently represented in the data.
We used a similar approach to further stratify the program to its TNF/IFN-dependent and independent components. We repeated the same analysis described above, using each one of the genes in the core oncogenic program as the dependent variable. Genes which were associated with both TNF and IFN (P < 0.05, following Bonferroni correction) were considered as TNF/IFN-dependent, and genes which were not associated with both cytokines (P > 0.05) were considered as TNF/IFN-independent.

TNF and IFN impact on SyS cell cultures
SyS cell cultures were treated with TNF and IFN , separately and in combination (see In vitro IFN/TNF experiment section), and profiled with scRNA-Seq. Given this data, differentially expressed genes and gene sets were identified using mixed-effects regression models (Multilevel

Reconstructing regulatory networks
To reconstruct the gene regulatory network controlling the core oncogenic program we assembled a database of transcription factor (TF) to target mapping based on four sources: JASPAR (78), HTRIdb (79), MSigDB (80,81), and TRRUST (82), and augmented it with the direct SS18-SSX targets identified here (Supp . Table 5A) and TF-target pairs we identified in a cis-regulatory motif analysis of the core oncogenic program. Specifically, for the cis-regulatory analysis, we used RcisTarget (83) (a R/Bioconductor implementation of icisTarget (84) and iRegulon (85)) to identify cis-regulatory elements significantly overrepresented in a window of 500bp around the transcription start site of the core oncogenic genes (normalized enrichment score > 3.0) along with their cognate TFs.
We pruned the resulting network to include only core oncogenic program genes (and SS18-SSX) (i.e., all TFs and targets aside from SS18-SSX are program genes). An edge in the network between a TF and its target denotes that: (1) the TF is regulating the target according to at least one of the sources described above, and (2) there is an association between their expression levels in the scRNA-Seq data of SyS tumors. Edges are weighted 1 and -1 to reflect positive and negative associations. We used pageRank (86) (with the R implementation as provided in igraph (https://igraph.org/r/)) as a measure of TF and target importance in the network. To compute TF importance we first flipped the direction of the edges in the network, going from target to TFs.
Consistent with the network weights, targets from the up-or down-regulated side of the network were considered induced or repressed, respectively. Likewise, TFs from the up-or down-regulated side of the network were considered activators and repressors, respectively.

Selectivity and synergy in drug experiments
To evaluate the impact of each drug on the expression of a certain program or gene in a specific

Data availability
Processed scRNA-seq data and interactive plots generated for this study will be provided through the Single Cell Portal. The processed scRNA-seq data will be provided via the Gene Expression Omnibus (GEO).

SUPPLEMENTAL TABLES LEGENDS
The Supplemental Tables are provided in separate (Excel) files.
Supplementary Table 1