Single-cell transcriptomics reveals varying degrees 1 of epithelial-mesenchymal plasticity in lymph node 2 metastasis from oral cavity squamous cell carcinoma 3

Kai Horny, Christoph Sproll, Lukas Peiffer, Frauke Furtmann, Patricia Gerhardt, Jan 4 Gravemeyer, Nikolas H. Stoecklein, Ivelina Spassova, Jürgen C. Becker 5 1 Translational Skin Cancer Research, German Cancer Consortium (DKTK), 45141 Essen, 6 Germany 7 2 German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany 8 3 Department of Oral-, Maxilloand Plastic Facial Surgery, Medical Faculty, University Hospital 9 of the Heinrich-Heine-University Düsseldorf, Germany 10 4 Department of Dermatology, University Medicine Essen, 45141 Essen, Germany 11 5 Department of General, Visceral and Pediatric Surgery, Medical Faculty, University Hospital 12 of the Heinrich-Heine-University Düsseldorf, Düsseldorf, Germany 13

showing a multitude of intermediate or partial states [5,8,9,[11][12][13][14][15]. Therefore, it has recently 72 been recommended that this EMT continuum is referred to as epithelial-mesenchymal 73 plasticity (EMP) [7,16]. During the metastatic cascade, cells evolve through this continuum, 74 changing from epithelial to more mesenchymal or partial EMT states and back to epithelial 75 phenotypes [17]. and discovered a partial EMT state with high variability in EMP-related gene expression [12]. 80 Here, we report comprehensive single-cell RNA sequencing data for 1,946 malignant cells 81 isolated from an advanced metachronous metastasis of an OSCC from the lower lip mucosa. 82 Our data demonstrates varying degrees of EMP within a single metastasis, driven in different 83 directions and towards environmental adaptations such as hypoxia, supply of nutrients and 84 possible cell-cell interactions. 85

Patient and tumor characteristics 87
The patient initially suffered from an OSCC of the lower lip mucosa that was surgically removed 88 by wedge excision with primary wound closure. Neither subsequent elective neck dissection 89 nor irradiation was performed presumably because of the multimorbidity of the patient. One 90 year later, the patient noticed a swelling under the left mandible, which was suspected to be 91 multiple metastases on computed tomography. With a delay of an additional four months, a 92 left sided neck dissection was performed confirming the diagnosis. One of these lymph node 93 metastases was selected for single cell RNA sequencing (scRNAseq). number variations (CNVs) inferred from scRNAseq data as well as the expression of S100A2, 106 S100A14, cytokeratins (KRT5, KRT14, KRT17) and stratifin (SFN), the latter highlighting their 107 epithelial origin ( Figure 1C, D, Supplementary Figure 2E). 108

Phenotypic heterogeneity of tumor cell populations is characterized by specific gene 109 expression patterns 110
Within tumor cells, we identified seven different cell populations based on gene expression 111 profiles using clustering and uniform manifold approximation and projection (UMAP) ( Figure 7 2A). The relationship between these populations is reflected by a trajectory branching into two 113 curves that largely follow the first and second principal component (PC) (Figure 2B). The 114 variation reflected in the first PC is mostly driven by EMP-related genes such as vimentin (VIM), 115 cytokeratins (KRT17, KRT6B, KRT6A), S100 proteins (S100A8, S100A9) and matrix 116 metallopeptidases (MMP10) ( Figure 2C). The second curve varies in the expression of cell 117 cycle (e.g., HMGB2, TUBB), metabolism (e.g., ASNS, SLC2A1) and hypoxia (e.g., NDRG1) 118 related genes. The majority of malignant cells are accumulating closely on the center of the 119 trajectory within clusters 2, 3 and 6, while the remaining cells showed larger variation in their 120 expression patterns as indicated by larger distances in PCs and on the trajectory ( Figure 2B, 121 D). 122 Since the beginning and end of such trajectories cannot be defined with certainty, we estimated 123 RNA velocity to predict the short-term future development of individual cells using the ratio of 124 spliced and non-spliced mRNA counts ( Figure 2E). While the cells in the center of the trajectory 125 did not show a uniform developmental direction, the cells from cluster 4 were developing 126 towards cluster 5, adopting a progressive epithelial differentiation, and cells from cluster 6 127 towards cluster 7 resembling cell cycle progression. Another possibility to determine the 128 direction of a trajectory is the accumulation of genetic aberrations in the cancer cells. The 129 inferred CNVs revealed an increasing frequency of copy number gains on chromosome 1, 8, 130 17, and 19 for cells developing towards cluster 4 and cluster 5 ( Figure 2F). These gains 131 suggest that these cells developed later during tumor evolution. However, it should be noted 132 that two of the four identified CNVs on chromosome 1 and 17 are associated with upregulated 133 epithelial genes in cluster 4 and 5 that are in close genomic proximity; thus, these two may not 134 represent true genomic CNVs, but rather reflect the high expression of these genes 135 (Supplementary Figure 3). 136

Epithelial-mesenchymal plasticity (EMP) drives heterogeneity of metastatic OSCC cells 137
The gene expression patterns of the different tumor cell populations reflect varying degrees of 138 EMP, cell cycle, and metabolic adaptation ( Figure 3A, B). Two clusters (cluster 3 & 6) express 139 genes associated with a partial EMT state, two clusters (cluster 4 & 5) express genes related 140 to epithelial differentiation, two clusters (cluster 1 & 2) express genes linked to metabolic 141 changes for hypoxia-response or in amino acid metabolism and one cluster (cluster 7) express 142 genes related to cell cycle. A large number of differentially expressed genes (DEGs) of cluster 143 2, 3, 6 and 7 are within the EMT hallmark gene set from the molecular signatures database 144 (MSigDB) and are known to contribute to previously reported partial EMT gene expression  Specifically, cells of cluster 4 express classical epithelial markers such as SPRR1B, KRT16, 162 KRT6B, S100A7, S100A8 and S100A9. With advancing epithelial differentiation towards 163 cluster 5 the expression of KRT13, KRT15, PSCA and kallikrein-related peptidases KLK5/6/7 164 increases. We confirmed the gene expression-based observation by comparing our epithelial 165 and partial EMT signatures to a broader set of gene expression signatures from the EMTome 166 database (Supplementary Figure 5). 167

Gene expression patterns indicate differences in metabolic adaptation of OSCC cells 168
As already mentioned above, cluster 1 and 2 were characterized by DEGs that are essentially 169 associated with changes in metabolism ( Figure  factors is not well reflected by its mRNA expression alone, as it depends also on protein 205 stability and posttranslational modifications; for example, ZEB1 is more stable than SNAI1 [28]. 206 To address this notion, we inferred the activity of transcription factors based on the mRNA 207 expression profile of their target genes using the algorithm VIPER with regulons defined by 208 DoRothEA [29][30][31]. Surprisingly, despite the fact that we could only detect mRNA for SNAI2, 209 we found similarly high ZEB1 and SNAI2 activity with some variation along the branching 210 trajectory ( Figure 4C). Surprisingly, the epithelial differentiating cells in cluster 4 and 5 showed 211 higher ZEB1, SNAI1 and TWIST1 and less TWIST2 activities than the other cells. As some 212 cells indicated a response to hypoxic conditions, we further investigated the possible link 213 between hypoxia-response and EMP. Initially, both EMT and hypoxia were considered as 214 separate events promoting invasion and metastasis. More recently, it was shown that the HIF 215 pathway is interrelated with EMP, assuming a hypoxia-induced EMT [32, 33]. However, while 216 we detected HIF1 mRNA expression, it appeared not to be truly active within the OSCC cells 217 ( Figure 4A, C). 218

Partial EMT OSCC cells have more interactions with stromal cells 219
In addition to OSCC cells, we identified fibroblasts, endothelial cells, macrophages, and 220 dendritic cells ( Figure    Therefore, the presented scRNAseq-based trajectory reflects the developmental relationships 310 between tumor cell populations rather than passing through evolutional processes.