The evolution of genomic, transcriptomic, and single-cell protein markers of metastatic upper tract urothelial carcinoma

The molecular characteristics of metastatic upper tract urothelial carcinoma (UTUC) are unknown. The genomic and transcriptomic differences between primary and metastatic UTUC is not well described either. We combined whole-exome sequencing, RNA-sequencing, and Imaging Mass Cytometry™ (IMC™) of 44 tumor samples from 28 patients with high-grade primary and metastatic UTUC. IMC enables spatially resolved single-cell analyses to examine the evolution of cancer cell, immune cell, and stromal cell markers using mass cytometry with lanthanide metal-conjugated antibodies. We discovered that actionable genomic alterations are frequently discordant between primary and metastatic UTUC tumors in the same patient. In contrast, molecular subtype membership and immune depletion signature were stable across primary and matched metastatic UTUC. Molecular and immune subtypes were consistent between bulk RNA-sequencing and mass cytometry of protein markers from 340,798 single-cells. Molecular subtyping at the single cell level was highly conserved between primary and metastatic UTUC tumors within the same patient.


Introduction
Upper tract urothelial carcinoma (UTUC) is defined as urothelial carcinoma (UC) arising from the ureter or the renal pelvis. UTUC is a rare tumor compared to UC of the bladder (UCB), accounting for 5-10% of UC, with aggressive clinical presentation. While 5-year survival for nonmuscle invasive UTUC is more than 90%, it drops to < 40% in patients with regional nodal metastases and < 10% in patients with distant metastases 1,2 . Even though systemic chemotherapies are given after relapsing, most patients die from the disease within 3 years 3 . A better understanding of the underlying molecular basis of metastatic UTUC is needed to develop targeted therapeutic strategies and improve clinical outcomes for patients with metastatic UTUC.
Recent studies by next-generation sequencing (NGS) have detailed the genomic landscape of primary UTUC [4][5][6] . Regarding molecular and immune phenotyping of UTUC, our group recently demonstrated that primary UTUC has a luminal-papillary T-cell depleted phenotype and a lower total mutational burden than UCB 7 . However, previous studies on UTUC including ours mainly focused on primary tumors and limited molecular data currently exists for metastatic UTUC 6 . Furthermore, the extent of genomic and phenotypic heterogeneity in each individual patient with metastatic UTUC is still unknown.
In this study, we performed whole-exome sequencing (WES), RNA-sequencing (RNA-seq), and multiplexed imaging cytometry which leverages antibodies conjugated to rare metals and detection by mass spectrometry to provide spatial protein expression patterns at a single cell resolution 8,9 from patients with primary and metastatic UTUC enrolled in our precision medicine study to characterize the genomic, transcriptomic and immunophenotypic features of metastatic UTUC, and understand the degree of heterogeneity between primary and metastatic UTUC.

Genomic heterogeneity between primary and matched metastatic UTUC
To compare the clonal structure of primary and metastatic UTUC, we investigated the numbers of private and shared mutations between the primary and metastatic UTUC within each patient by calculating the percentage of non-synonymous mutations that were private to the primary tumor, the metastatic tumor or shared within individual patients. Of the 7 analyzed patients with paired primary and metastatic UTUC, 4 patients received chemotherapy treatment before sampling of the primary or metastatic tumor tissue, and all the tumors from two patients, WCM052 and WCM068, were chemotherapy-naïve (Supplementary Figure 2). On average, only 17.6% (range 7.3-36.4%) of mutations were shared by primary and matched metastatic samples (Figure 3a). The percentages of shared mutations were higher in the chemotherapy naïve patients than patients who received prior chemotherapy treatment (32.6% vs. 11.5%), suggesting that chemotherapy potentially plays a role in increasing genomic heterogeneity by inducing mutations [13][14][15] .
To determine the biological impact of genomic heterogeneity, we next sought to define the differences in mutations and CNAs affecting cancer associated genes between primary and matched metastatic UTUC within each patient. A median of 13.5 cancer gene alterations per sample was identified (range 2 -51), including a median of 2 (range 1-9) alterations predicted to be likely oncogenic by the OncoKB database 16 . When we compared genomic alterations between paired primary and metastatic tumors, intra-patient heterogeneity in cancer gene alterations was observed (Figure 3b). The primary and matched metastatic tumors shared at least one pathogenic cancer gene alteration in all cases (Supplementary Figure 3). However, compared to primary UTUC, we identified additional pathogenic mutations or CNAs in all but one of the matched metastatic UTUC tumors (Figure 3b, Supplementary Figure 3).

Stability of molecular subtype and immune contexture assignments between primary and metastatic UTUC tumors
The next step was to see if there was any transcriptome difference between the primary and matched metastatic UTUC. To that goal, we performed whole-transcriptome sequencing (RNAseq) on 17 UTUC tumors (six primary and eleven metastatic), including 3 patients with primary and matched metastatic UTUC and 1 patient with two metastatic tumors (Figure 4a). Using the recently published single sample consensus molecular classifier 17 , we found that 83.3% of primary tumor samples were luminal-papillary, and the rest were basal/squamous (Figure 4a).
Of the metastatic tumors, 45.5%, 27.3%, 9.1% and 18.2% were luminal-papillary, luminalunstable, stroma-rich and basal/squamous, respectively (Figure 4a). Clustering analysis using the BASE47 gene classifier 18 showed a similar frequency of basal subtype between primary and metastatic tumors (33.3% vs. 27.3%, P value > 0.05). When we evaluated the molecular subtypes between primary and the matched metastatic UTUC tumors, we found that they were relatively stable in comparison to genomic changes, but discordance in molecular subtype was observed in one out of three cases (Figure 4d). In WCM010, one of three lymph node metastases was classified as luminal-unstable, while the primary and remaining lymph node metastases were classified as luminal-papillary. In case WCM031, only metastatic tumors were interrogated by RNA-seq due to the unavailability of frozen tissue from the primary tumor. Interestingly. molecular subtyping revealed discordance of molecular subtypes between the asynchronous liver metastases (Figure 4b). The initial metastasis (M1) and the second metastasis (M2) demonstrated stroma-rich and luminal unstable, respectively.
Next, we applied a classifier of 170 immune-related genes which we previously developed 7 Table 2) to the RNA-seq data obtained from 17 samples (6 primary and 11 metastatic UTUC) and the TCGA-BLCA cohort. The majority of UTUC tumors (76.5%, 13/17) clustered into the T-cell depleted subgroup (Figure 4c-d) consistent with our previous finding in primary UTUC 7 . Only 16.7% of primary UTUC (1/6) and 27.3% of metastatic UTUC (3/11) were T-cell inflamed. The immune phenotypes were concordant within individual patients.

Single cell spatial profiling of UTUC's tumor-immune contexture reveals intra-tumoral plasticity
In order to validate our findings showing differential immune states within the tumors and further investigate the tumor microenvironment, we employed Imaging Mass Cytometry™ (IMC™) on samples of UTUC -a method for multiplexed tissue imaging using antibodies conjugated to rare metals and detection by mass spectrometry 19 Table 3). Upon inspection, IMC™ images generally revealed structured boundaries between tumor and stromal compartments (Figure 5a-

Discussion
Here, we report the genomic, transcriptomic and immunophenotypic landscape and the differences between primary and metastatic UTUC. Furthermore, for the first time, we used multiplexed tissue imaging using Imaging Mass Cytometry™, to describe the intra-and intertumoral phenotypic heterogeneity of cancer cells and their microenvironment at the single-cell level. Although genomic data from paired primary and metastatic UTUC tumors in small cohorts has been reported 6,20 , neither transcriptomic data nor single-cell imaging analysis have been described.
WES analysis from paired primary and metastatic UTUC revealed that metastatic tumors harbored oncogenic genomic alterations which were not identified in the matched primary tumors. Two possibilities can be considered to account for this heterogeneity: (a) Clones harboring these additional alterations are resistant to chemotherapy. (b) Metastatic lineages result from early branched evolution in the primary tumors. These possibilities were discussed in previous reports which suggested that systemic spread is a very early event in cancer history and that chemotherapy selects clones harboring drug-resistant mutations in breast, colorectal, lung, and bladder cancers 13,21 . In our cohort, intra-patient heterogeneity of likely oncogenic alterations was observed also in the chemotherapy-naïve cases (WCM052 and WCM068). However, the percentages of shared mutations between primary and matched metastases were higher than those who had a history of chemotherapy before sampling. The observation suggests that chemotherapy potentially plays a role in increasing genomic heterogeneity by inducing mutations [13][14][15] . Regarding UTUC, previous genomic studies in small cohorts showed discordance of mutations between paired primary and metastatic UTUC 6,20 . Furthermore, we show that intrapatient genomic diversity in clinically targetable alterations is also common in UTUC. This finding is of clinical importance because evaluating only a primary or metastatic sites could potentially miss therapeutic targets in patients with metastatic UTUC. However, we acknowledge a limitation that the collected samples may not fully represent the tumors. There is a possibility that some of the alterations detected only in metastases were present in unsampled regions of their paired primary tumors. Liquid biopsy, which can analyze circulating cell-free DNA, could complement tissue biopsy by capturing tumor heterogeneity that single-lesion tumor biopsies may miss 22 .
We observed intra-patient discrepancy of molecular subtypes within 2 patients (WCM01031 and WCM031) at the transcriptomic level using a recently published molecular classification system 17

Patient enrollment and tissue acquisition.
Patients with primary and metastatic UTUC were prospectively enrolled in an

DNA extraction and WES.
WES was performed on each patient's tumor/matched germline DNA pair using previously described protocols 31,32 . After macrodissection of target lesions, tumor DNA was extracted from formalin-fixed, paraffin-embedded (FFPE) or cored OCT-cryopreserved tumors using the Promega Maxwell 16 MDx (Promega, Madison, WI, USA). Germline DNA was extracted from blood, buccal mucosa and normal lung and lymph node tissue using the same method. A minimum of 200 ng of DNA was used for WES. DNA quality was determined by TapeStation Instrument (Agilent Technologies, Santa Clara, CA) and was confirmed by real-time PCR before sequencing. Sequencing was performed using Illumina HiSeq 2500 (2 × 100 bp). A total of 21,522 genes were analyzed with an average coverage of 85× using Agilent HaloPlex Exome (Agilent Technologies, Santa Clara, CA).

WES data processing pipeline
All the sample data were processed through the computational analysis pipeline of the

Institute for Precision Medicine at Weill Cornell Medicine and NewYork-Presbyterian (IPM-
Exome-pipeline) 31 . Raw reads quality was assessed with FASTQC. Pipeline output includes segment DNA copy number data, somatic copy-number aberrations (CNAs), and putative somatic single-nucleotide variants (SNVs) 32 . Somatic variants were filtered using the following criteria: (a) read depth for both tumor and matched normal samples was ≥30 reads, (b) the variant allele frequency (VAF) in tumor samples was ≥10% and >5 reads harboring the mutated allele, (c) the VAF of matched normal was ≤1%, or there was just one read with mutated allele. Pathogenicity and actionability for each mutation and CNA were determined by the OncoKB database 16 . Tumor mutation burden (TMB) was calculated as the number of mutations divided by the number of bases in the coverage space per million. TMB status (high vs. low) was determined using a urothelial cancer-specific threshold which our group recently reported 10 .

Computational detection of MSI
MSI was detected by the MSI sensor. MSI sensor is a software tool that quantifies MSI in paired tumor-normal genome sequencing data and reports the somatic status of corresponding microsatellite sites in the human genome 11 . MSIsensor score was calculated by dividing the

RNA extraction, RNA sequencing, and data analysis.
RNA-seq and data processing was performed according to the protocol described in previous papers 16,33 . Briefly, RNA was extracted from frozen material for RNA-seq using Promega Maxwell 16 MDx instrument, (Maxwell 16 LEV simplyRNA Tissue Kit (cat. # AS1280)). Specimens were prepared for RNA sequencing using TruSeq RNA Library Preparation Kit v2 or riboZero as previously described 33 . RNA integrity was verified using the Agilent Bioanalyzer 2100 (Agilent Technologies). cDNA was synthesized from total RNA using Superscript III (Invitrogen).

Molecular subtyping
The log-transformed expression data was used to infer molecular subtypes using a recently published classification system as implemented in the consensusMIBC R package 17 .
Default parameters were set except for minCor representing a minimal threshold for best

T cell inflammation classification analysis
The previously published T cell inflammation gene signature was used to classify tumors into T-cell inflamed or T cell depleted 7 . The expression data, quantified as FPKMs, was obtained for the EIPM UTUC patients, of which RNA-seq was available (n=17). The FPKMs for the primary tumors were obtained from the GDC/TCGA bladder cohort (TCGA BLCA, n=414). The genes from the signature were selected for expression-based supervised clustering. The FPKMs were logtransformed and median centered, and partitioning around medoids (PAM) algorithm was applied to cluster the transformed expression data. This led to the identification of two broad clusters: one with higher expression of the signature genes was labeled as T cell inflamed while that with low expression of signatures genes was labeled as T cell-depleted.

Imaging Mass Cytometry™
Antibodies were conjugated in BSA and Azide free format using the MaxPar X8 multimetal labeling kit (Fluidigm) as per the manufacturer's protocol. Freshly cut 4-micron thick FFPE tissue sections were stored at 4oC for a day before staining. Slides were first incubated for 1 hour at 60oC on a slide warmer followed by dewaxing in fresh CitriSolv (Decon Labs) twice for 10 minutes, rehydrated in descending series of 100%, 95%, 80%, and 75% ethanol for 5 minutes each. After 5 minutes of MilliQ water wash, slides were treated with antigen retrieval solution (Tris-EDTA pH 9.2) for 30 minutes at 96oC, cooled to room temperature (RT), washed twice in TBS, and blocked for 1.5 hours in SuperBlock Solution (ThermoFischer). Overnight incubation occurred at 4oC with the prepared antibody cocktail containing the metal-labeled antibodies.
The next day, slides were washed twice in 0.2% Triton X-100 in PBS, and twice in TBS. DNA staining was performed using Intercalator-Iridium in PBS solution for 30 minutes in a humid chamber at room temperature, followed by a washing step in MilliQ water and air drying. The Hyperion instrument was calibrated using a tuning slide to optimize the sensitivity of the detection range. Hematoxylin and Eosin (H&E) stained slides were used to guide the selection of regions of interest in order to obtain representative regions. All ablations were performed with a laser frequency of 200 Hz. Tuning was performed intermittently to ensure the signal detection integrity was within the detectable range.

Analysis of Imaging Mass Cytometry™ data
Imaging Mass Cytometry™ data were preprocessed as previously described 9 with some modifications. Briefly, image data was extracted from MCD files acquired with the Hyperion instrument. Hot pixels were removed using a fixed threshold. The image was amplified two times, Gaussian smoothing applied and, from each image, a square 500-pixel crop was saved as an HDF5 file for image segmentation. Segmentation of cells in the image was performed with ilastik

Statistical analyses
For statistical tests, a two-sided Mann-Whitney test was used to check for significant differences between two distributions. The two-sided Fisher's exact test was applied to determine whether the deviations between the observed and the expected counts were significant. We used a P value threshold of 0.05.