Localization of T cell clonotypes using spatial transcriptomics

Spatial transcriptomics is an emerging technology that measures gene expression while preserving spatial information. Here, we present a method to determine localization of specific T cell clones by obtaining T cell receptor (TCR) sequences from spatial transcriptomics assays. Our method uses an existing commercial spatial transcriptomics platform and open-source software for analysis, allowing simple and inexpensive integration with archived samples and existing laboratory workflows. Using human brain metastasis samples, we show that TCR sequences are readily obtained from human tumor tissue and that these sequences are recapitulated by single-cell sequencing methods. This technique will permit detailed studies of the spatial organization of the human T cell repertoire, such as the identification of tumor-infiltrating and tumor-excluded T cell clones.


Main Text
Spatial transcriptomics permits measurement of gene expression without compromising spatial information. Given the importance of location in many immune processes -such as T cell killing and germinal center formation -the technology promises to transform our understanding of immune function and dysfunction in healthy and diseased tissues. Spatial transcriptomics has been used to describe immune cell infiltrate in diverse tissues including tendons, bone marrow, tumors, joint tissue, and neural tissue [1][2][3][4][5][6] , and platform commercialization will allow more widespread adoption of the technique 7 . However, detailed understanding of adaptive immune processes requires determination of B and T cell receptor sequences, which determine lymphocyte antigen specificities and can serve as a barcode for lymphocyte clones. Antigen receptor sequencing has transformed single-cell studies of B and T cells 8,9 but is not yet available in spatial transcriptomics methods. Here, we report a method to determine T cell receptor (TCR) sequences from the commercially available Visium Spatial Gene Expression platform from 10X Genomics. Our technique allows simultaneous visualization of T cell receptor clonotypes and overall gene expression within tissue, uses open-source analysis software, and can be performed on archived cDNA from previous Visium experiments.
The Visium spatial transcriptomics platform uses slide-bound, single-stranded DNA probes to capture polyadenylated mRNA. Each probe contains a partial Read 1 sequence at the 5' end, a 16-nucleotide spatial barcode, a 12-nucleotide unique molecular identifier (UMI), and a 3' poly(dT) tail (Fig, 1a) 10 . The read 1 primer sequence is initially used for cDNA amplification, the spatial barcode links the probe to a particular spatial location, and the UMI identifies unique cDNA molecules generated during first-strand synthesis. After imaging and tissue permeabilization, a polyadenylated RNA molecule anneals to the poly(dT) tail, allowing subsequent reverse transcription to extend the ssDNA probe to contain the cDNA sequence and an added template switching sequence (Fig. 1a). The second strand is generated with a primer complementary to the template switching sequence; this full-length second strand is eluted from the slide by the addition of potassium hydroxide. This eluted second strand is amplified to generate a full-length cDNA library with the spatial barcode and UMI to the 3' end of the poly(A) tail.
To generate cDNA molecules containing both the CDR3 region of the TCRβ as well as the UMI and spatial barcodes containing the location information, TCRβ amplification must occur with primers at the 5' end of the TCR (Fig. 1a). Since the TCRβ constant region is 3' to the CDR3, a pool of variable (TRBV) gene primers is required to generate cDNA molecules containing both the CDR3 and spatial information. To generate human libraries, we added a partial Read 2 sequence 5' to a set of 45 human TRBV primers described previously (Supplementary Table  1) 11 . PCR was performed with this pool of modified TRBV primers and a Read 1 primer using amplified cDNA from the Visium assay as template to generate TCR-enriched libraries. While the standard Visium gene expression protocol calls for fragmentation of the cDNA library 10 , TCR-enriched libraries generated here cannot be fragmented without losing linkage between TCR and spatial barcode sequence. Thus, the TCR-enriched cDNA library was cleaned with bead purification and directly subjected to sample index PCR for multiplexed paired-end sequencing (Fig. 1a).
Read 1 of the paired-end sequencing must be performed for at least 28 cycles to provide the sequences of the 16-nucleotide spatial barcode and 12 nucleotide UMI. Read 2 contains the TCRβ-identifying CDR3 sequence and should be preferably longer than 100 nucleotides. In the current study, we performed paired-end sequencing with read lengths ≥150 nucleotides on an Illumina MiSeq instrument. For analysis, read 2 was subjected to the MiXCR pipeline 12 to identify TCRs present and the reads supporting each identified TCR sequence. UMIs and spatial barcodes were identified from the paired read 1 to correct for amplification bias and to match TCRs to a spatial location.
We performed this protocol and analysis on six fresh-frozen tissue sections of resected human brain metastases, representing four primary tumor types (Supplementary Table 2). The number of TCR UMIs identified with this method ranged from 1 to 1,973 per sample, and TCR recovery was loosely correlated both with CD3E transcript counts by the standard Visium gene expression assay (Fig. 1b, Supplementary Table 2) and CD8 + T cell infiltration by flow cytometry (Supplementary Fig. 1). The number of unique clones identified ranged from 1 to 113 and diversity (Shannon index) from 0 to 3.89 (Fig. 1c). To determine whether the TCRs identified by our technique were recapitulated by other methods, we performed single cell RNA-sequencing (scRNA-seq) on T cells from four matched patients. CD8 + T cells were sequenced from all patients, and CD4 + T cells were sequenced from three of the four (Fig. 1c). On average, 74.8% of TCR UMIs identified by the spatial method described here were also called in matched scRNA-seq samples with the cellranger VDJ pipeline (Fig. 1d).
An example implementation of these data is shown in Figure 2. A lung cancer metastasis to the brain (patient 27; ref. 13) was stained with hematoxylin and eosin (Fig. 2a) and subjected to the standard Visium gene expression assay and spatial TCR method as described above. Gene expression analysis revealed seven clusters of spatial gene expression (Fig. 2b). scRNA-seq on FACS-sorted CD4 + T cells was performed in parallel, identifying seven clusters of CD4 + T cells, including regulatory T cells (Tregs), granzyme-expressing cells, and naïve cells from a healthy donor (Fig. 2c-d). TCR overlap was minimal between these scRNA-seq clusters (Fig. 2g-h), and diversity trended higher in CD4 + T cells than in CD8 + T cells. CD4 + T cell clones identified by scRNA-seq were detectable in the Visium TCR results, including single and multiple clones associated with the Treg and granzyme-expressing phenotypes (Fig. 2e-f). In Sudmeier et al. 13 , we use this technique to show that TCR clonotypes expressed by exhausted CD8 + T cells localize to the tumor parenchyma.
Here, we have shown that human TCRβ sequences can be accurately identified and assigned to spatial locations in tumor tissue using the Visium spatial transcriptomics assay. This technique can be used in parallel with the standard gene expression protocol, allowing determination of both TCRβ sequences and the phenotype of the tissue in which they are embedded. The spatial TCR repertoire of tissue types other than cancer can also be explored; for example, the localization of T cell clones in germinal centers or other lymphoid tissues may also be of interest. This method should be generalizable to TCRα sequences by using primer pools specific for TRAV genes 11 . Primer pools specific to variable T cell receptor genes of other species should allow this method to be applied to tissue sections from other species (such as mouse 14,15 ). Additionally, it should be possible to obtain B cell receptor sequences with this method. Igκ-and Igλ-specific primer pools 16,17 should generate libraries with insert sizes comparable to the TCR libraries described here, whereas sequencing of heavy chains may require long-read sequencing.

Data deposition
Raw and processed data for scRNA-seq have been deposited in the Gene Expression Omnibus (GEO) under accession GSE179373. Raw and processed data for the gene expression spatial transcriptomics experiments have has been deposited in the GEO with accession GSE179572. Reads from targeted TCR-sequencing for spatial transcriptomics have been deposited in the Sequence Read Archive under accession BioProject PRJNA742564. Sample code for analysis has been posted on Github (https://github.com/whhudson/spatialTCR).

Samples
Detailed information on the patient cohort and acquisition of tumor samples can be found in ref. 13. For spatial transcriptomics, as soon as possible after surgical resection, a piece of tumor tissue was embedded in OCT and flash frozen in a dry ice/2-methybutane bath. 10 μm thick sections were cut and placed onto a 10X Genomics Gene Expression Slide. Slides were stored at -80 °C for up to one week until methanol fixation, H&E staining, and imaging according to the manufacturer's instructions 18 . Gene expression library preparation was carried out as specified in the manufacturer's protocol 10 . 5 μl of amplified cDNA from step 3.4 of the manufacturer's instructions 10 was used for TCRβ enrichment and library preparation as described in the Supplementary Protocol. Data analysis for TCRβ sequencing from Visium is also described in the Supplementary Protocol. Spatial transcriptomics gene expression was performed with 10X Genomics's spaceranger pipeline. Clusters identified by graph-based clustering were annotated by neuropathologists as described in reference 13.
Remaining tumor not used for spatial transcriptomics was used for cell isolation as described previously 19 : Tumor samples were cut into small pieces in L-15 medium, and digested shaking at 37 °C for 1 hour with a mixture of elastase, DNase, and collagenases I, II, and IV. Samples were forced through a cell strainer and white blood cells were isolated with a 44%/67% Percoll gradient. Cells were counted by flow cytometry and either sorted immediately for scRNA-seq or frozen in FBS with 10% DMSO. For frozen samples, cells were incubated in a 37 °C water bath until thawed and washed twice with RPMI medium supplemented with 10% FBS. For naïve T cell isolation, PBMCs from a healthy donor were used. For CD8 + T cells from tumors, CD45 + CD14 -CD19 -CD3 + CD4 -CD8 + cells were isolated by FACS, and for CD4 + T cells from tumors, CD45 + CD14 -CD19 -CD3 + CD4 + CD8cells were isolated. CD8 + T cells from patients 15 and 16 were sorted at the time of resection. CD4 + and CD8 + T cells were sorted simultaneously from frozen samples of patients 16,26, and 27, using hashtag antibodies from Biolegend (product numbers 394661, 394663, 394665, and 394667). TCR data were combined from the frozen and fresh scRNA-seq experiments with patient 16. Naïve CD4 + and CD8 + T cells from PBMCs were sorted with the strategies above and the addition of CCR7 + CD45RA + gates (BD 562381 and Biolegend 304134).

scRNA-seq analysis
The cellranger pipeline (v6.0.0) 20 was used to align sequencing reads. The Seurat package (v4.0.1) in R was used for data analysis 21 . CD4 + T cells were identified by having CITE-seq counts for CD4 > 70 and CITE-seq counts for CD8 < 90. Cells were identified as originating from a particular patient if 90% of its hashtag CITE-seq reads originated from a single barcode. Cells not definitely determined as CD4 + or CD8 + and/or as from a single patient by CITE-seq were discarded. Doublets were excluded with a maximum cutoff of detected UMIs (10,000), and dead cells were excluded with a maximum percentage of genes originating from mitochondrial genes (5%). TCR genes and Y chromosome genes were excluded from gene expression analysis. Count data were normalized and scaled and variable features identified with Seurat. 10 principal components, 50 neighbors, and a minimum distance of 0 were used for UMAP reduction. Clustering was performed with the Louvain's algorithm with multilevel refinement and a resolution of 0.3 in Seurat's FindClusters.

TCR determination
cellranger vdj was used to call TCR sequences. cellranger vdj gives paired TCRα/TCRβ sequences but only unpaired TCRβ sequences are available from our spatial transcriptomics methods. Thus, we called TCRβ sequences using the same TRBV gene family, the same TRBJ gene family, and having an identical CDR3 amino acid sequence as an individual clonotype. This method was used for both cellranger vdj and MiXCR output.

Data visualization
Data were visualized using ggplot2 22 and GraphPad Prism v9.

Human subjects
Experiments were carried out with the approval of the Emory University Institutional Review Board under protocols IRB00045732, IRB00095411, and STUDY00001995. Figure 1: A method for obtaining TCR sequences from the Visium spatial transcriptomics assay. a) Schematic of TCRβ library generation from Visium cDNA. b) Correlation of TCR sequences obtained from the assay with expression of CD3E (a pan-T cell marker) by Visium gene expression analysis. c) For four patients, scRNA-seq on T cells was performed to validate TCR sequences obtained by the spatial assay. Pie charts show frequency of clones obtained from the spatial assay and are colored based on whether the clones were independently found through scRNA-seq in CD4 + or CD8 + T cells. d) Summary of clone recovery statistics. CD4 + T cells were not sequenced from patient 15 due to lack of cryopreserved cells. Figure 2: Example spatial localization of CD4 + T cell clones. a) H&E staining of a lung cancer metastasis to the brain. b) Clusters of gene expression within the tissue. c) scRNA-seq was performed on CD4 + T cells from three patients. Cells are plotted based on their UMAP dimensionality reduction coordinates and colored by gene expression cluster. d) Expression of selected genes. e-f) Spatial localization of a granzyme-expressing CD4 + T cell clone and multiple Treg clones. scRNA-seq phenotype of these clones are shown in panels g-h.