Single Cell T Cell Receptor Repertoire Profiling for Dogs

Background Spontaneous cancers in companion dogs are increasingly recognized as robust models of human disease. This recognition has led to translational clinical trials in companion dogs with osteosarcoma, lymphoma, melanoma, squamous cell carcinoma, and soft tissue sarcoma. The ability to precisely track tumor-specific immune responses in such clinical trials would benefit from reagents to perform species-specific single cell T cell receptor sequencing (scTCRseq). This technology defines clones of T cells reacting to immune interventions and can help identify the specific epitope of response. Single cell gene expression data give insights into the activity and polarization of the T cell. To date, scTCRseq has not been demonstrated for canine samples. Methods Samples from two responding dogs in a trial of an autologous deglycosylated melanoma vaccine were selected to demonstrate applicability of scTCRseq in a cancer immunotherapy setting. A single-cell suspension of cryopreserved peripheral blood mononuclear cells (PBMC) was prepared for 10X single cell sequencing. Full length 10X cDNA was amplified using a custom-designed nested PCR of the alpha/beta V(D)J region. A library made from this enriched product (scTCRseq) and a 10X gene expression (GEX) library (scRNAseq) were sequenced on the NovaSeq 6000. Results 1,850-2,172 estimated V(D)J-expressing cells yielded 87-103.7 million reads with 73.8%-75.8% mapped to a V(D)J gene (beta/alpha chains ratio 1.5:1). 43 TRAJ, 29 TRAV, 12 TRBJ, and 22 TRBV gene segments were observed representing 72.9%, 51.8%, 100%, and 62.9% of all known V and J gene segments respectively. A large diversity of clonotypes was captured with 966-1,253 TRA/TRB clonotypes identified. Both dogs also exhibited a small number of highly abundant T cell clonotypes suggesting the presence of an anti-tumor T cell population. GEX enriched libraries successfully defined large clusters of CD8+ and CD4+ T cells that overlapped with V(D)J-expressing cells. Discussion The developed reagents successfully generated scTCRseq data, for the first time, which allowed the T cell repertoire to be surveyed in dogs responding to anti-tumor immunotherapy. These reagents will allow longitudinal tracking of anti-tumor T cell dynamics in canine cancer immunotherapy trials.


Introduction
The biomedical research community, as well as the National Cancer Institute, have recognized the value of companion dog spontaneous models of cancer with release of several targeted funding programs. This recognition has led to the completion of translational clinical trials in companion dogs studying osteosarcoma 1 , lymphoma 2 , melanoma, squamous cell carcinoma, and soft tissue sarcoma 3 . Recently, immunotherapy trials evaluated a human chimeric HER2 Listeria vaccine in osteosarcoma as well as an antibody-linked IL-12 conjugate in melanoma of companion dogs 4 . The outbred genetic variability, body size, cancer-conditioned immune system, and shared environment of companion dogs and humans have been cited as particular advantages of dogs as pre-clinical or interposed post-clinical subjects in veterinary clinical trials of interventions destined for human application 5 .
Immunotherapy has become increasingly important in the treatment of cancer with the rise of checkpoint blockade inhibitors, personalized cancer vaccines, CAR T-cell therapies, and more. As a result, profiling the T cell receptor (TCR) repertoire has become an important tool for understanding, measuring, tracking and even predicting T cell mediated immune responses in a cancer setting 6 . The ability to accurately profile the TCR repertoire can also provide a powerful means of diagnosing and tracking T cell malignancies (e.g., minimal residual disease monitoring). Finally, TCR profiling may play an important role in understanding and treating a host of other immune-related (e.g., autoimmune) diseases.
It has also become increasingly apparent in recent literature that the tumor microenvironment (TME) plays a broad and complex role in tumorigenesis and response to many cancer therapies 7 . In particular the TME is known to be capable of producing an immunosuppressive effect. For example the presence of cytokines such as IL-6, IL-12, and TGFβ can increase PD-L1 expression allowing immune escape of tumor cells via checkpoint pathways 8 . The ability to examine the T cell repertoire, in the context of the TME, using single cell approaches, in a model with a functionally similar environment to that of humans could yield significant insights into the role of the TME in success or failure of immune therapy. Furthermore, single cell TCR profiling facilitates the reconstruction of the complete TCR by matching the alpha-beta pairs of the TCR. This in turn may allow for refinement of TCR-Neoantigen-MHC binding prediction algorithms. Such refinement could have implications for personalized therapy, allowing researchers to determine if a T-cell is reactive to a given neoantigen and opening avenues into engineering T-cells matched to an individual tumor profile.
A limitation of the companion dog model of cancer is the relative paucity of molecular profiling reagents available compared to human or mouse settings. Dog genome-specific reagents are required to effectively evaluate tumor-specific immune responses in cancer immunotherapy trials. While studies have piloted scRNAseq for canine samples 9,10 , to our knowledge comprehensive single cell profiling of the TCR repertoire (scTCRseq) of canine samples has not been demonstrated. There are no published protocols or commercially available kits for popular single cell platforms like the Chromium Single Cell TCR Amplification kits available for human and mouse cells.
Here we address this gap by developing and validating canine alpha (TRA) and beta (TRB) chain TCR single cell profiling for the Chromium 10X platform. We show robust isolation of individual canine cells, including a large proportion of T cells, and establish the first detailed single cell TCR repertoires for dog cells.

Samples and single cell suspension
To demonstrate applicability of canine TCR profiling in a cancer immunotherapy setting we identified two individual dogs (Dog_A and Dog_B) with metastatic melanoma from an ongoing immune therapy trial (See Suppl Table 1 for complete clinical information). Both dogs were treated with autologous deglycosylated vaccines derived from primary and metastatic tumor cultures and showed evidence of clinical response. Dog_A had a progression free interval of 280 days after the first dose of the autologous vaccine. Dog_B had resolution of progressive pulmonary nodules following the vaccine series and survived more than a year without clinical recurrence. Ficoll-separated peripheral blood mononuclear cells (PBMCs) were obtained from each dog and cryopreserved. PBMCs for Dog_A were collected at the time of the first tyrosinase vaccine (Oncept) and for Dog_B were collected 14 days after the initial autologous vaccine treatment. Dog_B was free of macroscopic disease at the time of blood draw. Cryopreserved PBMCs were thawed and a single cell suspension of approximately 20,000 cells was generated according to the 10x Genomics Demonstrated Protocol (CG00039 Rev D). Viability was greater than 90% as assessed by trypan blue exclusion staining.
Primer design Primer design and application (Figure 1) was modeled from the Chromium Single Cell V(D)J Reagent Kits User Guide (CG000086 Rev L). This protocol uses a nested PCR design. The two forward primers from the 10x protocol were left unchanged. In the first cycle, the forward primer (TRA/TRB Forward 1) primes at the Illumina read 1 (R1) sequencing adapter sequence that is incorporated during generation of 10x Barcoded, full-length cDNA from polyadenylated mRNA. This primer includes a 5' tail sequence consisting of the P5 priming site used in Illumina sequencing. In the nested primer set, the forward primer (TRA/TRB Forward 2) primes at the P5 sequence and a portion of R1. The forward primers correspond to the 5' end of 10x cDNA fragments. In the case of full length TCR cDNA this would represent the V gene segment end of the cDNA fragment. Neither forward primer has any specificity for the TCR which comes entirely from the reverse primers. The first reverse primer (TRA/TRB Reverse 1, Outer) primes at the constant (C) region gene segment. The second reverse primer (TRA/TRB Reverse 2, Inner) similarly primes at the C region at an inner, 5' position relative to the outer primer. In order to design appropriate 3' primers for use with canine cells we first constructed (in silico) a reference complete V(D)JC TCR cDNA sequence along with 10X adapter sequences for both alpha (TRA) and beta (TRB) chains. The alpha chain was based on a representative dog TCR alpha rearranged partial mRNA (GenBank: M97511.1). The closest V gene segment to this partial mRNA was determined by blast alignment against canine V gene sequences in the IMGT database. The constructed cDNA was extended to include (from 3' to 5') the Illumina R1 adaptor, 16 nucleotide (nt; 16 x N) 10x cell barcode, 10 nt (10 x N) unique molecular identifier (UMI), 13 nt template switch oligo (TSO), complete V gene segment, complete J gene segment, and complete C gene segment. The beta chain was based on a representative dog TCR beta rearranged partial mRNA (GenBank: HE653957.1). The closest V and C gene segments to this partial mRNA were determined by blast alignment against canine V and C gene sequences in the IMGT database. The constructed cDNA was extended to include (from 3' to 5') the R1 adaptor, 10x cell barcode, UMI, TSO, V, D, J, and C gene segments as described above. These constructed cDNA sequences were then used as input to primer3plus (4.0) 11,12 , with forward primers provided as described above, and a target region for reverse primer design specified in the C region. The product of the first (outer) design was used as input for the second (inner) design. Primer oligonucleotides were ordered from Integrated DNA Technologies (Coralville, IA). Final primer sequences are included in Table 1. cDNA generation cDNA generation was performed according to the Chromium Single Cell V(D)J Reagent Kits User Guide (CG000086 Rev L), with the exception of the TCR amplification (described below). Briefly, approximately 20,000 cells suspended in phosphate-buffered saline containing 0.4% bovine serum albumin were used for Gel Bead-in-Emulsion (GEM) generation, barcoding, post-GEM clean-up, and cDNA amplification. Quality controls were performed using the Agilent Bioanalyzer high sensitivity chip. 2 uL of cDNA were used to separately amplify TCR α and β chains using custom dog-specific primers (see Primer design; TCR amplification). The remaining cDNA were set-aside for gene expression (GEX) profiling.

TCR amplification
In order to amplify dog TCR sequences, the nested PCR approach utilized for human and mouse protocols was adopted, with dog-specific reverse primers located in the constant region of the TCR cDNA (see Primer design; Table 1). 2 uL of cDNA were amplified in a Mastercycler Gradient instrument (Eppendorf) in 100 uL total reaction volume using Phusion high fidelity polymerase (Thermo Fisher). The first reaction consisted of an initial denaturation step (98 °C for 45 s) followed by 12 cycles of 98 °C for 20 s, 65 °C for 30 s, and 72 °C for 60 s. An additional extension step (72 °C for 60 s) was added at the end. 10 uL of the resulting product was amplified in the second reaction. Amplification conditions were identical except for the annealing temperature which was 62 °C. The amplification product was purified using SPRI beads before quality control on the Agilent Bioanalyzer high sensitivity chip and further processing for library generation.
Library generation TCR enriched PCR products and set-aside GEX cDNAs underwent library construction and final quality control as described in Chromium Next GEM Single Cell 5' Reagent Kits v2 (Dual Index) User Guide (CG000331 Rev A).
Sequencing GEX and TCR VDJ enriched libraries, produced by the MU Core (as described above), were shipped to the McDonnell Genome Institute for sequencing. The concentration of each 10x single cell and V(D)J library was determined through qPCR (Kapa Biosystems) to produce cluster counts appropriate for the NovaSeq 6000 platform. 500M read pairs were targeted for each gene expression library and 25M read pairs were targeted for each V(D)J library. The libraries were sequenced on the S4 300 cycle kit flow (2x151 paired end reads) using the XP workflow as outlined by Illumina.
Data processing All data processing and subsequent analysis made use of Cell Ranger 5.0.1 unless otherwise noted. Raw fastq files were generated from Illumina instrument data using cellranger mkfastq. A canine-specific reference package was created using cellranger mkref with CanFam3.1 (INSDC Assembly GCA_000002285.2, Sep 2011) canine reference genome and cellranger mkgtf filtered CanFam3.1 GTF from Ensembl v102 as input. The GTF file was filtered according to the example provided in the documentation on the 10X Genomic site 13 . A canine-specific V(D)J reference package was created using the fetch-imgt script and cellranger mkvdjref. The fetch-imgt script failed to obtain C-REGION sequences causing errors in subsequent steps. These sequences (IMGT000004|TRAC*01, IMGT000005|TRBC1*01, HE653929|TRBC1*02, and HE653929|TRBC2*01) were obtained from IMGT 14 , artificially spliced, and added manually to the fetch-imgt file before completing V(D)J reference generation (Suppl Data 1). Single cell feature counts were generated with cellranger count. Counts were combined, normalized, and batch corrected across multiple libraries with cellranger aggregate. Single cell VDJ sequence assembly and paired clonotype calling was performed with cellranger vdj. Genes present in less than 10 cells and cells with less than 100 genes were filtered out of the dataset. Cells whose mitochondrial expression was in the top 5% across all cells were filtered out and DoubletFinder 15 was used to identify and filter out expected doublets. DoubletFinder was run with parameters derived using the example provided at https://github.com/chris-mcginnis-ucsf/DoubletFinder. Cells were annotated to cell types with SingleR 16 using expression profiles derived from the DMAP dataset available on Haemopedia, a hematopoiesis cell expression database 17 . The DMAP dataset was converted from Affymetrix probes to human gene IDs and then to dog gene names using the probe to human gene name mapping available on Haemopedia and the human to dog gene mapping acquired from Ensembl v102 BioMart. If different probes mapped to the same human gene the probe with the highest coefficient of variance across all samples in the dataset was kept and all other probes that mapped to that gene were discarded. If a probe entry mapped to multiple genes then that probe-to-gene mapping was duplicated for each gene it mapped to. For the human gene ID to dog gene name conversion all many-to-many mappings were removed, mappings that had multiple human genes to a single dog gene were removed, and all the mappings from one human gene to multiple dog genes were duplicated for each dog gene.

Single cell V(D)J enrichment and V(D)J/GEX sequencing results
Nested PCR primers were successfully designed for dog TCR alpha (TRA) and TCR beta (TRB) chains (Figure 1; Suppl Figure 1; Table 1). PCR products of the expected size (approximately 650 bp) were observed by gel electrophoresis (Figure 2). Sequencing libraries from TCR enriched products (TRA and TRB) and unenriched GEX cDNA were successfully generated, for both dogs, with suitable quantities, that also showed expected fragment size distributions (Figure 3; Suppl Figure 2). Sequencing data metrics approximated comparable human data for the TCR enriched libraries ( Table 2)  valid barcodes. Q30 bases for barcodes, Read 1, Read 2, and UMI sequences were greater than 89.8% in all cases. For the GEX enriched libraries we also obtained results comparable to human data ( Table 3) 21 Table 2). Each dog had one or a few CDR3 clonotypes accounting for a substantial fraction of all cell barcodes (Suppl Table 2-3). For example, for Dog_B, just two TRA and two TRB clonotypes account for approximately 20% and 40% of all barcodes respectively (Figure 5). At the same time, a large diversity of clonotypes was captured with 966-1,253 paired or unpaired unique TRA/TRB clonotypes identified for 1,850-2,172 cell barcodes. As expected, individual clonotypes were characterized by evidence of germline variation, V(D)J recombination diversity, as well as somatic hypermutation and/or recombination-related mutations at V(D)J junctions (See Figure 6 for a representative example clonotype).
Integration of single cell V(D)J repertoire and gene expression data For both dogs TSNE clustering based on gene expression patterns identified a number of distinct clusters of cells (Figure 7; Suppl Figure 5). Expression of CD3E (a general T cell marker) clearly demarcated a subset of major and minor clusters, representing a substantial fraction (~50%) of all cells (Figure 7A; Suppl Figure 5A). Cell typing based on established blood cell type gene expression signatures (mapped from human genes to dog orthologs) for Dog_A identified 47.8% T cells (26.2% CD4+; 21.6% CD8+), 22.7% monocytes, 21.1% B cells, 3.5% erythroid cells, 1% dendritic cells and less than 1% of any other cell type (Suppl Figure 5B; Suppl Table 4). For Dog_B, cell typing identified 49.3% T cells (19.4% CD4+; 29.9% CD8+), 33.5% monocytes, 11.1% B cells, 1.2% erythroid, 1.2% NK cells and less than 1% of any other cell type (Figure 7B; Suppl Table 4). These cell type proportions were consistent with previously reported results in human PBMC samples 22 and expectations for dog PBMC samples 23 (Figure  7; Suppl Figure 5). Cells specifically identified with V(D)J rearrangements expressed ( Figure 7C; Suppl Figure 5C) overlapped almost completely with those identified as CD3E-positive or as CD4/CD8 T cells. Finally, cells corresponding to the most dominant clonotypes largely overlapped with CD8 T cells.

Discussion
The biomedical community increasingly recognizes the value of companion dogs as a model system for human cancer. However, there remains a lack of the necessary reagents and methods for molecular profiling of canine clinical samples. Here we define a protocol for single cell TCR sequencing that should enhance the utility of canine samples as they relate to the study of cancer and disease. In this work we were able to successfully adapt existing protocols to perform single cell TCR profiling for two individual companion dogs. We were able to detect a majority of known V(D)J gene segments amongst our samples for both alpha and beta TCR chains. Integration with single cell transcriptome sequencing demonstrated the expected relationship between cells expressing rearranged V(D)J sequences and T cell markers or cell type inferred from global gene expression patterns. Somewhat surprisingly, both canine samples showed strong evidence of dominant clonotypes. Indeed, we observed that a number of the more frequent clonotypes could be merged based on CDR3 sequence similarity, even further emphasizing the dominant clonotype pattern (Suppl Table 2-3). As melanoma patients undergoing active immunotherapy treatments these patterns might indicate an active T-cell population responding to their tumors. However, such conclusions await further immune studies of these canine patient samples.
We did not detect all known V(D)J gene segments. In theory, all V, D, or J gene segments may participate in V(D)J rearrangement and form functional TCRs. However, there are a number of plausible explanations for unobserved gene segments. Missing V/J gene segments may simply be the result of insufficient sampling. D gene segments are difficult to determine due to their small size and the introduction of junction recombination mutations that make mapping to reference D gene sequences difficult. Despite the very comprehensive sequencing performed, only a few thousand cells were sequenced in relation to the very large possible diversity of T cells. This situation was compounded by the fact that the samples showed evidence of dominant clonotypes. As a result, a significant fraction of cells and sequence reads were associated with these few dominant clonotypes, decreasing our power to detect rare clonotypes. In human TCR repertoires it has been suggested that V(D)J gene segment usage is not random with some gene segments, segment combinations, and CDR3 lengths preferentially used and others rarely used [24][25][26][27][28] . The same could also be true in dog TCR repertoires. Finally, it is also possible that the current database of known V(D)J gene segments is incomplete. Modern canine genome assemblies 29 , future annotation work, and de novo analysis of data like ours may identify novel gene segments in the future. Profiling of additional samples will be required to further validate this approach and delineate any technological or biological gaps in V(D)J representation.
The methods developed in this study could be immediately applied to current canine immunotherapies studies to help identify predictors of response and evaluate candidate immunotherapeutic targets and efficacy of immune stimulating protocols. Future work should include adapting the protocols herein to be applicable for additional TCR loci (e.g., TRD and TRG) as well as B-cell receptor (BCR) sequencing. The ability to perform BCR sequencing would further increase the utility of canine samples, allowing for a more complete understanding of adaptive immunity in cancer, auto-immune diseases and B-cell malignancies.

Data availability
All raw scRNAseq and scTCRseq data have been deposited at SRA:SUB9918174.    The TCR V(D)J enrichment strategy is depicted using the beta chain for illustration (See Suppl Figure 1 for alpha chain). At top, the genomic un-rearranged TRB locus is shown (Note: TRB is located on the negative strand in dogs). During T cell development from progenitor T cells to mature T cells, individual V, D, J and C gene segments are rearranged by somatic recombination to produce a functional TRB locus. Transcription and splicing produce a pre-mRNA and then mRNA for the complete V(D)JC transcript sequence (not shown). In the modified 10X protocol, mRNA (including TRB mRNA) is converted to cDNA. TRB cDNA is then amplified using a nested PCR design. The two forward primers from the 10x protocol were left unchanged. In the first cycle, the forward primer (TRB Forward 1) primes off the Illumina read 1 (R1) sequencing adapter that is incorporated during generation of cDNA. TRB Forward 1 includes a tail sequence consisting of the P5 priming site used in Illumina sequencing. In the second cycle, the forward primer (TRB Forward 2) primes off the P5 sequence and a portion of R1. The first reverse primer (TRB Reverse 1, Outer) primes off the constant (C) region gene segment. The second reverse primer (TRB Reverse 2, Inner) similarly primes off the C region at an inner, 5' position relative to the outer primer. The beta chain primer design was based off of a dog TCR beta rearranged partial mRNA (GenBank: HE653957.1) which was extended to include (from 3' to 5') the R1 adaptor, 10x cell barcode, UMI, TSO, V, D, J, and C gene segments. The constructed cDNA sequence was then used as input to primer3plus (4.0), with forward primers provided as described above, and a target region for reverse primer specified in the C region. The product of the first (outer) design was used as input for the second (inner) design.

Figure 2. Gel image of final TCR amplification product (Dog_A and Dog_B).
Gel electrophoresis result visualizing product of TCR-alpha (TRA) and TCR-beta (TRB) nested PCR amplification for two dogs using custom primers ( Table 1) and a modified Chromium 10X protocol (Methods). Gel bands of the expected size (~650bp) were observed for all 4 reactions.