CAR T cells for solid tumors: genome-wide dysfunction signature confirms value of c-Jun overexpression, but signals heterogeneity

Chimeric antigen receptor (CAR) T cells still have limited effects in cancer, and especially in solid tumors, due to T cell dysfunction and exhaustion. CAR T cells overexpressing c-Jun (JUN CAR T cells) have been introduced to solve this problem. In this paper, we analyze JUN CAR T cells scRNA-seq data in solid tumors, by applying a genome-wide signature of T cell dysfunction, TID. This signature comes from the bulk RNA-seq signature TIDE, introduced to predict immune checkpoint inhibitor response. Our analysis confirms that on average, JUN CAR T cells are less dysfunctional than non-JUN CAR T cells. However, it also shows heterogeneity within JUN CAR T cells, which brings uncertainty about possible tumor resistance. We conclude that genome-wide dysfunction signature TID helps de-risking CAR T cell therapy for solid tumors.


Introduction
CAR T cells still have limited results against solid tumors, despite their successes in some hematological malignancies. One reason of failure is T cell dysfunction, due to T cell exhaustion [2,7,8]. To address this issue, [5] modifies CAR T cells of the HER2-BBz type, in order to overexpress c-Jun, an AP-1 transcription factor associated with productive T cell activation.
In mouse models of osteosarcoma (a solid tumor affecting bones), they report better survival However, in solid tumor transcriptomics, single-gene signatures are now not sufficient. They are complemented with genome-wide signatures, which, for example, better predict response to immune checkpoint inhibitors than single-gene signatures like PDL1 [4].
Among all genome-wide signatures, Tumor Immune Dysfunction and Exclusion, or TIDE [4], is particularly relevant, because it includes a distinctive T cell dysfunction signature, TID (Tumor Immune Dysfunction). In this paper, we apply TID signature to CAR T single-cell RNA-seq data from [5].

Highlights:
1 TID is second-best at separating averages of JUN and control CAR T cells 5 2 CAR T cells heterogeneity is higher when measured with TID than with single-gene signatures, and is higher for JUN than for Control cells 10 3 TID and single-gene signatures distributions don't show inter-mouse heterogeneity 12 4 TID has a one-sided correlation with single-gene signatures 15

TIDE and TID genome-wide signatures
Coefficient d i is the coupling coefficient between gene expression level h i and T cell level CTL.
When d i > 0, there is an antagonistic interaction between gene i and T cell level. A higher value of gene expression h i will decrease the beneficial association between T cell level and survival (higher h i will increase association between T cell level and death). For example, that's the case for TGFB1.
When d i < 0, that's the opposite: there is a synergistic interaction between the gene i and T cell level. A higher value of gene expression h i will increase the beneficial association between T cell level and survival. For example, that's the case for SOX10.
When d i = 0, there is no interaction between gene i and T cell level. TIDE has been successfully applied to the prediction of outcomes of immune checkpoint inhibitor therapies, where it is one of the best classifiers [1]. However, TIDE could also be applied elsewhere. To our knowledge, that's the first public attempt to apply TIDE to CAR T cell therapy for solid tumors.
1 TID is second-best at separating averages of JUN and control CAR T cells

Single-gene signatures averages
In [5, figure 6g] (copied here in figure 1a), there is a comparison of average expressions of 12 single-gene signatures. More precisely, [5, figure 6g] is a plot of: where jun i and ctrl i are the average expressions of gene i for JUN and Control CAR T cells respectively. They are given by: with #JUN and #Ctrl being the number of JUN and Control cells respectively, and with the nonnegative integer g i,j being the expression of gene i for cell j.
[5, figure 6g] tells that, on average, JUN cells always express less exhaustion markers, and always more activation and self-renewal markers, than Control cells.

Based on single-cell gene expression data (publicly available in NCBI Gene Expression
Omnibus, GEO series accession number: GSE136805), we tried to reproduce this [5, figure   6g]. Instead, we got figure 1b, which is different. We obtained relatively similar values for PDCD1, TIGIT, CD200, NR4A2, KLF2, ILR2A, and IL2. However, we obtained significantly different values for BTLA, ENTPD1, and even opposite signs for TOX, IL7R and CD28.
We have no explanation for this difference.
However, based on our figure 1b, the same conclusion remains, although a bit weakened, because compared with Control cells, JUN cells express all exhaustion markers less (6 over 6), and half of activation/self-renewal markers more (3 over 6).

TID average
In order to compare TID averages for JUN and control cells, we can't generalize equation (2) to the genome-wide signature TID. That is to say, we can't compute: where d i is the TID coefficient for gene i, because single-cell gene expression has a large amount of zeros. We have a lot of genes i such that: jun i = 0 or ctrl i = 0, with d i = 0.
Therefore, we would take logarithms of zero and infinity.
And for each of the 12 single-gene signatures i, we plot: This corresponds to the best practice of log-normalization, i.e. applying the transformation x → log 2 (1 + x) to gene expression. That is the recommended practice for TIDE. By the way, in TIDE definition in the hazard model of equation (1), the normalized gene expression is: In figure 1c, we see that (log-normalized) TID average is lower in JUN than in Control cells.
So according to TID, JUN cells are less dysfunctional than Control cells on average. Moreover, this difference is higher than with all other single-gene signatures, except IL2RA.
These results are even more remarkable when we see that d i coefficients have the "wrong" sign for many genes: d i is negative for the dysfunction gene CD200, and positive for most of activation genes: TOX, IL7R, KLF2, IL2RA, CD28 (figure 2).
However, contributions from those 6 genes with "wrong" signs together only weight 1.5 % of the average TID score.
One possible explanation for these "wrong" signs is that TID is not only a signature of T cell exhaustion, but a bulk RNA-seq signature of dysfunction, which takes into account different

cells, such as other immune cells, cancer cells and stromal cells.
Another remark is that log-normalization is important for TID, because otherwise, the result is reversed: the average TID score for non-normalized JUN cells (-0.087) is larger than for Control cells (-1.148), which would mean that JUN cells would be more dysfonctional than Control cells.
Finally, in equation (3), TID average is a simplification that does not take into account the varying "CTL" weight of each T cell, which appears in equation (1)

CAR T cells heterogeneity is higher when measured with TID than with single-gene signatures, and is higher for JUN than for Control cells
Heterogeneity is an important feature of solid tumors, and a factor of therapy resistance [3].
Single cell RNA-seq facilitates analyzing this heterogeneity. With data available, we can study CAR T cells heterogeneity within JUN and Control cells categories. [5] measures CAR T cells heterogeneity in different ways, in [5, figure 6] and [5, extended figure 10]. However, they didn't measure variance between cells.
The variance of gene i log-normalized expression, for JUN cells, is given by: where the nonnegative integer g i,j is the expression of gene i for cell j.
The TID variance for JUN cells is given by: We did not need to normalize gene expression by substracting the average expression, because variance does not change by subtracting the average.

The variance for Control cells is given by similar formulas.
In figures 3a and 3b, we can see that TID variance is the highest, by a large margin. Moreover, TID variance is higher in JUN than in Control cells. Another possibility is that these high variances might not reflect a real-world phenomenon, and only come from noise in the TID signature, which has a lot of counter-intuitive coefficients, as it appears in figure 2. This question would require further investigations with different singlecell datasets.

TID and single-gene signatures distributions don't show inter-mouse heterogeneity
To refine our observations about variance, we plot distributions.
In particular, these plots might be a way to detect inter-mouse heterogeneity. In single-cell data published by [5], mouse labels are missing: only their categories (JUN or Control) are provided. Moreover, mice survival follow-up is too short to be informative about heterogeneity in JUN mice.
That's why workarounds to detect inter-mouse variation can be useful, and multi-modal distributions could be a hint. They are superpositions of uni-modal (single-peaked) distributions. However, it is still possible to have uni-modal distribution with inter-mouse variation.
This is an important question because without inter-mouse variation, it would mean that TID variance in CAR T cells observed in section 2 is intra-mouse and intra-tumoral.
Moreover, immuno-oncology is often about personnalized oncology. In the case of immune checkpoint inhibitors, there is large inter-patient heterogeneity in clinical outcomes.

Single-gene signatures
Single-gene signatures take integer values, so it's convenient to plot histograms. We did it for PDCD1, in figures 4. We didn't log-normalized gene expression, for better readability.
We see a sharp and almost regular decrease in cell numbers, when gene expression per cell grows. As expected, the decrease is sharper for JUN than for Control cells. We don't see any significantly multi-modal distribution, although this does not rule out inter-mouse variation, which appears in [5, extended figure 10b] for PD-1.
Other single-gene signatures are left as exercises. Another exercise is to check whether the distribution is geometric. Again, we don't see any multi-modal distribution that could suggest significant inter-mouse variation, although this possibility is not ruled out either.
An exercise is to check whether distributions are log-normal, and compute their parameters.

TID has a one-sided correlation with single-gene signatures
We examine the relationship between TID signature and single-gene signatures of a CAR T cell. In figure 6a,  In contrast, in figure 6b, we plot TID scores against IL2RA, which is an activity signature.
Cells with low levels of TID can have high IL2RA, but can't have low IL2RA.
In figure 6c, we compare two single-gene signatures, PDCD1 and IL2RA. We make analogous observations, and in particular, high PDCD1 cells almost never have high IL2RA.
These partial correlations show that TID is complementary with single-gene signatures.

CAR T cells with low TID levels make distinct clusters
As in [5, extended figure 10f and 10g], we plot 2-dimensional UMAP projections of CAR T cells. This gives a qualitative way to visualize similarity between cells. UMAP is an embedding of high dimensional data, analogous to t-SNE, with improvements in terms of computational speed and global distance preservation [6].
In figure 7b, we project expression data. We obtain a different result from [5, extended figure 10f] (reproduced in figure 7a). This difference might be due to UMAP stochastic nature [6, section 5.2], which gives different results from run to run, like t-SNE.
In figure 7c, we project the log-normalized gene expression data, which is more relevant in our study. In both cases (with and without log-normalization), we can see that most of JUN and Control cells belong to different clusters. This confirms that JUN cell engineering significantly impacts cell activity (this was less obvious from UMAP visualization in [5, extended figure   10f]).
Moreover, the JUN cluster is larger than the control cluster, which supports our conclusion drawn from variance computation in section 2, according to which JUN cells are more heterogeneous than Control cells.
In figure 8a, we can see that zones of lowest TID scores for JUN cells (the least TIDdysfunctional JUN cells) are located at the tip, the most further away from Control cells. For Control cells, zones of lowest TID scores are the most further away from the core cluster. Figure 8b shows cells expressing IL2. They overlap with TID-low cells but they can be located elsewhere too.
The same conclusion holds for cells expressing IL7R (figure 8c): they overlap with the TID-low cluster, but most of their concentration is located far away. Figure 8d shows that cells expressing IL2RA are spread out. In conclusion, UMAP visualizations show that the least TID-dysfunctional CAR T cells form specific sub-clusters.

Conclusion
We have shown how genome-wide signature TID can enrich analysis of CAR T for solid tumors scRNA-seq data. First, it confirms that on average, JUN CAR T cells are less dysfunctional than Control CAR T cells. However, it also suggests to be careful about JUN CAR T cells heterogeneity, which brings uncertainty. In conclusion, TID helps de-risking this difficult pipeline in oncology drug development.