Alternative paths to immune activation: the role of costimulatory risk genes for polygenic inflammatory disease in T helper cells

T cell activation pathways have been repeatedly implicated by genetic studies as being enriched for risk genes for immune and inflammatory diseases. Many of these risk genes code for costimulatory receptors or ligands. Costimulatory receptors are cell surface proteins on T cells, which are engaged by costimulatory ligands on antigen-presenting cells. Both costimulation and antigen binding are required to trigger T cell activation. In order to study the different pathways activated by these costimulatory risk molecules, and the role they may play in inflammatory disease genetics, we carried out gene expression (RNA-seq) and chromatin accessibility (ATAC-seq) profiling of naive and memory CD4+ T cells (N=5 donors) activated via four different costimulatory receptors: CD28 (the standard molecule used for in vitro activation studies), along with alternative costimulatory molecules ICOS, CD6, and CD27. Most, but not all, activation genes and regions are shared by different costimulation conditions. Alternative costimulation induced lower proliferation and cytokine production, but higher lysosome production, altered metabolic processing, and indications of “signal seeking” behaviour (homing and expression of costimulatory and cytokine receptors). We validated a number of these functions at the surface protein level using orthogonal experimental techniques. We found the strongest enrichment of heritability for inflammatory bowel disease in shared regions upregulated by all costimulatory molecules. However, some risk variants and genes were only induced by alternative costimulation, and the impact of these variants on expression were less often successfully mapped in studies of T cells activated by traditional CD28 costimulation. This suggests that future genetics studies of gene expression in activated T cells may benefit from including alternative costimulation conditions.


Introduction
Genetic studies of immune and inflammatory diseases have identified many hundreds of risk variants and genes across a wide range of pathways. Cellular functions of CD4+ T cells have been consistently identified as key genetic risk pathways, and risk variants are heavily enriched in or near genes expressed by CD4+ T cells [1,2], particularly in activated cells [3]. Expression quantitative trait (eQTL) studies of activated CD4+ T cells have demonstrated the impact of risk variants on gene expression, with many variants having effects specifically in activated conditions [4,5].
T cells require multiple activation signals from antigen presenting cells (APCs), including a primary signal of antigen recognition (via the T cell receptor, TCR) and secondary signal called costimulation (via costimulatory molecules). Since the 1980s, in vitro experimental models of T cell activation have commonly used agonistic antibodies against CD3 to stimulate the T cell receptor and against CD28 (first called Tp44) to provide costimulation [6]. The CD28 ligands, CD80 and CD86, are expressed by dendritic cells (DCs), monocytes and even other T cells in a cell-subset and context-specific way, and changes in their expression impact both the strength of activation and the cell subsets T cells subsequently differentiate into [7][8][9].
However, beyond CD28, many other molecules can also provide costimulation or coinhibition via other receptor/ligand pairs expressed on T cells and APCs, respectively. The particular set of ligands and receptors that are involved in the interaction are determined by the cell subset, tissue context and cell history of the T cell and APC [10,11]. This can have important implications for disease treatment, as certain T cell subsets evade the CD28-blocker belatacept [12], likely due to the availability of alternative costimulation.
Costimulation pathways were identified as key pathways in the genetics of immune and inflammatory disease early on in GWAS [13]. The list of inflammatory bowel disease risk loci alone [14] includes variants mapped to costimulatory receptors on CD4+ T cells (CD6, CD27, CD226, CD137) and to costimulatory ligands on APCs (CD40, ICOSL). CD28 itself also showed evidence of association for a range of diseases [15]. Some locus-specific research has investigated whether risk variants in alternative costimulatory molecules inhibit their ability to costimulate T cells (e.g. CD6 [16], CD226 [17]), with the most striking example being a 2-fold increase in IL-17 production in CD4+ T-cells from CD226-risk-variant carriers compared to non-risk-variant carriers after CD226 costimulation.
It is known that many T cell genes and pathways are impacted by the level of CD28 costimulation [18], and by the cytokine applied during stimulation [19], and it is likely that differences in other costimulation molecules have similar impacts. Previous work on the transcriptional response to alternative costimulation molecules in human [20] and mouse cells [21] using gene expression microarrays has shown that the same core activation pathways tend to be upregulated by most costimulatory molecules, but that individual costimulatory molecules also often have biases or unique genes or pathways.
It is likely that standard in vitro studies of T cell costimulation miss or understudy pathways, genes and gene regulatory elements that are regulated specifically or predominantly by alternative (i.e. non-CD28) costimulatory pathways, and in particular by inflammatory-risk-associated costimulatory genetic variation. For instance, eQTL studies may have lower sensitivity to detect the impact of risk variants on activation-responding genes if they lie in enhancers that are underactive in CD28-dependent costimulation compared to alternative costimulation, or are dependent on regulatory pathways or transcription factors that lie downstream of alternative costimulatory molecules.
In this study, we set out to use modern functional genomics techniques to assess the difference in gene expression, activity of gene regulatory elements and pathway activity between standard CD28-induced costimulation and alternative genetic-risk-implicated costimulatory molecules. We had two main questions: Firstly, how many genes and regulatory elements are preferentially regulated by non-CD28 costimulation, and what sort of functions or pathways are they associated with? Secondly, do we find evidence that genetic risk for inflammatory disease is present and/or enriched in any genes downstream of these costimulatory molecules?
We selected four costimulatory molecules that have previously been identified as candidate causal genes for inflammatory bowel disease, two of which have been well studied in the genome-wide costimulation-induced expression papers reviewed above (CD28 and ICOS), and two not previously studied (CD6 and CD27).

Experimental set-up
We used an experimental approach that sorted live CD4+ naive and memory T cells (removing regulatory T cells, Tregs) from N=5 donors, activated them for 24 hours using biotin-conjugated agonistic antibodies bound to streptavidin plates [22], sorted out activated (CD69+ cells) and carried out RNA-seq and ATAC-seq assays to measure gene expression and chromatin accessibility ( Figure 1A). The four antibodies were titrated to give comparable levels of activation (measured via CD69 expression, Figure 1B), though in practice CD27 was unable to achieve high levels of activation in memory cells.
For RNA-seq, 58 samples were sequenced (after two failed library QC), with coverage per sample ranging from 22.4M to 50.2M mapped reads. For ATAC-seq, 57 samples were sequenced with coverage per sample ranging from 163.6M to 473.2M mapped reads.

Overall changes in gene expression and regulation
Principal component analysis demonstrated that the type of costimulation molecule used was the primary driver of genome-wide gene expression across all 58 samples (Figure 2A): the first principal component distinguishes different forms of costimulation, with control samples (cultured with a non-binding IgG2a isotype control antibody) and non-costimulated (cultured with anti-CD3 alone) on one end, standard CD28 costimulation on the other, and CD6, CD27 and ICOS costimulation falling in the middle. The second principal component separated naive from memory cells.
We tested for differential expression using DEseq2 [23]. Very large numbers of genes were differentially regulated by costimulation in naive cells, ranging from 6150 upregulated and 5963 downregulated by CD28 to 3806 upregulated and 3893 downregulated by CD27 ( Figure 2B).
For most genes, the log fold change in gene expression was correlated across the different costimulation molecules ( Figure S1), representing the core set of T cell activation genes. The overall strength of the impact of costimulation on gene expression was largest for CD28, and the effect on the majority of genes in the alternative costimulations can be viewed as a "scaled down" version of the impact of that seen in CD28.
To identify genes that show proportionally larger changes in particular alternative costimulatory molecules, we used two different approaches to define "costimulation-biased" genes. The first, which we called the "shape-based method", selected genes that were significantly upregulated in the alternative costimulation compared to both CD3-alone and CD3+CD28 (or were significantly downregulated in both, Figure S1). The second, which we called the "linear model-based" or "LM-based" approach, tested whether genes had a significantly larger or smaller log fold change than would be predicted based on the log fold change in CD3+CD28 and the genome-wide slope of the regression line (i.e. identifying genes that had significantly larger or smaller effects than would be expected based on a simple change in costimulation strength, Figure S2). The alternative costimulatory molecule with the largest number of costimulation-biased genes was CD6, which had 843 shaped-based and 2279 LM-based genes in the naive condition (Table 1). Overall, there were many more costimulation-biased genes in naive than in memory cells, likely due to the relatively smaller effect of alternative costimulation in memory cells.
In overall genome-wide pattern, gene regulation, measured by accessibility in called peaks, showed a similar pattern to gene expression on its principal components, differentially accessible peaks and costimulation-biased peaks (Supp Figure S3).

Alternative costimulation condition
Memory Naive  Table 1: Number of costimulation-biased genes in memory and naive CD4+ T cells under different alternative costimulation conditions compared to CD28 costimulation, estimated using two different methods (LM-based and shape-based).

Figure 2: Global changes in gene expression under different costimulation conditions. A)
Principal component analysis of RNA-seq samples, coloured by stimulation condition and shaded by memory/naive subset. B) Count of differentially expressed genes between costimulation conditions and control conditions, broken down by UP-vs DOWN-regulation, naive/memory subset and control condition (shown in gray boxes above the plot, costimulation conditions are compared to aCD3-stimulated-only controls, and aCD3 is compared to mIgG2a isotype controls). C) Count of differentially expressed genes between alternative costimulation conditions (aCD6, aCD27 and aICOS) and the aCD28-costimulated condition, broken down by UP-vs DOWN-regulation and naive/memory subset.

Figure 3: Pathway and transcription factor enrichment analysis. A) Hallmark and B)
KEGG terms significantly enriched (FDR < 0.01) in costimulation-biased genes in at least one alternative costimulation condition NES=Normalized Effect Size, padj=Benjamini-Hochberg adjusted q-value C) Enrichment of transcription factor motifs in the CD6 and CD28 costimulation conditions compared to CD3-alone. Yellow dots indicate significant costimulation motifs, defined as motifs which are significantly enriched (FDR < 0.05) both in peaks that are upregulated in the CD6 condition compared to all peaks and compared to peaks that are upregulated in either the CD6 or CD28 conditions.

Pathways and transcription factors characterizing alternative costimulation
By applying the fast GSEA [24] method to the LM-based Z scores, we tested for enrichment of Hallmark [25] and KEGG [26] terms among genes relatively upregulated in each alternative costimulation compared to CD28 ( Figure 3A and B). A number of effector pathways were downregulated in these alternative costimulations (i.e. upregulated in CD28), including proliferation pathways (G2-M checkpoint genes, E2F targets and cell cycle genes) and, for CD6 and CD27 in naive cells, cytokine response pathways (IFNg and IFNa response genes). Pathways that were characteristic of alternative costimulation include oxidative phosphorylation and lysosome pathways.
To investigate the DNA binding proteins driving differences in differential accessible peaks, we tested for enrichment of transcription factor motifs using HOMER [27]. As with gene regulation, most transcription factors were equally active in all costimulation conditions. However, using the same shape-based approach, we found a number of transcription factors differentially active in alternative costimulations, including NFAT, the NFAT:AP1 complex, and Nur77 in CD6 ( Figure  3C), the last of which seemed like a unique binding signature in CD6 costimulation and not present in CD28 costimulation. The Nur77 gene (NR4A1) was not upregulated in CD6 compared to CD28, suggesting that its activity is not driven by an increased production, but two NFAT genes (NFATC1 and NFATC3) were upregulated in the CD6 condition relative to the CD28 condition.
We carried out further bioinformatic analyses to validate and follow up on the candidate hypotheses suggested by the network analysis. We first tested effector functions (proliferation, cytokine production), which the pathway analysis suggested were lower in alternative costimulations. To validate the impact of costimulation on proliferation, we scored each sample for a cross-species measure of cell proliferation [28] (Figure 4A), demonstrating a clear effect in both memory and naive cells of CD6 and CD27 (and, in naive cells, ICOS) costimulation leading to lower proliferation.
We next turned to the pathways that were enriched in alternative costimulation conditions. To follow up on the lysosome gene set enrichment, we visualized the log fold changes ( Figure S4), demonstrating clear signals of upregulation in CD6 across a range of lysosome degradation components, including proteases, glycosidases, sulfatases and phosphatases. The role of the lysosome in costimulation was hard to establish; it was not explained by increased autophagy (LC3 transcripts were not upregulated, and the autophagy KEGG pathway was not enriched), or by antigen presentation (HLA alleles were not upregulated, the antigen processing marker TAP1 was downregulated in CD6 compared to CD28, and genes in the KEGG antigen presenting pathway were downregulated).
To investigate the role of oxidative phosphorylation metabolism, we used the COMPASS method [29] to infer metabolic reaction rates across 7,440 reactions for each of our samples, and found that, as predicted, OxPhos reactions were more highly upregulated in CD6 compared to CD28 than would be expected under the trend line ( Figure 4B). There was no corresponding downregulation in glycolysis reactions in CD6, suggesting that this is not a simple inverse case of oxphos-to-glycolysis metabolic switching as seen in differentiating T cells [30].
For the production of cytokines, we found that almost all cytokine genes were either expressed at a lower level in the alternative costimulation (e.g. TNF, IFNG in CD6) or were not expressed at all (e.g. IL1 and IL17 in CD6) ( Figure S5). The single exception to this rule was the production of XCL1 and XCL2 in naive cells, which was expressed specifically in response to ICOS and no other costimulatory molecules. For cytokine receptors, the pattern was more mixed, with some cytokine receptors expressed at lower levels in alternative costimulations (TNFR, IFNAR2 in CD6), whereas others expressed at higher levels (e.g. CCR11). We also investigated other genes towards the top of our list. CD9 stood out as a candidate that was highly expressed by naive cells in response to all costimulation conditions except CD28. While CD9 is involved in the creation of exosomes [31], we did not find other exosome components upregulated in alternative costimulations. Instead, this molecule seems to follow a pattern of cell adhesion genes being upregulated in alternative costimulations, with both the "biological adhesion" and the "leukocyte cell-cell adhesion" GO terms enriched in CD6, CD27 and ICOS, and including a number of adhesion genes (CD99, LYPD3, CD96).
We also tested the extent to which costimulatory molecules coregulate each other ( Figure 4C). The primary finding was that the CD27 transcript in naive cells was strongly upregulated by all alternative costimulation molecules (particularly CD6), but weakly, if at all, by CD28. By contrast, the ICOS transcript is induced most strongly by CD28, particularly in memory cells, though it is also induced more weakly by CD6. By contrast the CD6 transcript was only weakly regulated by any costimulation.

The role of alternative costimulation in genetic risk of inflammatory bowel disease
The ATAC-seq open chromatin data allowed us to test whether regions of the genome that were regulated by different costimulation molecules were enriched for genetic risk of inflammatory diseases. Testing for enrichment of inflammatory bowel disease (IBD) SNP heritability using partitioned LD score regression (LDSC [32]), we found significantly higher heritability in peaks that were upregulated in activated, costimulated naive T-cells compared to CD3-stimulation alone ( Figure 5A). This was equally true regardless of the costimulation molecule used. We also tested whether peaks that were shared or unique across both CD28 and alternative costimulation conditions were enriched for IBD risk ( Figure 5B), and found the strongest enrichment of risk in peaks that were induced by both, with the second largest enrichment in CD28-specific peaks, and the weakest in alternative costimulation-specific peaks.
We intersected our costimulation-biased peaks with genetic fine-mapping results for immune and inflammatory disease associations [1,33] to find candidate risk variants that may act through regulatory elements specific to alternative costimulation. We found nine loci where costimulation-biased peaks contained potential causal variants for inflammatory disease, including seven for inflammatory bowel disease, one for multiple sclerosis and one for primary biliary cirrhosis (Supplementary Table S1). One example of an alternative costimulation-specific risk locus is in the IBD-associated 8q24 gene desert [34], where the lead risk variant is contained within an accessible element that is repressed by CD6 costimulation but not by CD28 costimulation ( Figure 5C).  CIs). B) Enrichment of heritability in regions specifically upregulated in CD28 or in one or more alternative costimulation conditions (CD28-specific and alternative-specific peaks), in both CD28 and at least one alternative costimulation condition (shared peaks) or not upregulated in activated T cells in any condition (neither). C) The 8q24 IBD GWAS locus, causality posteriors for candidate causal variants [1], and estimated log-fold changes (with 95% CIs) in the CD28 and CD6 conditions as compared to aCD3-only control. Dashed line shows the location of the highest posterior variant. D) Enrichment of IBD associations colocalizing with gene expression in an eQTL study of CD28-costimulated CD4+ T cells [4] among genes upregulated during activation (in any costimulation condition) or specifically upregulated by alternative costimulation and not by CD28.  We also intersected high-confidence candidate risk genes for IBD (L2G score > 0.5 [35]) with our costimulation-biased gene list. We looked for genes that are either upregulated specifically in alternative costimulation (i.e. not differentially expressed in the CD28 condition), or had opposite effects in CD28 vs alternative costimulation, as these were genes that would be more likely to be unmapped or give false direction of effect in eQTL studies. Of 76 high-confidence IBD risk genes, 8 met these criteria ( Table 2). Of these, five were specific to alternative costimulations, and three went in opposite directions between standard and alternative costimulation. Six of the eight were induced or repressed by CD6.

CD4+ subset Conditions
We hypothesized that these alternative-costimulation-biased risk genes would be harder to successfully map in traditionally activated (i.e. CD28-costimulated) eQTL studies, and we tested this hypothesis in a recent single-cell eQTL map of CD28+CD3 stimulated CD4+ T cells [4]. Of our 8 candidate IBD risk genes, only one (IL18R1) had a colocalizing eQTL in the CD3+CD28 eQTL dataset (compared to 15/68, or 22%, of the remaining risk genes). Genome-wide, costimulation-specific genes have a lower (and non-significant) enrichment of successfully colocalised IBD risk loci (OR = 1.37, 95% CI = 0.82 -2.18) than T cell activation genes in general (OR=2.59, 95% CI = 1.72 -3.71, Figure 5D), providing further evidence that alternative costimulation-specific risk genes are harder to map in standard eQTL studies.

Experimental validation of key findings at the protein level
We validated a number of our key findings using orthogonal experimental assays of protein expression.
We measured the surface expression of CD6, CD27 and ICOS after CD28 and CD6 costimulation using flow cytometry. We validated that CD27 is upregulated by CD6 compared to CD28 in naive (but not memory) T cells, and that ICOS is more strongly induced by CD28 than CD6 ( Figure 6A). As predicted CD6 was not impacted by CD28 costimulation, though it was almost entirely lost from the surface after CD6 costimulation (an effect of ligation-dependent cleavage which has been described previously [36]).
We validated the reduced effector function of alternative costimulated cells using a multiplex cytokine assay, showing either absence or significantly reduced secretion of key effector cytokines by CD4+ T cells after costimulation with CD6 compared to CD28. The level of reduction in protein concentration was consistent with the level of reduction of gene expression ( Figure 6B).
Finally, we validated the increase in lysosomes using intracellular staining of LAMP1 measured by flow cytometry, demonstrating that T cells accumulate lysosomes after CD6 costimulation compared to CD28 costimulation ( Figure 6C). We further investigated whether this increased lysosome formation could be explained by increases in either autophagy [37,38] or lysosome exocytosis [39], but saw neither evidence of lysosome secretion or an increase in autophagic flux after 24 hours of CD6 costimulation ( Figure 6C,D).

Discussion
We have carried out a detailed assessment of the different effects genetic-risk-associated costimulatory molecules play on gene regulation, gene expression and downstream pathways in activated CD4+ T cells. This has revealed a number of key findings about the functional impacts of these signaling molecules, and the role they may play in the genetics of inflammatory disease. We have also successfully validated a set of these findings at the protein level using orthogonal experimental approaches.
The majority of the changes downstream of T cell activation, as has been noted by previous research, are activated (to some degree) regardless of the specific costimulation molecule used, though usually with a larger effect in the CD28 condition. Costimulation with CD28 has a more dramatic impact on T cell function than other costimulators across three levels of the signaling cascade. Firstly, anti-CD28 is a more potent costimulator than other molecules, and in our experiments we required 6.7x the concentration of anti-CD6 to achieve a comparable level of T cell activation (measured by CD69 positivity). Next, even at comparable levels of activation, and looking at only activated (CD69+) cells, we saw a larger degree of chromatin remodeling and a more dramatic impact on gene expression in the CD28 condition than for other costimulation molecules. Finally, even after controlling for the genome-wide patterns, we found that genes underlying effector functions, such as proliferation and cytokine release, were further enriched in CD28 compared to other costimulators, relative to other genes.
There were numerous genomic regions, genes and pathways that diverged from this overall trend and responded exclusively or predominantly to alternative costimulation. This was particularly true in naive cells, and particularly during CD6 costimulation. Alternative costimulation called T cells to have a relative shift towards what appears to be a "getting ready" cell state. These cells secrete less cytokine and proliferate less. They begin to overexpress genes that may play roles in searching out further activation signals, including further costimulation (by upregulating CD27), homing to inflammatory areas (via CCR11 [40]) upregulating certain cytokine receptors (TNFR, IFNAR2) and increasing activity of activation-mediating transcription factors (Nur77/NR4A1 [41]). They also undergo a number of intracellular changes, including the overproduction of lysosomes and a relative increase in oxidative phosphorylation. The lysosome overproduction did not appear to be related to autophagy or antigen presentation, and we did not see evidence of secretory exocytosis (though the 24-hour time point may have been too early to detect it). Lysosome-mediated protein degradation plays a role in many other T cell functions [42], some general (e.g. increasing protein turn-over) and some specific (including sensitizing cells to activation by degrading CTLA-4 [43]), and the same is true for metabolic state [44]. Further investigation into these pathways, and how they are altered by genetic risk variants for inflammatory disease, will shed light into the downstream functions of these pathways and their role in immunopathology.
In this study, we were specifically interested in whether previous studies of CD3+CD28 activation of T cells could have missed important genetic risk pathways, for example by failing to find eQTLs in genes expressed only downstream of genetically implicated alternative costimulation molecules. Our heritability analysis suggests that the majority of risk variants lie in general T cell activation genes, and that the alternative costimulation molecules are no better at inducing the expression of risk genes than CD28. In fact, regulatory regions that were only activated by alternative costimulation and not by CD28 had significantly less enrichment of heritability than regions regulated by all costimulatory molecules. However, this does not mean that alternative-costimulation-specific pathways play no role in disease risk: we found both risk variants and risk genes that are exclusively regulated by alternative costimulation, both for inflammatory bowel disease (including IRGM and ZFP36L1 in CD6 costimulation) and for psoriasis (XCL1/2 in ICOS costimulation [45]). As predicted, alternative costimulation risk genes are less likely to colocalize with gene expression in CD28-costimulated CD4+ T cells in a recent single-cell eQTL study [4].
There are many limitations of this study, and much more could be done. We picked a 24 hour time-point in order to capture an early pre-proliferative, pre-differentiation state. However, both earlier time-points (to capture the immediate effects of costimulation) and later time points (to capture the effects of differentiation and T cell polarization) could also include valuable information. There are further experimental studies that could and should be done to validate more findings at the protein level, and measure their impacts on downstream cellular functions. There is also further investigation that can be done on the impact of combinations of costimulation -here we have modeled the effect of "pure" alternative costimulation (as might be seen from a CD80/86-negative APC, or a CD28-negative T cell), but in practice a T cell will often receive a combination of costimulation signals, which may have non-additive impacts. We have also used bulk RNA-seq and ATAC-seq, sorted only on activation state and naive vs memory subset, which may miss strong impacts of certain costimulatory molecules on certain rare T cell subsets.
A final limitation of the study is that, while we have adjusted for stimulation strength as much as possible (by achieving comparable levels of CD69-positivity, sorting CD69+ cells and adjusting for global changes in gene regulation/expression), it remains difficult to distinguish between the impacts on T cell function due to a weaker stimulation compared to specific signaling pathways downstream of a given costimulation molecule. The "getting ready" T cell states we describe, induced by alternative costimulation, may in fact be "weaker activation" states, where T cells receive a signal sufficient to activate but not sufficient to commit to full effector status. Ongoing studies of the impact of TCR stimulation affinity on T cell activation will test this hypothesis.
A key hypothesis that we began this study with was that genetic risk variants in alternative costimulatory molecules could be increasing risk by impacting genetic risk pathways that were specifically downstream of these molecules. This is still an active hypothesis, and we have found candidate genes and variants where this may be the case. However, the overall similarity in the downstream effects of CD28 and other risk-associated costimulation molecules also leaves open the possibility that these variants are acting as overall modulators of the shared T cell activation response (with any costimulation-specific effects being coincidental to disease risk). Ultimately, it will require costimulation-specific eQTL maps of CD4+ T cells to answer the question of how costimulation and disease risk interact, by mapping cis-eQTLs for risk variants and genes that are only regulated by alternative costimulation, and by mapping the downstream effects of costimulatory risk variants by finding their trans-eQTLs effects on downstream activation genes.