Abstract
Alloreactivity compromising clinical outcomes in stem cell transplantation is observed despite HLA matching of donors and recipients. This has its origin in the variation between the exomes of the two, which provides the basis for minor histocompatibility antigens (mHA). The mHA presented on the HLA class I and II molecules and the ensuing T cell response to these antigens results in graft versus host disease. In this paper, results of a whole exome sequencing study are presented, with resulting alloreactive polymorphic peptides and their HLA class I and HLA class II (DRB1) binding affinity quantified. Large libraries of potentially alloreactive recipient peptides binding both sets of molecules were identified, with HLA-DRB1 presenting an order of magnitude greater number of peptides. These results are used to develop a quantitative framework to understand the immunobiology of transplantation. A tensor-based approach is used to derive the equations needed to determine the alloreactive donor T cell response from the mHA-HLA binding affinity and protein expression data. This approach may be used in future studies to simulate the magnitude of expected donor T cell response and risk for alloreactive complications in HLA matched or mismatched hematopoietic cell and solid organ transplantation.
Introduction
Graft-versus-host Disease (GVHD) represents a significant cause of morbidity and mortality in stem-cell transplant (SCT) recipients1. GVHD in an HLA-matched allogeneic stem cell transplant is the archetype of an adaptive immune response with donor derived T cells responding to recipient antigens presented on shared HLA class I and class II antigens2, 3, 4. Since the beginning, HLA matching has been the bedrock principle of donor selection in SCT, and this is particularly so when the donor is not a close relative5, 6. Improvements in the fidelity of HLA matching between unrelated transplant donors and recipients has yielded incremental benefits in patient outcomes, with improvements in survival resulting from both a reduction in GVHD risk as well as reduction in graft loss and optimization of relapse risk. Nevertheless, GVHD remains a therapeutic challenge, and there is little that can be done to predict the outcomes of specific donor-recipient pairs.
This challenge may be surmounted by accounting for genomic variation between the donors and recipients which yields the peptides presented on HLA molecules, known as minor histocompatibility antigens (mHA)7, 8. While mHA have had a recognized pathophysiologic role in allogeneic SCT outcomes, especially in GVHD pathogenesis, it has not been possible to apply the notion to clinical practice because mHA characterization is a cumbersome process9, 10, 11, 12. Two developments in the past decade have changed this situation. One, the emergence of next generation DNA sequencing techniques, such as single nucleotide polymorphism mapping 13, 14 and whole exome sequencing (WES) to identify the potential antigenic differences15, 16. The second is the development of machine learning algorithms which allow determination of the binding affinity that different antigens may have for specific HLA molecules17, 18, 19. These two techniques have been combined to develop algorithms that may be used to determine the complex array of recipient antigens that a given donor T cells may encounter in a recipient20, 21. This knowledge of mHA in turn may allow simulation of alloreactive T cell responses in equivalently HLA matched SCT donor-recipient pairs (DRP) to identify donors with optimal alloreactivity.
Studies reporting exome-wide or other genomic disparities in donors and recipients, have demonstrated a large body of DNA sequence differences between transplant donors and recipients, independent of relatedness and HLA matching14, 15, 16. These large genomic differences have been translated to peptides and HLA affinities for the resulting peptides determined20. This too yields large libraries of antigens which may be analyzed by either simulating alloreactive T cell responses or by statistical methodology to determine predictive power for alloreactive T cell responses22, 23. To date, these models have examined recipient peptide presentation on HLA class I and studied the resulting associations.
As noted above, HLA-matched SCT remains fraught with uncertainty as patients with HLA-matched donors continue to have disparate outcomes24, 25. A quantitative model of transplant alloreactivity would allow a more complete understanding of the molecular immunology of SCT, help to identify the most suitable donors for specific recipients, and allow personalized determination of the optimal level of immunosuppression. A central assumption in one such quantitative model, the dynamical system model of T cell responses, is that alloreactivity (such as GVHD) risk is a function of the cumulative mHA variation in the context of the HLA type of each donor-recipient pair (DRP), and may thus be regarded as an alloreactivity potential for that pair15, 20, 26. Clinical outcomes partially depend on the cumulative donor T cell responses to the burden of polymorphic recipient peptides. Previous work applying this dynamical system model to HLA class I presented molecules demonstrates that there are large differences in the simulated T cell responses between different HLA matched DRP22, 23. Herein, previously reported findings of WES of SCT DRP are extended with an analysis of the HLA class II presentation of polymorphic peptides. A comparison of the difference in magnitude of the derived peptide libraries presented on the HLA class I and HLA class II molecules in the DRP is presented. Next a mathematical model is developed which may allow the development of a new approach to the study of such large data sets and their eventual application to clinical medicine. The previously reported dynamical systems model of alloreactive T cell responses is generalized to include both HLA class I and HLA class II presented peptides. The model is expanded to account for different conditions T cells may be subject to, specifically their own state of antigen-responsiveness and the cytokine milieu. This quantitative perspective may, in the future, permit successful simulation of alloreactive T cell responses between different donors and recipients in SCT.
Methods
After obtaining approval from the institutional review board at the Virginia Commonwealth University, whole exome sequencing (WES) was performed on previously cryopreserved DNA samples from 77 HLA-matched DRP (Supplementary Table 1) as previously described15, 22. Briefly, whole exome sequences from each DRP were compared with each other, as well as to a standard reference exome. All nonsynonymous single-nucleotide polymorphisms (nsSNPs) present in the recipient and donor were identified and recorded in the vcf format. Subsequent processing of the vcf files was done using custom python scripts to remove synonymous mutations, eliminate duplicates, and record the coordinates of the SNPs. Non-synonymous SNPs that exist in the recipient but not in the donor were recorded and identified as potential source of alloreactive antigens. Non-synonymous, single nucleotide polymorphisms (nsSNP) in each DRP would correspond to potential antigens due to the resulting amino acid substitution in oligopeptides which bind HLA in that DRP (Figure 1A).
To derive the peptide sequences for this study, an average peptide length of 15 amino acids for HLA class II HLA was used27. HLA class I bound 9-mer peptides were generated as previously described20. Each of the nsSNPs could potentially be incorporated into the alloreactive peptide of 15 amino acids. The position of the nsSNP encoded polymorphic amino acid in the peptide could vary from the N-terminus to the C-terminus of the peptide. The possible library of peptides will thus be contained within a 29-mer oligopeptide (Figure 1B). Thus, there are 15 different HLA-II binding peptides that could potentially be generated from each nsSNP identified by WES. ANNOVAR was used to generate 29-mer peptides for each nsSNP respectively to study HLA class II presentation. In ANNOVAR, a sliding window method was used with the “seq_padding” option of the “annotate_variation” function to generate the 15 different 15-mers resulting from each nsSNP. Tissue expression of the proteins from which the peptides were derived was determined as previously described23.
Once the peptide library was created for each DRP, the HLA types for the recipient were tabulated from the medical records. For class II HLA, HLA-DRB1 alleles for each patient were recorded. Each patient’s HLA-DRB1 allele types (and HLA class I alleles, as previously described) along with peptide library were analyzed using NetMHCIIpan 2.0 to derive the binding affinity of each peptide-HLA complex. This was given as an IC50 (half-maximal inhibitory concentration) for each peptide, measured in nano-Molar. This measure of binding affinity provided the concentration of peptide required to displace 50% of a standard peptide from the HLA type to which it would have been bound.
Peptides present in the recipient but absent in the donor, generated from the ANNOVAR sliding window with IC50 values for all the different patient HLA types were tabulated and duplicates were deleted. Any peptide with the same amino acid sequence but different SNP position along the peptide must have generated from a different area of the exome and was therefore retained in the enumeration. When compiling the peptides binding to different HLA alleles, the patients with homozygous allele for DRB1 had their peptide values doubled to simulate having double the normal number of allele-specific HLA bound peptides presented. Analysis of the number of strongly bound (SB; IC50 ≤50 nM) and bound peptides (BP; IC50 ≤500 nM) for each patient-HLA allele combination was done by listing the peptides in descending order of binding affinity, as measured by IC50 levels (Table 1A & 1B). HLA class I and HLA class II bound peptides were compared numerically for this perspective paper.
Results
HLA class II bound alloreactive peptides
Whole exome sequencing (WES) was performed on the cohort of 77 donor recipient pairs (DRP) of which 75 were evaluable for this analysis. SNPs were identified, following which alloreactive peptides and HLA-DRB1 binding affinities were derived. HLA matched unrelated donor (MUD) DRP exhibited a higher number of HLA-DRB1-BP; mean: 39,584 alloreactive peptides in HLA matched related donors (MRD) vs. 67,987 in MUD (t-test P <.001). When only the SB peptides are analyzed, this trend while present no longer remains statistically significant, mean SB 6,077 alloreactive peptides in MRD vs. 9,535 in MUD (p=0.168) (Figure 2A & 2B). This is consistent with the larger burden of exome variation in MUD transplant recipients. Significantly more MUD DRP had BP > the median 52,983 peptides for the whole cohort (34/49 vs. 4/26, Fishers Exact test p<0.0001), as well as SB >4,245 (30/49 vs. 8/26, p=0.012), when compared to MRD DRP. There was marked variability in the HLA DRB1 allele binding affinity in the various peptides as well as the tissue expression of the proteins from which peptides were derived (Table 1A). This is likely an effect of the randomness observed in exome sequence variation, and the variation in HLA binding affinity of the resulting alloreactive peptides, and illustrates the potential for variability in alloreactive antigen presentation between different donors and recipients who undergo SCT.
Comparing HLA class I and II bound alloreactive peptides
The HLA class II binding peptides libraries were compared to previously-determined numbers of BP and SB on all Class I HLA alleles for the same patients. On average, the number of alloreactive peptides bound to the two HLA-DRB1 alleles with an IC50<500nM, was far greater than the number bound to the HLA class I loci (all 6 HLA-A, B & C alleles). Significantly more peptides bound HLA DRB1 molecules compared to all the HLA class I molecules put together; BP for HLA DRB1 median 52,983 compared with BP for all HLA class I molecules 4,532, yielding a median ratio BP-HLA class I/BP-HLA DRB1 per DRP of 0.09 (0.03-0.29; t test p<0.0001). The same trend was observed with SB with a median ratio of 0.23 per DRP (0.02-4.48; p =0.0001) (Figure 3A). There was correlation between the number of BP and SB for both HLA class I and to a lesser extent in HLA class II molecules in the DRP studied; Pearson correlation coefficient, R 0.71, p<0.0000001 for HLA DRB1 & 0.94, p <0.0000001 for all HLA class I molecules together (Figure 3B). Nevertheless, HLA class II molecules presented an order of magnitude greater number of peptides. There was little overlap in the binding affinities of various alloreactive peptides to different HLA class I molecules (Table 1B). The difference observed in HLA class I and II antigen presentation is likely a consequence of the larger peptide length presented on the dimeric HLA class II molecules. This increases the size of the peptide pool on offer (9 alloreactive peptides/SNP for HLA I vs. 15 for HLA II), and consequently the likelihood that alloreactive peptides will be presented. Tissue expression of the proteins from which the peptides presented on HLA class I, were derived was also determined and marked variation was observed in the RPKM values of the proteins of origin (Tables 1A & 1B).
One DRP (# 26), was analyzed to determine the likelihood of peptide presentation from the same proteins on both HLA class I and II molecules. This would result in activation of both CD4+ and CD8+ T cells in the tissues expressing that protein, and greater potential for tissue injury. A comparison of strongly bound peptides (IC50 ≤50nM) demonstrates that this DRP had 143 genes, that yielded peptides binding both HLA class I and HLA class II. Different degrees of sequence of homology between these 9-mer and 15-mer peptides was observed (Table 2 & Supplementary Figure 1). This overlap suggests that if the degree of exome sequence variation in a DRP is sufficiently large, it is plausible that most tissues will potentially present mHA to both helper and cytotoxic T cells.
Demographic factors influencing HLA class II bound alloreactive peptides
Finally, demographic factors, including race and gender, that impact genetic disparity were analyzed. African-American vs. Caucasian DRP demonstrated a non-significant trend for increased HLA-DRB1 bound mHA in African American DRP, for both BP (74,179 vs. 53,735 in Caucasian patients; p=0.075) & SB peptides (11,972 vs. 7,503; p=0.36). There was no significant difference in the number of BP or SB in the gender-matched male or female DRP, not accounting for Y chromosome disparity in male patients receiving transplant from a female donor.
Discussion
The data presented in this paper illustrate the large potential that HLA class I and especially HLA class II molecules have for recipient peptide antigen presentation in the context of allogeneic SCT. The magnitude of this antigen burden across the patient population makes it difficult to predict which patient will have a poor outcome. However, while in and of themselves these parameters may not yet be definitive for GVHD prediction, given the uniformly large magnitude of mHA identified in the patient cohort examined, these measures if appropriately analyzed may give insight into the quantitative principles of the T cell immune response. The following discussion gives a quantitative perspective of the impact the presentation of recipient antigens on HLA class I and II molecules may have on donor T cell responses following allogeneic SCT. The Dynamical System model of alloreactive T cell growth previously developed for HLA class I-presented-mHA is further developed for use in future studies to simulate universal alloreactive T cell responses.
An important clinical question in transplant immunology is how data from next generation sequencing (NGS) and novel machine-learning algorithms can be used to help identify optimal donors for SCT. To do this it is imperative to understand the quantitative principles at work in donor immune response and use these principles to develop methodology to simulate transplants with different donors in silico. Such simulations may then be used to identify both the ideal donor and the level of immunosuppression needed for optimal clinical outcomes. The mHA prediction methodology presented previously and extended herein, augmented by analysis of peptide cleavage sites to more accurately determine the probability of the generation of specific HLA binding alloreactive peptides may allow this prediction in the future28. As a first step towards this goal, it was shown that donor CD8+ T cell growth simulations may identify patients at risk for moderate to severe GVHD, however these associations were relatively weak23. While, one possible explanation for this is the stochastic nature of alloreactive antigen presentation on HLA molecules (both alloreactive and non-alloreactive peptides may bind HLA), an important limitation in the special case of the model described (HLA class I antigen presentation) was its lack of information on HLA class II mHA presentation and consequent CD4+ helper T cell responses in the donor-recipient pairs involved. Normally, CD4+ T helper cells play an important role in the homing of cytotoxic T cells to infected tissues, and in the case of GVHD to the target tissues29, 30, 31. In the transplant setting, T helper cells will recognize their target alloreactive antigens bound to HLA class II molecules; notably, these differ from the antigens recognized by CD8+ cytotoxic T cells and presented by HLA class I. The T helper cells initiate signaling by secretion of appropriate cytokines (IFN-γ, IL-2, IL-12, IL-17 etc.) and set up the homing signal for the cytotoxic T cells to invade the target tissue (Figure 4), which cause tissue injury through direct cytolytic activity. In the present study we estimate the magnitude of alloreactive antigen burden encountered by donor cytotoxic T cells and helper T cells in HLA matched DRP. This estimate may allow a more accurate calculation of the likelihood that a patient may develop T cell mediated tissue injury following SCT, then was previously possible23.
T cell clonal proliferation in response to mHA-HLA complexes: The logistic equation of growth
Previous work has shown there to be far greater diversity in the T cell repertoire of CD4+ T cells than in the CD8+ T cells in the post-transplant period in both allogeneic and autologous SCT32. In fact, CD4+ T cell diversity has been found to be about 50 times greater than CD8+ T cell diversity33. The relative magnitude of antigen presentation by HLA class II compared with HLA class I molecules allows one to understand this difference in clonal diversity between the helper and cytotoxic T cells. The ability of HLA class II molecules to present larger peptide sequences is related to their structure compared to HLA class I molecules. The antigen-binding region of HLA Class II molecules consists of both an invariant α and a variable β domain, whereas that of HLA Class I molecules contains only a domains resulting in the binding of a wider range of peptide sequences6,34. This differential antigen presentation results in the quantitative difference observed between the two classes of T cells and may be understood using the dynamical systems approach. In this model, growth equations have been used to simulate the cytotoxic T cell growth in response to HLA class I presented antigen,
This iterating equation describes the logistic growth of a CD8+ T cell clone Tx in a polyclonal T cell graft infused into a recipient (Figure 4 & Supplementary Table 2). N0 (Tx) is the T cell count at the time of transplantation (assumed to be 1 for this equation), Nt (Tx) is the T cell count after t iterations (time) following SCT. Nt-1 (Tx) represents the T cell count for the previous iteration and K is the constant that will determine the T cell count at the asymptote (steady state conditions after infinite iterations), K (Tx), representing the maximum T cell count the system would support (carrying capacity); r is the growth rate. In the logistic equation, the steady state count for each T cell clone (KBZ) will be proportional to the product of the binding affinity of the target peptide mHA (peptide y) for the HLA molecule (afmHA = 1/IC50 in Koparde et al, in this paper, By for peptide y) and the affinity of T cell clone, T cell receptor for the mHA-HLA complex (afTCR = 1/IC50 in Koparde et al, now Zx for T cell clone Tx)23. In this model, the parameter r, determines the growth rate of the specific clone and reflects the effect of the costimulatory molecules and cytokines driving T cell proliferation. This iterating equation gives instantaneous T cell count (magnitude of the proliferative response) in response to antigens presented. The tissue expression of proteins from which peptide y is derived (Py) is a coefficient/multiplier for the steady state T cell population KBZ, and may be estimated by RNA sequencing techniques, and reported as Reads or Fragments Per Kilobase of transcript per Million mapped reads (RPKM or FPKM)35. In real-world situations the term Py will have a time modifier, et, associated with it, as protein expression and antigen amount declines over time because of tissue injury. This time relationship will be ignored for simplicity at this time. It is important to recognize that in HLA class I-presented antigen-driven T cell expansion, this term is utilized in its entirety given that HLA class I molecules are loaded using peptides derived from proteins present in the cytosol. This however is not the case for HLA class II molecules, which present antigens endocytosed from the extracellular environment36. This means that when calculating helper T cell growth, the term P will be modified to P.c, with a constant, c, reflecting the attenuation of antigen concentration given its ‘scavenged’ nature as opposed to direct cytosolic presence, in other words, 0 < c < 1 (for CD8+ T cells, c=1). Thus, the equation for determining helper T cell growth will take the general form,
Adjusting the variable P means that the absolute magnitude of the steady state T cell population for each of the dominant (high-ranked) helper T cell clones will be smaller than that for each of the dominant cytotoxic T cell clones, nevertheless because of the greater number of antigens presented by HLA class II molecules there will be a greater number of CD4+ T cell clones, and thus greater clonal diversity of helper T cells when compared to cytotoxic T cells. This also means that in a Power law clonal frequency distribution analysis37, 38, the contribution of the highest-ranking (most numerous) T cell clones to the entire repertoire will be higher with cytotoxic T cells. Conversely, in the T helper cell population there will be a larger number of high-ranking clones which contribute a larger component of the overall repertoire. Given the greater number of antigens there may be greater competition between the clones, which in a model accounting for competition between clones will lead to slower growth of helper T cell clones, a relatively frequent clinical observation39. Also, given the restriction of HLA class II molecules to antigen presenting cells the absolute magnitude of steady-state helper T cell clonal populations will be smaller; however, since HLA class I molecules are expressed on all nucleated cells, cytotoxic T cells get a proliferative signal from many different cell types, therefore steady state T cell clonal counts can be further augmented. From an evolutionary and T cell response sensitivity and specificity standpoint, it is logical that the cytotoxic T cell-recruiting signal provided by CD4+ T helper cells should be more sensitive, triggered by a greater variety of antigens, but when it comes to actual tissue destruction by CD8+ cytotoxic T cells, a more fine-tuned HLA class I bound, shorter peptide with greater specificity required for presentation, provides the necessary stimulus. This would come from the prevention of non-specific binding of peptide antigens to the more ‘discriminating’ HLA class I molecules.
Quantifying mHA-HLA-TCR interactions: On matrices, vectors & tensors
Following the above general discussion about T cell behavior, it is necessary to develop a model that will account for the potentially large arrays of antigens being presented in allogeneic SCT. As noted earlier, immunotherapy and SCT are fraught with the risk of treatment failure either in the form of relapsed malignancy or immune mediated normal tissue injury (GVHD or graft rejection). Various outcome prediction algorithms and models have been developed using increasingly sophisticated characteristics studied statistically40, 41. These may allow improvement in clinical outcomes prediction, but often do not provide mechanistic insight into the reason for the observed clinical outcomes. Further, while principles of immune therapy and the mechanisms of T cell action are well known from work on mouse models and in vitro42, 43, when the antigenic complexity encountered in vivo in human SCT recipients is considered, the existing models do not reliably predict individual clinical outcomes. This is also true of the T cell repertoire that emerges following SCT.
Nevertheless, mathematical methods are available that have long been used in physics to understand natural phenomenon and may be extrapolated to biological systems such as immune response modeling. For example, the concept of vectors and operators has been used to simulate aggregate T cell clonal responses to antigen arrays22, 23. However, this method is limited in that it requires identification of unique mHA-HLA and cognate TCR for application. To overcome this limitation, a related mathematical method, tensor analysis, may be used to simulate the immune responses to the vast library of tissue specific antigens presented by the entire spectrum of HLA molecules in an individual. In physics, tensors describe interaction between vector quantities and their components, so they enable determination of variation in vector magnitude and direction and subsequent mapping to a different ‘state’. In other words, tensors help describe vector transformation when multiple forces are acting upon an object, which itself may be a vector44, 45, 46. It is important to recognize that these methods have been developed for use in ‘linear’ physical systems, however biological systems are seldom linear. They follow nonlinear dynamics such as Power laws and exponential growth patterns, which require development of methodology which can account for the complexity in biologic systems because of the multiplicity of variables encountered. It is for this reason that tensor methodology may lend itself to the study of the alloreactive immune response problem. In the example at hand, the donor T cell array infused into the recipient may be considered as a vector, which is modified by the interaction between the T cell receptors (TCR) on the donor T cell clones and the recipient mHA-HLA complexes and is transformed to a new state following SCT. The interacting TCR & mHA-HLA complex in this example may be considered as a tensor, modifying the T cell clonal vector. Tensors remain invariant in different frames of reference and in this application of the concept, the mHA-HLA-TCR interactions, determined by the protein sequences remain constant, regardless of tissues and individuals where the interactions may be occurring. In other words, the unique peptide sequences’ affinity to specific HLA molecules and TCR will remain the same across individuals and tissues. In essence, such an alloreactivity tensor comprised of recipient mHA and HLA, in the presence of donor T cell repertoire influences the relative growth of alloreactive T cell clones versus the non-alloreactive clones. Accordingly, clinical GVHD may or may not manifest.
To understand this notion, consider a basic adaptive immune response to a recipient alloreactive peptide following SCT (or any other antigenic peptide); the first interaction is between the alloreactive recipient peptide and the HLA molecule resulting in the binding and presentation of the peptide on the HLA molecules (Figure 5). Consider two HLA molecules H1 and H2, and two peptides p1 and p2, each recognized by only one of these two HLA molecules; a matrix may be constructed showing the peptides bound to the relevant HLA molecules47. The possible interactions between the peptides p1 and p2 in a system of two HLA molecules H1 and H2, may be depicted in matrix form as follows.
The 0 and 1 represent conditionality of interaction between the peptides and HLA. The matrix on the left-hand side of equation 3 represents vector quantities, H1p1, H1p2, H2p1 or H2p2, which have a magnitude (binding affinity, expressed in 1/IC50, nM-1) and a ‘direction’ given by the specificity, i.e. unique affinity of the peptide for the HLA molecule. Given affinity of H1 for p1 and H2 for p2, this interaction yields an identity matrix. The interaction between the peptides and HLA molecules constitute a matrix where peptide recognition and binding by an HLA molecule is represented by 1, and the converse situation by 0. Thus, the numbers 1 & 0 represent the selectivity of peptides with a certain sequence (and commensurate length) for specific HLA and vice versa. These two alloreactive HLA-peptide complexes may then be presented to donor T cell clones by the antigen presenting cells, (Figure 4) and specific donor T cell receptors may recognize these unique HLA-peptide combinations and bind. In this example, TCR1 only recognizes H1p1 and, TCR2 only recognizes H2p2. The resulting matrices are given below The right-hand side of equation 4 is a tensor with two vector quantities, the affinity of HLA molecule for the peptide and the affinity of the TCR for the peptide-HLA complex, which may be summarized as follows
The matrix depicted in equation 5, is a tensor of the second rank with two vector quantities, i.e. the affinities B and Z (specific binding between HLA & peptide (B) and between HLA-peptide & TCR (Z)), which are depicted by HpT1,1 and HpT2,2. HpT in this case symbolizes the HLA molecules, peptides and TCR interacting with each other, and the subscripts 1 and 2 are called indices in tensor terminology, identifying interactions between specific molecules (e.g., p1 and p2). The identity matrix reflects the affinity of specific TCR for specific mHA-HLA combinations. It is to be noted that, the same peptides given above may bind other HLA molecules with a different affinity and there may be TCR which bind these alternative antigen complexes with different affinities, constituting different vectors (Figure 5A & 5B). Along the same lines, a given peptide or TCR may interact with different partners yielding different vector components. For example, in the above matrices, TCR1 may interact with both H1p1 and H1p2, the magnitude of the former will be 1 and the latter, 0. However, given the continuous nature of the IC50s observed for different peptides with different HLA molecules in the analysis presented in this paper it is unlikely that the vector magnitudes are going to be binary in nature. The well-known phenomenon of immune cross reactivity is an example of the vector components which are not binary48. It is also important to note that the forces (vectors) represented by B (H1p1) and Z (TCR1) may be considered orthogonal (perpendicular) because their direction is imparted by the unique recognition of peptide sequence by HLA, and that of peptide-HLA complex by TCR respectively. Thus, the growth of the T cell clone resulting from this interaction may be considered a ‘cross’ product of these two forces (Sin 90 °=1, for orthogonal vectors) (Figure 5C).
T cell vector transformation: Enter Operators
In the SCT context the alloreactivity tensor, HpT, determines the magnitude (and direction) of T cell clonal growth vector in response to antigens. T cell clones with receptors TCR1 and TCR2 respectively will grow in response to the HpT Tensor. It is to be noted that the HLA-peptide driven T cell clonal growth vector is distinct from the TCR affinity vector for HLA-peptide complex, even if one considers that mHA-HLA affinity vector drives T cell clonal growth of the relevant TCR bearing clone. This relationship is analogous to applied force, resulting in motion at a certain velocity and consequent mass displacement which are distinct vector quantities pointing in the same direction (with time being the scalar distinguishing between them; T cell clonal growth is also a time-dependent function). In the above example, the T cell clonal growth vectors, comprising the two T cell clones bearing the T cell receptors TCR1 and TCR2, are termed T1 and T2 respectively. These constitute a vector matrix, which is transformed over time t by the HpT tensor to the vectors and .
In equation 6, the vector is transformed by the HpT tensor and the logistic operator, L previously defined as the logistic equation for T cell growth, which incorporates the term ByZx included in the HpT tensor,
T cell growth: the effect of co-stimulation, checkpoints and cytokines
In equation 2 the term r quantifying growth rate is an aggregate measure of different influences on T cells and may be considered a scalar multiple of a tensor quantity. This term represents the cumulative growth effect of the costimulatory and inhibitory molecules present on the T cells and the cytokines present in the environment. In the dynamical system model of T cell growth, the T cell steady state numbers are determined by TCR-mHA-HLA affinity (BZ), also called ‘Signal 1’. A second critical influence on T cell growth is provided by ‘Signal 2’ mediated by the costimulatory molecule CD28 and inhibitory molecule CTLA4 (S2) may be mathematically represented by, CD28 = 1, CTLA4 = 0. Additionally, the checkpoint mechanism (CP) comprising the PD1 receptors, if engaged may be represented by a variable valued at 0 because no T cell growth will occur, and when absent, valued at 1. Finally, ‘Signal 3’, (S3) represents the effect of cytokines on T cell growth (Supplementary Figure 2)49, 50, 51. Considering that all these variables contribute to T cell growth, the term r is therefore a composite of the following factors, Solving this equation for lack of PD1 engagement (1) and the presence of CD28 expression (1) yields,
Solving the equation for CTLA4 expression or PD1 engagement gives r a value of 0, which yields e° = 1 in equations 1 & 2, consistent with suppression of T cell growth. In other words, the presence of PD1 engagement by PDL-1 or the engagement of CTLA-4 instead of CD28, by CD80 on APC, changes r to zero, eliminating the effect of time t, which changes the value of e to 1 (in equation 2), leading to growth arrest of the T cell clone.
As for S3, the cytokine mediated signal may also be considered a second order tensor quantity, consisting of a matrix with cytokines and cytokine receptor vectors, because the cytokines and their receptors, have different magnitudes and varying receptor specific effects (directionality) on T cell growth and differentiation. Ignoring the di- or trimerization of cytokine-receptor protein subunits, a simplified version of the cytokine tensor may be constructed as follows,
This is the cytokine tensor, Ck, with the example showing the interaction between IL-12 and IL-10 and their respective receptors. It should be noted that cytokines may bind related receptors with different affinities, providing different vector components. The negative sign means a growth suppressive effect, the net effect of cytokines can either be negative or positive and as a multiple of the CD28-PD1 expression term, the Ck can alter the magnitude and direction of effect of the exponent in equation 2 (by changing the symbol of r from - to +). Equation 7 therefore is modified to
Further complicating these estimations from a physical standpoint at a cellular level in equation 8, cytokine exposure will be variable since these effects are ‘local’ to the tissue or lymph nodes. Cytokines likely depend on diffusion via capillary action in the extracellular matrix to create a ‘field’ in which the T cells experience the cytokine effects. These effects on growth are of an exponential nature because of r being an exponent in equations 1 & 252. The receptor expression levels also vary on different cells and confer a direction by means of influencing differentiation and functional specificity to the T cell clones with unique TCR.
Evolution of the T cell repertoire: Putting it all together
The above discussion illustrates the complexity inherent in the multiple factors influencing the T cell responses to antigens presented by HLA molecules. Nevertheless, it makes it clear that despite the complexity, it is possible to describe the immune interactions in mathematical terms, and therefore it is also possible to simulate them, especially when antigen presentation data are available. To do so one may take the example of a random collection of tissue associated peptides. First, consider an alloreactive peptide of any size varying between 7-18 amino acids. This peptide will have a choice of binding to HLA class I and II molecules (there are six of each). Therefore, depending on its size and mode of acquisition (extracellular or cytosolic) it will bind to the relevant HLA molecules with a unique binding affinity. It is to be noted that depending on the number of binding HLA molecules and the concentration of competing peptides, there will be a probability function associated with each of these interactions. As demonstrated above in equations 4 and 5, the mHA (polymorphic peptide) binding affinity to available HLA molecules, may be considered to represent the components of the immune response vector to this antigen (or degrees of freedom for the peptide). For most peptides, only one component (one HLA-mHA complex) with the strong interaction will be relevant, and others with weak interactions may be ignored. With the peptide bound to one of the HLA molecules (or more depending on binding affinity with other HLA molecules), it is presented on the APC. If a T cell clone with a TCR which has affinity for the HLA-peptide complex is present (a second probability term), then depending on the CD28/CTLA-4 and PD1 expression levels in the T cell clone, it will grow in the cytokine ‘field’ present in the tissue.
Thus, consider peptides (p1, p2…pn) with high affinities B1, B2…Bn for HLA molecules H1, H2…Hn respectively, but with a very low-level affinity for the non-corresponding HLA molecules present in the individual (e.g., the components p1H2, p2Hn, pnH1, not considered here for the sake of simplicity in illustration, but fundamental to the tensor concept). These mHA-HLA complexes have corresponding T cell receptors TCR1, TCR2…TCRm with affinities, Z1, Z2…Zm, the tensor HpT may be written as follows,
Here n and m are indices which indicate the HLA-peptide affinity (Bi) and TCR binding affinity to the HLA-peptide complex (Zj). This is the alloreactivity tensor, and it reflects the interaction of the alloreactive peptides with the HLA molecules in that individual and transforms the T cell clonal vector comprised of the array of the T cell clones bearing the above TCR <Tm> according to the logistic function.
This results in the transformation of the infused donor T cell repertoire, with T1, T2…Tm being transformed to following transplant. The logistic growth equation provides the rule for transformation, so equation 1 may also be rewritten as follows for the ith HLA-bound-peptide, pi, and the responding jth T cell clone <Tj> in a repertoire comprised of T cell clones T1 thru Tm.
Substituting the value of r from equation 9 in this equation, we get,
The aggregate alloreactive T cell response at time, t is then
This general equation describes the transforming effect of the alloreactivity tensor and the cytokine tensor on the T cell repertoire following SCT. The risk of alloreactivity developing clinically will in this instance be proportional to Nt(TM).
Dynamical system model of alloreactive T cell response and clinical observations
Does this model explain observations in clinical transplantation? To determine this one may consider the matter of HLA-DPB1 mismatching and alloreactivity in 10/10 HLA matched DRP53, 54, and for that matter the general problem of HLA mismatched SCT and associated negative clinical outcomes55. In the dynamical system model this phenomenon may be easily understood; the mismatched HLA DPB1 epitopes are highly expressed so instead of having a fraction of the protein expressed (the term P.c in Eq. 2) governing CD4+T cell clonal growth, T cell clones bearing TCR that recognize epitopes on HLA DPB1 encounter an order of magnitude higher target concentration with a marked amplification of the steady state alloreactive T cell clonal populations compared to a standard HLA class II bound mHA. Indeed, polymorphisms impacting the level of HLA DPB1 expression correlate with the likelihood of GVHD developing56. Further any peptides bound to the mismatched HLA will be novel antigen complexes for the donor T cell clones to recognize. This would result in a strong aggregate immune response to the mismatched HLA (and its presented peptides) which is widely expressed, and this response is significantly larger than a mHA-HLA directed immune response.
Despite the ability of the model to explain some common clinical observations (logistic growth of T cells, power law distributions, and CD4/CD clonal distribution), it will not be validated unless it explains the random occurrence of GVHD following allografting. A discussion of this has previously been presented, (Koparde et al 2017) where the competition between non-alloreactive and alloreactive peptides for HLA binding and presentation was invoked as a possible reason for this, resulting in a probability distribution (ρHpn) for the alloreactive peptide pn to be presented on HLA molecule H. A further consideration in the development of GVHD from these alloreactive T cell clonal growth simulations is the probability function introduced by peptide cleavage potential, which affects the likelihood of antigen presentation, as well as whether the relevant T cell clones are present following transplantation (ρTm). The probability of peptide cleavage is determined by the amino-acid sequence at the C terminal of the peptide antigens57, as such, several peptides in our study may have low likelihood of presentation and may be ignored to simplify the model. The likelihood of alloreactive antigen response (ρGVHD) may then be calculated as
Computed for each alloreactive peptide, the probability of clonal expansion of the mHA-targeting-T cells will be significantly diminished as the number of probability terms are introduced into the computations, which explains why despite many potential alloreactive antigens being present in each donor and recipient not every patient develops GVHD.
Another clinical phenomenon, the T cell growth amplification effect of cytokines is well recognized clinically. This is recognized in both the need for lymphodepletion prior to adaptive immunotherapy and in the cytokine release syndrome seen following it58, 59. Thus far in the dynamical system model discussed above the cytokine tensor effect has been described as modulating rate of T cell clonal growth. However, cytokines effect not only the rate, but they also effect the magnitude of clonal expansion, amplifying the T cell clonal growth. This may be modelled using the iterating equation
This equation demonstrates the effect of the cytokine tensor, Ck, as a time-dependent function, which in the beginning increases the magnitude of T cell clonal growth for clones expressing the relevant cytokine receptors by an order of magnitude. As the number of T cells increases, this time-dependent effect declines to a steady state level since the cytokines are taken up and utilized by the growing T cell population. This relationship plotted over time demonstrates the familiar T cell antigen response curve and mirrors the effect of antigen presenting cell growth previously described (Supplementary Figure 3) (See Koparde et al23, for discussion of APC-T cell interactions).
A final consideration in building this model is that the antigen matrices presented above are ‘identity matrices’ with binary values of, 1 along the diagonal of a square matrix and 0 elsewhere. In physiologic conditions, however there will be a continuum of values because of differential binding of peptides to various HLA molecules and cross reactivity of T cell receptors with such antigen complexes, generating random number matrices, rather than identity matrices60. This will add another element of complexity to the antigen-effector interactions, and possibly provides a rationale for complex GVHD phenotypes observed.
In conclusion, there is considerable genetic variation present between HLA matched transplant donors and recipients. In silico, this yields a putative large array of recipient mHA bound to both HLA class I and class II molecules, which when viewed from the frame of reference of responding donor T cells may be used to develop a mathematical model to allow simulation of the generalized T cell responses in allograft recipients. The quantitative understanding of alloreactivity thus gained may allow greater precision in donor selection and management of immunosuppression.
Acknowledgements
This study conducted at Virginia Commonwealth University’s Massey Cancer Center was supported, in part, by research funding from the NIH-NCI Cancer Center Support Grant (P30-CA016059; PI: Gordon Ginder, MD) and by research funding from Virginia’s Commonwealth Health Research Board Grant #236-11-13 (PI: Michael Neale, PhD). We acknowledge Allison Scalora and David J Kobulnicky for collection of clinical outcomes data and analysis. Sequencing and Bioinformatics Analysis was performed in the Genomics Core of the Nucleic Acids Research Facilities at VCU, supervised by GB. VK performed bioinformatic analysis of the sequencing data to identify unique peptides and their HLA binding affinity, as well as tissue expression and wrote the paper. MS performed sequencing on samples identified and procured by CR. AS analyzed the data, performed statistical analysis and wrote the paper. AS, CH and MJ-L created data files with unique peptides and HLA IC50 values and wrote the paper. AT, MN, GB developed the WES study. AT developed the vector-operator and tensor models and wrote the manuscript. JR, RQ, MN, SH, DW and BAR critically reviewed the manuscript and provided expert commentary. All the authors contributed to writing the manuscript.