ABSTRACT
Birdshot Uveitis (BU) is a blinding inflammatory eye condition that only affects HLA-A29-positive individuals. Genetic association studies linked ERAP2 with BU, an aminopeptidase which trims peptides before their presentation by HLA class I at the cell surface, which suggests that ERAP2-dependent peptide presentation by HLA-A29 drives the pathogenesis of BU. However, it remains poorly understood whether the effects of ERAP2 on the HLA-A29 peptidome are distinct from its effect on other HLA allotypes. To address this, we focused on the effects of ERAP2 on the immunopeptidome in patient-derived antigen presenting cells. Using complementary HLA-A29-based and pan-class I immunopurifications, isotope-labelled naturally processed and presented HLA-bound peptides were sequenced by mass spectrometry. We show that the effects of ERAP2 on the N-terminus of ligands of HLA-A29 are shared across endogenous HLA allotypes, but discover and replicate that one peptide motif generated in the presence of ERAP2 is specifically bound by HLA-A29. This motif can be found in the amino acid sequence of putative autoantigens. We further show evidence for internal sequence specificity for ERAP2 imprinted in the immunopeptidome. These results reveal that ERAP2 can generate an HLA-A29-specific antigen repertoire, which supports that antigen presentation is a key disease pathway in BU.
Introduction
Birdshot uveitis (BU) is a rare form of uveitis characterized by distinctive inflammatory foci across the retina, hypopigmented choroidal lesions, and cystoid macular edema, which causes visual impairment when undertreated1,2. Infiltration of T cells and elevated levels of T cell cytokines in eye tissues of patients suggest that T cell-mediated inflammation is among the driving disease mechanisms3–6. This is further supported by the fact that all patients with BU carry at least one copy of the Human leukocyte antigen (HLA)-A*29 allele, now widely considered as a prerequisite for diagnosis7,8. How HLA-A29 causes BU has remained unsolved, however, genetic association studies identified that in addition to the extreme association with the HLA-A*29:02 allele, polymorphisms in endoplasmic reticulum aminopeptidase (ERAP)-1 and ERAP2 confer strong disease risk9,10. Within the endoplasmic reticulum, ERAP aminopeptidases destroy or trim peptides to a length that is considered to influence their binding to HLA class I and presentation at the cell surface11. Importantly, of the two major haplotypes of ERAP2, the haplotype associated with canonical full-length ERAP2 (termed Haplotype A) is associated with BU9. The other common haplotype (haplotype B) encodes a transcript that undergoes alternative splicing and nonsense-mediated RNA decay, resulting in undetectable ERAP2 protein12. Because the risk haplotypes of ERAP genes for BU have been shown to result in lower cellular expression and activity of ERAP1 in combination with high cellular expression of functional ERAP210, it is likely that ERAP2 generates a so far unknown, but highly HLA-A29-restricted antigen repertoire that dictates T cell-or NK cell responses. This renders antigen processing and presentation a key disease pathway in BU9.
ERAPs trim the N-terminal residues of peptide substrates by loading the entire substrate inside the enzyme’s cavity where the sum of interactions of amino acid side chains are considered to determine the rate and outcome of peptide proteolysis13,14. Both ERAP1 and ERAP2 have been shown to have preferences for the internal sequence of the peptide, although these preferences are broad and no specific motif has been identified 13–16. These and other observations17 support that ERAPs predominantly modulate the ‘free’ peptide cargo before binding to HLA, which suggests that physiologically-relevant sequence specificities for ERAP2 may be deciphered from the presented peptide repertoire.
Mass-spectrometry based peptidomic studies of model high-passage cell lines have revealed that ERAPs can influence the peptide repertoire presented by HLA-A2918,19. However, to date, no studies have been conducted that studied the interaction of the major genetic risk haplotypes for ERAP1, ERAP2, and HLA-A*29:02 simultaneously in patient-derived tissues and compared the effects of ERAP2 on HLA-A29 to the other competing alleles expressed by the same cell. Knowing the potential effects of ERAP2 across HLA class I alleles is important to be able to separate potential disease effects from canonical antigen processing in studies of the immunopeptidome and may help predict the outcome of pharmacological interference of ERAP2 activity using small molecule inhibitors20.
We generated patient-derived lymphoblastoid cells that naturally express high levels of HLA and ERAPs, in which we stably expressed an autoantigen for BU (i.e. the retinal S-antigen, which is only expressed in the retina). An advantage of using lymphoblastoid cells is that they express high levels of the immunoproteasome (e.g., LMP7 subunit)21, which is also highly expressed in photoreceptors of the retina where the immunoproteasome is essential for the maintenance of normal retinal function and vision transduction22. The use of newly established low-passage patient-derived antigen presenting cell lines better preserves the genetic architecture critically involved with BU in the context of physiologically relevant antigen processing.
In this study, we compared the immunopeptidomes of ERAP2-wild-type and ERAP2-deficient cells using mass spectrometry profiling of elutions from immunopurification with a HLA-A29-binding antibody and subsequent pan-class I antibody. Using several unbiased computational analyses, we accurately dissect the immunopeptidomes of HLA-A29 and other allotypes, which revealed commonly shared effects on position (P)1 and P7 of peptides across alleles, and hitherto unknown, specific effects on P2 in the HLA-A29 immunopeptidome with potential implications for the disease mechanisms of BU.
Materials & Methods
Generation of patient-derived EBV-immortalized B cell lines
EBV-immortalized lymphoblastoid cell lines (EBV-LCL) were generated from peripheral blood mononuclear cells (PBMC) from Birdshot patients, from which we selected a cell-line from a female patient (80 years old during sampling) homozygous for the risk haplotypes for ERAP1 (Hap10/Hap10) and ERAP2 (HapA/HapA)10. B95-8 marmoset-derived EBV supernatant was a kind gift from Dr. Willemijn Janssen, Center for Translational Immunology, UMC Utrecht. Cryopreserved PBMC were thawed and the cell number was determined. In a 24-well plate, 5-106 cells were plated and cultured in freshly thawed EBV supernatant overnight at 37°C, 5% CO2. The next day, transformation-medium (RPMI 1640 + 10% FBS + 1μg/ml cyclosporine) was added into the wells. The EBV-infected cells were observed under the microscope to look for transformed LCLs in clusters. Patient-derived cell lines were cultured in Roswell Park Memorial Institute 1640 medium (RPMI 1640, Thermo Fisher Scientific) supplemented with 10% heat-inactivated fetal bovine serum (FBS, Biowest Riverside) and 1% penicillin/streptomycin (Thermo Fisher Scientific). To obtain stable cell lines overexpressing S-antigen, EBV-LCLs were transduced with the concentrated lentiviral supernatants (see Supplemental Info).
ERAP2 KO using CRISPR-Cas9
For the generation of ERAP2 KO EBV-LCLs the Alt-R CRISPR-Cas9 system (Integrated DNA Technologies) was used and cells were electroporated with the Neon Transfection System (Thermo Fisher Scientific). First, the RNP complex was assembled by combining the crRNA CTAATGGGGAACGATTTCCT with the Alt-R tracrRNA (at a ratio of 1:1) and incubated at 95°C for 5 min, cooled down at room temperature and mixed with the Alt-R S.p Cas9 Nuclease and Buffer R (Neon system). After incubating the RNP complex for 10 minutes at room temperature, 8×105 EBV-LCLs were mixed with the crRNA:tracrRNA-Cas9 complex and electroporated with two pulses of 1100 V and 30 ms each using the 10 μl Neon pipette tip. Electroporated cells were immediately taken up in antibiotic-free medium and cultured for minimal 7 days. This procedure was repeated for 3 times before ERAP2 protein expression levels were analyzed by western blot. A total of 5 rounds was required to reduce to levels of ERAP2 expression to near undetectable levels (Figure 1B).
Cell Culture and HLA-Peptide Immunopurification
For stable isotope labeling by amino acids in cell culture (SILAC), EBV-LCLs were cultured in customized RPMI with the same formula but lacking the two amino acids tyrosine and phenylalanine (Thermo Fisher Scientific) and with dialysed FBS (Thermo Fisher Scientific) in order to avoid unlabeled (i.e., ‘light’) amino acid carry-over. The medium was supplemented with L-Tyrosine-13C9,15N (Sigma Aldrich) and L-Phenylalanine-13C9,15N (Cortecnet). Wildtype EBV-LCLs were cultured with the customized medium (‘heavy’ labeled) and ERAP2-KO EBV-LCLs were cultured in RPMI with 10% non-dialyzed FBS (‘light’, without the labeled amino acids). Two independent experimental cultures were performed; Biological replicates were defined as two separate experiments starting from the CRISPR-Cas9-mediated ERAP2-KO (i.e., independent SILAC-cultures, immunopurification, elution and mass spectrometry profiling). In each experiment, cells from each condition were cultured to obtain 1×109 cells in total per cell line. Cell pellets were stored at −20°C before mass spectrometry was performed. HLA class I molecules were isolated using standard immunoaffinity purification (IP) as described before23 from a fixed sample volume of 2.0◻ml cell pellet per condition and biological replicate. IP was done using the human monoclonal antibody (mAb) DK1G8 (IgG1)24 derived from a HLA-A29-negative multiparous woman sensitized to HLA-A29 due to pregnancy, which specifically binds to 63-L-63-Q epitope in HLA-A*29:01 and A*29:02 and the very rare allele A*43:01, in a single antigen bead test. (https://www.epregistry.com.br/index/databases/database/ABC/), and a pan-HLA class I-specific mAb W6/32. Cell pellets from light and heavy labeled cell lines (ERAP2-WT and ERAP2-KO conditions) were combined and stored at −80 °C until mass spectrometry analysis.
HLA-A29-binding and W6/32 antibodies
The hybridoma cell line producing HLA-A29-binding mAb DK1G8 was cultured in protein-free hybridoma medium supplemented with penicillin/streptomycin and L-glutamine in roller bottles. Cell culture supernatant was treated with Protein-A Sepharose beads to capture the mAb and eluted with glycine pH 2.5. Eluted mAb was covalently bound to Protein-A with dimethylpimelimidate for use in an immunoaffinity column (HLA-A29-Protein-A, W6/32-Protein-A Sepharose at 2.5 mg/ml). The columns were stored in PBS pH 8.0 and 0.02% NaN3 at 4 °C. HLA-bound peptides were extracted as described previously23.
Isolation of HLA Class I–presented Peptides
The extraction of peptides associated with HLA class I molecules was performed as described elsewhere23. Briefly, pellets from a total of 2 × 109 LCLs were lysed for 2 hours at 4 °C in 50 mm Tris-HCl, 150 mm NaCl, 5 mm EDTA, and 0.5% Zwittergent 3-12 (N-dodecyl-N,N-dimethyl-3-ammonio-1-propanesulfonate) (pH 8.0) and the presence of Complete® protease inhibitor (Roche). The preparation was centrifuged for 10 min at 2500 rpm and 4 °C and supernatant was transferred to a new tube and centrifuged for 40 min at 30,000 x g and 4 °C. The supernatant was pre-cleared with a 2-ml CL4B column and subjected to the immunoaffinity column (2ml with 5 mg ml). After washing, bound HLA class I–peptide complexes were eluted from the column and dissociated with 10% acetic acid. Peptides were separated from the HLA class I molecules via passage through a 10 kDa membrane (Microcon YM-10). The filtrate was freeze dried, dissolved in 50mM NH4HCO3 pH 8.4 and the peptides were further purified via ‘high pH reverse phase’ fractionation on a C18 column (Oasis HLB, Waters, Milford, MA). The peptides were eluted from the C18 Oasis column with successively 400 μl 10/90/0.1, 20/80/0.1 and 50/50/0.1 water/acetonitrile (ACN)/formic acid (FA), v/v/v.
MS analysis
Peptides were lyophilized, dissolved in 95/3/0.1 v/v/v water/acetonitrile/formic acid and subsequently analysed by on-line C18 nanoHPLC MS/MS with a system consisting of an Easy nLC 1200 gradient HPLC system (Thermo, Bremen, Germany), and a LUMOS mass spectrometer (Thermo). Fractions were injected onto a homemade precolumn (100 μm × 15 mm; Reprosil-Pur C18-AQ 3 μm, Dr. Maisch, Ammerbuch, Germany) and eluted via a homemade analytical nano-HPLC column (30 cm × 50 μm; Reprosil-Pur C18-AQ 3 um). The gradient was run from 2% to 36% solvent B (20/80/0.1 water/acetonitrile/formic acid (FA) v/v) in 120 min. The nano-HPLC column was drawn to a tip of ∼5 μm and acted as the electrospray needle of the MS source. The LUMOS mass spectrometer was operated in data-dependent MS/MS mode for a cycle time of 3 seconds, with a HCD collision energy at 32 V and recording of the MS2 spectrum in the orbitrap. In the master scan (MS1) the resolution was 60,000, the scan range 300-1400, at the standard AGC target @maximum fill time of 50 ms. Dynamic exclusion was after n=1 with an exclusion duration of 20s. Charge states 1-3 were included. For MS2 precursors were isolated with the quadrupole with an isolation width of 1.2 Da. Precursors of charge 1 were selected in the range of 800-1400, precursors of charge 2 were selected in the range 400-800, and precursors of charge 3 were selected in the range 300-600. The first mass was set to 110 Da. The MS2 scan resolution was 30,000 at the standard AGC target of 50,000 @dynamic injection time.
In a post-analysis process, raw data were first converted to peak lists using Proteome Discoverer version 2.1 (Thermo Electron), and then submitted to the Uniprot Homo sapiens canonical database (67911 entries), using Mascot v. 2.2.07 (www.matrixscience.com) for protein identification. Mascot searches were with 10 ppm and 0.02 Da deviation for precursor and fragment mass, respectively, and no enzyme was specified. Methionine oxidation was set as a variable modification.
Differential expression analysis
Peptide confidence False Discovery Rates (FDRs) were calculated with the Mascot Percolator25 plug◻in in Proteome Discoverer version 2.1 (Thermo Electron) and we used a strict target FDR of 1% (q<0.01) to obtain peptides detected with high confidence. To retrieve labeled peptides for downstream analysis, the high confidence peptides were further filtered to remove peptides with flags “InconsistentlyLabeled”, “NoQuanValues”, “Redundant”, “IndistinguishableChannels”. To detect significant changes in ligand abundance, we used the empirical Bayes workflow for mass spectrometry data based on the limma26 and qvalue27 R packages following Kammers and associates28 (see Supplemental Info). The qvalue R package was used to provide an unbiased estimate of the false discovery rate (FDR). Changes in peptide abundance between light and heavy conditions below a moderated q<0.01 (i.e., 1% empirical FDR) was considered affected by ERAP2. After differential expression analysis, peptides were assigned to HLA alleles using the HLAthena algorithm29, a state-of-the-art neural-network prediction algorithm trained on mass-spectrometry derived peptides from 95 mono-HLA expressing cell lines, which provides the binding score metric ‘MSi’ for each peptide and corresponding allele (range [0,1], MSi >0.6 was considered good, MSi>0.8 was considered strong). We used the GibbsCluster 2.0 server30 to deconvolute the detected 9-mers into a deconvolution solution of maximum three clusters (seeds=5, A=0.7, a=5, t=3). We picked a three-cluster solution that best matched the canonical binding motifs of the HLA-A alleles HLA-A*29:02 (Pfl-Tyr/Y or Phe/F) and HLA-A*03:01 (Pfl-Lys/K or Arg/R).
Non-metric mutidimensional scaling of peptides
Non-metric multidimensional scaling of 9-mers using entropy-weighted (MolecularEntropy() function from HDMD R package31 peptide distances in two-dimensional space was conducted following the method of Sarkizova and associates29,32. This method uses a Hamming distance calculated with an amino acid substitution matrix (adapted from Kim et al. 33) that is inversely weighted according to positional entropy to obtain the pairwise “distance” between 9-mers. To map the peptide distances in two dimensions, for each analysed HLA allele, non-metric multidimensional scaling (NMDS) was used with 10 separate ordinations of 500 iterations using the nmds() function from the ecodist R package34. The configuration with the least stress was used for visualization of the peptidome. We next used density-based spatial clustering of applications with noise (DBSCAN)35 within the fpc R package36 to cluster peptides using the elbow method (KNNdisplot function() in dbscan R package35 to estimate the number of clusters that fit the data. Sequence logo plots were generated using the ggseqlogo R package37. The positional amino acid usage differences were calculated by determining the count for each amino acid at indicated positions (e.g., P1, P2) in the peptides using the MolecularEntropy() function from the HDMD R package and a fisher exact test was used (fisher.test () function in r base) to assess the differences at indicated positions. A chi-squared test (chisq.test() in r base) was used to assess for differences in the number of ERAP2 affected peptides per cluster. All P values were adjusted (termed Padj) using the bonferroni method as indicated. A grand average of hydropathicity (GRAVY) hydrophobicity index on the Kyte-Doolittle scale for each peptide was calculated with the hydrophobicity() function in Peptides R package38. Differences in binding scores and hydrophobicity index were assessed using the dunnTest() function in the FSA R package39.
Western Blot analysis
Protein levels of S-antigen, ERAP1 and ERAP2 were analysed using western blotting. Total cell lysates were prepared using the NP40 lysis buffer (1% NP40, 135 mM NaCl, 5 mM EDTA, 20 mM Tris-HCl, pH□=□7.4) complemented with complete protease inhibitor cocktail (Roche). Protein lysates (10 μl/lane) were separated on a 4-20% Mini-PROTEAN TGX gel (Bio-Rad Laboratories) and transferred to a polyvinylidene difluoride membrane (Immobilon-P PVDF, Millipore). Membranes were blocked in 5% nonfat dry milk in TBST and probed overnight at 4°C with antibodies recognizing ERAP1 (AF2334, R&D Systems), ERAP2 (AF3830, R&D Systems), S-antigen (α-mGFP, TA180076, Origene, to detect the fusion protein S-antigen-GFP) or α-tubulin (T6199, Sigma). After washing, membranes were incubated with anti-mouse secondary antibody conjugated to horseradish peroxidase (HRP) (DAKO) or anti-goat secondary antibody conjugated to HRP (DAKO). Protein bands were detected with Amersham Prima Western Blotting (RPN22361, GE Healthcare) on the ChemiDoc Gel Imaging System (Bio-Rad Laboratories). The ratio of the intensity was calculated using Image Lab 5.1 (Bio-Rad Laboratories) for each experiment.
High-density SNP-array analysis
SNP-array copy number profiling and analysis of regions of homozygosity were performed on DNA isolated from WT and CRISPR-Cas9 edited LCLs (ERAP2-KO) according to standard procedures using the Infinium Human CytoSNP-850K v1.2 BeadChip (Illumina, San Diego, CA, USA). Samples were scanned using the iScan system (Illumina). Subsequently, visualizations of SNP-array results and data analysis were carried out using NxClinical software v5.1 (BioDiscovery, Los Angeles, CA, USA). Human genome build Feb. 2009 GRCh37/hg19 was used.
Data availability
Analysis code, genotype data, and supporting data files can be found at https://github.com/jonaskuiper/ERAP2_HLA-A29_peptidome. Mass spectrometric raw data has been deposited in the MassIVE depository (MassIVE dataset XXXXX) under the creative commons zero license (CC0 1.0).
Results
Generation of a model for ERAP2-mediated antigen processing and presentation
We generated lymphoblastoid cells (LCLs) from a HLA-A*29:02-positive Birdshot patient homozygous for risk haplotypes of ERAP1 (Hap10/Hap10) and ERAP2 (HapA/HapA)(Figure 1A)9,10 and the retinal S-antigen was stably expressed by lentiviral transduction (see Supplemental Notes). Genotyping of the patient revealed HLA-A*29:02, HLA-A*03:01, HLA-B*40:01, HLA-B*44:03, HLA-C*16:01, and HLA-C*03:04 alleles. Because the risk allotype of ERAP1 shows relatively low aminopeptidase activity10, we focused our analysis on the effects of ERAP2 on the immunopeptidome. We used CRISPR-Cas9 ribonucleoprotein delivery with a guideRNA targeting exon 2 in ERAP2 (Figure 1A) to disrupt protein expression of ERAP2 and generate an ERAP2-KO LCL, while preserving the protein expression of ERAP1 (Figure 1B). SNP-array copy number profiling and analysis of regions of homozygosity were performed using the Infinium Human CytoSNP-850K capable of detecting genomic gains and losses with an approximate resolution of ~10 kb by profiling 850,000 single nucleotide polymorphism (SNP) markers spanning the entire genome. SNP-array analysis resulted in a normal female array profile (arr(1-22,X)x2) for both cell conditions (Figure S1), and detected no changes between the WT and ERAP2-KO clones, including the ERAP region 5q15 (Figure S2). This confirms that our editing strategy did not introduce wide spread genomic changes and thus that the conditions are highly suitable for comparison. Genotype date for 92 SNPs at 5q15 for these cell lines is shown in Table S1.
Next, we used stable isotope labeling by amino acids in cell culture (SILAC) to incorporate “heavy” L-Tyrosine-13C9,15N (Tyr/Y) and L-Phenylalanine-13C9,15N (Phe/F) in the ‘wild type’ (WT) LCLs and compare these to unlabeled (“light”) culture conditions for the ERAP2-KO LCL cells (Figure 1C, 1D). The amino acids Y/F are observed in 95% of previously identified HLA-A29 ligands (Figure 2A), but are also found in the majority of peptides presented by the other HLA allotypes - with exception of HLA-B40:01.
Capture of a high-quality HLA-A29 peptidome
Using the HLA-A29-binding antibody, a total of 2315 unique peptides were identified with high confidence (Mascot Percolator q<0.01) between biological replicates (Jaccard similarity = 0.64)(Figure 1D) that were used for further analysis. These were predominantly 9-11 mers (88%), which fits the length distribution40 of HLA-A29 ligands (Figure 2B). The HLA-A29-binding antibody may weakly cross-react with other HLA-A allotypes (see Methods). This is of relevance given that HLA-A*29 alleles are low expressed HLA-A alleles41 compared to high expressed HLA-A*03 alleles. We used GibbsCluster 2.0 for unbiased clustering of the peptides, which found a deconvoluted solution that consisted of three clusters; two motifs fitting the canonical HLA-A29:02 binding motif (C-terminal position Y or Pfl-Y) and one cluster highly similar to the dominant HLA-A03:01 motif (Pfl Lysine (K) or Arginine (R)(Figure 2C) and shows that the HLA-A29 antibody cross-reacts with HLA-A03:01. Indeed, when we used the HLAthena algorithm29, ligands in cluster 1 and 2 were predominantly assigned to HLA-A29:02 (84% and 86%, respectively), and 93% of ligands in cluster 3 were assigned to HLA-A03:01 (Figure 2D). However, because 66% and 20% of peptides in clusters 1 and 2 assigned to other endogenous HLA alleles also showed high binding scores for HLA-A29:02 (Figure 2E), we later choose to filter the dataset using bindings scores for HLA-A29:02 (Figure 2G).
Because we were interested in determining significant changes in peptide abundance associated with ERAP2, we first jointly analyzed the relative abundance (fold change) of light (KO) over heavy (WT) labeled peptides from both experiments using limma28. A total of 1,896 peptides (Figure 2F) were detected in both light and heavy channels and used for analysis. Analysis of peptides unique to one of the conditions is shown in the Supplemental Info. Note that the log fold changes of pooled normalized peptides abundances from light and heavy channels by limma strongly correlate (spearman r = 0.95) with the light/heavy ratio abundance of each experiment (Figure 2F), thus the normalization steps preserve the data structure, while improving the power to detect significant changes28. From the 1,330 8-to 11-mers HLA-A29 epitopes (MSi>0.6 by HLAthena)(Figure 2G), 1,195/1,330 (89%) of the peptides in our HLA-A29 dataset have been reported as ligands for HLA-A29 of which 78% detected in mono-allelic or homozygous HLA-A29-expressing cell systems 18,29, supporting the notion that the approach taken yields an accurate representation of the peptide-presenting properties of HLA-A29:02.
ERAP2 shapes P1 of HLA-A29 ligands
At a false discovery rate of 1%, in ERAP2-WT compared to ERAP2-KO cells, a total of 226 peptides were detected at decreased abundance in the binding groove of HLA-A29 (termed ERAP2-“sensitive” peptides), and 228 peptides were increased in abundance (termed ERAP2-“dependent” peptides) (Figure 2H and Table S2). We detected the 9-mer VTLTCAFRY from retinal S-antigen, which was ~6-fold higher (Log2[FC] = 2.45) in ERAP2-KO cells compared to ERAP2-WT cells, indicating ERAP2 destroys this epitope (Figure 2H and Table S2). We observed moderate changes in the length distribution (Figure 2I and Figure S3) or predicted binding affinities of peptides affected by ERAP2 (Kruskal-Wallis P = 0.06)(Figure 2J). In contrast, comparison of the peptide motifs revealed evident and consistent changes at the N-terminal amino acid positions for ERAP2-sensitive 9-11 mer peptides compared to peptides not affected by ERAP2 (Figure 2K, Figure S4), which aligns with the current view that ERAP2 trims the N-terminal amino acids of peptide substrates13. In detail, P1 of 9-mers revealed a contrasting residue preference for ERAP2-sensitive and ERAP2-dependent peptides (Figure 3A); Alanine(A), K, and R amino acids were seen significantly more often, while amino acids Y and F were seen significantly less often in sensitive peptides compared to non-affected peptides (Fisher’s Exact test, Padj <0.05, Table S3). In contrast, the most common P1 residues for dependent and non-affected peptides were Y and F (Y/F at P1; 45% and 30%, respectively) with F statistically more abundant at P1 and P2 in dependent peptides (Figure 3A, Table S3-4). Intriguingly, we detected no significant effects of ERAP2 at the N-terminal residue of the precursor peptide (position P-1)(Figure S5). Together these data show that ERAP2 has a selective effect on P1 of the HLA-A29 immunopeptidome in part by driving the depletion of peptides with preferred P1 substrates (e.g., A,K,R)42 of ERAP2. This finding is consistent with previous reports that ERAP2 has primarily a destructive role by over-trimming susceptible peptide sequences and thus removing them from the immunopeptidome42.
ERAP2 increases the abundance of peptides with a cryptic aromatic P2 motif
ERAP2 trims peptides by sequestering them into the relatively large internal enzyme cavity13, where peptide side chains across the amino acid sequence can interact with pockets inside the cavity of ERAP213,14. To evaluate if sequence-specific selectivity16 by ERAP2 could be interpreted from the HLA-A29 peptidome, we conducted non-metric multidimensional scaling (NMDS) of all 9-mers29. This analysis projects peptides in two-dimensional space based on the similarity of the amino acid sequences (Figure 3B). Considering peptides with significant changes between ERAP2-WT and - KO conditions revealed distinct patterns for co-clustered (“similar”) peptides, with ERAP2-sensitive peptides located ‘away’ from ERAP2-dependent peptides (Figure 3C). To quantify these differences, we compared the amount of ERAP2-sensitive (Figure 3B in blue, n=155) versus ERAP2-dependent peptides (in red, n=164) across five clusters of peptides or ‘submotifs’32. This analysis revealed that ERAP2-sensitive peptides were overrepresented in cluster 3 (Χ2, Bonferroni n=clusters, Padj = 0.046) and ERAP2-dependent peptides overrepresented in cluster 2 (Padj = 5.1 × 10−6)(Figure 3D). Cluster 2 (n=172 in total) was defined by nonpolar aromatic residues F (Padj = 1.0 × 10−49), or Y (Padj = 2.1 × 10−22) at P2 (F/Y in 97% of peptides in cluster 2 compared to 13% of peptides in all other clusters). ERAP2-dependent peptides (n=53) made up a considerable proportion of cluster 2 (unaffected peptides; n=106).
Peptides in cluster 3 (n=356) were distinguished by a L at P7 (99% of peptides in cluster 3 compared to 3% in other clusters, Padj = 2.0 × 10−230)(Figure 3C and Table S5). Peptides in cluster 3 showed an overall higher binding score for HLA-A29:02 and higher hydrophobicity index compared to cluster 2 (Figure 3E). Note that submotifs cluster 2 and cluster 3 are bona fide submotifs of HLA-A29 that are highly reproducible in other datasets (cluster 1 and 4 in Figure S6 and cluster 1 and 3 in Figure S7). We further replicated our findings in HLA-A29 immunopeptidome data from Sanz-Bravo et al.18 of ERAP2-competent and naturally ERAP2-deficient HLA-A29-positive cell lines (Figure S6) and, thus, demonstrate that ERAP2-positive cell lines commonly display selectively increased peptides with the motif of cluster 2. In contrast, ERAP1 did not selectively contribute to cluster 2 peptides (Figure S7). Also, the effect of ERAP1 and ERAP2 on HLA-A29 peptides correlated weakly (spearman rho=0.12, Figure S8), suggesting non-redundant effects for ERAP1 and 2 on the HLA-A29 peptidome. Although the analysis for 10-mers (n=235) in our dataset was considered to lack sufficient resolution to map the effects of ERAP2 on the submotif level, most of the ERAP2-dependent 10-mers also mapped to a submotif of HLA-A29 with F at P2 (Figure S9). In summary, ERAP2 selectively increases the expression of HLA-A29-binding peptides with a submotif with aromatic residues at P2.
ERAP2-dependent peptides of cluster 2 are selective for HLA-A29
HLA class I peptides display promiscuity43 and it is therefore of interest that HLA-A03:01 can present peptides with a Y at Pfl (similar to HLA-A29) only with L at P7 is present40. As expected, peptides from cluster 3 (Figure 3C) were also predicted as potential binders for HLA-A03:01, while cluster 2 peptides (Figure 3C) were not (Figure S10). To further test the HLA allotype restriction, we compared the binding scores for the differentially expressed peptides in cluster 2 and 3 for eight alleles which display binding motifs that overlap with HLA-A29:02 (based on Sarkizova et al.29). As shown in Figure 3F, ERAP2-sensitive peptides in cluster 3 show relatively good (MSi>0.8) binding scores for several other alleles (e.g., HLA-A30:02). Note that the S-antigen peptide VTLTCAFRY (in cluster 1, Figure 3B,3C, 3F) also shows good binding scores for other alleles (e.g., HLA-A30:02 MSi = 0.86). In contrast, ERAP2-dependent peptides from cluster 2 are predicted to poorly bind other class I alleles (median MSi<0.6)(Figure 3F). We extended this analysis to 95 alleles, which supported that the ERAP2-dependent peptides in cluster 2 are highly specific for HLA-A29 (Figure S11), with the exception of HLA-C*14:03 (>100 times lower allele frequency compared to HLA-A*29:02 in European populations) (Table S6). The motif of cluster 2 peptides is present in the amino acid sequences of proteins encoded by ~300 genes highly expressed in the retina (Table S7), of which putative HLA-A29-restricted peptides (MSi>0.9 for HLA-A29 and MSi<0.6 for 94 other alleles) were found in key factors in melanocyte biology (ARMC9, OCA2, SLC45A2, PLXNC1)(Table S8). This is of significance, because progressive loss of ocular melanocytes is a hallmark feature of BU2,5,8,50,51. We conclude that these data support that ERAP2 may apply selective pressure on the repertoire of HLA-A29.
ERAP2 has similar effects on P1 across the HLA class I immunopeptidome
Next, we were interested to see how ERAP2 affects the global peptidome of the other class I alleles. We use the flow-through of the HLA-A29-binding antibody immunopurifications to capture HLA class I molecules (Figure 1D). After filtering, a total of 10,233 unique peptides were identified between biological replicates (Jaccard similarity = 0.73) of which 6,678 8-11 mers were considered for differential expression analysis (Figure 1D)(Table S9). A total of 2,170 peptides were differentially expressed (Table S9). Notwithstanding allele-specific differences, K, R, and A were seen more often at P1 of ERAP2-sensitive peptides, while F and Y were typically underrepresented across the other five alleles (Figure 4)(Tables S10-S14). This was supported by a global assessment of all 9-11-mers (Figure S12). These results indicate that ERAP2 has globally similar effects on P1 across HLA allotypes and in line with the observation that the P1 across HLA class I ligands is enrichment for residues A, K, and R 32.
Internal sequence preferences of ERAP2 can be interpreted from the immunopeptidome
We further conducted NMDS of the 9-mers for HLA-A03:01, HLA-B40:01, and HLA-B44:03 (Figure 5A, D, G and Figure S13). The HLA-C peptidomes captured were too sparse to provide sufficient resolution. Investigation of HLA-A03:01 was hampered by a relatively high level of submotifs, characteristic for this allele 29,32, in comparison to the density of the peptide data (Figure 5B), possibly due to loss of peptides by the initial immunopurification (Figure 1D). Regardless, ERAP2-sensitive peptides were enriched in cluster 4 (Χ2, Padj = 9.5 × 10−3)(Figure 5C), but 37/92 (40%) peptides of cluster 4 were also in the HLA-A29 peptidome (20 in cluster 3 of HLA-A29:02, Figure 3C). Reanalysis of immunopeptidome data from mono-allelic cell lines29 support that HLA-A29:02 and HLA-A03:01 can each present peptides with the motif of cluster 4 (Figure S14) and demonstrates that ERAP2 influences multiple alleles in part by peptide promiscuity. Considering the other clusters, no evidence for effects of ERAP2 beyond P1 could be observed.
In contrast to HLA-A03:01, strong residue preferences at P2 and PΩ of HLA-B40:01 resulted in few submotifs (Figure 5D). The distribution of ERAP2-sensitive ‘away’ from dependent peptides in two-dimensional space was reminiscent of the ‘pattern’ observed in the projection of HLA-A29:02 peptides (Figure 5D). Submotif analysis revealed that cluster 1 and 4 were enriched for sensitive peptides and were distinguished by a preference for F or Y at P3 (Figure 5E, F)(Table S15). Cluster 3 (enriched for dependent peptides) was distinguished by a F/Y at P1 (Figure 5E), similar to the overall motif of ERAP2-dependent peptides (Figure 3B, C).
Finally, HLA-B44:03 submotifs enriched for sensitive peptides (cluster 3 and cluster 2) (Figure 5G-I) showed a preference for F at P3, similar to HLA-B40:01(Table S15). These observations are consistent with recognition of P3 by a hydrophobic pocket revealed by structural analysis of ERAP2 (Figure S15). Note that cluster 4 was enriched for dependent peptides (Figure 5I) and enriched for E at P1 (Table S15), a negatively charged amino acid that is resistant to trimming by ERAP2. In summary, immunopeptidome data revealed internal peptide sequence preferences of ERAP2 that shape the ligand repertoire in a HLA class I-specific manner.
Discussion
In this study, we showed that ERAP2 shapes the HLA-A29 peptidome predominantly by over-trimming peptides carrying susceptible residues at their N-terminus while sparing others carrying a sub-optimal residue at the N-terminal positions. We showed that in the presence of ERAP2 preferred amino acids A,K, and R42 are underrepresented, while amino acids F and Y are over-represented at P1, but that these effects on P1 are commonly shared with other class I alleles. Strikingly, we identified that ERAP2 specifically increases the abundance of peptides with a distinct submotif (cluster 2, Figure 3C) defined by nonpolar aromatic residues F or Y at P2 that specifically binds to HLA-A29. Replication of these findings in non-related HLA-A29-positive cell lines suggests that these effects of ERAP2 on HLA-A29 are common. Indeed, in known crystal structures of ERAP2 with peptides, the P2 side-chain is accommodated in a very shallow pocket that cannot easily accommodate large residues such as F and Y due to steric clashes with nearby enzyme residues13 thus making peptides carrying large hydrophobic bulky residues at P2, poorer substrates (Figure S15). Note that we further showed that the effects of ERAP2 on this cluster of peptides is different from ERAP1, which did not show selectivity for this submotif of HLA-A29 (Figure S7). This fits with the observation that the pocket in ERAP1 that interacts with P2 provides more space for bulky residues14. In fact, using correlation as a metric of the effects of ERAP1 and ERAP2, we show that ERAP1 and ERAP2 show non-redundant effects on the HLA-A29 peptidome (Figure S8), which is in line with genetic studies that revealed that ERAP1 and ERAP2 independently contribute to the disease risk for BU10.
Structural studies support that ERAP2 trims the N-terminal residues from peptide substrates by first sequestering the entire peptide sequence inside the enzyme’s cavity. There, the peptide substrate interacts with amino acid side chains of the enzyme, which are considered to influence the stability of the interaction and thus the trimming rates of the peptides13,14. The exact internal peptide sequence preferences for ERAP2 remain poorly understood. In an attempt to map its relevance to antigen presentation, here we considered the entire peptide sequence to capture the full effects of ERAP2 on the class I immuno peptidomes, and identify functional submotifs which may be missed using traditional single residue or motif analysis. We describe highly reproducible motifs of HLA-A29 and identified that peptides that are destroyed by ERAP2 (i.e., ‘sensitive’ to trimming) showed a strong preference for Leucine at P7 and often are presented by multiple alleles (promiscuity). Although we formally cannot exclude the contribution of residual HLA-A29 molecules in the analysis of HLA-A03:01, data from single-HLA cell lines supported overlap in presented peptides with P7-L (Figure S14). Based on the crystal structure of ERAP213, the sidechain P7 can be accommodated within a shallow hydrophobic pocket, which suggests that hydrophobic residues like Leucine would be preferred (Figure S15).
Thus, structural analysis indicates that L at P7 is near-optimal for trimming by ERAP2, while bulky residues at P2 (e.g., F) reduce trimming by ERAP2. Therefore, we hypothesize that the increase in peptides with bulky residues at P2 in the presence of ERAP2 is a result of the decreased availability of competing peptides with P7-L due to overtrimming by ERAP2. Importantly, nonpolar aromatic residues F or Y at P3 were associated with peptides that are destroyed in the HLA-B40:01 an HLA-B44:03 peptidome, which is consistent with recognition of P3 by a hydrophobic pocket lined by two other aromatic residues (Tyr892 and Tyr455) that can make favorable pi-stacking interactions with the peptide aromatic side-chain (Figure S15). F at P3 was also the most common residue considering all 9-11-mers detected by immunoprecipitation of HLA class I (Figure S12). The seemingly contrasting preference of F dependent on the position in the peptide substrate, also suggests that predicting substrate specificity based on widely used fluorogenic aminopeptidase substrates (e.g., R-AMC) or peptide series that vary only the N-terminal residue may obscure the full breadth of substrate specificity for this amino peptidase. We do emphasize that the binding motif of HLA-A29 (and other alleles investigated) can obscure the detection of the full internal sequence preferences of ERAP2, but using the presented peptides as a read-out provides the net effect of any internal sequence preferences on antigen presentation.
We showed that the ERAP-sensitive peptides presented by HLA-A29:02 are promiscuous based on their predicted binding scores for other class I alleles, and their detection in the HLA-A29-negative fraction in mass spectrometry analysis. Since these peptides are also characterized by P1 composition (e.g, A, K, R) that is shared with the other HLA allotypes investigated, it is tempting to speculate that HLA-A29 epitope destruction by ERAP2 is a canonical phenomenon common to class I alleles. This is supported by the observation that HLA class I ligands in general show a depletion for residues A, K, and R at P1 in ERAP2-positive cell lines32, which are preferred substrates of ERAP2. High hydrophobicity of T-cell receptor contact residues in presented peptides - in particular a hydrophobic P7 - is associated with immunogenicity44,45. Perhaps a canonical function of ERAP2 is to destroy epitopes to lower the immunogenic index of peptide cargo presented. This is supported by observations in cancer immunotherapy, where high ERAP2 expression (the risk haplotype for BU) is a strong prognostic predictor of poor survival in patients receiving checkpoint inhibitor therapy to induce T-cell mediated antitumor immunity46. Of interest, the size of P1 of the presented peptide modulates the configuration of position 167 in HLA-A47, which was shown to critically influence T cell recognition47. F or Y at P1 gives a similar configuration for position 167, which is different from the conformation mediated by K and R at P1 in one study47, which suggests that the effects of ERAP2 on P1 may influence T cell receptor recognition.
Given that HLA-A29 is prerequisite for the development of BU, we hypothesize that disease mechanisms associated with antigen presentation are most likely driven by a limited set of epitopes (Table S8) because of promiscuity of peptides43. ERAP2 destroyed the only S-antigen peptide detected in the HLA-A29 peptidome, which also shows good binding scores for other alleles, suggesting that HLA-A29-mediated presentation of S-antigen fragments is less relevant during disease initiation, but perhaps more relevant in later stages of the disease after the blood retina barrier has been breached. This is supported by the common immune reactivity towards S-antigen in patients with clinically distinct phenotypes of uveitis.2 Based on the submotifs of peptides (i.e. cluster 2, Figure 3C), we hypothesize that ‘uveitogenic’ HLA-A29-restricted peptides may more likely harbor a F or Y at P2. The importance of P2 is supported by the fact that fine mapping studies of the MHC linked BU risk to amino acid positions 62-Leu and 63-Gln of HLA-A 48, which are unique to HLA-A29 and directly interact with P2 of the anchoring peptide. Although the HLA-C*14:03 allele also showed good binding scores for the ERAP2-dependent peptides with F or Y at P2, HLA-C alleles are notoriously low expressed49 and the allele frequency of HLA-C*14:03 is >100 times lower compared to HLA-A*29:02. Also, the peptidomes of HLA-A29 and HLA-C14:03 are starkly different (Jaccard similarity index ±1% using peptidome data from Sarkizova and associates29) and T cells recognizing the same peptide in a different HLA molecule may not show immune reactivity. Regardless, we show that the amino acid sequence of retina-expressed genes contain peptides with the motif of cluster 2, which supports that ERAP2-mediated HLA-A29-restricted presentation of ocular epitopes could be a key disease mechanism for BU. Of course, functional experiments of antigen presentation in the eye and tetramer-analysis of T cell immunity to these putative epitopes is warranted. It is, however, of interest that among the predicted epitopes we found peptides derived from key factors in melanocyte biology. A hallmark feature of BU is the progressive loss of stromal melanocytes in the choroid corresponding to the characteristic cream-colored birdshot fundus lesions2,8,50,51, and BU has been associated with melanoma5,6,52.
Previous HLA peptidomic studies of ERAPs are based on single-HLA or long-established cell lines which after years of continuous cultivation are notorious for their profound chromosomal aberrations reported to also affect ERAP and HLA genes42,53–55. In addition, these studies have been conducted with label-free approaches using independent experimental runs, which makes accurate quantification of effects of ERAPs on the immunopeptidome more challenging. To study ERAPs in a physiologically more relevant environment, we exploited MS analysis using newly-established patient-derived cell lines and SILAC labeling to address several potential sources of ambiguity that are non-trivial to resolve with in silico methods, including often unaccounted genetic variability (i.e., polymorphisms) in comparing different cell lines or quantitative error caused by the individual analysis of to be compared conditions. Regardless, the results in this study can also be influenced by several factors. Although abundant peptides are more likely to be sufficiently detected in individual elutions (~90% of peptides were reported before), less abundant peptides might be missed. This means that additional undiscovered effects of ERAP2 on the peptidomes investigated could be present. For example, we limited our labeling and analysis to peptides that contain F and/or Y for SILAC labeling, which obscured our capability to cover the majority of the HLA-B40:01 peptidome or potential uncharted domains of the peptidomes of the other alleles.
In conclusion, we show that ERAP2 significantly influences the immunopeptidome across the cellular HLA class I allotypes. The effects of ERAP2 are consistent with proposed ERAP2 sequence specificity and highlight that observed substrate-enzyme interactions can be translated to observed effects on the immunopeptidome. We have narrowed down the potential sequences for autoimmunity-inducing antigenic peptides based on the selective effect of ERAP2 on the peptide cargo of HLA-A29 in the pathogenesis of Birdshot Uveitis.
Supplemental Methods and Info
Lentiviral vector production
HEK-293T cells were seeded into 10 cm dishes (2×106 cells/dish) and cultured in Dulbecco’s Modified Eagle Medium (DMEM, Thermo Fisher Scientific). The next day, 293T cells were co-transfected with 2 μg transfer vector (Lenti ORF clone of Human S-antigen mGFP tagged, RC220057L2 from Origene) and components of 2nd generation packaging vectors: 8.33 μg psPAX2 packaging vector and 2.77 μg pMD2.G envelope vector at a ratio of 4:1. Transfection was done in serum-free DMEM using Lipofectamine 2000 (Thermo Fisher Scientific) according to manufacturer’s instructions. Medium was replaced with 10 mL DMEM supplemented with 10% FBS and incubated at 37°C, 5% CO2 after 24 hours. The conditioned medium containing lentiviral particles was collected 48 hours after transfection and an additional 10 mL of fresh culture medium was added to the cells. After 12 hours, harvested supernatants were combined and cleared by centrifugation at 1500 rpm for 5 minutes at 4°C then passed through a 0.45 μm filter.
Concentration of lentiviral supernatants using ultracentrifugation was performed with a Beckman Coulter Optima centrifuge using a SW32Ti rotor. Filtered supernatant was added to 38.5 mL Ultra-Clear tubes (Beckman Coulter). Centrifugation was performed for 120 minutes at 32,000□rpm. Supernatant was completely removed and virus pellets were resuspended in 1 mL RPMI (containing 10% FBS and 1% penicillin/streptomycin) and stored at −80°C.
Lentiviral transduction of S-antigen in EBV-LCL
To obtain stable cell lines overexpressing S-antigen, EBV-LCLs were transduced with the concentrated lentiviral supernatants. To transduce EBV-LCLs, 1×106 cells were seeded in a 24-wells plate with the lentivirus and a final polybrene concentration of 6 μg/mL. After 24 hours, the medium was replaced and the cells were cultured for another 3 days, without exceeding a cell concentration of 1.5 × 106 cells/mL. Transduction efficiency was monitored by fluorescent light microscopy. GFP-positive EBV-LCLs were sorted using the BD FACSAria™ III sorter and S-antigen expression levels were detected by western blot.
Differential expression analysis of peptides using limma
For differential expression analysis we used the workflow from Kammers et al., 2015 available at http://www.biostat.jhsph.edu/~kkammers/software/eupa/R_guide.html. Their method exploits the R package limma for shrinking a peptide’s sample variance towards a pooled estimate that boosts power for stable detection of (truly) significant changes in small proteomic data sets. Peptide data were preprocessed using the read.peptides() function, which excludes peptides with missing values (i.e., not detected in either the light or heavy channel). We computed dummy variables for the “Isolation.Interference”, “Quan.Usage”, “Quan.info” variables, because quality control of the input data was completed as described in the main manuscript. The peptide sequence was used as the “Protein.Group.Accessions” variable. Overlapping peptide data from the biological replicates were independently normalized using the quantify.proteins() function. Following the workflow of Kammers et al., we used peptides (with a Mascot Percolator q<0.01 in all analyses) detected in both biological replicates (i.e., peptides unique to one of the conditions are left out for normalization and statistical analysis). For example, for peptides detected by DK1G8 (anti-HLA-A29) with a HLAthena binding score [MSi]>0.6 for HLA-A*29:02 a total of 1330 peptides were detected in both channels, while 41 peptides in either the light or heavy channel (with consistent detection in the same channel in both experiments) and were not considered for statistical analyses. We blocked for batch effect (two independent experiments) in limma by including them in the design matrix. HLA-A29 peptidome analysis considering also peptides detected in either the heavy or light channels is provided in Figure S3. Here, we used dummy variables for the moderate q-value (set to 1 × 10−6) and log2FC (log2FC= −6.6 for peptides only detected in the ERAP2 KO-cell line and log2FC=6.6 for peptides detected only in the ERAP2 WT-cell line), because these parameters were only used to subset peptides unique to either of the conditions (using moderate q<0.01 as a threshold) together with the differentially expressed peptides detected in both channels. Also, although Mascot Percolator exploits a number of relevant peptide features and has been shown to be superior in accurate peptide identification compared to previous Mascot scoring based on one metric (Borsch et al.,2009), we also conducted this analysis of the HLA-A29 peptidome using the the percolator q-value in conjunction with the Mascot ions score >30, which showed similar effects for ERAP2 at the submotif level as the analysis using the percolator q-value (see Figure S3).
Supplemental Figures
Acknowledgements
We express gratitude for constructive input from prof. Debbie van Baarle.
Footnotes
SNP-array copy number profiling and analysis of regions of homozygosity were performed on DNA isolated from WT and CRISPR-Cas9 edited LCLs (ERAP2-KO) according to standard procedures using the Infinium Human CytoSNP-850K v1.2 BeadChip (Illumina, San Diego, CA, USA). Samples were scanned using the iScan system (Illumina). Subsequently, visualizations of SNP-array results and data analysis were carried out using NxClinical software v5.1 (BioDiscovery, Los Angeles, CA, USA). Human genome build Feb. 2009 GRCh37/hg19 was used.