Immunopeptidomic analysis of influenza A virus infected human tissues identifies internal proteins as a rich source of HLA ligands

CD8+ and CD4+ T cells provide cell-mediated cross-protection against multiple influenza strains by recognising epitopes bound as peptides to human leukocyte antigen (HLA) class I and -II molecules respectively. Two challenges in identifying the immunodominant epitopes needed to generate a universal T cell influenza vaccine are: A lack of cell models susceptible to influenza infection which present population-prevalent HLA allotypes, and an absence of a reliable in-vitro method of identifying class II HLA peptides. Here we present a mass spectrometry-based proteomics strategy for identifying viral peptides derived from the A/H3N2/X31 and A/H3N2/Wisconsin/67/2005 strains of influenza. We compared the HLA-I and -II immunopeptidomes presented by ex-vivo influenza challenged human lung tissues. We then compared these with directly infected immortalised macrophage-like cell line (THP1) and primary dendritic cells fed apoptotic influenza-infected respiratory epithelial cells. In each of the three experimental conditions we identified novel influenza class I and II HLA peptides with motifs specific for the host allotype. Ex-vivo infected lung tissues yielded few class-II HLA peptides despite significant numbers of alveolar macrophages, including directly infected ones, present within the tissues. THP1 cells presented HLA-I viral peptides derived predominantly from internal proteins. Primary dendritic cells presented predominantly viral envelope-derived HLA class II peptides following phagocytosis of apoptotic infected cells. The most frequent viral source protein for HLA-I and -II was matrix 1 protein (M1). This work confirms that internal influenza proteins, particularly M1, are a rich source of CD4+ and CD8+ T cell epitopes. Moreover, we demonstrate the utility of two ex-vivo fully human infection models which enable direct HLA-I and -II immunopeptide identification without significant viral tropism limitations. Application of this epitope discovery strategy in a clinical setting will provide more certainty in rational vaccine design against influenza and other emergent viruses.

Influenza vaccines designed to induce strong neutralising antibody responses to haemagglutinin 85 offer narrower and more short-lived immunity than naturally acquired infections, , which also 86 induce antibody responses predominantly to HA, but also stronger responses to NA and some 87 internal viral proteins [6,7]. Although neutralising antibodies provide key protection against initial 88 infection, T cells play an equally important role in limiting the consequent illness [8]. 89 T cells recognise viral peptides bound to class I and II major histocompatibility (MHC) molecules 90 which are presented at the cell surface. CD8+ T cells recognise endogenously processed viral 91 peptides presented by class I MHC molecules on the surface of infected cells, whereas CD4+ T cells recognise exogenously processed peptides presented by class II MHC molecules mainly 93 presented on the surface of professional antigen presenting cells such as dendritic cells and 94 macrophages [9]. 95 Targeting conserved viral protein sequences, which are more commonly derived from internal 96 viral proteins, should confer greater vaccine induced cross-protection against multiple influenza 97 strains, and early evidence in mice supports this [10]. Previous evidence has shown that the 98 influenza virus nucleoprotein (NP) is a major target of immunodominant CTLs in direct infections 99 [11], and acid polymerase T cell epitopes are more abundant in mouse cross-presentation 100 models, but matrix protein (M1) and the RNA-directed RNA polymerase catalytic subunit (PB1) 101 also contain conserved immunogenic sequences [12]. Viral NP and M1 are also major targets 102 for immunodominant CD4 T cell responses [13]. Human infection trials suggest that pre-existing 103 influenza-specific T cells, particularly those recognising conserved epitopes of internal viral 104 proteins, are central to limiting disease severity following experimental challenge with different 105 influenza strains [14]. 106 Infections stimulate both CD4+ and CD8+ T cell subsets, and optimal humoral and cellular 107 immunity is dependent upon the activation of CD4+ T helper cells, which support CD8+ T cell 108 function but can also themselves have effector functions [15,16]. Virus-specific CD4+ and CD8+ 109 T cells specific for immunodominant influenza epitopes negatively correlate with disease 110 severity and fever symptoms, respectively [17]. 111 In animal models, long peptide vaccines designed to stimulate antibody and T cell responses 112 have provided only minimal protection against infection, with limited evidence of symptom 113 reduction [18]. Rationally designed T cell epitope targeted vaccines containing long peptide 114 sequences from the extracellular domain of M2 (M2e) and NP have been tested in mice, but 115 offered only limited to moderate protection with variable responses to each peptide [19]. This 116 may arise because candidate T cell epitopes are commonly identified using machine learning based algorithms to predict binding of 9-mer or 15-mer peptides to specific HLA-I and HLA-II 118 HLA allotypes, respectively. Whilst peptide affinity predictions are reasonably accurate, at least 119 for HLA-I, there are multiple other factors that influence the true efficacy of T cell epitopes, 120 including, but not restricted to, the abundance of the source protein available for presentation by 121 infected cells, the biochemical nature and structural stability of the epitope, the suitability of 122 surrounding residues to endosomal processing, and the secondary structure of the source 123 protein. Due to differences in mouse and human MHC, humanised mouse models must be 124 utilised to examine influenza T cell epitopes in humans, but are then restricted to the transgenic 125 allotypes, usually HLA-A*02:01, the most prevalent global allotype. 126 Recent improvements in the sensitivity of mass spectrometry combined with 127 immunoprecipitation of peptides bound to HLA-I and HLA-II have enabled the field of 128 immunopeptidomics to be utilised in the search for optimal T cell epitopes [20]

Results
We used three different models as sources for immunopeptidome isolation, to identify HLA-I and 152 HLA-II restricted influenza immunopeptides (Fig 1) [24]. 153 154

Fig 1: Workflow of the approach to identify HLA-I and -II influenza immunopeptides 155 isolated from cell lines, dendritic cells and lung tissues. 156
Infection of resected lung tissues reveals novel influenza HLA-I and -II restricted 158 epitopes 159 We examined infection rates and HLA-presented peptides using ex-vivo lung tissue samples 160 from three different human donors with diverse HLA types (Table S1). Exposure of the ex-vivo 161 lung parenchymal tissues: P1, P2 and P3 to the two viral strains studied herein led to variable 162 infection rates in the two main cell types which have previously shown influenza susceptibility, 163 epithelial and macrophage cells (Fig 2 A) [23]. We have previously shown that epithelial 164 infection rates in resected lung tissues can be variable [23], and this study indicates that the 165 viral strain also affects infection rate in the two target cell types. Infection rates varied from 2%-166 70% in epithelial and 1%-15% in macrophage cells. 167 Following ex-vivo infection, we were able to identify a number of influenza-derived HLA-I 168 restricted peptides across all samples derived principally from M1, NP and NS proteins (Table  169 1). Similar to previous findings, despite consistent, if limited, expression of viral haemagglutinin 170 and neuraminidase in the proteome, HLA-I peptides were generally of internal viral protein 171 origins: M1, NP and Non-structural protein 1 (NS1) (Fig 2 B (Table S1) Excitingly from P3 we also identified one class II HLA peptide, derived from the Haemagglutinin 192 protein (Table 1). Although this potential CD4+T cell epitope is not yet proven as functional, it is 193 novel, and as the first such identification in ex-vivo infected tissues, it paves the way for the 194 identification of further CD4-stimulatory peptides. The presentation of the only observed CD4+ T 195 cell epitope derived from a membrane-resident protein is consistent with the predominantly 196 extracellular/membrane origin of the HLA-II sourced proteins. 197  indicated that all the cells expressed HLA-I but only a minority expressed HLA-II (Fig 3 A-B). 201 Following exposure to the Wisconsin H3N2 and X31 influenza strains at a MOI of 1.0, 202 approximately 50% and 90% of the THP1 cells were infected respectively (Fig 3 C, Fig S2). 203 abundant proteins were matrix protein 1, nucleoprotein and non-structural protein-1, whereas 206 others, such as polymerase basic proteins 1 and 2 were the least abundant (Fig 3 D). Matrix 207 protein 2 and RRBP2 were not detected in the analysis. This pattern was similar between the 208 two influenza strains studied, with slight differences in the proportions of the most abundant 209 proteins. This approximate pattern of expression has been previously observed in purified 210 influenza virions [27,28], however our observation of the relatively high abundance of NP and 211 NS1, similar to that observed in ex-vivo infected lung tissues, may be due to our examination of 212 infected cells rather than virions [29], as these were over-represented in the infection models 213 compared to our initial purified influenza stock ( Fig S1). Notably, the five most abundant 214 proteins are the same as those found to be principal targets for cell mediated immune 215 responses in animal infection models [30]. 216 Immunopeptidomic analysis of eluted HLA-1-bound peptides extracted from THP1MΦ infected 217 with Wisconsin virus resulted in the detection of 10,709 unique peptide sequences matching 218 3,064 host proteins in the Uniprot human database (Table S2). Cluster analysis of these 219 peptides indicated the presence of three strong HLA-1 binding motifs (Fig 3 E), which were 220 consistent with the HLA types of this cell line (Table S1). Of the observed peptides, 6,499 could 221 be assigned to the homozygous HLA-A*02:01 and -B*15:11 allotypes on the surface of THP-1 222 cells on the basis of motif presence. Similarly, infection with X31 resulted in 11,643 unique host 223 peptide sequences derived from 3,308 host proteins in the Uniprot human database (Table S2). 224 From the three biological replicates used in the study of THP1MΦ, we detected 9 unique 236 influenza peptides associated with Wisconsin infection and 20 associated with X31 (Table 1). 237 HLA-restricted influenza peptides contained the correct binding motifs for the HLA types of 238 THP1 cells (Fig 3 E) [31]. There was only one unique Wisconsin strain peptide, which was 239 derived from PA-X, thus all but one of the peptides found in the Wisconsin strain were from 240 identical regions to those in X31 (with small like-for-like differences in amino acid sequence), 241 and the one unique Wisconsin strain sequence has an identical amino acid sequence in the X31 242 strain. The additional X31 peptides were potentially due to the higher infection rate of X31 in 243 these cells, leading to more intracellular viral protein. 244  Two of the HLA-A*02:01 immunopeptides, NS1 protein-derived peptide AIMDKNIIL and the M1 245 peptide RMGAVTTEV have been previously observed following X31 infection in respiratory 246 epithelial cells [32]. 247

Fig 3: Characteristics of the THP1MF used to identify HLA-binding viral peptides 226 following direct infection with influenza virus. (A) Differentiation of THP1 cells alters cells to
The majority of observed viral peptides were predicted to be strong binders to the HLA-B*15:01 248 whereas the majority of host immune-peptides were predicted to bind to HLA-A*02:01 (Fig 3 E  249   and Table 2). It is unclear whether this is due to preferential tracking of viral proteins to the B 250 allotype, or simply the presence of favourable B allotype binding motifs in the viral proteins. The majority of our observed peptides in THP1MΦ have been previously characterised by in-258 vitro binding/cytotoxicity assays and were present in the Immune Epitope Database (IEDB), 259 although not derived from the two strains studied herein. Most reported positive ELISpot 260 outputs, confirming that they led to functional responses. The majority of these immunogenic 261 peptides were previously identified because many influenza strains have been intensively 262 studied. 263 All of the novel peptides observed in our THP1MΦ study were predicted to bind the HLA-264 C*03:03 allotype. The C allotype binding motifs are less clear than A and B allotypes, rendering 265 allotype assignation by predictive algorithm less efficient. Further examination by functional 266 assay would be required to confirm their functionality. 267 Very few host cell HLA-II peptides could be detected on these cells, consistent with our flow 268 cytometry data, suggesting that expression of HLA-II on the cell surface was low (Fig 3 B). No 269 influenza peptides were detected bound to HLA-II. 270 We found some influenza immunopeptides from proteins which were undetectable in the 271 proteome of infected cells. Such a finding is consistent with previous reports that immuno-272 peptide selection is poorly correlated with source protein concentration [34], but may also reflect 273 the challenges of detecting lower abundance proteins in a complex proteome such as that 274 derived from infected cell lines. 275

Phagocytosis of apoptotic influenza-infected MoDCs reveals multiple nested MHC-II 276 influenza epitopes 277
Using Wisconsin H3N2-infected A549 cells (80% infected, see Fig S3) as the source of viral 278 proteins, we UV irradiated these cells to drive them into apoptosis prior to feeding them to 279

Fig 4: MoDCs can be used to identify HLA-II binding immunopeptides following 282 phagocytosis of apoptotic influenza-infected respiratory epithelial (A549) cells. (A) 283
Dendritic cell morphology prior to exposure to apoptotic A549 cells, 20X magnification using Phagocytosis of these infected cells led to preferential presentation of HLA-II bound influenza 291 peptides (Table 3), with those peptides matching the HLA-II motifs from the DCs (Table S1, P4),  292 with no observable viral HLA-I peptides, despite robust host-derived HLA-I peptide presentation 293 in these cells (Table S3). This lack of evidence of cross-presentation of influenza HLA-I peptides 294 by human DCs has been previously observed when using an HLA-A*02 cell line (BEAS-2B) 295 [32]. 296 We observed 4,639 distinct class II HLA peptides deriving from 891 source proteins (Table S2). 297 Motif deconvolution [35] was able to assign 2,597 peptides to the respective HLA-DRB1 298 allotypes of P4 (Fig 4 B). Within these immunopeptidomes, there were 29 influenza A derived 299 HLA-II restricted peptide sequences. Contrary to viral presentation following direct infection of 300 cells or tissues, there was a very strong bias to the processing and display of the membrane-301 bound proteins neuraminidase, haemagglutinin, and Matrix protein 1 in the detected HLA-II 302 peptides (Table 3). 303   predominantly towards the C terminus of the amino acid sequence, whereas the HLA-I motifs 309 were equally distributed over the protein, including regions also presented by HLA-II (Fig 4 C), peptides, and no HLA-II restricted peptides (Table S3). All were consistent with the HLA-I 327 allotypes of these cells. We found no evidence of either HLA-I or -II peptides sourced from the 328 A549 cells (i.e. matching their allotypes) following engulfment by DCs. This was perhaps not 329 surprising considering the how few influenza peptides appear to be presented by these cells, 330 and the fact that only 10 6 cells were used in the MoDC assay (only 10% of the amount normally 331 required to achieve a peptidome of >1,000 unique peptides). 332

333
Current subunit vaccine strategies to optimize T cell responses to influenza challenge are 334 mostly directed towards the most mutable proteins such as Haemagglutinin and Neuraminidase. 335 There is evidence to suggest that HLA-I restricted T cell responses are more directed towards 336 the more highly conserved internal viral proteins, whereas humoral responses are dominated by 337 envelope proteins [39]. Despite the proteome of influenza comprising of only a dozen proteins, 338 this yields many thousands of potential T cell epitopes. Therefore identifying the epitopes most 339 important for anti-influenza responses by predictive means is challenging. Most viral proteins 340 will contain HLA binding motifs for multiple allotypes, but current evidence suggests that only a 341 small minority of these will actually be presented [40].
Here we show how influenza epitope presentation is influenced by presence of HLA binding 343 motifs, source protein abundance, and the HLA pathway. We confirm that only a few internal 344 viral proteins provide the main source of HLA-I immunopeptides in lung tissues, and we find that 345 select immunopeptides are favoured in different influenza strains. Viral protein abundance 346 influences, but is not the only factor in HLA presentation. Using a MoDC model we show that 347 viral membrane bound proteins such as NA, HA and M1 are preferentially presented by HLA-II, 348 and that certain regions of these proteins may be more conducive to processing via the HLA-II 349 pathway. These results demonstrate how peptidomics can reduce the potential pool of anti-350 influenza T cell epitopes from thousands to a few dozen. Furthermore, these candidates can be 351 refined according to their relevant HLA pathway and helps guide predictive algorithm epitope 352 selection more effectively. 353 To address the issue of viral tropism, we have taken an approach using ex-vivo human lung To better understand the HLA-I viral immunopeptidome, we initially used a cell model (THP1) 377 with the aim of identifying T cell epitopes for influenza. These cells were more susceptible to 378 infection with the laboratory-adapted X31 strain than the more clinically relevant Wisconsin 379 strain. We were able to identify a number of well-characterised HLA-A and B epitopes that had 380 been previously observed in similar studies. The utility of this approach is limited by the 381 molecular phenotype of THP1 cells which are homozygous for the three indicated haplotypes. 382 Bioinformatic comparison of our observed epitopes with a predicted list found that most of the 383 HLA-A and B epitopes detected using this cell model were predicted binders for the known 384 THP1 allotypes, but only represented a small proportion of the predicted binders. The reasons 385 for this are complex, and do not necessarily imply that others are not present, but rather that 386 they may be unrecognised, since MS is biased to the detection of peptides with certain 387 biophysical characteristics. For example, the well-characterised immunodominant M158-66 388 peptide GILGFVFTL, is a high-ranking predicted HLA-A*02:01 immunopeptide which is 389 refractory to identification by mass spectrometry.
A number of the T cell epitopes we identified were derived from the identical region of the same 391 protein in the two influenza strains despite small differences in amino acid sequence. This may 392 arise because the HLA anchor positions were not altered, but is also suggestive of intrinsic 393 properties of these protein regions being conducive to antigen processing and presentation. 394 Some of our observed influenza immunopeptides did not match with any allotype or were 395 assigned to HLA-C, but with low predicted affinity. This may reflect the poor performance for 396 predictive algorithms using the C allotype which is less well characterised. Often where the 397 motifs for HLA binding are not clearly defined, prediction tools are less useful, meaning direct 398 observation could play a more significant role, not only in identifying novel peptides, but also in 399 improving the algorithms for future searches. HLA-II prediction algorithms are thought to be 400 even less reliable [42]. It has been proposed that alternative pathways of antigen processing in infected APCs, rather 440 than virion or infected cell uptake, is the primary driver of CD4+ T cell response to influenza 441 infection [46]. Further work might reveal differential immunopeptidomes in MoDCs infected 442 directly with influenza virus, rather than following uptake of infected cells as in our study. Here 443 we have demonstrated the potential for fully human ex-vivo models as tools to identify viral 444 immunopeptides which could be used to design strain-specific T cell vaccines against influenza. 445 We have shown evidence that these epitopes will be conserved between different donors if they 446 share the same HLA allotype. 447 Much of the vaccine design process employs machine learning algorithms to predict the 448 relevant HLA-I CD8 T cell epitopes by searching for motifs and predicting their affinity in-silico. 449 There is increasing evidence, including in our data, that epitopes that are actually presented are 450 influenced in-vivo by a complex series of additional factors however, inclusive of source protein 451 abundance, protease type in the proteasome, protein turnover, transporter protein expression, 452 PTMs and many others, making the prediction algorithms liable to prioritise non-immunogenic 453

peptides. 454
There are many potential combinations of HLA allotypes in humans, and the best T cell epitopes 455 need to be selected. The current approach is limited by the quality of HLA-I and -II epitopes that 456 can be predicted using the current algorithms. These algorithms are trained using mass 457 spectrometry (MS) data from peptides eluted from the HLA complex, and thus, rarer HLA types 458 often have less well-defined motifs, and the predictions are therefore less accurate. In the case 459 of HLA-II this is even more apparent, as the paucity of eluted peptide datasets for each 460 haplotype means the predictions are probably highly inaccurate. Furthermore, model cell lines 461 may not be susceptible to pathogen infections. protocols. To generate inactivated UV virus, aliquots were irradiated for 30 min on ice using an 476 ultraviolet microbicidal crosslinker (Steristrom 2537a) as previously described [47].

Assessment of influenza infection in resected human lung tissue samples 533
To analyse influenza infection in resected tissue samples, post-infection, tissues were weighed 534 and enzymatically dispersed with 1 mg/mL type I collagenase in RPMI as previously described 535 24. Dispersed cells were re-suspended in 100 μL FACS buffer containing human IgG (as 536 Fc block) prior to the addition of antibodies directed against surface proteins: CD45-PECF594 537 (to differentiate leukocytes from structural cells), CD3-PECy7, HLA-DR/APCH7 and CD326-PerCPCy5.5 or relevant fluorophore-conjugated isotype controls, and incubated for 30 min on 539 ice. Cells were then fixed and permeabilised as previously described, prior to intracellular 540 staining to quantify viral infection using FITC-conjugated anti-viral nucleoprotein (NP) antibody. 541 All flow cytometry was performed using a BD FACSAria equipped with relevant the relevant 542 lasers and filters, and data were analysed using BD FACS DIVA software. Epithelial cells were 543 identified using the following gating strategy: size/scatter, CD45-, CD326+. Macrophages were 544 identified using the following gating strategy: size/scatter, CD45+, CD3-, HLA-DR++ (Fig S2). 545 Infected epithelial and macrophage cell populations were identified by NP-FITC staining, gated 546 against mock-infected controls using a 1% overlap (Fig S2 E-F). 547

Isolation and preparation of MoDCs 549
PBMCs were isolated from buffy coats and allowed to adhere to tissue culture treated flasks for 550 2 h in Promocell monocyte attachment buffer (Sigma, Dorset, UK). Monolayers were then rinsed 551 thoroughly with Promocell DC generation medium to remove non-adherent cells and 552 subsequently were cultured for 6 days in dendritic cell generation medium supplemented with 553 cytokines (Sigma) to generate immature MoDCs. 554

A549 cell infection and apoptosis 555
Monolayer cultures of A549 cells were infected and sent into apoptosis essentially as previously 556 described [48]. Briefly, 90% confluent monolayers of A549 cells were infected with 557 A/Wisconsin/67/2005 influenza at an MOI of 1.0 in serum-free medium for 2 h, monolayers were 558 rinsed twice with serum free medium to remove excess inoculum, and then cultured for a further 559 12 h in serum-free DMEM supplemented with penicillin-streptomycin and l-glutamine for a 560 further 12 h. The infection rate of >80% was confirmed by flow cytometry (Fig S3) using 561 detection of intracellular viral NP protein as previously described. The infected cell monolayers were rinsed twice with PBS and then irradiated with 150 mJ/cm 2 of UV light using a Stratalinker 563 1800 (Agilent technologies, Santa Cruz, CA, USA) to induce apoptosis. The cells were then 564 incubated for a further 2 h in serum-free medium, prior to enzymatic dispersal to collect the 565 cells. These were then re-suspended at a concentration of 1x10 7 /mL in monocyte generation 566 medium and added to the MoDCs for 3 h. DC activation supplement was then applied and the 567 cells incubated for a further 4 h. Cells were then harvested by trypsinisation and washed twice 568 by centrifugation in PBS before storage at -80°C. 569

Immunopeptidome analysis 570
Purification HLA-I and -II immunopeptides 571 Protein-A sepharose beads (Repligen, Waltham, Mass. USA) were covalently conjugated to 10 572 mg/mL W6/32 (pan-anti-HLA-I) or 5 mg/mL HB145 (pan-anti-HLA-II) monoclonal antibodies 573 (SAL Scientific, Hampshire, UK) using DMP as previously described [49]. Snap frozen tissue 574 samples were briefly thawed and weighed prior to 30 S of mechanical homogenization using a 575 150W handheld mechanical homogeniser with disposable probes (Thermo Fisher Scientific) in 4 576 mL lysis buffer (0.02M Tris, 0.5% (w/v) IGEPAL, 0.25% (w/v) sodium deoxycholate, 0.15mM 577 NaCl, 1mM EDTA, 0.2mM iodoacetamide supplemented with EDTA-free protease inhibitor mix). 578 For cell cultures, frozen cell pellets were re-suspended in 5 mL of lysis buffer and rotated on ice 579 for 30 min to solubilise. 580 Homogenates were clarified for 10 min @2,000g, 4°C and then for a further 60 min @13,500g, 581 4°C. 2 mg of anti-HLA-I conjugated beads were added to the clarified supernatants and 582 incubated with constant agitation for 2 h at 4°C. The captured HLA-583 I/ 2microglobulin/immunopeptide complex on the beads was washed sequentially with 10 584 column volumes of low (isotonic, 0.15M NaCl) and high (hypertonic, 0.4M NaCl) TBS washes 585 prior to elution in 10% acetic acid and dried under vacuum. The MHC-I-depleted lysate was then incubated with 1 mg of anti-HLA-II mouse monoclonal antibodies and MHC-II bound peptides 587 were captured and eluted in the same conditions. Column eluates were diluted with 0.5 volumes 588 of 0.1% TFA and then applied to HLB-prime reverse phase columns (Waters, 30 mg 589 sorbent/column). The columns were rinsed with 10 column volumes of 0.1% TFA and then the 590 peptides were eluted with 12 sequential step-wise increases in acetonitrile from 2.5-30%. 591 Alternate eluates were pooled and dried using a centrifugal evaporator and re-suspended in 592 0.1% formic acid.