Abstract
Peptide binding to MHC class I molecules is the single most selective step in antigen presentation and the strongest single correlate to peptide cellular immunogenicity. The cost of experimentally characterizing the rules of peptide presentation for a given MHC-I molecule is extensive, and predictors of peptide-MHC interactions constitute an attractive alternative.
Recently, an increasing amount of MHC presented peptides identified by mass spectrometry (MS ligands) has been published. Handling and interpretation of MS ligand data is in general challenging due to the poly-specificity nature of the data. We here outline a general pipeline for dealing with this challenge, and accurately annotate ligands to the relevant MHC-I molecule they were eluted from by use of GibbsClustering and binding motif information inferred from in-silico models. We illustrate the approach here in the context of MHCI molecules (BoLA) of cattle. Next, we demonstrate how such annotated BoLA MS ligand data can readily be integrated with in-vitro binding affinity data in a prediction model with very high and unprecedented performance for identification of BoLA-I restricted T cell epitopes.
The approach has here been applied to the BoLA-I system, but the pipeline is readily applicable to MHC systems in other species.