Abstract
In this work, we propose a new deep learning model, MHCrank, to predict the probability that a peptide will be processed for presentation within the MHC Class I pathway. We find that the performance of our model is significantly higher than two previously published baseline methods: MHCflurry and netMHCpan. Gains in performance result from the utilization of cleavage site-specific kernels and learned representations for amino acids. By visualizing the site-specific amino acid enrichment among top-ranked peptides, we find MHCrank’s top-ranked peptides are enriched at biologically relevant positions with amino acids that are consistent with previous work. Furthermore, the cosine similarity matrix derived from MHCrank’s learned embeddings for amino acids correlate highly with physiochemical properties that have been experimentally shown to be important in determining a peptide’s favorability to be processed. Altogether, the results reported in this work indicate that the proposed MHCrank demonstrates strong performance compared to existing methods and could have vast applicability to aid drug and vaccine development.
Competing Interest Statement
The authors have declared no competing interest.