Abstract
The sensitivity of malignant tissues to T cell-based cancer immunotherapies is dependent on the presence of targetable HLA class I ligands on the tumor cell surface. Peptide intrinsic factors, such as HLA class I affinity, likelihood of proteasomal processing, and transport into the ER lumen have all been established as determinants of HLA ligand presentation. However, the role of sequence features at the gene and protein level as determinants of epitope presentation has not been systematically evaluated. To address this, we performed HLA ligandome mass spectrometry on patient-derived melanoma lines and used this data-set to evaluate the contribution of 7,124 gene and protein sequence features to HLA sampling. This analysis reveals that a number of predicted modifiers of mRNA and protein abundance and turn-over, including predicted mRNA methylation and protein ubiquitination sites, inform on the presence of HLA ligands. Importantly, integration of gene and protein sequence features into a machine learning approach augments HLA ligand predictions to a comparable degree as predictive models that include experimental measures of gene expression. Our study highlights the value of gene and protein features to HLA ligand predictions.