Abstract
Background Tumor neoantigens are a driver of cancer immunotherapy response; however, current neoantigen prediction tools produce many candidates that require further prioritization for research/clinical applications. Additional filtration criteria and population-level understanding may help to produce refined lists of putative neoantigens. Herein, we show neoepitope immunogenicity is likely related to measures of peptide novelty and report population-level behavior of these and other metrics.
Methods We propose four peptide novelty metrics to refine predicted neoantigenicity: tumor vs. paired normal peptide binding affinity difference, tumor vs. paired normal peptide sequence similarity, tumor vs. closest human peptide sequence similarity, and tumor vs. closest microbial peptide sequence similarity. We apply these metrics to tumor neoepitopes predicted from somatic missense mutations in The Cancer Genome Atlas (TCGA) and a cohort of melanoma patients, as well as to a group of peptides with neoepitope-specific immune response data using an extension of pVAC-Seq [1].
Results We show neoepitope burden varies across TCGA disease sites and HLA alleles, with surprisingly low repetition of neoepitope sequences across patients or neoepitope preferences among sets of HLA alleles. Only 20.3% of predicted neoepitopes across TCGA patients displayed novel binding change based on our binding affinity difference criteria. Similarity of amino acid sequence was typically high between paired tumor-normal epitopes, but in 24.6% of cases, neoepitopes were more similar to other human peptides, or even to bacterial (56.8% of cases) or viral peptides (15.5% of cases), than their paired normal counterparts. Applied to peptides with neoepitope-specific immune response, a linear model incorporating neoepitope binding affinity, protein sequence similarity between neoepitopes and their closest viral peptides, and paired binding affinity difference was able to predict immunogenicity with an AUROC of 0.66.
Conclusions Our proposed neoepitope prioritization criteria emphasize neoepitope novelty and refine patient neoepitope predictions for focus on biologically meaningful candidate neoantigens. We have demonstrated that neoepitopes should be considered not only with respect to their paired normal epitope, but with respect to the entire human proteome, as well as bacterial and viral peptides, with potential implications for neoepitope immunogenicity and personalized vaccines for cancer treatment. We conclude that putative neoantigens are highly variable across individuals as a function of both cancer genetics and personalized HLA repertoire, while the overall behavior of filtration criteria reflects predictable patterns.
Abbreviations
- AUROC
- Area Under the Receiver Operating Characteristic Curve
- DAI
- Differential Agretopicity Index
- HLA
- Human Leukocyte Antigen
- MAF
- Mutation Annotation Format
- MCC
- Merkel Cell Carcinoma
- MHC
- Major Histocompatibility Complex
- NCBI
- National Center for Biotechnology Information
- ROC
- Receiver Operating Characteristic
- TCGA
- The Cancer Genome Atlas
- VCF
- Variant Call Format
- VEP
- Variant Effect Predictor
- pVAC-Seq
- personalized Variant Antigens by Cancer Sequencing