PT - JOURNAL ARTICLE AU - Lena Pfitzer AU - Lien Lybaert AU - Cedric Bogaert AU - Bruno Fant TI - Improving T-cell mediated immunogenic epitope identification via machine learning: the neoIM model AID - 10.1101/2022.06.03.494687 DP - 2022 Jan 01 TA - bioRxiv PG - 2022.06.03.494687 4099 - http://biorxiv.org/content/early/2022/06/04/2022.06.03.494687.short 4100 - http://biorxiv.org/content/early/2022/06/04/2022.06.03.494687.full AB - The identification of immunogenic peptides that will elicit a CD8+ T cell-specific immune response is a critical step for various immunotherapeutic strategies such as cancer vaccines. Significant research effort has been directed towards predicting whether a peptide is presented on class I major histocompatibility complex (MHC I) molecules. However, only a small fraction of the peptides predicted to bind to MHC I turn out to be immunogenic. Prediction of immunogenicity, i.e. the likelihood for CD8+ T cells to recognize and react to a peptide presented on MHC I, is of high interest to reduce validation costs, de-risk clinical studies and increase therapeutic efficacy especially in a personalized setting where in vitro immunogenicity pre-screening is not possible.To address this, we present neoIM, a random forest classifier specifically trained to classify short peptides as immunogenic or non-immunogenic. This first-in-class algorithm was trained using a positive dataset of more than 8000 non-self immunogenic peptide sequences, and a negative dataset consisting of MHC I-presented peptides with one or two mismatches to the human proteome for a closer resemblance to a background of mutated but non-immunogenic peptides. Peptide features were constructed by performing principal component analysis on amino acid physicochemical properties and stringing together the values of the ten main principal components for each amino acid in the peptide, combined with a set of peptide-wide properties. The neoIM algorithm outperforms the currently publicly available methods and is able to predict peptide immunogenicity with high accuracy (AUC=0.88). neoIM is MHC-allele agnostic, and in vitro validation through ELISPOT experiments on 33 cancer-derived neoantigens have confirmed its predictive power, showing that 71% of all immunogenic peptides are contained within the top 30% of neoIM predictions and all immunogenic peptides were included when selecting the top 55% of peptides with the highest neoIM score. Finally, neoIM results can help to better predict the response to checkpoint inhibition therapy, especially in low TMB tumors, by focusing on the number of immunogenic variants in a tumor.Overall, neoIM enables significantly improved identification of immunogenic peptides allowing the development of more potent vaccines and providing new insights into the characteristics of immunogenic peptides.Competing Interest StatementLena Pfitzer, Lien Lybaert and Bruno Fant are employees and shareholders at Myneo NV, a company developing neoantigen immunotherapies, and Cedric Bogaert is a founder and shareholder at MyNeo NV.