TY - JOUR T1 - EpitopeVec: Linear Epitope Prediction Using Deep Protein Sequence Embeddings JF - bioRxiv DO - 10.1101/2020.11.26.395830 SP - 2020.11.26.395830 AU - Akash Bahai AU - Ehsaneddin Asgari AU - Mohammad R.K. Mofrad AU - Andreas Kloetgen AU - Alice C. McHardy Y1 - 2020/01/01 UR - http://biorxiv.org/content/early/2020/11/26/2020.11.26.395830.abstract N2 - Motivation B-cell epitopes (BCEs) play a pivotal role in the development of peptide vaccines, immunodiagnostic reagents, and antibody production, and thus generally in infectious disease prevention and diagnosis. Experimental methods used to determine BCEs are costly and time-consuming. It thus becomes essential to develop computational methods for the rapid identification of BCEs. Though several computational methods have been developed for this task, cross-testing of classifiers trained and tested on different datasets revealed their limitations, with accuracies of 51 to 53%.Results We describe a new method called EpitopeVec, which utilizes residue properties, modified antigenicity scales, and a Protvec representation of peptides for linear BCE prediction with machine learning techniques. Evaluating on several large and small data sets, as well as cross-testing demonstrated an improvement of the state-of-the-art performances in terms of accuracy and AUC. Predictive performance depended on the type of antigen (viral, bacterial, eukaryote, etc.). In view of that, we also trained our method on a large viral dataset to create a linear viral BCE predictor.Availablity The software is available at https://github.com/hzi-bifo/epitope-prediction under the GPL3.0 license.Contact alice.mchardy{at}helmholtz-hzi.deSupplementary information Supplementary data are available at Bioinformatics online.Competing Interest StatementThe authors have declared no competing interest. ER -