Abstract
Reverse vaccinology (RV) provides a systematic approach to identifying potential vaccine candidates based on protein sequences. The integration of machine learning (ML) into this process has greatly enhanced our ability to predict viable vaccine candidates from these sequences. We have previously developed a Vaxign-ML program based on the eXtreme Gradient Boosting (XGBoost). In this study, we further extend our work to develop a Vaxign-DL program based on deep learning techniques. Deep neural networks assemble non-linear models and learn multilevel abstraction of data using hierarchically structured layers, offering a data-driven approach in computational design models. Vaxign-DL uses a three-layer fully connected neural network model. Using the same bacterial vaccine candidate training data as used in Vaxign-ML development, Vaxign-DL was able to achieve an Area Under the Receiver Operating Characteristic of 0.94, specificity of 0.99, sensitivity of 0.74, and accuracy of 0.96. Using the Leave-One-Pathogen-Out Validation (LOPOV) method, Vaxign-DL was able to predict vaccine candidates for 10 pathogens. Our benchmark study shows that Vaxign-DL achieved comparable results with Vaxign-ML in most cases, and our method outperforms Vaxi-DL in the accurate prediction of bacterial protective antigens.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
↵* Co-first author.