RT Journal Article
SR Electronic
T1 Predicting Protein Binding Affinity With Word Embeddings and Recurrent Neural Networks
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 128223
DO 10.1101/128223
A1 Carlo Mazzaferro
YR 2017
UL http://biorxiv.org/content/early/2017/04/18/128223.abstract
AB At the core of our immunological system lies a group of proteins named Major Histocompatibility Complex (MHC), to which epitopes (also proteins sometimes named antigenic determinants), bind to eliciting a response. These responses are extremely varied and of widely different nature. For instance, Killer and Helper T cells are responsible for, respectively, counteracting viral pathogens and tumorous cells. Many other types exist, but their underlying structure can be very similar due to the fact that they all are proteins and bind to the MHC receptor in a similar fashion. With this framework in mind, being able to predict with precision the structure of a protein that will elicit a specific response in the human body represents a novel computational approach to drug discovery. Although many machine learning approaches have been used, no attempt to solve this problem using Recurrent Neural Networks (RNNs) exist. We extend the current efforts in the field by applying a variety of network architectures based on RNNs and word embeddings (WE). The code is freely available and under current development at https://github.com/carlomazzaferro/mhcPreds