PT - JOURNAL ARTICLE AU - Vladimir Gligorijevic AU - P. Douglas Renfrew AU - Tomasz Kosciolek AU - Julia Koehler Leman AU - Kyunghyun Cho AU - Tommi Vatanen AU - Daniel Berenberg AU - Bryn Taylor AU - Ian M. Fisk AU - Ramnik J. Xavier AU - Rob Knight AU - Richard Bonneau TI - Structure-Based Function Prediction using Graph Convolutional Networks AID - 10.1101/786236 DP - 2019 Jan 01 TA - bioRxiv PG - 786236 4099 - http://biorxiv.org/content/early/2019/10/04/786236.short 4100 - http://biorxiv.org/content/early/2019/10/04/786236.full AB - Recent massive increases in the number of sequences available in public databases challenges current experimental approaches to determining protein function. These methods are limited by both the large scale of these sequences databases and the diversity of protein functions. We present a deep learning Graph Convolutional Network (GCN) trained on sequence and structural data and evaluate it on ~40k proteins with known structures and functions from the Protein Data Bank (PDB). Our GCN predicts functions more accurately than Convolutional Neural Networks trained on sequence data alone and competing methods. Feature extraction via a language model removes the need for constructing multiple sequence alignments or feature engineering. Our model learns general structure-function relationships by robustly predicting functions of proteins with ≤ 30% sequence identity to the training set. Using class activation mapping, we can automatically identify structural regions at the residue-level that lead to each function prediction for every protein confidently predicted, advancing site-specific function prediction. De-noising inherent in the trained model allows an only minor drop in performance when structure predictions are used, including multiple de novo protocols. We use our method to annotate all proteins in the PDB, making several new confident function predictions spanning both fold and function trees.