RT Journal Article SR Electronic T1 Structure-Based Function Prediction using Graph Convolutional Networks JF bioRxiv FD Cold Spring Harbor Laboratory SP 786236 DO 10.1101/786236 A1 Vladimir Gligorijevic A1 P. Douglas Renfrew A1 Tomasz Kosciolek A1 Julia Koehler Leman A1 Kyunghyun Cho A1 Tommi Vatanen A1 Daniel Berenberg A1 Bryn Taylor A1 Ian M. Fisk A1 Ramnik J. Xavier A1 Rob Knight A1 Richard Bonneau YR 2019 UL http://biorxiv.org/content/early/2019/10/04/786236.abstract AB Recent massive increases in the number of sequences available in public databases challenges current experimental approaches to determining protein function. These methods are limited by both the large scale of these sequences databases and the diversity of protein functions. We present a deep learning Graph Convolutional Network (GCN) trained on sequence and structural data and evaluate it on ~40k proteins with known structures and functions from the Protein Data Bank (PDB). Our GCN predicts functions more accurately than Convolutional Neural Networks trained on sequence data alone and competing methods. Feature extraction via a language model removes the need for constructing multiple sequence alignments or feature engineering. Our model learns general structure-function relationships by robustly predicting functions of proteins with ≤ 30% sequence identity to the training set. Using class activation mapping, we can automatically identify structural regions at the residue-level that lead to each function prediction for every protein confidently predicted, advancing site-specific function prediction. De-noising inherent in the trained model allows an only minor drop in performance when structure predictions are used, including multiple de novo protocols. We use our method to annotate all proteins in the PDB, making several new confident function predictions spanning both fold and function trees.