Abstract
Protein structure prediction has been greatly improved, but there are still a good portion of predicted models that do not have very high quality. Protein model refinement is one of the methods that may further improve model quality. Nevertheless, it is very challenging to refine a protein model towards better quality. Currently the most successful refinement methods rely on extensive conformation sampling and thus, take hours or days to refine even a single protein model. Here we propose a fast and effective method that may refine protein models with very limited conformation sampling. Our method applies GNN (graph neural networks) to predict refined inter-atom distance probability distribution from an initial model and then rebuilds the model using the predicted distance as restraints. On the CASP13 refinement targets our method may refine models with comparable quality as the two leading human groups (Feig and Baker) and greatly outperforms the others. On the CASP14 refinement targets our method is only second to Feig’s method, comparable to Baker’s method and much better than the others (who worsened instead of improved model quality). Our method achieves this result by generating only 5 refined models for an initial model, which can be done in ∼15 minutes. Our study also shows that GNN performs much better than convolutional residual neural networks for protein model refinement when conformation sampling is limited.
Availability The code will be released once the manuscript is published and available at http://raptorx.uchicago.edu
Contact jinboxu{at}gmail.com
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
1. corrected the description of the message block for edges. 2. added citations for the FastRelax and ref2015 scoring function.