INPS: predicting the impact of non-synonymous variations on protein stability from sequence

Bioinformatics. 2015 Sep 1;31(17):2816-21. doi: 10.1093/bioinformatics/btv291. Epub 2015 May 7.

Abstract

Motivation: A tool for reliably predicting the impact of variations on protein stability is extremely important for both protein engineering and for understanding the effects of Mendelian and somatic mutations in the genome. Next Generation Sequencing studies are constantly increasing the number of protein sequences. Given the huge disproportion between protein sequences and structures, there is a need for tools suited to annotate the effect of mutations starting from protein sequence without relying on the structure. Here, we describe INPS, a novel approach for annotating the effect of non-synonymous mutations on the protein stability from its sequence. INPS is based on SVM regression and it is trained to predict the thermodynamic free energy change upon single-point variations in protein sequences.

Results: We show that INPS performs similarly to the state-of-the-art methods based on protein structure when tested in cross-validation on a non-redundant dataset. INPS performs very well also on a newly generated dataset consisting of a number of variations occurring in the tumor suppressor protein p53. Our results suggest that INPS is a tool suited for computing the effect of non-synonymous polymorphisms on protein stability when the protein structure is not available. We also show that INPS predictions are complementary to those of the state-of-the-art, structure-based method mCSM. When the two methods are combined, the overall prediction on the p53 set scores significantly higher than those of the single methods.

Availability and implementation: The presented method is available as web server at http://inps.biocomp.unibo.it.

Contact: piero.fariselli@unibo.it

Supplementary information: Supplementary Materials are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Machine Learning
  • Mutation / genetics*
  • Protein Engineering
  • Protein Stability*
  • Proteins / chemistry*
  • Proteins / genetics
  • Software*
  • Thermodynamics
  • Tumor Suppressor Protein p53 / chemistry*
  • Tumor Suppressor Protein p53 / genetics

Substances

  • Proteins
  • TP53 protein, human
  • Tumor Suppressor Protein p53