Structure-based predictive models for allosteric hot spots

PLoS Comput Biol. 2009 Oct;5(10):e1000531. doi: 10.1371/journal.pcbi.1000531. Epub 2009 Oct 9.

Abstract

In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 68-81% of known hotspots, and among total hotspot predictions, 58-67% were actual hotspots. Hence, these models have precision P = 58-67% and recall R = 68-81%. The corresponding models for Feature Set 2 had P = 55-59% and R = 81-92%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R = 73-81% and P = 64-71%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Allosteric Site / genetics*
  • Artificial Intelligence
  • Cluster Analysis
  • Lac Repressors / chemistry
  • Lac Repressors / genetics
  • Lac Repressors / metabolism
  • Models, Chemical*
  • Models, Molecular
  • Protein Binding
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism
  • Reproducibility of Results
  • Signal Transduction
  • Structure-Activity Relationship*
  • Thermodynamics
  • Transcription Factors

Substances

  • Lac Repressors
  • Proteins
  • Transcription Factors