Prediction of protein function using protein-protein interaction data

J Comput Biol. 2003;10(6):947-60. doi: 10.1089/106652703322756168.

Abstract

Assigning functions to novel proteins is one of the most important problems in the postgenomic era. Several approaches have been applied to this problem, including the analysis of gene expression patterns, phylogenetic profiles, protein fusions, and protein-protein interactions. In this paper, we develop a novel approach that employs the theory of Markov random fields to infer a protein's functions using protein-protein interaction data and the functional annotations of protein's interaction partners. For each function of interest and protein, we predict the probability that the protein has such function using Bayesian approaches. Unlike other available approaches for protein annotation in which a protein has or does not have a function of interest, we give a probability for having the function. This probability indicates how confident we are about the prediction. We employ our method to predict protein functions based on "biochemical function," "subcellular location," and "cellular role" for yeast proteins defined in the Yeast Proteome Database (YPD, www.incyte.com), using the protein-protein interaction data from the Munich Information Center for Protein Sequences (MIPS, mips.gsf.de). We show that our approach outperforms other available methods for function prediction based on protein interaction data. The supplementary data is available at www-hto.usc.edu/~msms/ProteinFunction.

MeSH terms

  • Bayes Theorem
  • Computational Biology / methods*
  • Databases, Protein
  • Fungal Proteins / classification
  • Fungal Proteins / metabolism
  • Models, Statistical
  • Predictive Value of Tests
  • Protein Binding
  • Protein Interaction Mapping / methods*
  • Proteins / classification
  • Proteins / metabolism*
  • Sensitivity and Specificity
  • Subcellular Fractions / metabolism

Substances

  • Fungal Proteins
  • Proteins