Predicting proteomes of mitochondria and related organelles from genomic and expressed sequence tag data

Methods Enzymol. 2009:457:21-47. doi: 10.1016/S0076-6879(09)05002-2.

Abstract

In eukaryotes, determination of the subcellular location of a novel protein encoded in genomic or transcriptomic data provides useful clues as to its possible function. However, experimental localization studies are expensive and time-consuming. As a result, accurate in silico prediction of subcellular localization from sequence data alone is an extremely important field of study in bioinformatics. This is especially so as genomic studies expand beyond model system organisms to encompass the full diversity of eukaryotes. Here we review some of the more commonly used programs for prediction of proteins that function in mitochondria, or mitochondrion-related organelles in diverse eukaryotic lineages and provide recommendations on how to apply these methods. Furthermore, we compare the predictive performance of these programs on a mixed set of mitochondrial and non-mitochondrial proteins. Although N-terminal targeting peptide prediction programs tend to have the highest accuracy, they cannot be effectively used for partial coding sequences derived from high-throughput expressed sequence tag surveys where data for the N-terminus of the encoded protein is often missing. Therefore methods that do not rely on the presence of an N-terminal targeting sequence alone are extremely useful, especially for expressed sequence tag data. The best strategy for classification of unknown proteins is to use multiple programs, incorporating a variety of prediction strategies, and closely examine the predictions with an understanding of how each of those programs will likely handle the data.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Databases, Protein
  • Expressed Sequence Tags*
  • Humans
  • Mitochondria / chemistry
  • Mitochondria / genetics*
  • Mitochondria / metabolism*
  • Mitochondrial Proteins / analysis*
  • Mitochondrial Proteins / genetics*
  • Mitochondrial Proteins / metabolism
  • Proteome / analysis
  • Proteome / genetics
  • Proteome / metabolism
  • Proteomics / methods*
  • Software

Substances

  • Mitochondrial Proteins
  • Proteome