SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome

Bioinformatics. 2014 Dec 1;30(23):3356-64. doi: 10.1093/bioinformatics/btu550. Epub 2014 Aug 22.

Abstract

Motivation: Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory.

Results: To help unify these multiple data sources, we have developed the SUBcellular Arabidopsis consensus (SUBAcon) algorithm, a naive Bayes classifier that integrates 22 computational prediction algorithms, experimental GFP and MS localizations, protein-protein interaction and co-expression data to derive a consensus call and probability. SUBAcon classifies protein location in Arabidopsis more accurately than single predictors.

Availability: SUBAcon is a useful tool for recovering proteome-wide subcellular locations of Arabidopsis proteins and is displayed in the SUBA3 database (http://suba.plantenergy.uwa.edu.au). The source code and input data is available through the SUBA3 server (http://suba.plantenergy.uwa.edu.au//SUBAcon.html) and the Arabidopsis SUbproteome REference (ASURE) training set can be accessed using the ASURE web portal (http://suba.plantenergy.uwa.edu.au/ASURE).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Arabidopsis / chemistry*
  • Arabidopsis / genetics
  • Arabidopsis / metabolism
  • Arabidopsis Proteins / analysis*
  • Arabidopsis Proteins / genetics
  • Arabidopsis Proteins / metabolism
  • Bayes Theorem
  • Databases, Protein
  • Green Fluorescent Proteins / genetics
  • Mass Spectrometry
  • Membrane Proteins / analysis
  • Protein Interaction Mapping
  • Proteome / analysis*
  • Proteome / genetics
  • Proteome / metabolism
  • Software

Substances

  • Arabidopsis Proteins
  • Membrane Proteins
  • Proteome
  • Green Fluorescent Proteins