Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics

Proteomics Clin Appl. 2015 Aug;9(7-8):745-54. doi: 10.1002/prca.201400164. Epub 2015 Apr 2.

Abstract

Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include MS to define protein sequence, protein:protein interactions, and protein PTMs. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans-Proteomics Pipeline (TPP) is a robust open-source standardized data processing pipeline for large-scale reproducible quantitative MS proteomics. It supports all major operating systems and instrument vendors via open data formats. Here, we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of MS/MS datasets, as well as some major upcoming features.

Keywords: Bioinformatics; Mass spectrometry.

Publication types

  • Research Support, American Recovery and Reinvestment Act
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Computational Biology / methods*
  • Humans
  • Proteome / metabolism
  • Proteomics / methods*
  • Reproducibility of Results
  • Software
  • Statistics as Topic*

Substances

  • Proteome