Elsevier

Journal of Proteomics

Volume 73, Issue 9, 5 August 2010, Pages 1740-1746
Journal of Proteomics

Implementation and evaluation of relative and absolute quantification in shotgun proteomics with label-free methods

https://doi.org/10.1016/j.jprot.2010.05.011Get rights and content

Abstract

Tandem mass spectrometry allows for fast protein identification in a complex sample. As mass spectrometers get faster, more sensitive and more accurate, methods were devised by many academic research groups and commercial suppliers that allow protein research also in quantitative respect. Since label-free methods are an attractive alternative to labeling approaches for proteomics researchers seeking for accurate quantitative results we evaluated several open-source analysis tools in terms of performance on two reference data sets, explicitly generated for this purpose.

In this paper we present an implementation, T3PQ (Top 3 Protein Quantification), of the method suggested by Silva and colleagues for LC-MSE applications and we demonstrate its applicability to data generated on FT-ICR instruments acquiring in data dependent acquisition (DDA) mode. In order to validate this method and to show its usefulness also for absolute protein quantification, we generated a reference data set of a sample containing four different proteins with known concentrations. Furthermore, we compare three other label-free quantification methods using a complex biological sample spiked with a standard protein in defined concentrations. We evaluate the applicability of these methods and the quality of the results in terms of robustness and dynamic range of the spiked-in protein as well as other proteins also detected in the mixture. We discuss drawbacks of each method individually and consider crucial points for experimental designs. The source code of our implementation is available under the terms of the GNU GPLv3 and can be downloaded from sourceforge (http://fqms.svn.sourceforge.net/svnroot/fqms). A tarball containing the data used for the evaluation is available on the FGCZ web server (http://fgcz-data.uzh.ch/public/T3PQ.tgz).

Introduction

Over the last years, technology developments in mass spectrometry have dramatically enhanced throughput, sensitivity and resolution of analytical technologies. During the same time computational methods have emerged that focus on the quantitative assessment of proteins in a complex sample [1]. Quantitative results can be obtained by either labeled [2], [3], [4] or unlabeled approaches [5], [6], [7], [8], [9], [10], [11].

Labeling techniques use the possibility to introduce heavy isotopes for example 2H, 13C, 15N, or others into proteins or peptides. Protein quantification is achieved by comparing relative intensities of MS signals of peptides in a combined sample of both labeled and unlabeled proteins. Commonly used isotopically labeling methods are ICAT (isotope coded affinity tag) [12], iTRAQ (isobaric tag for relative and absolute quantification) [13], or SILAC (stable isotope labeling with amino acids in cell culture) [14]. While SILAC is a metabolic labeling technique using isotopically labeled amino acids to label proteins, ICAT and iTRAQ are chemical labeling strategies using labeled tags. While ICAT is also used for relative quantification of two samples iTRAQ could be used for relative quantification of up to 8 samples.

In general, straightforward sample preparation and no extra costs make label-free methods attractive compared to labeling approaches that are both expensive and time consuming. Moreover, label-free methods can be multiplexed to a higher degree and they can even be applied to already acquired data. These approaches will prove essential for proteomics to move into a phase beyond mere protein identification. If quantitative information on proteins in complex mixtures will become robust and more complete, this will enable modeling endeavors in systems biology approaches that currently lack solid quantitative protein data. Building on top of this, questions on perturbation and regulation of whole systems could be investigated in silico.

While labeling approaches are always peptide oriented, label-free methods can be divided into peptide oriented and protein oriented approaches. The label-free approach we used in our implementation was originally suggested by Silva et al. [8]. It takes the intensities of the precursor ions of the three most efficiently ionized and identified peptides of a protein to calculate a measure for its abundance. Two other protein oriented approaches are emPAI (exponentially modified protein abundance index, here we used the Mascot server 2.2 implementation of the emPAI value, herein called Mascot-emPAI) [6] and APEX (absolute protein expression) [9] which are both based on spectral counting to provide a measure for relative protein abundance. An example for a label-free peptide oriented approach is the program SuperHirn [10] which uses the feature intensity on the LC–MS level to quantify each peptide of a particular protein. The presented evaluation of three different protein oriented methods and one peptide oriented method based on a reference data set will allow the reader to a make a competent choice for his own research. The use of a spiked-in protein of known concentration in comparison to the change in quantity of other proteins also identified in the mixture allows an evaluation of the reliability of the results in terms of quality, dynamic range, and linearity towards concentration changes. Silva et al. [8] showed in their work on a Q-ToF type instrument that you can quantify unknown protein samples with a known unified signal response factor in absolute manner. We are showing here that this technique can also be used with ion trap based instruments and we present a software that offers an automated workflow to retrieve quantitative data from LC–MS/MS runs.

We focused on protein oriented approaches to circumvent the difficulty and uncertainty of the peptide to protein inference. Besides using the three published programs (APEX, SuperHirn, Mascot-emPAI), we implemented the idea of Silva et al. [8] in our T3PQ software. This software uses result files from a Mascot search and the accordingly generic mzXML files of the LC–MS/MS runs as input. The APEX [9] method was used with the output of the ISB Trans-Proteomic Pipeline (TPP)2, which is also able to deal with Mascot results. For emPAI we used the built-in implementation of the Mascot server 2.2 software (for details, see Computational methods section) which is a slightly modified version of the original emPAI value as described by Ishihama et al. [6]. In order to compare the linearity of the quantitative response of these methods to that of the peptide oriented methods such as SuperHirn, we normalized all read-outs of the different spike concentrations to the 40 fmol condition as a reference measurement.

Section snippets

Standard protein and yeast extract preparation

Fetuin A (P12763, bovine, Fetuin), beta-lactoglobulin (P02754, bovine, b-LAC), glyceraldehyde-3-phosphate dehydrogenase (P46406, rabbit, GAPDH), beta-galactosidase (P00722, E. coli, b-GAL) were purchased from Sigma-Aldrich. All protein samples were prepared following standard protocols. The proteins as well as a complex yeast extract were independently reduced, alkylated and digested. Proteins were denatured and reduced for 45 min at 60 °C in 50 mM ammonium bicarbonate buffer (pH 8.0) containing 8 

Data set 1

We have analyzed four tryptically digested proteins with known concentrations within a range of 0.7 fmol–135 fmol on column. The measurements were performed on an LTQ-FT-ICR Ultra mass spectrometer in data dependent manner (DDA). Fig. 2 shows a linearity between the average of the three most intense MS signals of tryptic peptides of one protein and the protein abundance. We were using four different proteins with different molecular masses to show that the linear dependency is not affected by the

Conclusion and remarks

Our evaluation shows that the currently publicly available label-free protein quantification methods are limited in terms of dynamic range, variance, and accuracy of protein abundance calculation. Although all tested methods are able to capture the increasing concentration of a spiked-in protein, there are differences with respect to the linear response and variance of the protein abundance values. We show that for higher protein concentrations (> 100 fmol on column) or if the sample complexity

Author contributions

JG: performed APEX, Superhirn and Mascot-emPAI analysis and data analysis, wrote the paper; BR: prepared all extracts, performed all mass spectrometry experiments, outlined the paper; CF: prepared extracts, performed T3PQ analysis, compiled part of supplementary material CP: implemented the T3PQ method, performed the data analysis, drew the plots, and outlined the paper; SB: extracted emPAI values, compiled part of supplementary material, revised the paper; DR: intellectual input, revised the

Acknowledgements

This work was supported by the UZH Research Priority Program (URPP) Systems Biology/Functional Genomics, the European Sixth Framework Programm SYSPROT (LSHG-CT-2006-37457) and the AGRON-OMICS (LSHG-CT-2006-037704). The authors declare no conflict of interest.

The authors thank Prof. Jiricnys' laboratory for providing the yeast samples which were used as biological background. We also would like to thank our colleagues at the FGCZ and Dr. Ermir Qeli for the intellectual input and critical reading

References (15)

  • P.L. Ross et al.

    Mol Cell Proteomics

    (2004)
  • S.E. Ong et al.

    Mol Cell Proteomics

    (2002)
  • Y. Ishihama et al.

    Mol Cell Proteomics

    (2005)
  • M. Ono et al.

    Mol Cell Proteomics

    (2006)
  • J.C. Silva et al.

    Mol Cell Proteomics

    (2006)
  • P.L. Ross et al.

    Mol Cell Proteomics

    (2004)
  • S.E. Ong et al.

    Mol Cell Proteomics

    (2002)
There are more references available in the full text version of this article.

Cited by (126)

  • Analysis of protein additives degradation in aged mortars using mass spectrometry and principal component analysis

    2021, Construction and Building Materials
    Citation Excerpt :

    Consequently, the three most intense peptides of a given protein are selected, from which the relative amount of this protein is calculated. TOP3 is the preferred method for samples for which a comparison with a reference material is not possible [38,39]. Data obtained by LC-ESI-Q-TOF mass spectrometry were used for quantification.

View all citing articles on Scopus
1

These authors contributed equally to this work.

View full text