Training, selection, and robust calibration of retention time models for targeted proteomics

J Proteome Res. 2010 Oct 1;9(10):5209-16. doi: 10.1021/pr1005058.

Abstract

Accurate predictions of peptide retention times (RT) in liquid chromatography have many applications in mass spectrometry-based proteomics. Most notably such predictions are used to weed out incorrect peptide-spectrum matches, and to design targeted proteomics experiments. In this study, we describe a RT predictor, ELUDE, which can be employed in both applications. ELUDE's predictions are based on 60 features derived from the peptide's amino acid composition and optimally combined using kernel regression. When sufficient data is available, ELUDE derives a retention time index for the condition at hand making it fully portable to new chromatographic conditions. In cases when little training data is available, as often is the case in targeted proteomics experiments, ELUDE selects and calibrates a model from a library of pretrained predictors. Both model selection and calibration are carried out via robust statistical methods and thus ELUDE can handle situations where the calibration data contains erroneous data points. We benchmarked our method against two state-of-the-art predictors and showed that ELUDE outperforms these methods and tracked up to 34% more peptides in a theoretical SRM method creation experiment. ELUDE is freely available under Apache License from http://per-colator.com.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Caenorhabditis elegans / metabolism
  • Caenorhabditis elegans Proteins / analysis*
  • Calibration
  • Chromatography, Liquid / methods*
  • Internet
  • Mass Spectrometry / methods*
  • Models, Theoretical
  • Proteomics / methods*
  • Reproducibility of Results
  • Software
  • Time Factors

Substances

  • Caenorhabditis elegans Proteins