Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Bolt: A new age peptide search engine for comprehensive MS/MS sequencing through vast protein databases in minutes

View ORCID ProfileAmol Prakash, Shadab Ahmad, Swetaketu Majumder, View ORCID ProfileConor Jenkins, View ORCID ProfileBen Orsburn
doi: https://doi.org/10.1101/551622
Amol Prakash
1Optys Tech Corporation, Shrewsbury MA 01545
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Amol Prakash
Shadab Ahmad
1Optys Tech Corporation, Shrewsbury MA 01545
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Swetaketu Majumder
1Optys Tech Corporation, Shrewsbury MA 01545
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Conor Jenkins
2Protein Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., Frederick, Maryland
3Hood College Department of Biology, Frederick, MD 21702
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Conor Jenkins
Ben Orsburn
2Protein Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research Inc., Frederick, Maryland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ben Orsburn
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

The standard platform for proteomics experiments today is mass spectrometry, particularly for samples derived from complex matrices. Recent increases in mass spectrometry sequencing speed, sensitivity and resolution now permit comprehensive coverage of even the most precious and limited samples, particularly when coupled with improvements in protein extraction techniques and chromatographic separation.

However, the results obtained from laborious sample extraction and expensive instrumentation are often hindered by a sub optimal data processing pipelines. One critical data processing piece is peptide sequencing which is most commonly done through database search engines. In almost all MS/MS search engines users must limit their search space due to time constraints and q-value considerations. In nearly all experiments, the search is limited to a canonical database that typically does not reflect the individual genetic variations of the organism being studied. Searching for posttranslational modifications can exponentially increase the search space thus careful consideration must be used during the selection process. In addition, engines will nearly always assume the presence of only fully tryptic peptides. Despite these stringent parameters, proteomic data searches may take hours or even days to complete and opening even one of these criteria to more realistic biological settings will lead to detrimental increases in search time on expensive and custom data processing towers. Even on high performance servers, these search engines are computationally expensive, and most users decide to dial back their search parameters. We present Bolt, a new search engine that can search more than nine hundred thousand protein sequences (canonical, isoform, mutations, and contaminants) with 31 post translation modifications and N-terminal and C-terminal partial tryptic search in a matter of minutes on a standard configuration laptop. Along with increases in speed, Bolt provides an additional benefit of improvement in high confidence identifications, as demonstrated by manual validation of unique peptides identified by Bolt that were missed with parallel searching using standard engines. When in disagreement, 67% of peptides identified by Bolt may be manually validated by strong fragmentation patterns, compared to 14% of peptides uniquely identified by SEQUEST. Bolt represents, to the best of our knowledge, the first fully scalable, cloud based quantitative proteomic solution that can be operated within a user-friendly GUI interface. Data are available via ProteomeXchange with identifier PXD012700.

Figure
  • Download figure
  • Open in new tab
Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted February 18, 2019.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Bolt: A new age peptide search engine for comprehensive MS/MS sequencing through vast protein databases in minutes
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Bolt: A new age peptide search engine for comprehensive MS/MS sequencing through vast protein databases in minutes
Amol Prakash, Shadab Ahmad, Swetaketu Majumder, Conor Jenkins, Ben Orsburn
bioRxiv 551622; doi: https://doi.org/10.1101/551622
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Bolt: A new age peptide search engine for comprehensive MS/MS sequencing through vast protein databases in minutes
Amol Prakash, Shadab Ahmad, Swetaketu Majumder, Conor Jenkins, Ben Orsburn
bioRxiv 551622; doi: https://doi.org/10.1101/551622

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4235)
  • Biochemistry (9136)
  • Bioengineering (6784)
  • Bioinformatics (24001)
  • Biophysics (12129)
  • Cancer Biology (9534)
  • Cell Biology (13778)
  • Clinical Trials (138)
  • Developmental Biology (7636)
  • Ecology (11702)
  • Epidemiology (2066)
  • Evolutionary Biology (15513)
  • Genetics (10644)
  • Genomics (14326)
  • Immunology (9483)
  • Microbiology (22840)
  • Molecular Biology (9090)
  • Neuroscience (48995)
  • Paleontology (355)
  • Pathology (1482)
  • Pharmacology and Toxicology (2570)
  • Physiology (3846)
  • Plant Biology (8331)
  • Scientific Communication and Education (1471)
  • Synthetic Biology (2296)
  • Systems Biology (6192)
  • Zoology (1301)