Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

“A novel paradigm for optimal mass feature peak picking in large scale LC-MS datasets using the ‘isopair’: isoLock, autoCredential and anovAlign”

View ORCID ProfileAllen Hubbard, Louis Connelly, View ORCID ProfileShrikaar Kambhampati, View ORCID ProfileBrad Evans, View ORCID ProfileIvan Baxter
doi: https://doi.org/10.1101/2021.12.05.471237
Allen Hubbard
1Donald Danforth Plant Science Center, Saint Louis, M.O.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Allen Hubbard
Louis Connelly
1Donald Danforth Plant Science Center, Saint Louis, M.O.
2Saint Louis University, Saint Louis, M.O.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shrikaar Kambhampati
1Donald Danforth Plant Science Center, Saint Louis, M.O.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Shrikaar Kambhampati
Brad Evans
1Donald Danforth Plant Science Center, Saint Louis, M.O.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brad Evans
Ivan Baxter
1Donald Danforth Plant Science Center, Saint Louis, M.O.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ivan Baxter
  • For correspondence: ibaxter@danforthcenter.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Untargeted metabolomics enables direct quantification of metabolites without apriori knowledge of their identity. Liquid chromatography mass spectrometry (LC-MS), a popular method to implement untargeted metabolomics, identifies metabolites via combined mass/charge (m/z) and retention time as mass features. Improvements in the sensitivity of mass spectrometers has increased the complexity of data produced, leading to computational obstacles. One outstanding challenge is calling metabolite mass feature peaks rapidly and accurately in large LC-MS datasets (dozens to thousands of samples) in the presence of measurement and other noise. While existing algorithms are useful, they have limitations that become pronounced at scale and lead to false positive metabolite predictions as well as signal dropouts. To overcome some of these shortcomings, biochemists have developed hybrid computational and carbon labeling techniques, such as credentialing. Credentialing can validate metabolite signals, but is laborious and its applicability is limited. We have developed a suite of three computational tools to overcome the challenges of unreliable algorithms and inefficient validation protocols: isolock, autoCredential and anovAlign. Isolock uses isopairs, or metabolite-istopologue pairs, to calculate and correct for mass drift noise across LC-MS runs. autoCredential leverages statistical features of LC-MS data to amplify naturally present 13C isotopologues and validate metabolites through isopairs. This obviates the need to artificially introduce carbon labeling. anovAlign, an anova-derived algorithm, is used to align retention time windows across samples to accurately delineate retention time windows for mass features. Using a large published clinical dataset as well as a plant dataset with biological replicates across time, genotype and treatment, we demonstrate that this suite of tools is more sensitive and reproducible than both an open source metabolomics pipelines, XCMS, and the commercial software progenesis QI. This software suite opens a new era for enhanced accuracy and increased throughput for untargeted metabolomics.

Competing Interest Statement

AH, SK and BH have filed an invention disclosure related to this work

Footnotes

  • Made sure to add the full citation of lloyd-price in the references

  • https://www.metabolomicsworkbench.org/data/DRCCMetadata.php?Mode=Study&StudyID=ST000923

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted December 08, 2021.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
“A novel paradigm for optimal mass feature peak picking in large scale LC-MS datasets using the ‘isopair’: isoLock, autoCredential and anovAlign”
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
“A novel paradigm for optimal mass feature peak picking in large scale LC-MS datasets using the ‘isopair’: isoLock, autoCredential and anovAlign”
Allen Hubbard, Louis Connelly, Shrikaar Kambhampati, Brad Evans, Ivan Baxter
bioRxiv 2021.12.05.471237; doi: https://doi.org/10.1101/2021.12.05.471237
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
“A novel paradigm for optimal mass feature peak picking in large scale LC-MS datasets using the ‘isopair’: isoLock, autoCredential and anovAlign”
Allen Hubbard, Louis Connelly, Shrikaar Kambhampati, Brad Evans, Ivan Baxter
bioRxiv 2021.12.05.471237; doi: https://doi.org/10.1101/2021.12.05.471237

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4225)
  • Biochemistry (9101)
  • Bioengineering (6750)
  • Bioinformatics (23941)
  • Biophysics (12087)
  • Cancer Biology (9493)
  • Cell Biology (13738)
  • Clinical Trials (138)
  • Developmental Biology (7614)
  • Ecology (11659)
  • Epidemiology (2066)
  • Evolutionary Biology (15477)
  • Genetics (10616)
  • Genomics (14293)
  • Immunology (9460)
  • Microbiology (22774)
  • Molecular Biology (9069)
  • Neuroscience (48851)
  • Paleontology (354)
  • Pathology (1479)
  • Pharmacology and Toxicology (2564)
  • Physiology (3822)
  • Plant Biology (8308)
  • Scientific Communication and Education (1467)
  • Synthetic Biology (2289)
  • Systems Biology (6171)
  • Zoology (1297)