Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

One Codex: A Sensitive and Accurate Data Platform for Genomic Microbial Identification

Samuel S Minot, Niklas Krumm, Nicholas B Greenfield
doi: https://doi.org/10.1101/027607
Samuel S Minot
One Codex (Reference Genomics, Inc.)
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: sam@onecodex.com
Niklas Krumm
One Codex (Reference Genomics, Inc.)
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicholas B Greenfield
One Codex (Reference Genomics, Inc.)
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

High-throughput sequencing (HTS) is increasingly being used for broad applications of microbial characterization, such as microbial ecology, clinical diagnosis, and outbreak epidemiology. However, the analytical task of comparing short sequence reads against the known diversity of microbial life has proved to be computationally challenging. The One Codex data platform was created with the dual goals of analyzing microbial data against the largest possible collection of microbial reference genomes, as well as presenting those results in a format that is consumable by applied end-users. One Codex identifies microbial sequences using a "k-mer based" taxonomic classification algorithm through a web-based data platform, using a reference database that currently includes approximately 40,000 bacterial, viral, fungal, and protozoan genomes. In order to evaluate whether this classification method and associated database provided quantitatively different performance for microbial identification, we created a large and diverse evaluation dataset containing 50 million reads from 10,639 genomes, as well as sequences from six organisms novel species not be included in the reference databases of any of the tested classifiers. Quantitative evaluation of several published microbial detection methods shows that One Codex has the highest degree of sensitivity and specificity (AUC = 0.97, compared to 0.82-0.88 for other methods), both when detecting well-characterized species as well as newly sequenced, "taxonomically novel" organisms.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted September 28, 2015.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
One Codex: A Sensitive and Accurate Data Platform for Genomic Microbial Identification
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
One Codex: A Sensitive and Accurate Data Platform for Genomic Microbial Identification
Samuel S Minot, Niklas Krumm, Nicholas B Greenfield
bioRxiv 027607; doi: https://doi.org/10.1101/027607
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
One Codex: A Sensitive and Accurate Data Platform for Genomic Microbial Identification
Samuel S Minot, Niklas Krumm, Nicholas B Greenfield
bioRxiv 027607; doi: https://doi.org/10.1101/027607

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (999)
  • Biochemistry (1493)
  • Bioengineering (946)
  • Bioinformatics (6831)
  • Biophysics (2429)
  • Cancer Biology (1792)
  • Cell Biology (2530)
  • Clinical Trials (106)
  • Developmental Biology (1700)
  • Ecology (2576)
  • Epidemiology (1496)
  • Evolutionary Biology (5029)
  • Genetics (3623)
  • Genomics (4638)
  • Immunology (1175)
  • Microbiology (4252)
  • Molecular Biology (1629)
  • Neuroscience (10801)
  • Paleontology (83)
  • Pathology (240)
  • Pharmacology and Toxicology (409)
  • Physiology (555)
  • Plant Biology (1459)
  • Scientific Communication and Education (412)
  • Synthetic Biology (542)
  • Systems Biology (1878)
  • Zoology (260)