Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Ultra-accurate Microbial Amplicon Sequencing Directly from Complex Samples with Synthetic Long Reads

View ORCID ProfileBenjamin J Callahan, Dmitry Grinevich, Siddhartha Thakur, Michael A Balamotis, Tuval Ben Yehezkel
doi: https://doi.org/10.1101/2020.07.07.192286
Benjamin J Callahan
1Department of Population Health and Pathobiology, North Carolina State University, College of Veterinary Medicine, Raleigh, NC USA
2Bioinformatics Research Center, North Carolina State University, Raleigh, NC USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Benjamin J Callahan
  • For correspondence: benjamin.j.callahan@gmail.com
Dmitry Grinevich
1Department of Population Health and Pathobiology, North Carolina State University, College of Veterinary Medicine, Raleigh, NC USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Siddhartha Thakur
1Department of Population Health and Pathobiology, North Carolina State University, College of Veterinary Medicine, Raleigh, NC USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael A Balamotis
3Loop Genomics, San Jose, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tuval Ben Yehezkel
3Loop Genomics, San Jose, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Out of the many pathogenic bacterial species that are known, only a fraction are readily identifiable directly from a complex microbial community using standard next generation DNA sequencing technology. Long-read sequencing offers the potential to identify a wider range of species and to differentiate between strains within a species, but attaining sufficient accuracy in complex metagenomes remains a challenge. Here, we describe and analytically validate LoopSeq, a commercially-available synthetic long-read (SLR) sequencing technology that generates highly-accurate long reads from standard short reads. LoopSeq reads are sufficiently long and accurate to identify microbial genes and species directly from complex samples. LoopSeq applied to full-length 16S rRNA genes from known strains in a microbial community perfectly recovered the full diversity of full-length exact sequence variants in a known microbial community. Full-length LoopSeq reads had a per-base error rate of 0.005%, which exceeds the accuracy reported for other long-read sequencing technologies. 18S-ITS and genomic sequencing of fungal and bacterial isolates confirmed that LoopSeq sequencing maintains that accuracy for reads up to 6 kilobases in length. Analysis of rinsate from retail meat samples demonstrated that LoopSeq full-length 16S rRNA synthetic long-reads could accurately classify organisms down to the species level, and could differentiate between different strains within species identified by the CDC as potential foodborne pathogens. The order-of-magnitude improvement in both length and accuracy over standard Illumina amplicon sequencing achieved with LoopSeq enables accurate species-level and strain identification from complex and low-biomass microbiome samples. The ability to generate accurate and long microbiome sequencing reads using standard short read sequencers will accelerate the building of quality microbial sequence databases and removes a significant hurdle on the path to precision microbial genomics.

Competing Interest Statement

Michael Balamotis and Tuval Ben Yehezkel are employees of Loop Genomics, the vendor for the synthetic long-read sequencing technology analyzed in this manuscript.

Footnotes

  • https://github.com/benjjneb/LoopManuscript

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted July 07, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Ultra-accurate Microbial Amplicon Sequencing Directly from Complex Samples with Synthetic Long Reads
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Ultra-accurate Microbial Amplicon Sequencing Directly from Complex Samples with Synthetic Long Reads
Benjamin J Callahan, Dmitry Grinevich, Siddhartha Thakur, Michael A Balamotis, Tuval Ben Yehezkel
bioRxiv 2020.07.07.192286; doi: https://doi.org/10.1101/2020.07.07.192286
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Ultra-accurate Microbial Amplicon Sequencing Directly from Complex Samples with Synthetic Long Reads
Benjamin J Callahan, Dmitry Grinevich, Siddhartha Thakur, Michael A Balamotis, Tuval Ben Yehezkel
bioRxiv 2020.07.07.192286; doi: https://doi.org/10.1101/2020.07.07.192286

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Microbiology
Subject Areas
All Articles
  • Animal Behavior and Cognition (2633)
  • Biochemistry (5221)
  • Bioengineering (3643)
  • Bioinformatics (15711)
  • Biophysics (7213)
  • Cancer Biology (5593)
  • Cell Biology (8045)
  • Clinical Trials (138)
  • Developmental Biology (4735)
  • Ecology (7462)
  • Epidemiology (2059)
  • Evolutionary Biology (10520)
  • Genetics (7698)
  • Genomics (10082)
  • Immunology (5148)
  • Microbiology (13823)
  • Molecular Biology (5354)
  • Neuroscience (30577)
  • Paleontology (211)
  • Pathology (871)
  • Pharmacology and Toxicology (1519)
  • Physiology (2234)
  • Plant Biology (4983)
  • Scientific Communication and Education (1036)
  • Synthetic Biology (1379)
  • Systems Biology (4130)
  • Zoology (803)