Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A robust statistical framework for reconstructing genomes from metagenomic data

Dongwan D. Kang, Jeff Froula, Rob Egan, Zhong Wang
doi: https://doi.org/10.1101/011460
Dongwan D. Kang
1Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
2Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeff Froula
1Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
2Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rob Egan
1Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
2Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Zhong Wang
1Department of Energy, Joint Genome Institute, Walnut Creek, CA 94598, USA
2Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: zhongwang@lbl.gov
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

ABSTRACT

We present software that reconstructs genomes from shotgun metagenomic sequences using a reference-independent approach. This method permits the identification of OTUs in large complex communities where many species are unknown. Binning reduces the complexity of a metagenomic dataset enabling many downstream analyses previously unavailable. In this study we developed MetaBAT, a robust statistical framework that integrates probabilistic distances of genome abundance with sequence composition for automatic binning. Applying MetaBAT to a human gut microbiome dataset identified 173 highly specific genomes bins including many representing previously unidentified species.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted November 15, 2014.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A robust statistical framework for reconstructing genomes from metagenomic data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A robust statistical framework for reconstructing genomes from metagenomic data
Dongwan D. Kang, Jeff Froula, Rob Egan, Zhong Wang
bioRxiv 011460; doi: https://doi.org/10.1101/011460
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
A robust statistical framework for reconstructing genomes from metagenomic data
Dongwan D. Kang, Jeff Froula, Rob Egan, Zhong Wang
bioRxiv 011460; doi: https://doi.org/10.1101/011460

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2548)
  • Biochemistry (4995)
  • Bioengineering (3503)
  • Bioinformatics (15291)
  • Biophysics (6934)
  • Cancer Biology (5432)
  • Cell Biology (7783)
  • Clinical Trials (138)
  • Developmental Biology (4564)
  • Ecology (7186)
  • Epidemiology (2059)
  • Evolutionary Biology (10264)
  • Genetics (7542)
  • Genomics (9835)
  • Immunology (4905)
  • Microbiology (13311)
  • Molecular Biology (5170)
  • Neuroscience (29607)
  • Paleontology (203)
  • Pathology (842)
  • Pharmacology and Toxicology (1471)
  • Physiology (2155)
  • Plant Biology (4788)
  • Scientific Communication and Education (1016)
  • Synthetic Biology (1343)
  • Systems Biology (4025)
  • Zoology (773)