Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis

View ORCID ProfileGherman V Uritskiy, View ORCID ProfileJocelyne DiRuggiero, View ORCID ProfileJames Taylor
doi: https://doi.org/10.1101/277442
Gherman V Uritskiy
1Department of Biology, Johns Hopkins University, Baltimore MD orcid.org/0000-0002-3332-0815
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gherman V Uritskiy
Jocelyne DiRuggiero
2Department of Biology, Johns Hopkins University, Baltimore MD orcid.org/0000-0001-6721-8061
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jocelyne DiRuggiero
  • For correspondence: jdiruggiero@jhu.edu james@taylorlab.org
James Taylor
3Departments of Biology and Computer Science, Johns Hopkins University, Baltimore MD orcid.org/0000-0001-5079-840X
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for James Taylor
  • For correspondence: jdiruggiero@jhu.edu james@taylorlab.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Background: The study of microbiomes using whole-metagenome shotgun sequencing enables the analysis of uncultivated microbial populations that may have important roles in their environments. Extracting individual draft genomes (bins) facilitates metagenomic analysis at the single genome level. Software and pipelines for such analysis have become diverse and sophisticated, resulting in a significant burden for biologists to access and use them. Furthermore, while bin extraction algorithms are rapidly improving, there is still a lack of tools for their evaluation and visualization.

Results: To address these challenges, we present metaWRAP, a modular pipeline software for shotgun metagenomic data analysis. MetaWRAP deploys state-of-the-art software to handle metagenomic data processing starting from raw sequencing reads and ending in metagenomic bins and their analysis. MetaWRAP is flexible enough to give investigators control over the analysis, while still being easy-to-install and easy-to-use. It includes hybrid algorithms that leverage the strengths of a variety of software to extract and refine high-quality bins from metagenomic data through bin consolidation and reassembly. MetaWRAP’s hybrid bin extraction algorithm outperforms individual binning approaches and other bin consolidation programs in both synthetic and real datasets. Finally, metaWRAP comes with numerous modules for the analysis of metagenomic bins, including taxonomy assignment, abundance estimation, functional annotation, and visualization.

Conclusions: MetaWRAP is an easy-to-use modular pipeline that automates the core tasks in metagenomic analysis, while contributing significant improvements to the extraction and interpretation of high-quality metagenomic bins. The bin refinement and reassembly modules of metaWRAP consistently outperform other binning approaches. Each module of metaWRAP is also a standalone component, making it a flexible and versatile tool for tackling metagenomic shotgun sequencing data. MetaWRAP is open-source software available at https://github.com/bxlab/metaWRAP.

  • Abbreviations

    WMG
    whole metagenome
    comp
    completion
    cont
    contamination
    -c
    minimum completion parameter
    -x
    maximum contamination parameter
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
    Back to top
    PreviousNext
    Posted March 06, 2018.
    Download PDF

    Supplementary Material

    Email

    Thank you for your interest in spreading the word about bioRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
    (Your Name) has forwarded a page to you from bioRxiv
    (Your Name) thought you would like to see this page from the bioRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
    Gherman V Uritskiy, Jocelyne DiRuggiero, James Taylor
    bioRxiv 277442; doi: https://doi.org/10.1101/277442
    Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis
    Gherman V Uritskiy, Jocelyne DiRuggiero, James Taylor
    bioRxiv 277442; doi: https://doi.org/10.1101/277442

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Microbiology
    Subject Areas
    All Articles
    • Animal Behavior and Cognition (4384)
    • Biochemistry (9602)
    • Bioengineering (7100)
    • Bioinformatics (24885)
    • Biophysics (12625)
    • Cancer Biology (9968)
    • Cell Biology (14364)
    • Clinical Trials (138)
    • Developmental Biology (7966)
    • Ecology (12115)
    • Epidemiology (2067)
    • Evolutionary Biology (15997)
    • Genetics (10932)
    • Genomics (14746)
    • Immunology (9875)
    • Microbiology (23683)
    • Molecular Biology (9486)
    • Neuroscience (50907)
    • Paleontology (370)
    • Pathology (1540)
    • Pharmacology and Toxicology (2684)
    • Physiology (4022)
    • Plant Biology (8669)
    • Scientific Communication and Education (1510)
    • Synthetic Biology (2397)
    • Systems Biology (6442)
    • Zoology (1346)