Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Accessible, curated metagenomic data through ExperimentHub

View ORCID ProfileEdoardo Pasolli, Lucas Schiffer, Audrey Renson, Valerie Obenchain, Paolo Manghi, View ORCID ProfileDuy Tin Truong, View ORCID ProfileFrancesco Beghini, Faizan Malik, Marcel Ramos, View ORCID ProfileJennifer B Dowd, View ORCID ProfileCurtis Huttenhower, Martin Morgan, View ORCID ProfileNicola Segata, View ORCID ProfileLevi Waldron
doi: https://doi.org/10.1101/103085
Edoardo Pasolli
Centre for Integrative Biology, University of Trento, Trento, Italy;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Edoardo Pasolli
Lucas Schiffer
Institute for Implementation Science and Population Health, CUNY, New York, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Audrey Renson
Institute for Implementation Science and Population Health, CUNY, New York, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Valerie Obenchain
Roswell Park Cancer Institute, University of Buffalo, Buffalo, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Paolo Manghi
Centre for Integrative Biology, University of Trento, Trento, Italy;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Duy Tin Truong
Centre for Integrative Biology, University of Trento, Trento, Italy;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Duy Tin Truong
Francesco Beghini
Centre for Integrative Biology, University of Trento, Trento, Italy;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Francesco Beghini
Faizan Malik
Institute for Implementation Science and Population Health, CUNY, New York, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Marcel Ramos
Institute for Implementation Science and Population Health, CUNY, New York, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer B Dowd
Institute for Implementation Science and Population Health, CUNY, New York, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jennifer B Dowd
Curtis Huttenhower
Biostatistics Department, Harvard School of Public Health, Boston, United States of America
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Curtis Huttenhower
Martin Morgan
Roswell Park Cancer Institute, University of Buffalo, Buffalo, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicola Segata
Centre for Integrative Biology, University of Trento, Trento, Italy;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Nicola Segata
Levi Waldron
Institute for Implementation Science and Population Health, CUNY, New York, United States of America;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Levi Waldron
  • For correspondence: levi.waldron@sph.cuny.edu
  • Abstract
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

We present curatedMetagenomicData, a Bioconductor and command-line resource providing thousands of metagenomic profiles from the Human Microbiome Project and other publicly available datasets, and ExperimentHub, for convenient cloud-based distribution of data to the R desktop. curatedMetagenomicData provides standardized per-participant metadata linked to bacterial, fungal, archaeal, and viral taxonomic abundances, as well as quantitative metabolic functional profiles, generated by the HUMAnN2 and MetaPhlAn2 pipelines. The resulting datasets can be immediately analyzed with a wide range of statistical methods, requiring a minimum of bioinformatic expertise and no preprocessing of data. We demonstrate exploratory data analysis, an investigation of gut "enterotypes", and a comparison of the accuracy of disease classification from different data types. These documented analyses can be reproduced efficiently on a laptop, without the barriers of working with large-scale, raw sequencing data. The development of curatedMetagenomicData will continue with the addition, curation, and analysis of further microbiome datasets.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted January 27, 2017.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Accessible, curated metagenomic data through ExperimentHub
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
Accessible, curated metagenomic data through ExperimentHub
Edoardo Pasolli, Lucas Schiffer, Audrey Renson, Valerie Obenchain, Paolo Manghi, Duy Tin Truong, Francesco Beghini, Faizan Malik, Marcel Ramos, Jennifer B Dowd, Curtis Huttenhower, Martin Morgan, Nicola Segata, Levi Waldron
bioRxiv 103085; doi: https://doi.org/10.1101/103085
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
Accessible, curated metagenomic data through ExperimentHub
Edoardo Pasolli, Lucas Schiffer, Audrey Renson, Valerie Obenchain, Paolo Manghi, Duy Tin Truong, Francesco Beghini, Faizan Malik, Marcel Ramos, Jennifer B Dowd, Curtis Huttenhower, Martin Morgan, Nicola Segata, Levi Waldron
bioRxiv 103085; doi: https://doi.org/10.1101/103085

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (1003)
  • Biochemistry (1502)
  • Bioengineering (954)
  • Bioinformatics (6860)
  • Biophysics (2451)
  • Cancer Biology (1808)
  • Cell Biology (2547)
  • Clinical Trials (108)
  • Developmental Biology (1710)
  • Ecology (2589)
  • Epidemiology (1508)
  • Evolutionary Biology (5048)
  • Genetics (3632)
  • Genomics (4651)
  • Immunology (1186)
  • Microbiology (4275)
  • Molecular Biology (1637)
  • Neuroscience (10865)
  • Paleontology (83)
  • Pathology (243)
  • Pharmacology and Toxicology (411)
  • Physiology (560)
  • Plant Biology (1469)
  • Scientific Communication and Education (414)
  • Synthetic Biology (546)
  • Systems Biology (1891)
  • Zoology (261)