Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

MetaMaps – Strain-level metagenomic assignment and compositional estimation for long reads

View ORCID ProfileAlexander Dilthey, Chirag Jain, Sergey Koren, View ORCID ProfileAdam M. Phillippy
doi: https://doi.org/10.1101/372474
Alexander Dilthey
Institute of Medical Microbiology, University Hospital of Dusseldorf, Dusseldorf, North Rhine-Westphalia, GermanyGenome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD 20892, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alexander Dilthey
  • For correspondence: dilthey@med.uni-duesseldorf.de
Chirag Jain
Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD 20892, USAGeorgia Institute of Technology, Atlanta, GA 30332, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sergey Koren
Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD 20892, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Adam M. Phillippy
Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD 20892, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Adam M. Phillippy
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Metagenomic sequence classification should be fast, accurate and information-rich. Emerging long-read sequencing technologies promise to improve the balance between these factors but most existing methods were designed for short reads. MetaMaps is a new method, specifically developed for long reads, that combines the accuracy of slower alignment-based methods with the scalability of faster k-mer-based methods. Using an approximate mapping algorithm, it is capable of mapping a long-read metagenome to a comprehensive RefSeq database with >12,000 genomes in <30 GB or RAM on a laptop computer. Integrating these mappings with a probabilistic scoring scheme and EM-based estimation of sample composition, MetaMaps achieves >95% accuracy for species-level read assignment and r2 > 0.98 for the estimation of sample composition on both simulated and real data. Uniquely, MetaMaps outputs mapping locations and qualities for all classified reads, enabling functional studies (e.g. gene presence/absence) and the detection of novel species not present in the current database.

Availability and Implementation MetaMaps is implemented in C++/Perl and freely available from https://github.com/DiltheyLab/MetaMaps (GPL v3).

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.
Back to top
PreviousNext
Posted July 20, 2018.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
MetaMaps – Strain-level metagenomic assignment and compositional estimation for long reads
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
Share
MetaMaps – Strain-level metagenomic assignment and compositional estimation for long reads
Alexander Dilthey, Chirag Jain, Sergey Koren, Adam M. Phillippy
bioRxiv 372474; doi: https://doi.org/10.1101/372474
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
MetaMaps – Strain-level metagenomic assignment and compositional estimation for long reads
Alexander Dilthey, Chirag Jain, Sergey Koren, Adam M. Phillippy
bioRxiv 372474; doi: https://doi.org/10.1101/372474

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (1519)
  • Biochemistry (2472)
  • Bioengineering (1726)
  • Bioinformatics (9647)
  • Biophysics (3881)
  • Cancer Biology (2960)
  • Cell Biology (4173)
  • Clinical Trials (135)
  • Developmental Biology (2620)
  • Ecology (4083)
  • Epidemiology (2031)
  • Evolutionary Biology (6867)
  • Genetics (5195)
  • Genomics (6482)
  • Immunology (2176)
  • Microbiology (6908)
  • Molecular Biology (2746)
  • Neuroscience (17196)
  • Paleontology (125)
  • Pathology (425)
  • Pharmacology and Toxicology (703)
  • Physiology (1050)
  • Plant Biology (2478)
  • Scientific Communication and Education (642)
  • Synthetic Biology (826)
  • Systems Biology (2680)
  • Zoology (429)