Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

StrainSeeker: fast identification of bacterial strains from unassembled sequencing reads using user-provided guide trees

Märt Roosaare, Mihkel Vaher, Lauris Kaplinski, Märt Möls, Reidar Andreson, Maarja Lepamets, Triinu Kõressaar, Paul Naaber, Siiri Kõljalg, Maido Remm
doi: https://doi.org/10.1101/040261
Märt Roosaare
1Department of Bioinformatics, IMCB, University of Tartu, Riia 23, EE-51010 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mart.roosaare@ut.ee mihkel.vaher@ut.ee reidar.andreson@ut.ee lauris.kaplinski@ut.ee triinu.koressaar@ut.ee maarja.lepamets@ut.ee martm@ut.ee Paul.Naaber@synlab.ee siiri.koljalg@ut.ee maido.remm@ut.ee
Mihkel Vaher
1Department of Bioinformatics, IMCB, University of Tartu, Riia 23, EE-51010 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lauris Kaplinski
1Department of Bioinformatics, IMCB, University of Tartu, Riia 23, EE-51010 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Märt Möls
2Estonian Biocentre, Riia 23, EE-51010 Tartu, Estonia
3Institute of Mathematical Statistics, University of Tartu, Liivi 2, EE-50409 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Reidar Andreson
1Department of Bioinformatics, IMCB, University of Tartu, Riia 23, EE-51010 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maarja Lepamets
1Department of Bioinformatics, IMCB, University of Tartu, Riia 23, EE-51010 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Triinu Kõressaar
1Department of Bioinformatics, IMCB, University of Tartu, Riia 23, EE-51010 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Paul Naaber
4Department of Microbiology, University of Tartu, Ravila 19, EE-50411 Tartu, Estonia
5synlab Eesti, Väike-Paala 1, EE-11415 Tallinn, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Siiri Kõljalg
4Department of Microbiology, University of Tartu, Ravila 19, EE-50411 Tartu, Estonia
6United Laboratories, Tartu University Clinics, L. Puusepa 1 a, EE-50406 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Maido Remm
1Department of Bioinformatics, IMCB, University of Tartu, Riia 23, EE-51010 Tartu, Estonia
2Estonian Biocentre, Riia 23, EE-51010 Tartu, Estonia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Background Fast, accurate and high-throughput detection of bacteria is in great demand. The present work was conducted to investigate the possibility of identifying both known and unknown bacterial strains from unassembled next-generation sequencing reads using custom-made guide trees.

Results A program named StrainSeeker was developed that constructs a list of specific k-mers for each node of any given Newick-format tree and enables rapid identification of bacterial genomes within minutes. StrainSeeker has been tested and shown to successfully identify Escherichia coli strains from mixed samples in less than 5 minutes. StrainSeeker can also identify bacterial strains from highly diverse metagenomics samples. StrainSeeker is available at http://bioinfo.ut.ee/strainseeker.

Conclusions Our novel approach can be useful for both clinical diagnostics and research laboratories because novel bacterial strains are constantly emerging and their fast and accurate detection is very important.

  • List of abbreviations

    bp
    base pair
    NCBI
    National Center for Biotechnology Information
    MLST
    multi-locus sequence typing
    SRA
    Sequence Read Archive
    WGS
    whole-genome sequencing
    UPGMA
    unweighted pair group method with arithmetic mean
  • Copyright 
    The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
    Back to top
    PreviousNext
    Posted February 19, 2016.
    Download PDF
    Email

    Thank you for your interest in spreading the word about bioRxiv.

    NOTE: Your email address is requested solely to identify you as the sender of this article.

    Enter multiple addresses on separate lines or separate them with commas.
    StrainSeeker: fast identification of bacterial strains from unassembled sequencing reads using user-provided guide trees
    (Your Name) has forwarded a page to you from bioRxiv
    (Your Name) thought you would like to see this page from the bioRxiv website.
    CAPTCHA
    This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
    Share
    StrainSeeker: fast identification of bacterial strains from unassembled sequencing reads using user-provided guide trees
    Märt Roosaare, Mihkel Vaher, Lauris Kaplinski, Märt Möls, Reidar Andreson, Maarja Lepamets, Triinu Kõressaar, Paul Naaber, Siiri Kõljalg, Maido Remm
    bioRxiv 040261; doi: https://doi.org/10.1101/040261
    Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
    Citation Tools
    StrainSeeker: fast identification of bacterial strains from unassembled sequencing reads using user-provided guide trees
    Märt Roosaare, Mihkel Vaher, Lauris Kaplinski, Märt Möls, Reidar Andreson, Maarja Lepamets, Triinu Kõressaar, Paul Naaber, Siiri Kõljalg, Maido Remm
    bioRxiv 040261; doi: https://doi.org/10.1101/040261

    Citation Manager Formats

    • BibTeX
    • Bookends
    • EasyBib
    • EndNote (tagged)
    • EndNote 8 (xml)
    • Medlars
    • Mendeley
    • Papers
    • RefWorks Tagged
    • Ref Manager
    • RIS
    • Zotero
    • Tweet Widget
    • Facebook Like
    • Google Plus One

    Subject Area

    • Bioinformatics
    Subject Areas
    All Articles
    • Animal Behavior and Cognition (4235)
    • Biochemistry (9140)
    • Bioengineering (6784)
    • Bioinformatics (24004)
    • Biophysics (12131)
    • Cancer Biology (9537)
    • Cell Biology (13781)
    • Clinical Trials (138)
    • Developmental Biology (7638)
    • Ecology (11703)
    • Epidemiology (2066)
    • Evolutionary Biology (15513)
    • Genetics (10647)
    • Genomics (14327)
    • Immunology (9484)
    • Microbiology (22847)
    • Molecular Biology (9095)
    • Neuroscience (48998)
    • Paleontology (355)
    • Pathology (1482)
    • Pharmacology and Toxicology (2570)
    • Physiology (3848)
    • Plant Biology (8331)
    • Scientific Communication and Education (1471)
    • Synthetic Biology (2296)
    • Systems Biology (6193)
    • Zoology (1301)