Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

BlobToolKit – Interactive quality assessment of genome assemblies

View ORCID ProfileRichard Challis, View ORCID ProfileEdward Richards, View ORCID ProfileJeena Rajan, View ORCID ProfileGuy Cochrane, View ORCID ProfileMark Blaxter
doi: https://doi.org/10.1101/844852
Richard Challis
1Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
2Wellcome Sanger Institute, Cambridge CB10 1SA, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Richard Challis
  • For correspondence: rc28@sanger.ac.uk
Edward Richards
3European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Edward Richards
Jeena Rajan
3European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jeena Rajan
Guy Cochrane
3European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Guy Cochrane
Mark Blaxter
1Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
2Wellcome Sanger Institute, Cambridge CB10 1SA, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mark Blaxter
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in downstream analyses by users unaware of underlying problems.

We present BlobToolKit, a software suite to aid researchers in identifying and isolating non-target data in draft and publicly available genome assemblies. BlobToolKit can be used to process assembly, read and analysis files for fully reproducible interactive exploration in the browser-based Viewer. BlobToolKit can be used during assembly to filter non-target DNA, helping researchers produce assemblies with high biological credibility.

We have been running an automated BlobToolKit pipeline on eukaryotic assemblies publicly available in the International Nucleotide Sequence Data Collaboration and are making the results available through a public instance of the Viewer at https://blobtoolkit.genomehubs.org/view. We aim to complete analysis of all publicly available genomes and then maintain currency with the flow of new genomes. We have worked to embed these views into the presentation of genome assemblies at the European Nucleotide Archive, providing an indication of assembly quality alongside the public record with links out to allow full exploration in the Viewer.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted November 15, 2019.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
BlobToolKit – Interactive quality assessment of genome assemblies
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
BlobToolKit – Interactive quality assessment of genome assemblies
Richard Challis, Edward Richards, Jeena Rajan, Guy Cochrane, Mark Blaxter
bioRxiv 844852; doi: https://doi.org/10.1101/844852
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
BlobToolKit – Interactive quality assessment of genome assemblies
Richard Challis, Edward Richards, Jeena Rajan, Guy Cochrane, Mark Blaxter
bioRxiv 844852; doi: https://doi.org/10.1101/844852

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3498)
  • Biochemistry (7342)
  • Bioengineering (5318)
  • Bioinformatics (20249)
  • Biophysics (10002)
  • Cancer Biology (7735)
  • Cell Biology (11292)
  • Clinical Trials (138)
  • Developmental Biology (6431)
  • Ecology (9943)
  • Epidemiology (2065)
  • Evolutionary Biology (13312)
  • Genetics (9358)
  • Genomics (12577)
  • Immunology (7696)
  • Microbiology (19000)
  • Molecular Biology (7433)
  • Neuroscience (40976)
  • Paleontology (300)
  • Pathology (1228)
  • Pharmacology and Toxicology (2133)
  • Physiology (3155)
  • Plant Biology (6857)
  • Scientific Communication and Education (1272)
  • Synthetic Biology (1895)
  • Systems Biology (5310)
  • Zoology (1087)