Genome assembly forensics: finding the elusive mis-assembly

Genome Biol. 2008;9(3):R55. doi: 10.1186/gb-2008-9-3-r55. Epub 2008 Mar 14.

Abstract

We present the first collection of tools aimed at automated genome assembly validation. This work formalizes several mechanisms for detecting mis-assemblies, and describes their implementation in our automated validation pipeline, called amosvalidate. We demonstrate the application of our pipeline in both bacterial and eukaryotic genome assemblies, and highlight several assembly errors in both draft and finished genomes. The software described is compatible with common assembly formats and is released, open-source, at http://amos.sourceforge.net.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Artifacts*
  • Bacillus anthracis / genetics
  • Drosophila / genetics
  • Genome*
  • Quality Control
  • Sensitivity and Specificity
  • Sequence Analysis, DNA / methods*
  • Software*
  • Tandem Repeat Sequences