Review of general algorithmic features for genome assemblers for next generation sequencers

Genomics Proteomics Bioinformatics. 2012 Apr;10(2):58-73. doi: 10.1016/j.gpb.2012.05.006. Epub 2012 Jun 9.

Abstract

In the realm of bioinformatics and computational biology, the most rudimentary data upon which all the analysis is built is the sequence data of genes, proteins and RNA. The sequence data of the entire genome is the solution to the genome assembly problem. The scope of this contribution is to provide an overview on the art of problem-solving applied within the domain of genome assembly in the next-generation sequencing (NGS) platforms. This article discusses the major genome assemblers that were proposed in the literature during the past decade by outlining their basic working principles. It is intended to act as a qualitative, not a quantitative, tutorial to all working on genome assemblers pertaining to the next generation of sequencers. We discuss the theoretical aspects of various genome assemblers, identifying their working schemes. We also discuss briefly the direction in which the area is headed towards along with discussing core issues on software simplicity.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Review

MeSH terms

  • Algorithms*
  • Computational Biology
  • Contig Mapping
  • Genome*
  • Sequence Analysis, DNA / methods*
  • Software