Comparison of the two major classes of assembly algorithms: overlap-layout-consensus and de-bruijn-graph

Brief Funct Genomics. 2012 Jan;11(1):25-37. doi: 10.1093/bfgp/elr035. Epub 2011 Dec 19.

Abstract

Since the completion of the cucumber and panda genome projects using Illumina sequencing in 2009, the global scientific community has had to pay much more attention to this new cost-effective approach to generate the draft sequence of large genomes. To allow new users to more easily understand the assembly algorithms and the optimum software packages for their projects, we make a detailed comparison of the two major classes of assembly algorithms: overlap-layout-consensus and de-bruijn-graph, from how they match the Lander-Waterman model, to the required sequencing depth and reads length. We also discuss the computational efficiency of each class of algorithm, the influence of repeats and heterozygosity and points of note in the subsequent scaffold linkage and gap closure steps. We hope this review can help further promote the application of second-generation de novo sequencing, as well as aid the future development of assembly algorithms.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Algorithms*
  • Animals
  • Databases, Nucleic Acid
  • Models, Genetic
  • Sequence Analysis, DNA / methods*
  • Statistics as Topic