MT-Toolbox: improved amplicon sequencing using molecule tags

BMC Bioinformatics. 2014 Aug 22;15(1):284. doi: 10.1186/1471-2105-15-284.

Abstract

Background: Short oligonucleotides can be used as markers to tag and track DNA sequences. For example, barcoding techniques (i.e. Multiplex Identifiers or Indexing) use short oligonucleotides to distinguish between reads from different DNA samples pooled for high-throughput sequencing. A similar technique called molecule tagging uses the same principles but is applied to individual DNA template molecules. Each template molecule is tagged with a unique oligonucleotide prior to polymerase chain reaction. The resulting amplicon sequences can be traced back to their original templates by their oligonucleotide tag. Consensus building from sequences sharing the same tag enables inference of original template molecules thereby reducing effects of sequencing error and polymerase chain reaction bias. Several independent groups have developed similar protocols for molecule tagging; however, user-friendly software for build consensus sequences from molecule tagged reads is not readily available or is highly specific for a particular protocol.

Results: MT-Toolbox recognizes oligonucleotide tags in amplicons and infers the correct template sequence. On a set of molecule tagged test reads, MT-Toolbox generates sequences having on average 0.00047 errors per base. MT-Toolbox includes a graphical user interface, command line interface, and options for speed and accuracy maximization. It can be run in serial on a standard personal computer or in parallel on a Load Sharing Facility based cluster system. An optional plugin provides features for common 16S metagenome profiling analysis such as chimera filtering, building operational taxonomic units, contaminant removal, and taxonomy assignments.

Conclusions: MT-Toolbox provides an accessible, user-friendly environment for analysis of molecule tagged reads thereby reducing technical errors and polymerase chain reaction bias. These improvements reduce noise and allow for greater precision in single amplicon sequencing experiments.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computer Graphics
  • DNA / genetics*
  • Metagenomics
  • Oligonucleotides / genetics
  • Polymerase Chain Reaction
  • Sequence Analysis, DNA / methods*
  • Software*
  • Time Factors
  • User-Computer Interface

Substances

  • Oligonucleotides
  • DNA