TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences

BMC Bioinformatics. 2004 Oct 26:5:163. doi: 10.1186/1471-2105-5-163.

Abstract

Background: In the emerging field of environmental genomics, direct cloning and sequencing of genomic fragments from complex microbial communities has proven to be a valuable source of new enzymes, expanding the knowledge of basic biological processes. The central problem of this so called metagenome-approach is that the cloned fragments often lack suitable phylogenetic marker genes, rendering the identification of clones that are likely to originate from the same genome difficult or impossible. In such cases, the analysis of intrinsic DNA-signatures like tetranucleotide frequencies can provide valuable hints on fragment affiliation. With this application in mind, the TETRA web-service and the TETRA stand-alone program have been developed, both of which automate the task of comparative tetranucleotide frequency analysis.

Availability: http://www.megx.net/tetra.

Results: TETRA provides a statistical analysis of tetranucleotide usage patterns in genomic fragments, either via a web-service or a stand-alone program. With respect to discriminatory power, such an analysis outperforms the assignment of genomic fragments based on the (G+C)-content, which is a widely-used sequence-based measure for assessing fragment relatedness. While the web-service is restricted to the calculation of correlation coefficients between tetranucleotide usage patterns of submitted DNA sequences, the stand-alone program generates a much more detailed output, comprising all raw data and graphical plots. The stand-alone program is controlled via a graphical user interface and can batch-process a multitude of sequences. Furthermore, it comes with pre-computed tetranucleotide usage patterns for 166 prokaryote chromosomes, providing a useful reference dataset and source for data-mining.

Conclusions: Up to now, the analysis of skewed oligonucleotide distributions within DNA sequences is not a commonly used tool within metagenomics. With the TETRA web-service and stand-alone program, the method is now accessible in an easy to use manner for a broad audience. This will hopefully facilitate the interrelation of genomic fragments from metagenome libraries, ultimately leading to new insights into the genetic potentials of yet uncultured microorganisms.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition / genetics
  • Bradyrhizobium / genetics
  • Chromosomes, Bacterial / genetics
  • DNA, Bacterial / genetics
  • Escherichia / genetics
  • Genome, Bacterial
  • Internet*
  • Microsatellite Repeats / genetics*
  • Prochlorococcus / genetics
  • Sequence Analysis, DNA / methods*
  • Shigella / genetics
  • Sinorhizobium / genetics
  • Software*
  • Yersinia / genetics

Substances

  • DNA, Bacterial