BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences

J Mol Biol. 2016 Feb 22;428(4):726-731. doi: 10.1016/j.jmb.2015.11.006. Epub 2015 Nov 14.

Abstract

BlastKOALA and GhostKOALA are automatic annotation servers for genome and metagenome sequences, which perform KO (KEGG Orthology) assignments to characterize individual gene functions and reconstruct KEGG pathways, BRITE hierarchies and KEGG modules to infer high-level functions of the organism or the ecosystem. Both servers are made freely available at the KEGG Web site (http://www.kegg.jp/blastkoala/). In BlastKOALA, the KO assignment is performed by a modified version of the internally used KOALA algorithm after the BLAST search against a non-redundant dataset of pangenome sequences at the species, genus or family level, which is generated from the KEGG GENES database by retaining the KO content of each taxonomic category. In GhostKOALA, which utilizes more rapid GHOSTX for database search and is suitable for metagenome annotation, the pangenome dataset is supplemented with Cd-hit clusters including those for viral genes. The result files may be downloaded and manipulated for further KEGG Mapper analysis, such as comparative pathway analysis using multiple BlastKOALA results.

Keywords: KEGG Orthology; KEGG pathway mapping; genome annotation; metagenome analysis; taxonomic composition.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Computational Biology / methods*
  • Genome*
  • Internet
  • Metagenome*
  • Sequence Analysis, DNA / methods*