Ergatis: a web interface and scalable software system for bioinformatics workflows

Bioinformatics. 2010 Jun 15;26(12):1488-92. doi: 10.1093/bioinformatics/btq167. Epub 2010 Apr 22.

Abstract

Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users.

Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects.

Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic
  • Databases, Protein
  • Internet*
  • Software*
  • Workflow