MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes

  1. Brandi L. Cantarel1,
  2. Ian Korf2,
  3. Sofia M.C. Robb3,
  4. Genis Parra2,
  5. Eric Ross4,
  6. Barry Moore1,
  7. Carson Holt1,
  8. Alejandro Sánchez Alvarado3,4, and
  9. Mark Yandell1,5
  1. 1 Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA;
  2. 2 Department of Molecular and Cellular Biology and Genome Center, UC Davis, Davis, California 95616, USA;
  3. 3 Department of Neurobiology & Anatomy, University of Utah School of Medicine, Salt Lake City, Utah 84132, USA;
  4. 4 Howard Hughes Medical Institute, University of Utah School of Medicine, Salt Lake City, Utah 84132, USA

Abstract

We have developed a portable and easily configurable genome annotation pipeline called MAKER. Its purpose is to allow investigators to independently annotate eukaryotic genomes and create genome databases. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab initio gene predictions, and automatically synthesizes these data into gene annotations having evidence-based quality indices. MAKER is also easily trainable: Outputs of preliminary runs are used to automatically retrain its gene-prediction algorithm, producing higher-quality gene-models on subsequent runs. MAKER’s inputs are minimal, and its outputs can be used to create a GMOD database. Its outputs can also be viewed in the Apollo Genome browser; this feature of MAKER provides an easy means to annotate, view, and edit individual contigs and BACs without the overhead of a database. As proof of principle, we have used MAKER to annotate the genome of the planarian Schmidtea mediterranea and to create a new genome database, SmedGD. We have also compared MAKER’s performance to other published annotation pipelines. Our results demonstrate that MAKER provides a simple and effective means to convert a genome sequence into a community-accessible genome database. MAKER should prove especially useful for emerging model organism genome projects for which extensive bioinformatics resources may not be readily available.

Footnotes

  • 5 Corresponding author.

    5 E-mail myandell{at}genetics.utah.edu; fax (801) 585-3214.

  • [Supplemental material is available online at www.genome.org.]

  • Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.6743907

    • Received May 25, 2007.
    • Accepted September 18, 2007.
| Table of Contents

Preprint Server