GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments

Bioinformatics. 2011 Jan 15;27(2):270-1. doi: 10.1093/bioinformatics/btq636. Epub 2010 Nov 15.

Abstract

Summary: Accurate prediction of transcription factor binding motifs that are enriched in a collection of sequences remains a computational challenge. Here we report on GimmeMotifs, a pipeline that incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing (ChIP-seq) data. Similar redundant motifs are compared using the weighted information content (WIC) similarity score and clustered using an iterative procedure. A comprehensive output report is generated with several different evaluation metrics to compare and evaluate the results. Benchmarks show that the method performs well on human and mouse ChIP-seq datasets. GimmeMotifs consists of a suite of command-line scripts that can be easily implemented in a ChIP-seq analysis pipeline.

Availability: GimmeMotifs is implemented in Python and runs on Linux. The source code is freely available for download at http://www.ncmls.eu/bioinfo/gimmemotifs/.

Contact: s.vanheeringen@ncmls.ru.nl

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Binding Sites
  • Chromatin Immunoprecipitation*
  • Computational Biology / methods
  • Humans
  • Mice
  • Sequence Analysis, DNA
  • Software*
  • Transcription Factors / metabolism*

Substances

  • Transcription Factors