Composite Module Analyst: identification of transcription factor binding site combinations using genetic algorithm

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W541-5. doi: 10.1093/nar/gkl342.

Abstract

Composite Module Analyst (CMA) is a novel software tool aiming to identify promoter-enhancer models based on the composition of transcription factor (TF) binding sites and their pairs. CMA is closely interconnected with the TRANSFAC database. In particular, CMA uses the positional weight matrix (PWM) library collected in TRANSFAC and therefore provides the possibility to search for a large variety of different TF binding sites. We model the structure of the long gene regulatory regions by a Boolean function that joins several local modules, each consisting of co-localized TF binding sites. Having as an input a set of co-regulated genes, CMA builds the promoter model and optimizes the parameters of the model automatically by applying a genetic-regression algorithm. We use a multicomponent fitness function of the algorithm which includes several statistical criteria in a weighted linear function. We show examples of successful application of CMA to a microarray data on transcription profiling of TNF-alpha stimulated primary human endothelial cells. The CMA web server is freely accessible at http://www.gene-regulation.com/pub/programs/cma/CMA.html. An advanced version of CMA is also a part of the commercial system ExPlaintrade mark (www.biobase.de) designed for causal analysis of gene expression data.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Binding Sites
  • Endothelial Cells / metabolism
  • Gene Expression Profiling
  • Humans
  • Internet
  • Promoter Regions, Genetic*
  • Sequence Analysis, DNA / methods
  • Software*
  • Transcription Factors / metabolism*
  • User-Computer Interface

Substances

  • Transcription Factors