Comparing association network algorithms for reverse engineering of large-scale gene regulatory networks: synthetic versus real data

Bioinformatics. 2007 Jul 1;23(13):1640-7. doi: 10.1093/bioinformatics/btm163. Epub 2007 May 7.

Abstract

Motivation: Inferring a gene regulatory network exclusively from microarray expression profiles is a difficult but important task. The aim of this work is to compare the predictive power of some of the most popular algorithms in different conditions (like data taken at equilibrium or time courses) and on both synthetic and real microarray data. We are in particular interested in comparing similarity measures both of linear type (like correlations and partial correlations) and of non-linear type (mutual information and conditional mutual information), and in investigating the underdetermined case (less samples than genes).

Results: In our simulations we see that all network inference algorithms obtain better performances from data produced with 'structural' perturbations, like gene knockouts at steady state, than with any dynamical perturbation. The predictive power of all algorithms is confirmed on a reverse engineering problem from Escherichia coli gene profiling data: the edges of the 'physical' network of transcription factor-binding sites are significantly overrepresented among the highest weighting edges of the graph that we infer directly from the data without any structure supervision. Comparing synthetic and in vivo data on the same network graph allows us to give an indication of how much more complex a real transcriptional regulation program is with respect to an artificial model.

Availability: Software is freely available at the URL http://people.sissa.it/~altafini/papers/SoBiAl07/.

Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms*
  • Biomedical Engineering / methods
  • Computer Simulation
  • Databases, Protein
  • Gene Expression Profiling
  • Gene Expression Regulation / physiology*
  • Models, Biological*
  • Oligonucleotide Array Sequence Analysis / methods*
  • Proteome / metabolism*
  • Signal Transduction / physiology*

Substances

  • Proteome