Advances to Bayesian network inference for generating causal networks from observational biological data

Bioinformatics. 2004 Dec 12;20(18):3594-603. doi: 10.1093/bioinformatics/bth448. Epub 2004 Jul 29.

Abstract

Motivation: Network inference algorithms are powerful computational tools for identifying putative causal interactions among variables from observational data. Bayesian network inference algorithms hold particular promise in that they can capture linear, non-linear, combinatorial, stochastic and other types of relationships among variables across multiple levels of biological organization. However, challenges remain when applying these algorithms to limited quantities of experimental data collected from biological systems. Here, we use a simulation approach to make advances in our dynamic Bayesian network (DBN) inference algorithm, especially in the context of limited quantities of biological data.

Results: We test a range of scoring metrics and search heuristics to find an effective algorithm configuration for evaluating our methodological advances. We also identify sampling intervals and levels of data discretization that allow the best recovery of the simulated networks. We develop a novel influence score for DBNs that attempts to estimate both the sign (activation or repression) and relative magnitude of interactions among variables. When faced with limited quantities of observational data, combining our influence score with moderate data interpolation reduces a significant portion of false positive interactions in the recovered networks. Together, our advances allow DBN inference algorithms to be more effective in recovering biological networks from experimentally collected data.

Availability: Source code and simulated data are available upon request.

Supplementary information: http://www.jarvislab.net/Bioinformatics/BNAdvances/

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Validation Study

MeSH terms

  • Algorithms*
  • Bayes Theorem
  • Computer Simulation
  • Gene Expression Profiling / methods*
  • Gene Expression Regulation / physiology*
  • Models, Genetic*
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*
  • Signal Transduction / physiology*
  • Software