Elsevier

Biosystems

Volume 84, Issue 2, May 2006, Pages 81-90
Biosystems

An analysis of the class of gene regulatory functions implied by a biochemical model

https://doi.org/10.1016/j.biosystems.2005.09.009Get rights and content

Abstract

Understanding the integrated behavior of genetic regulatory networks, in which genes regulate one another's activities via RNA and protein products, is emerging as a dominant problem in systems biology. One widely studied class of models of such networks includes genes whose expression values assume Boolean values (i.e., on or off). Design decisions in the development of Boolean network models of gene regulatory systems include the topology of the network (including the distribution of input- and output-connectivity) and the class of Boolean functions used by each gene (e.g., canalizing functions, post functions, etc.). For example, evidence from simulations suggests that biologically realistic dynamics can be produced by scale-free network topologies with canalizing Boolean functions. This work seeks further insights into the design of Boolean network models through the construction and analysis of a class of models that include more concrete biochemical mechanisms than the usual abstract model, including genes and gene products, dimerization, cis-binding sites, promoters and repressors. In this model, it is assumed that the system consists of N genes, with each gene producing one protein product. Proteins may form complexes such as dimers, trimers, etc. The model also includes cis-binding sites to which proteins may bind to form activators or repressors. Binding affinities are based on structural complementarity between proteins and binding sites, with molecular binding sites modeled by bit-strings. Biochemically plausible gene expression rules are used to derive a Boolean regulatory function for each gene in the system. The result is a network model in which both topological features and Boolean functions arise as emergent properties of the interactions of components at the biochemical level. A highly biased set of Boolean functions is observed in simulations of networks of various sizes, suggesting a new characterization of the subset of Boolean functions that are likely to appear in gene regulatory networks.

Introduction

Genetic regulatory networks, the linked network of genes and their products that regulate one another's activities, form the basis of the integrated behavior of the genome, regulating some 30,000 genes and their products in humans. Understanding this system is at the forefront of contemporary molecular, cellular, and computational biology. The development of DNA microarrays (Eisen et al., 1998) has provided methods to measure the expression level of thousands of genes at one time, opening the door to the inference of regulatory interactions from high-throughput data. At this early stage when we know little of the architecture and logic governing the network, it is important to investigate which classes of models are most appropriate to describe such interactions. In particular, this paper addresses the class of control rules governing gene activities that can be generated via specified molecular mechanisms.

Approaches to modeling genetic regulatory networks have been developed using a variety of formal models, including synchronous Boolean networks (Kauffman et al., 2003, Kauffman, 1971), continuous-time switching networks (Glass and Kauffman, 1973), linear systems, (Deng et al., 2005), S-systems (Judd et al., 2000), oscillators (Judd et al., 2000), and specialized models for individual circuits (McAdams and Shapiro, 1995). This paper addresses Boolean network models in which nodes represent genes, and edges represent regulatory interactions among genes. Nodes have binary values vi where vi=1 indicates that gene i is on (expressed) and vi=0 indicates that gene i is off (not expressed). Each node i has Ki inputs (regulatory genes), and each node uses a deterministic Boolean (logical) function to update its value based on the values of its inputs:vi=Bi(vi1,vi2,,vK)It is usually assumed that each node updates its value synchronously. While Boolean networks are clearly simplified models of genetic regulatory networks, they are nonetheless useful as a first approximation for many purposes. In developing Boolean network models for gene regulation, two critical questions arise: first, what is the topology, or pattern of input- and output-connections? Several classes of topologies have been considered, including random (Fox and Hill, 2001, Kauffman, 1969a), scale-free (Barabasi and Bonabeau, 2003, Fox and Hill, 2001, Goldberg and Roth, 2003), small-world (Goldberg and Roth, 2003), and hierarchical networks (Ravasz and Barabasi, 2003). An interesting approach to studying network topologies involves proposing a specific mechanism for network evolution, for example, gene duplication events (Babu et al., 2004), and then investigating the class of topologies that arise under this assumption.

A second question concerns the class of logical functions to be used to describe the regulation interactions among genes. The number of possible Boolean functions giving the possible logical rules for gene interactions grows very rapidly as the number of molecular inputs increases. In general, there are 22K Boolean functions of K inputs, hence 16 functions for K = 2; 256 functions for K = 3 and over four billion functions for K = 5. However, it is possible that a much more limited set of regulatory functions arises in nature. For example, there may be biases on the class of regulatory rules based on the interaction of biochemical principles and the mechanisms of natural selection. It is desirable to limit the Boolean functions in gene regulatory network models to the appropriate subset of functions that may arise in nature, since a model with biologically plausible functions is more likely to yield simulations that accurately predict the behaviors of real biological systems. Moreover, assuming a limited class of Boolean functions facilitates the task of inferring a model from experimental data, sometimes called reverse engineering (D’Haeseleer et al., 2000, Gat-Viks and Shamir, 2003, Liang et al., 1998).

Several classes of Boolean functions have been investigated in the context of genetic regulatory models, including random functions (Kauffman, 1969b), canalizing functions (Kauffman, 1971), hierarchical canalizing functions (Szallasi and Liang, 1998), and Post classes (Shmulevich et al., 2003). Several studies suggest that canalizing functions1 play an important role in genetic regulatory systems. An analysis of gene regulatory function in yeast has indicated that experimentally derived regulatory functions are primarily canalizing functions (Harris et al., 2002, Kauffman et al., 2003). In addition, networks controlled by canalizing function exhibit more stable dynamics than networks controlled by random Boolean functions (Kauffman, 1993). A recent analysis of gene regulation relationships seems to favor the class of so-called chain functions (Gat-Viks and Shamir, 2003, Kauffman, 1969b). At this point, there still may be insufficient data to determine the proper class of Boolean functions merely by generalizing from experimental data.

In this article, we describe an alternative approach to selecting the class of Boolean functions, based on a model that explicitly includes selected underlying regulatory mechanisms. The objective is to identify the class of regulatory functions that arise under a set of realistic assumptions about the kinds of protein–DNA interactions known to occur in cells. In the next section, we propose a specific model of genes and molecular interactions, and use this model to derive distributions over the set of Boolean regulatory rules. The results demonstrate a remarkable limitation on the number of Boolean functions generated and suggests a central role for unate Boolean functions. In comparison with the approach of inferring Boolean function classes from experimental databases, the current method carries the possibility of providing stronger mechanistic explanations for regularities in the topology or logical functions that occur. The current model is an initial effort, but we expect that building increasingly accurate models of the mechanisms of gene regulation will produce networks with specific topological and logical features that can then be calibrated against experimental data.

Section snippets

A biochemical gene regulatory model

The approach here is to first create abstract models of known regulatory mechanisms, and then to observe what classes of switching functions occur within the model. In the proposed model, genes are associated with a regulatory site and a coding region. Coding regions are expressed as proteins, which may form complexes (dimers, trimers, tetramers, etc.). A protein may regulate a gene by binding to the gene's regulatory site. Molecular binding is assumed to occur with affinity, or strength, based

Boolean functions from ensemble simulations

The primary goal of this work is to identify subclasses of biologically plausible Boolean functions that reflect the biochemical mechanisms of gene regulation. The model described above was used to generate an ensemble of regulatory networks in search of regulatory motifs, or classes of Boolean functions that occur more often than expected by chance. Simulations were performed using the model parameters shown in Table 1. One thousand networks of size N = 250, 500, 750 and 1000 genes were

Discussion

There are several plausible explanations for the preponderance of unate functions observed under this model. First, non-unate functions require rare combination of events. For a variable xi to occur in both a positive subterm of a Boolean function and a negative subterm requires that the monomer represented by xi occur both as part of an activating protein and as part of a repressing protein for the same regulatory site. This can only occur if the monomer participates as part of a protein

Summary

A model of gene regulation has been developed that includes protein–protein and protein–DNA interactions. This model can be used to simulate ensembles of gene regulatory networks, and to generate statistics concerning the distribution of Boolean regulatory functions. An objective of this study was to identify a biologically plausible subset of Boolean functions that represent the set of control rules that occur biological networks. Simulations of many thousands of genetic regulatory networks

Acknowledgments

We thank the anonymous reviewers for helpful comments and suggestions, and Ilya Shmulevich for suggestions on improving the terminology.

References (26)

  • I. Gat-Viks et al.

    Chain functions and scoring functions in genetic networks

    Bioinformatics

    (2003)
  • D.S. Goldberg et al.

    Assessing experimentally derived interactions in a small world

    Proc. Natl. Acad. Sci. U.S.A.

    (2003)
  • S.E. Harris et al.

    A model of transcriptional regulatory networks based on biases in the observed regulation rules

    Complexity

    (2002)
  • Cited by (20)

    • Structure estimation for unate Boolean models of gene regulation networks

      2012, IFAC Proceedings Volumes (IFAC-PapersOnline)
    • Regulatory patterns in molecular interaction networks

      2011, Journal of Theoretical Biology
      Citation Excerpt :

      These rules, so-called nested canalyzing functions, capture the spirit of Waddington's concept of canalyzation in gene regulation (Waddington, 1942). Several other classes of Boolean functions have also been investigated in the search for biologically meaningful rules to describe molecular interactions, including random functions (Kauffman, 1969), hierarchical canalyzing function (Szallasi and Liang, 1998; Nikolajewa et al., 2007), chain functions (Gat-Viks and Shamir, 2003), and unate functions (Grefenstette et al., 2006). Here, we use this very general framework to give a definition of the notion of nested canalyzing rule, which then applies to all different model types simultaneously.

    • Learning the structure of genetic network dynamics: A geometric approach

      2011, IFAC Proceedings Volumes (IFAC-PapersOnline)
    View all citing articles on Scopus
    View full text