An analysis of the class of gene regulatory functions implied by a biochemical model

doi:10.1016/j.biosystems.2005.09.009

Biosystems

Volume 84, Issue 2, May 2006, Pages 81-90

https://doi.org/10.1016/j.biosystems.2005.09.009 Get rights and content

Abstract

Understanding the integrated behavior of genetic regulatory networks, in which genes regulate one another's activities via RNA and protein products, is emerging as a dominant problem in systems biology. One widely studied class of models of such networks includes genes whose expression values assume Boolean values (i.e., on or off). Design decisions in the development of Boolean network models of gene regulatory systems include the topology of the network (including the distribution of input- and output-connectivity) and the class of Boolean functions used by each gene (e.g., canalizing functions, post functions, etc.). For example, evidence from simulations suggests that biologically realistic dynamics can be produced by scale-free network topologies with canalizing Boolean functions. This work seeks further insights into the design of Boolean network models through the construction and analysis of a class of models that include more concrete biochemical mechanisms than the usual abstract model, including genes and gene products, dimerization, cis-binding sites, promoters and repressors. In this model, it is assumed that the system consists of N genes, with each gene producing one protein product. Proteins may form complexes such as dimers, trimers, etc. The model also includes cis-binding sites to which proteins may bind to form activators or repressors. Binding affinities are based on structural complementarity between proteins and binding sites, with molecular binding sites modeled by bit-strings. Biochemically plausible gene expression rules are used to derive a Boolean regulatory function for each gene in the system. The result is a network model in which both topological features and Boolean functions arise as emergent properties of the interactions of components at the biochemical level. A highly biased set of Boolean functions is observed in simulations of networks of various sizes, suggesting a new characterization of the subset of Boolean functions that are likely to appear in gene regulatory networks.

Introduction

Genetic regulatory networks, the linked network of genes and their products that regulate one another's activities, form the basis of the integrated behavior of the genome, regulating some 30,000 genes and their products in humans. Understanding this system is at the forefront of contemporary molecular, cellular, and computational biology. The development of DNA microarrays (Eisen et al., 1998) has provided methods to measure the expression level of thousands of genes at one time, opening the door to the inference of regulatory interactions from high-throughput data. At this early stage when we know little of the architecture and logic governing the network, it is important to investigate which classes of models are most appropriate to describe such interactions. In particular, this paper addresses the class of control rules governing gene activities that can be generated via specified molecular mechanisms.

Approaches to modeling genetic regulatory networks have been developed using a variety of formal models, including synchronous Boolean networks (Kauffman et al., 2003, Kauffman, 1971), continuous-time switching networks (Glass and Kauffman, 1973), linear systems, (Deng et al., 2005), S-systems (Judd et al., 2000), oscillators (Judd et al., 2000), and specialized models for individual circuits (McAdams and Shapiro, 1995). This paper addresses Boolean network models in which nodes represent genes, and edges represent regulatory interactions among genes. Nodes have binary values $v_{i}$ where $v_{i} = 1$ indicates that gene i is on (expressed) and $v_{i} = 0$ indicates that gene i is off (not expressed). Each node i has K_i inputs (regulatory genes), and each node uses a deterministic Boolean (logical) function to update its value based on the values of its inputs: $v_{i} = B_{i} (v_{i 1}, v_{i 2}, \dots, v_{K})$ It is usually assumed that each node updates its value synchronously. While Boolean networks are clearly simplified models of genetic regulatory networks, they are nonetheless useful as a first approximation for many purposes. In developing Boolean network models for gene regulation, two critical questions arise: first, what is the topology, or pattern of input- and output-connections? Several classes of topologies have been considered, including random (Fox and Hill, 2001, Kauffman, 1969a), scale-free (Barabasi and Bonabeau, 2003, Fox and Hill, 2001, Goldberg and Roth, 2003), small-world (Goldberg and Roth, 2003), and hierarchical networks (Ravasz and Barabasi, 2003). An interesting approach to studying network topologies involves proposing a specific mechanism for network evolution, for example, gene duplication events (Babu et al., 2004), and then investigating the class of topologies that arise under this assumption.

A second question concerns the class of logical functions to be used to describe the regulation interactions among genes. The number of possible Boolean functions giving the possible logical rules for gene interactions grows very rapidly as the number of molecular inputs increases. In general, there are $2^{2^{K}}$ Boolean functions of K inputs, hence 16 functions for K = 2; 256 functions for K = 3 and over four billion functions for K = 5. However, it is possible that a much more limited set of regulatory functions arises in nature. For example, there may be biases on the class of regulatory rules based on the interaction of biochemical principles and the mechanisms of natural selection. It is desirable to limit the Boolean functions in gene regulatory network models to the appropriate subset of functions that may arise in nature, since a model with biologically plausible functions is more likely to yield simulations that accurately predict the behaviors of real biological systems. Moreover, assuming a limited class of Boolean functions facilitates the task of inferring a model from experimental data, sometimes called reverse engineering (D’Haeseleer et al., 2000, Gat-Viks and Shamir, 2003, Liang et al., 1998).

Several classes of Boolean functions have been investigated in the context of genetic regulatory models, including random functions (Kauffman, 1969b), canalizing functions (Kauffman, 1971), hierarchical canalizing functions (Szallasi and Liang, 1998), and Post classes (Shmulevich et al., 2003). Several studies suggest that canalizing functions¹ play an important role in genetic regulatory systems. An analysis of gene regulatory function in yeast has indicated that experimentally derived regulatory functions are primarily canalizing functions (Harris et al., 2002, Kauffman et al., 2003). In addition, networks controlled by canalizing function exhibit more stable dynamics than networks controlled by random Boolean functions (Kauffman, 1993). A recent analysis of gene regulation relationships seems to favor the class of so-called chain functions (Gat-Viks and Shamir, 2003, Kauffman, 1969b). At this point, there still may be insufficient data to determine the proper class of Boolean functions merely by generalizing from experimental data.

In this article, we describe an alternative approach to selecting the class of Boolean functions, based on a model that explicitly includes selected underlying regulatory mechanisms. The objective is to identify the class of regulatory functions that arise under a set of realistic assumptions about the kinds of protein–DNA interactions known to occur in cells. In the next section, we propose a specific model of genes and molecular interactions, and use this model to derive distributions over the set of Boolean regulatory rules. The results demonstrate a remarkable limitation on the number of Boolean functions generated and suggests a central role for unate Boolean functions. In comparison with the approach of inferring Boolean function classes from experimental databases, the current method carries the possibility of providing stronger mechanistic explanations for regularities in the topology or logical functions that occur. The current model is an initial effort, but we expect that building increasingly accurate models of the mechanisms of gene regulation will produce networks with specific topological and logical features that can then be calibrated against experimental data.

Section snippets

A biochemical gene regulatory model

The approach here is to first create abstract models of known regulatory mechanisms, and then to observe what classes of switching functions occur within the model. In the proposed model, genes are associated with a regulatory site and a coding region. Coding regions are expressed as proteins, which may form complexes (dimers, trimers, tetramers, etc.). A protein may regulate a gene by binding to the gene's regulatory site. Molecular binding is assumed to occur with affinity, or strength, based

Boolean functions from ensemble simulations

The primary goal of this work is to identify subclasses of biologically plausible Boolean functions that reflect the biochemical mechanisms of gene regulation. The model described above was used to generate an ensemble of regulatory networks in search of regulatory motifs, or classes of Boolean functions that occur more often than expected by chance. Simulations were performed using the model parameters shown in Table 1. One thousand networks of size N = 250, 500, 750 and 1000 genes were

Discussion

There are several plausible explanations for the preponderance of unate functions observed under this model. First, non-unate functions require rare combination of events. For a variable x_i to occur in both a positive subterm of a Boolean function and a negative subterm requires that the monomer represented by x_i occur both as part of an activating protein and as part of a repressing protein for the same regulatory site. This can only occur if the monomer participates as part of a protein

Summary

A model of gene regulation has been developed that includes protein–protein and protein–DNA interactions. This model can be used to simulate ensembles of gene regulatory networks, and to generate statistics concerning the distribution of Boolean regulatory functions. An objective of this study was to identify a biologically plausible subset of Boolean functions that represent the set of control rules that occur biological networks. Simulations of many thousands of genetic regulatory networks

Acknowledgments

We thank the anonymous reviewers for helpful comments and suggestions, and Ilya Shmulevich for suggestions on improving the terminology.

References (26)

M.M. Babu et al.
Structure and evolution of transcriptional regulatory networks
Curr. Opin. Struct. Biol.
(2004)
X. Deng et al.
EXAMINE: a computational approach to reconstructing gene regulatory networks
Biosystems
(2005)
L. Glass et al.
The logical analysis of continuous, non-linear biochemical control networks
J. Theor. Biol.
(1973)
S.A. Kauffman
Metabolic stability and epigenesis in randomly constructed genetic nets
J. Theor. Biol.
(1969)
S.A. Kauffman
Gene regulation networks: a theory for their global structure and behaviors
Curr. Top. Dev. Biol.
(1971)
A.L. Barabasi et al.
Scale-free networks
Sci. Am.
(2003)
P. D’Haeseleer et al.
Genetic network inference: from co-expression clustering to reverse engineering
Bioinformatics
(2000)
M.I. Diamond et al.
Transcription factor interactions: selectors of positive or negative regulation from a single DNA element
Science
(1990)
M.B. Eisen et al.
Cluster analysis and display of genome-wide expression patterns
Proc. Natl. Acad. Sci. U.S.A.
(1998)
J.J. Fox et al.
From topology to dynamics in biochemical networks
Chaos
(2001)

I. Gat-Viks et al.

Chain functions and scoring functions in genetic networks

Bioinformatics

(2003)

D.S. Goldberg et al.

Assessing experimentally derived interactions in a small world

Proc. Natl. Acad. Sci. U.S.A.

(2003)

S.E. Harris et al.

A model of transcriptional regulatory networks based on biases in the observed regulation rules

Complexity

(2002)

Cited by (20)

Structure estimation for unate Boolean models of gene regulation networks
2012, IFAC Proceedings Volumes (IFAC-PapersOnline)
This paper deals with the reconstruction of the interaction structure of a gene regulation network from qualitative data in a Boolean framework. The problem in this setup is to find update functions which are in agreement with the data. As the search space grows exponentially with the system size but data are rare, large uncertainties remain in the reconstructed networks. In order to attenuate this problem, we propose to restrict the search space to the biologically meaningful class of unate functions. Using sign-representations, the problem of exploring this reduced search space is transformed into a linear feasibility problem. The sign-representation furthermore allows to incorporate robustness considerations and gives rise to a new measure which can be used to further reduce the uncertainties. The proposed methodology is demonstrated with a Boolean apoptosis signaling model.
Regulatory patterns in molecular interaction networks
2011, Journal of Theoretical Biology
Citation Excerpt :
These rules, so-called nested canalyzing functions, capture the spirit of Waddington's concept of canalyzation in gene regulation (Waddington, 1942). Several other classes of Boolean functions have also been investigated in the search for biologically meaningful rules to describe molecular interactions, including random functions (Kauffman, 1969), hierarchical canalyzing function (Szallasi and Liang, 1998; Nikolajewa et al., 2007), chain functions (Gat-Viks and Shamir, 2003), and unate functions (Grefenstette et al., 2006). Here, we use this very general framework to give a definition of the notion of nested canalyzing rule, which then applies to all different model types simultaneously.
Understanding design principles of molecular interaction networks is an important goal of molecular systems biology. Some insights have been gained into features of their network topology through the discovery of graph theoretic patterns that constrain network dynamics. This paper contributes to the identification of patterns in the mechanisms that govern network dynamics. The control of nodes in gene regulatory, signaling, and metabolic networks is governed by a variety of biochemical mechanisms, with inputs from other network nodes that act additively or synergistically. This paper focuses on a certain type of logical rule that appears frequently as a regulatory pattern. Within the context of the multistate discrete model paradigm, a rule type is introduced that reduces to the concept of nested canalyzing function in the Boolean network case. It is shown that networks that employ this type of multivalued logic exhibit more robust dynamics than random networks, with few attractors and short limit cycles. It is also shown that the majority of regulatory functions in many published models of gene regulatory and signaling networks are nested canalyzing.
Learning the structure of genetic network dynamics: A geometric approach
2011, IFAC Proceedings Volumes (IFAC-PapersOnline)
This work concerns the identification of the structure of a genetic network model from measurements of gene product concentrations and synthesis rates. In earlier work, for a wide family of network models, we developed a data preprocessing algorithm that is able to reject many hypotheses on the network structure by testing certain monotonicity properties of the models. Here we develop a geometric analysis of the method. Then, for a relevant subclass of genetic network models, we extend our approach to the combined testing of monotonicity and convexity-like properties associated with the network structures. Theoretical achievements as well as performance of the enhanced methods are illustrated by way of numerical results.
The Cognitive Phenotype of Down Syndrome: Insights from Intracellular Network Analysis
2006, NeuroRx
Down syndrome (DS) is caused by trisomy of chromosome 21. All individuals with DS exhibit some level of cognitive dysfunction. It is generally accepted that these abnormalities are a result of the upregulation of genes encoded by chromosome 21. Many chromosome 21 proteins are known or predicted to function in critical neurological processes, but typically they function as modulators of these processes, not as key regulators. Thus, upregulation in DS is expected to cause only modest perturbations of normal processes. Systematic approaches such as intracellular network construction and analysis have not been generally applied in DS research. Networks can be assembled from high-throughput experiments or by text-mining of experimental literature. We survey some new developments in constructing such networks, focusing on newly developed network analysis methodologies. We propose how these methods could be integrated with creation and manipulation of mouse models of DS to advance our understanding of the perturbed cell signaling pathways in DS. This understanding could lead to potential therapeutics.
Global stabilizing control of large-scale biomolecular regulatory networks
2023, Bioinformatics
Partial Information Decomposition of Boolean Functions: a Fourier Analysis perspective
2020, arXiv

View all citing articles on Scopus

View full text

An analysis of the class of gene regulatory functions implied by a biochemical model

Abstract

Introduction

Section snippets

A biochemical gene regulatory model

Boolean functions from ensemble simulations

Discussion

Summary

Acknowledgments

Curr. Opin. Struct. Biol.

Biosystems

J. Theor. Biol.

J. Theor. Biol.

Curr. Top. Dev. Biol.

Scale-free networks

Sci. Am.

Genetic network inference: from co-expression clustering to reverse engineering

Bioinformatics

Transcription factor interactions: selectors of positive or negative regulation from a single DNA element

Science

Cluster analysis and display of genome-wide expression patterns

Proc. Natl. Acad. Sci. U.S.A.

From topology to dynamics in biochemical networks

Chaos

Chain functions and scoring functions in genetic networks

Bioinformatics

Assessing experimentally derived interactions in a small world

Proc. Natl. Acad. Sci. U.S.A.

A model of transcriptional regulatory networks based on biases in the observed regulation rules

Complexity