Relative stability of mRNA and protein severely limits inference of gene networks from single-cell mRNA measurements

Tarun Mahajan; Michael Saint-Antoine; Roy D. Dar; Abhyudai Singh

doi:10.1101/2022.03.31.486623

Abstract

Inference of gene regulatory networks from single-cell expression data, such as single-cell RNA sequencing, is a popular problem in computational biology. Despite diverse methods spanning information theory, machine learning, and statistics, it is unsolved. This shortcoming can be attributed to measurement errors, lack of perturbation data, or difficulty in causal inference. Yet, it is not known if kinetic properties of gene expression also cause an issue. We show how the relative stability of mRNA and protein hampers inference. Available inference methods perform benchmarking on synthetic data lacking protein species, which is biologically incorrect. We use a simple model of gene expression, incorporating both mRNA and protein, to show that a more stable protein than mRNA can cause loss in correlation between the mRNA of a transcription factor and its target gene. This can also happen when mRNA and protein are on the same timescale. The relative difference in timescales affects true interactions more strongly than false positives, which may not be suppressed. Besides correlation, we find that information-theoretic nonlinear measures are also prone to this problem. Finally, we demonstrate these principles in real single-cell RNA sequencing data for over 1700 yeast genes.

I. Introduction

According to the “central dogma” of molecular biology [1], genes on DNA are transcribed into messenger RNA (mRNA), which are translated into proteins. The proteins carry out various functions within the cell including regulation of expression of other genes. These regulatory relationships form gene regulatory networks (GRNs), which control the complexity of cellular life [2], [3], and malfunctions in GRNs can lead to diseases like cancer [4].

Understanding GRN function and structure is essential for cell biologists, and the inference of their topology from static transcriptomic data is important [5]. Researchers have used statistical relationships between levels of mRNA to identify the underlying GRN. These statistical methods include correlation [6], regression [7]–[9], information-theoretic techniques [10]–[14], Bayesian techniques [15], [16], and more [17]–[23]. Benchmarking studies have assessed the comparative performance of different methods [24]. An excellent review of GRN inference methods can be found in [25].

Many GRN inference techniques make the assumption that mRNA and protein counts for a given gene are correlated, and use mRNA transcript abundance as a proxy for protein abundance, which is difficult to measure in a highthroughput manner. However, in experiments the correlation between mRNA and corresponding protein counts can be weak [26]. In this paper, we use a model of gene expression to explore the impact of relative stability of mRNA and protein on the correlation between their abundances. We also explore information-theoretic measures–mutual information (MI) [27] and the phi-mixing coefficient [28]–using stochastic simulations [29]. MI quantifies the uncertainty in individual and joint distributions of random variables [27]. The phi-mixing coefficient is a measure of the statistical dependence of two random variables based on the difference between their conditional and unconditional probability distributions [28]. Both MI ([10]–[14]) and the phi-mixing coefficient ([30]) have been used for GRN inference. Finally, we demonstrate the established principles on a real singlecell RNA sequencing (scRNA-seq) dataset for yeast, S. cerevisiae.

We show that when protein is more stable than mRNA, GRN cannot be reliably inferred at the single-cell mRNA level. Even when the true GRN edges are identifiable, the false positive edges will dominate in correlation values. Collectively, this establishes a protein-mRNA lifetime-dependent loss in GRN signature in single-cell data.

II. A simple model of gene expression

We start with a simple model of gene expression. mRNA M is produced and degraded via a 1-dimensional Poisson birth-death process. Uppercase and lowercase letters represent a species and its molecular count, respectively; M and m are the mRNA and its count, respectively. Production of M is a Poisson birth process with rate k_mF(β), and each event creates mRNA in bursts of size β, which is distributed according to any arbitrary positive-valued probability distribution F(β). Protein P is created and degraded in a 1-dimensional conditional Poisson birth-death process. Each M molecule is translated into a molecule of P via a conditional Poisson birth process at a rate k_p. M and P are degraded as Poisson processes at rates 1/τ_m and 1/τ_p, respectively. τ_m and τ_p are the respective average lifetimes of M and P. For the rest of the paper, we assume that τ_m is constant, and τ_p = τ τ_m. This allows us to change τ_p/τ_m by varying only τ. The chemical reaction network for this system is

The time evolution of the joint probability distribution for m and p in (1) is given by the following chemical master equation (CME) [31]: where P(m, p; t) is the joint probability distribution for m and p at time t. (2) can be solved exactly for the steady-state moments [31]–[33]. The first order moments are given by where the angle brackets represent statistical expectation. The second order moments involving only one species can be written as where and are the total noise or expression fluctuation, for M and P, respectively [32], [33]. Intrinsic noise is the fluctuation in m or p caused by the discrete birth-death events for M or P, respectively. Intrinsic noise for any mRNA species is given by (4) [32], [33]. Intrinsic noise for any protein species is given by the first term on the right-hand size (RHS) in (5) [32], [33]. Extrinsic noise in (5) is the noise propagated from m to p. The decomposition of total noise into intrinsic and extrinsic in (5) is fairly standard in noisy expression research [32]–[36]. The numerator for intrinsic noise in (4), is the average contribution to noise from birth and death events for M [34], [35]. We propose that where o is the Little-o notation. If β is deterministic, B = (⟨β⟩ + 1) /2. If β is distributed geometrically, as is known for many genes in different species [37], [38], B = ⟨β⟩. B is an estimate of mean-independent intrinsic noise for mRNA, and we modulate noise by varying B. We vary B by changing ⟨β⟩ ((7)). Whenever ⟨β⟩ is increased/decreased by a factor, we decrease/increase k_m by the same factor to keep first order moments constant at fixed τ.

We are interested in the dependence of steady-state species correlations on B, and τ. Since the moments in (3)- (5) are dependent on B and τ, we obtain the expression for correlation between m and p, cor(m, p), as a function of these moments.

Proposition 1

(mRNA-protein correlation in a single gene)

For a single gene (1), steady-state correlation between m and p is (8) is obtained by solving (2) [31]–[33]. We examine the behavior of cor(m, p) when τ is fixed, and B is variable, and establish the following upper bound:

Theorem 2

(Upper bound on mRNA-protein correlation in a single gene)

For a single gene (1), steady-state correlation between m and p is bounded from above by

cor(m, p) is an increasing function of , which is an increasing function of B (see (4) and (6)). Consequently, cor(m, p) is an increasing function of B, and reaches its maximum value in (9) when the second term on the right hand side (RHS) in (5) is much larger than the first term. This gives us the constraint on B in (9) while using (3)-(4).(9) is depicted by the green curve in Fig. 1b. There is a τ-mediated loss in correlation as protein becomes more stable than mRNA. For extremely stable proteins, cor(m, p) might completely vanish. Actual cor(m, p) values will always be lower than (9). The gap between the upper bound and the true cor(m, p) is governed by the relative amounts of intrinsic and extrinsic noises in p. This is evident from (8) and (9). cor(m, p) reaches the upper bound only when B is large enough. However, if B is low or moderate, cor(m, p) can deviate significantly. cor(m, p)’s dependence on τ and B is shown in Fig. 1b. Further, we have also validated the analytical result using exact stochastic simulations in Fig. 1c. Dependence of cor(m, p) on τ and B motivates the central theme of the paper. We next explore whether this dependence propagates to downstream genes in small GRN topologies.

Fig. 1:

Plot of mRNA-protein correlation, cor(m, p), for a single gene in (1) as a function of B, and τ. (a) GRN being considered–a single gene with its mRNA, M, and protein, P. (b) Analytical curves (black curves) obtained from (8) (one curve per B value). Nine different values were used for B: B ∝ ⟨β⟩ ∈ 4 × {2⁰, 2¹, …, 2⁸}. The upper bound in (9) is depicted by the green curve. (c) cor(m, p) as a function of B and τ using exact stochastic simulations [29]. B was varied over the same range as before. The other kinetic parameters are: k_m = 0.0282, τ_m = 1/0.0025, k_p = 1.2/τ_m. k_m and τ_m are for the nanog gene from mouse embryonic stem cells ([39]). We assume that β is deterministic. Error bars show one standard deviation.

III. Loss of Correlation for Simple GRNS

Count correlations in a two-gene cascade

For a two-gene cascade (Fig. (2)a), gene 1 has mRNA M₁ and protein P₁, and gene 2 has mRNA M₂ and protein P₂. P₁ regulates the production of M₂. All kinetic parameters are identical for the two genes. M₁ is created and degraded via a 1-dimensional Poisson birth-death process. All the other species are created and degraded via 1-dimensional conditional Poisson birth-death processes. The chemical reaction network is where β₁ and β₂ are the respective burst sizes for M₁ and M₂, P₁ regulates M₂ via h (p₁), and numbers (in curly braces) represent an ordering on the reactions. The CME for (10) in compact form is [31] where R = 8 is the total number of reactions in (10), x = (m₁, p₁, m₂, p₂)^T, f_r (x) is the propensity for reaction is a vector containing the jump in species count because of reaction r, and P(x; t) is the joint probability distribution for x. β₁ and β₂ independent. We define as the average contributions to noise from birth and death processes for M₁ and M₂, respectively ([32]–[35]):

(11) can be solved exactly for steady-state moments if h (p₁) is a linear function [31]–[33]. For nonlinear h (p₁), we solve (11) approximately using the linear noise approximation (LNA) approach [31]–[35], [40], [41]. For LNA, we linearize h(p₁) around a deterministic concentration [32], [33], [41]. Then, evolution of the first-order moments is obtained using the extended moment generator from [42]: where S is a diagonal matrix of inverse lifetimes, and g is the vector of species production rates. Time evolution of the mean-normalized covariance matrix Σ, with entries Σ_{i j} = (⟨x_ix _j⟩ − ⟨x_i⟩ ⟨ x _j⟩)/(⟨x_i⟩ ⟨x _j ⟩) is given by ([31], [34], [35], [40], [41]) where A and D are the mean-normalized jacobian and diffusion matrices, respectively. Ordering for the rows and columns of S, A, D and Σ corresponds to the species order in x. The non-zero entries of A encode the structure of the expanded GRN including both mRNA and protein; there exists a regulation edge from species j to i iff A_{i j} ≠ 0 ([31]–[35], [40], [41]). At steady state, first-order moments are given by and the Σ is given by the following Lyapunov equation ([31]–[35]):

Fig. 2:

Plot of mRNA-mRNA, cor(m₁, m₂), and protein-protein, cor(p₁, p₂), correlation, for a two-gene cascade in (10) as a function of B₁, B₂, and τ. (a) GRN being considered–a two gene cascade. Analytical curves (black (one curve per value of B₁, while B₂ ∝⟨ β₂⟩ = 4) and yellow (one curve per value of B₂, while B₁ ∝ ⟨β₁⟩ = 4) curves) for cor(m₁, m₂) and cor(p₁, p₂) are shown in (b) and (c), respectively. The upper bounds in (17) and (22) are depicted by the green curves in (b) and (c), respectively. cor(m₁, m₂) and cor(p₁, p₂) as functions of B₁ and τ using exact stochastic simulations [29] are shown in (d) and (e), respectively. B₁ was varied while B₂ was held constant (both in the same ranges as before). The values of other kinetic parameters are the same as mentioned in the caption for Fig. 1. Further, we assume that , where k_h was selected such that at steady state h (⟨p₁ ⟩) = 0.5 for all values of the kinetic parameters. Error bars show one standard deviation.

(15) gives all the second order moments ([31]–[35], [41]). We are interested in the dependence of steady-state correlation between m₁ and m₂, cor(m₁, m₂), on B₁, B₂ and τ.

Proposition 3

(mRNA-mRNA correlation in a two-gene cascade)

For the two-gene cascade in (10), correlation between m₁ and m₂ is where and are the total noises in m₁ and m₂, respectively, which are obtained from (15) ([31]–[35], [41]). is the log-sensitivity of m₂ to changes in p₁ at steady state ([32]–[35]). Assume that h (p₁) is saturating and h (⟨p₁⟩) is independent of ⟨p₁⟩. Then, is also independent of ⟨p₁⟩ at steady-state [32]–[35]. (16) is obtained by solving (15) ([31]–[33], [41]). We next establish the following upper bound on cor(m₁, m₂):

Theorem 4

(Upper bound on mRNA-mRNA correlation in a two-gene cascade)

For the two-gene cascade in (10), correlation between m₁ and m₂ is bounded from above by when

From (15), we obtain ([32], [33]),

From (16), (19) and , cor(m₁, m₂) is an increasing function of , and achieves the upper bound in (17) when the third term on the RHS in (19) is much larger than the first two terms. (17) is depicted by the green curve in Fig. 2b. There is a τ-mediated loss in cor(m₁, m₂). As protein becomes more stable than mRNA, cor(m₁, m₂) decays much faster than cor(m₁, p₁). For extremely stable proteins, cor(m₁, m₂) will completely vanish. Actual cor(m₁, p₁) could be much less than (17) (Fig. 2).

(18) defines a tug-of-war between B₁ and B₂. Their relative magintudes dictate the gap between (17) and the actual cor(m₁, m₂). If B₁ is much higher than B₂, (18), then cor(m₁, m₂) will reach its upper bound (17) (see the black curves in Fig. 2b). However, if B₁ is much smaller than B₂, cor(m₁, m₂) can vanish even when mRNA is more stable than protein, and τ < 1 (check the yellow curves in Fig. 2b). We also performed exact stochastic simulations ([29]) to verify (16) and (17), Fig. 2d. Next, we examine the steady-state correlation between p₁ and p₂.

Proposition 5

(protein-protein correlation in a two-gene cascade)

For the two-gene cascade (10), correlation between p₁ and p₂ is

is obtained from (5) by replacing and with and , respectively, and from (15) ([32], [33])

(20) is obtained from (15) [31]–[33]. cor(p₁, p₂) is an increasing function of . Consequently, cor(p₁, p₂) is an increasing function of B₁, which establishes the following upper bound on cor(p₁, p₂):

Theorem 6

(Upper bound on protein-protein correlation in a two-gene cascade)

For the two-gene cascade (10), correlation between p₁ and p₂ is bounded from above by when

Theorem 6 can be proved by substituting intrinsic noise terms in (20) to show that the upper bound is reached when the fourth term in the RHS of (21) is much larger than the first three terms. When B₁ is large to satisfy (23), cor(p₁, p₂) will reach its upper bound in (22) (see the black curves in Fig. 2c). For all values of τ, cor(p₁, p₂) is greater than or equal to 80% of its maximum of (green curve in Fig. 2c). This is in contadiction to (17) where the upper bound was a monotonically decreasing function of τ. The upper bound for cor(p₁, p₂) does not exhibit a τ-mediated loss in correlation. cor(p₁, p₂) also exhibits a tug-of-war between B₁ and B₂ (yellow curves in Fig. 2c). This is evident from (23) and the fact that (20) is a decreasing function of B₂.

We also performed stochastic simulations [29] to demonstrate the dependence of cor(p₁, p₂) on B₁, and τ (Fig. 2e).

Count correlations in a three-gene fanout motif

Next, we show that the false positive correlation between two genes having a common upstream TF, but not regulating each other, is less susceptible to a τ-mediated loss of correlation. This false positive correlation is most of the time greater than the true positive correlation in a two-gene cascade. For this, we study a fanout network, which has three genes. Genes 1 and 2 satisfy (10). Gene 3 satistifes the following additional reaction channels: where reaction numbering has been continued from (24), M₃ and P₃ are the mRNA and protein for gene 3, respectively, β₃ is the burst size for m₃. We assume that β₂ and β₃ are identically distributed. We also define where B₂ is given in (12), as the average contribution of the birth and death events in (24) to noise in m₃. Genes 2 and 3 are identical, and do not regulate each other. They have a common regulator, P₁. Correlation between the species of genes 2 and 3 defines false positive correlation (cor(m₂, m₃) and cor(p₂, p₃)). Now, x = (m₁, p₁, m₂, p₂, m₃, p₃)^T. All moments upto the second-order can be obtained by solving (14) and (15) with additional species, M₃ and P₃. All moments for m₃ amd p₃ can be obtained from the moments of m₂ and p₂, respectively. Consequently, . Next, we find cor(m₂, m₃).

Proposition 7

(mRNA-mRNA correlation in a three-gene fanout)

For the three-gene fanout (10) and (24), correlation between m₂ and m₃ is given by

and in (26) are interchangeable with and , respectively. Proposition 7 is proved by solving (15) [32], [33]. On substituting (19) in (26), we see that cor(m₂, m₃) is an increasing function of B₁, and establish the following upper bound on cor(m₂, m₃):

Theorem 8

(Upper bound on mRNA-mRNA correlation in a three-gene fanout)

For the three-gene fanout (10) and (24), correlation between m₂ and m₃ is bounded from above by

cor(m₂, m₃) depends on B₁ through . For in the denominator in (26), substituting (19), we get (27). Interestingly, the upper bound (27) is independent of τ. This is in contrast to cor(m₁, m₂), where (17) which decays with τ. This implies that cor(m₂, m₃) will always dominate cor(m₁, m₂) and cor(m₁, m₃) when protein is more stable than mRNA, and can even dominate when mRNA is more stable (Fig. (3)b). Next, we directly compare cor(m₁, m₂) to cor(m₂, m₃).

Fig. 3:

Plot of mRNA-mRNA, cor(m₂, m₃), and protein-protein, cor(p₂, p₃), correlation, for a three-gene fanout in (10) and (24) jointly as a function of B₁, B₂, and τ. (a) GRN being considered–a three-gene fanout Analytical curves (cyan (one curve per value of B₁, while B₂ ∝ ⟨β₂ ⟩= 4) and dark-green (one curve per value of B₂, while B₁ ∝ ⟨β₁ ⟩ = 4) curves) for cor(m₂, m₃) and cor(p₂, p₃) are shown in (b) and (c), respectively. The analytical curves from Figs. 2a and 2b are also shown in black and yellow curves in (b) and (c), respectively, for comparison. The upper bound of 1 is shown in green curves in (b) and (c). cor(m₂, m₃) and cor(p₂, p₃) as functions of B₁ and τ using exact stochastic simulations [29] are shown in (d) and (e), respectively. B₁ was varied while B₂ was held constant (both in the same ranges as before). The values of other kinetic parameters are the same as mentioned in the caption for Fig. 1. The values of other kinetic parameters are the same as before (see caption of Fig. 1). For details on the regulation function h (p₁), see caption of Fig. 2. Error bars show one standard deviation.

Theorem 9

(mRNA-mRNA correlation in two-gene cascade vs three-gene fanout)

For the three-gene fanout (10) and (24)

On comparing (16) and (26), while using (19), we get (28). (28) demonstrates a tug-of-war between B₁ and B₂, where their relative magintudes dictate whether cor(m₁, m₂) or cor(m₂, m₃) will dominate. The relative magnitudes of B₁ and B₂ also determine whether cor(m₁, m₂) and cor(m₁, m₃) will achieve their upper bounds (see Theorems 4 and 8). When (18) and (27) are true, cor(m₁, m₂) and cor(m₂, m₃) achieve their upper bounds. However, at the same time, cor(m₂, m₃) will begin to dominate cor(m₁, m₂). This implies that kinetic conditions which allow inference of the GRN also confound inference via the false positive edges.

The dependence of cor(m₂, m₃) on B₁ (cyan curves), B₂ (dark-green curves) and τ based on (26) is shown in Fig. 3b. We have also shown the respective curves for cor(m₁, m₂) based on (16) there for the sake of comparison. We also performed exact stochastic simulation to verify these results as shown in Figs. 3d. Next, we study the correlation between p₂ and p₃, cor(p₂, p₃).

Proposition 10

(protein-protein correlation in a three--gene fanout)

For the three-gene fanout (10) and (24), correlation between p₂ and p₃ is

and in (29) are interchangeable with and , respectively. Proposition 10, is proved by solving (15) [32], [33]. Like before, we assume that B₁, B₂ and B₃ are varied in a manner which preserves the first order moments at fixed τ. On substituting (21) in (29), we see that cor(p₂, p₃) is an increasing function of B₁. Now, we establish the following upper bound on cor(p₂, p₃):

Theorem 11

(Upper bound on protein-protein correlation in a three-gene fanout)

For the three-gene fanout (10) and (24), correlation between p₂ and p₃ is bounded from above by

cor(p₂, p₃) depends on B₁ through . For in the denominator in (29), substituting (21), we get (30) for reaching the upper bound. Similar to cor(m₂, m₃), the upper bound for cor(p₂, p₃) is independent of τ. Therefore, it is possible that cor(p₂, p₃) might dominate cor(p₁, p₂) and cor(p₁, p₃). Next, we directly compare cor(p₂, p₃) to cor(p₁, p₂):

Theorem 12

(Protein-protein correlation in two-gene cascade vs three-gene fanout)

For the three-gene fanout (10) and (24),

On comparing (20) and (29), while using (19), we get (31). Similar to Theorem 9, Theorem 12 demonstrates a tug-of-war between B₁ and B₂, where their relative magintudes dictate whether the true positive correlation cor(p₁, p₂) (cor(p₁, p₃)) or the false positive correlation cor(p₂, p₃) will dominate in single-cell protein measurements. Strong bursting in M₁ relative to M₂ can make cor(p₂, p₃) dominate.

The kinetic regime which allows cor(p₁, p₂) and cor(p₁, p₃) to achieve their upper bounds also allows cor(p₂, p₃) to reach its upper bound (compare the constraints in Theorem 6 and Theorem 11). However, the upper bound of cor(p₂, p₃) is larger than that of cor(p₁, p₂) and cor(p₁, p₃): 1 vs . Even though protein-protein correlations are not subject to τ-mediated loss in correlation, yet the relative gap between the upper bounds for cor(p₂, p₃) and cor(p₁, p₂) or cor(p₁, p₃) will cause the false positive correlation cor(p₂, p₃) to dominate the true positive correlations cor(p₁, p₂) and cor(p₁, p₃). For GRN inference from singlecell protein measurements, this implies that true positive and false positives will always be observed together. Again, the kinetic conditions which allow inference of the network also confound inference via the false positive edges.

The dependence of cor(p₂, p₃) on B₁ (cyan curves), B₂ (dark-green curves) and τ based on (29) is shown in Fig. 3c. We have also shown the respective curves for cor(p₁, p₂) based on (20) there for the sake of comparison. We also performed exact stochastic simulation to verify these results as shown in Figs. 3e.

IV. Loss of Information-Theoretic Measures for Simple GRNS

We also study the behavior of MI and the phi-mixing coefficient as a function of mRNA bursting and τ for the single gene and the two-gene cascade topologies using stochastic simulations ([29]). For a single-gene, like correlation, we observe a τ-mediated loss in MI between m₁ and p₁ (Fig. 4, top). For the two-gene cascade, there is τ-mediated loss in MI between m₁ and m₂ as well (Fig. 4, middle). Similar to correlation, MI between m₁ and p₁ appears to be larger in magnitude compared to m₁ and m₂. Finally, MI between p₁ and p₂ for the two-gene cascade in (10) exhibits an opposite trend to MI between m₁ and m₂, and has a τ-mediated loss when τ decreases rather than increasing (Fig. 4, bottom). The phi-mixing coefficient exhibited a similar behavior (Fig. 4b).

Fig. 4:

Plot of mutual information and phi-mixing coefficient for a single-gene and the two-gene cascade as functions of B (B₁), and τ. (a) and (b) show results for mutual information (MI) and phi-mixing coefficient, respectively. (top) MI/phi-mixing coefficient between m₁ and p₁ for the single gene in (1) computed using exact stochastic simulations [29] for two values of B–one low ∝ ⟨β⟩= 4 (red), and one high ∝ ⟨β⟩ = 1024 (blue), and four different values of τ. (middle) MI/phi-mixing coefficient between m₁ and m₂ for the two-gene cascade in (10). (bottom) MI/phi-mixing coefficient between p₁ and p₂ for the two-gene cascade in (10). The values of other kinetic parameters are the same as before (see caption of Fig. 1). For details on the regulation function h (p₁), see caption of Fig. 2. Error bars show one standard deviation.

V. Loss of Inference Accuracy for real Yeast GRN in Single-Cell RNA Sequencing Data

We collected the experimentally inferred GRN for S. cerevisiae from the yeastract database [43]. We collected transcriptome- and proteome-wide mRNA and protein degradation rates from [44] and [45], respectively. We used scRNA-seq data generated in [46]. We only retained cells grown under complete medium conditions as degradation rate measurements were made under these conditions [44], [45]. Further, the scRNA-seq data has 12 different genotypes, including the wildtype, and we used all the genotypes.

From the GRN, we extracted edges between master regulators (TFs without any incoming regulation), and their target genes. For these edges, we computed correlation between the mRNA counts of the TF and its target. These values are shown in red in Fig. 5a. We find that these true positive edges do not violate the upper bound on such correlations. Since the TF and the target genes can have different lifetimes, we recompute the upper bound in (4), which becomes

Fig. 5: Loss in correlation and mutual information for yeast in single-cell RNA sequencing data.

(a) Comparision of correlation and normalized mutual information (NMI) between a TF and its target gene (red spheres) against correlation and normalized mutual information (NMI) between two genes with common regulators but no edge between them (blue spheres) are shown in (a) and (b), respectively. The green curves represent the upper bound on correlation between a TF and its target gene given in (32)

Interestingly, allowing different lifetimes, enables the upper bound to reach the maximum possible value of 1 when M₁ is much more stable than M₂. The upper bound in (32) is shown by the green curve in Figs. (5)a and (5)b.

Within the limit of statistical variability, the true positive correlations from yeast scRNA-seq data do not violate (32), and consequently face a τ-mediated loss (Fig. (5)a). The false positive edges (blue spheres in Fig. (5)a) are not constrained by the upper bound. For false positive edges, we calculate correlation between genes which do not regulate each other, but are regulated by the master regulators. This observation shows that the insights generated from small network topologies are valid for larger GRNs as well. Further, we observed a similar behavior when we calculated normalized mutual information [47] instead of correlation for the true positive and false positive edges in Fig. (5)b.

Discussion

We have established fundamental limits on inferrability of GRN topology from static single-cell mRNA and protein measurements. We find a relative stability-mediated loss in correlation and information-theoretic measures for mRNA species; when protein is more stable than mRNA, steady-state mRNA counts are not enough to infer the underlying GRN. This is exacerbated by the robustness of false positive correlations to this loss. The kinetic conditions which allow discovery of true positive correlations between a TF and its target gene also hinder GRN inference by amplification of false positive correlations.

We also found these constraints to be true for scRNA-seq data for yeast, S. cerevisiae, suggesting that the relative stability issue is true for real systems. This raises an important question on the limits of identifiability from static data. What about the dependence of other inference tasks, such as kinetic estimation, trajectory inference, clustering and differential expression, on relative stability of mRNA and protein and its propagation over GRN?

We used a simple model of gene expression, which does not incorporate complex processes such as post-transcriptional and post-transcriptional modifications. A future research direction is the establishment of constraints on GRN inferrability in more general settings.

scRNA-seq is not a static snapshot. Cells can be present in multiples states, and not not steady-state prior to sequencing. Can this be leveraged to circumvent the issues we have raised? This is an interesting problem to unpack.

GRN is essential for cellular functioning [2]–[4]. Consequently, a knowledge of its topology is important for understanding and controlling cellular functions. Experimental and computational efforts have been spent over the last two decades to unravel GRN topology for different species. However, the computational problem still remains unsolved. We have provided one explanation for this difficulty. This will motivate development of methods to circumvent the limitations we have unraveled.

ACKNOWLEDGMENT

Footnotes

tarunm3{at}illinois.edu
mikest{at}udel.edu
roydar{at}illinois.edu
absingh{at}udel.edu

References

[1].↵
F. H. Crick, “On protein synthesis,” Symposia of the Society for Experimental Biology, vol. 12, pp. 138–63, 1958.
OpenUrl PubMed
[2].↵
M. Ptashne, “The chemistry of regulation of genes and other things,” Journal of Biological Chemistry, vol. 289, no. 9, p. 5417–5435, 2014.
OpenUrl FREE Full Text
[3].↵
G. Karlebach and R. Shamir, “Modelling and analysis of gene regulatory networks,” Nature Reviews Molecular Cell Biology, vol. 9, no. 10, p. 770–780, 2008.
OpenUrl CrossRef PubMed Web of Science
[4].↵
P. K. Kreeger and D. A. Lauffenburger, “Cancer systems biology: A network modeling perspective,” Carcinogenesis, vol. 31, no. 1, p. 2–8, 2009.
OpenUrl CrossRef PubMed
[5].↵
M. M. Saint-Antoine and A. Singh, “Network inference in systems biology: Recent developments, challenges, and applications,” Current Opinion in Biotechnology, vol. 63, p. 89–98, 2020.
OpenUrl
[6].↵
B. Zhang and S. Horvath, “A general framework for weighted gene co-expression network analysis,” Statistical Applications in Genetics and Molecular Biology, vol. 4, no. 1, 2005.
[7].↵
A.-C. Haury, F. Mordelet, P. Vera-Licona, and J.-P. Vert, “TIGRESS: Trustful inference of gene regulation using stability selection,” BMC Systems Biology, vol. 6, no. 1, 2012.
[8].
V. A. Huynh-Thu, A. Irrthum, L. Wehenkel, and P. Geurts, “Inferring regulatory networks from expression data using tree-based methods,” PLoS ONE, vol. 5, no. 9, 2010.
[9].↵
N. Singh and M. Vidyasagar, “bLARS: An algorithm to infer gene regulatory networks,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 13, no. 2, p. 301–314, 2016.
OpenUrl
[10].↵
A. A. Margolin, I. Nemenman, K. Basso, C. Wiggins, G. Stolovitzky, R. D. Favera, and A. Califano, “ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context,” BMC Bioinformatics, vol. 7, no. S1, 2006.
[11].
A. J. Butte and I. S. Kohane, “Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements,” Biocomputing 2000, 1999.
[12].
J. J. Faith, B. Hayete, J. T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J. J. Collins, and T. S. Gardner, “Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles,” PLoS Biology, vol. 5, no. 1, 2007.
[13].
P. E. Meyer, K. Kontos, F. Lafitte, and G. Bontempi, “Informationtheoretic inference of large transcriptional regulatory networks,” EURASIP Journal on Bioinformatics and Systems Biology, vol. 2007, p. 1–9, 2007.
OpenUrl
[14].↵
T. E. Chan, M. P. Stumpf, and A. C. Babtie, “Gene regulatory network inference from single-cell data using multivariate information measures,” Cell Systems, vol. 5, no. 3, 2017.
[15].↵
N. Friedman, M. Linial, I. Nachman, and D. Pe’er, “Using Bayesian networks to analyze expression data,” Journal of Computational Biology, vol. 7, no. 3-4, p. 601–620, 2000.
OpenUrl CrossRef PubMed Web of Science
[16].↵
J. Yu, V. A. Smith, P. P. Wang, A. J. Hartemink, and E. D. Jarvis, “Advances to Bayesian network inference for generating causal networks from observational biological data,” Bioinformatics, vol. 20, no. 18, p. 3594–3603, 2004.
OpenUrl CrossRef PubMed Web of Science
[17].↵
T. Xu, L. Ou-Yang, X. Hu, and X.-F. Zhang, “Identifying gene network rewiring by integrating gene expression and gene network data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 15, no. 6, p. 2079–2085, 2018.
OpenUrl
[18].
H. Zhao and Z.-H. Duan, “Cancer genetic network inference using Gaussian graphical models,” Bioinformatics and Biology Insights, vol. 13, p. 117793221983940, 2019.
[19].
M. Bansal, G. D. Gatta, and D. di Bernardo, “Inference of gene regulatory networks and compound mode of action from time course gene expression profiles,” Bioinformatics, vol. 22, no. 7, p. 815–822, 2006.
OpenUrl CrossRef PubMed Web of Science
[20].
V. A. Huynh-Thu and G. Sanguinetti, “Combining tree-based and dynamical systems for the inference of gene regulatory networks,” Bioinformatics, vol. 31, no. 10, p. 1614–1622, 2015.
OpenUrl CrossRef PubMed
[21].
C. Biane, F. Delaplace, and T. Melliti, “Abductive network action inference for targeted therapy discovery,” Electronic Notes in Theoretical Computer Science, vol. 335, p. 3–25, 2018.
OpenUrl
[22].
S. Barman and Y.-K. Kwon, “A Boolean network inference from timeseries gene expression data using a genetic algorithm,” Bioinformatics, vol. 34, no. 17, p. i927–i933, 2018.
OpenUrl CrossRef
[23].↵
K. Kishan, R. Li, F. Cui, Q. Yu, and A. R. Haake, “GNE: A deep learning framework for gene network inference by aggregating biological information,” BMC Systems Biology, vol. 13, no. S2, 2019.
[24].↵
M. M. Saint-Antoine and A. Singh, “Evaluating pruning methods in gene network inference,” in 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2019, pp. 1–7.
[25].↵
V. A. Huynh-Thu and G. Sanguinetti, “Gene regulatory network inference: An introductory survey,” Methods in Molecular Biology, p. 1–23, 2018.
[26].↵
Y. Liu, A. Beyer, and R. Aebersold, “On the dependency of cellular protein levels on mRNA abundance,” Cell, vol. 165, no. 3, p. 535–550, 2016.
OpenUrl CrossRef PubMed
[27].↵
C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, no. 4, p. 623–656, 1948.
OpenUrl
[28].↵
I. A. Ibragimov, “Some limit theorems for stationary processes,” Theory of Probability and Its Applications, vol. 7, no. 4, p. 349–382, 1962.
OpenUrl CrossRef
[29].↵
D. T. Gillespie, “Exact stochastic simulation of coupled chemical reactions,” J. Phys. Chem., vol. 81, no. 25, pp. 2340–2361, 1977-12-01.
OpenUrl CrossRef PubMed Web of Science
[30].↵
N. Singh, M. E. Ahsen, N. Challapalli, H.-S. Kim, M. A. White, and M. Vidyasagar, “Inferring genome-wide interaction networks using the phi-mixing coefficient, and applications to lung and breast cancer,” IEEE Transactions on Molecular, Biological and Multi-Scale Communications, vol. 4, no. 3, p. 123–139, 2018.
OpenUrl
[31].↵
D. Schnoerr, G. Sanguinetti, and R. Grima, “Approximation and inference methods for stochastic biochemical kinetics—a tutorial review,” J. Phys. A: Math. Theor., vol. 50, no. 9, p. 093001, 2017-01.
OpenUrl
[32].↵
A. Singh and M. Soltani, “Quantifying intrinsic and extrinsic variability in stochastic gene expression models,” Plos one, vol. 8, no. 12, p. e84301, 2013.
OpenUrl
[33].↵
A. Singh, “Transient changes in intercellular protein variability identify sources of noise in gene expression,” Biophysical Journal, vol. 107, no. 9, pp. 2214–2220, 2014.
OpenUrl CrossRef PubMed Web of Science
[34].↵
J. Paulsson, “Summing up the noise in gene networks,” Nature, vol. 427, no. 6973, pp. 415–418, 2004.
OpenUrl CrossRef PubMed Web of Science
[35].↵
A. Hilfinger, T. M. Norman, G. Vinnicombe, and J. Paulsson, “Constraints on fluctuations in sparsely characterized biological systems,” Physical review letters, vol. 116, no. 5, p. 58101, 2016.
OpenUrl
[36].↵
T. Mahajan, A. Singh, and R. Dar, “Topological constraints on noise propagation in gene regulatory networks,” bioRxiv, 2021.
[37].↵
J. Peccoud and B. Ycart, “Markovian Modeling of Gene-Product Synthesis,” Theoretical Population Biology, vol. 48, no. 2, pp. 222–234, 1995-10-01.
OpenUrl CrossRef Web of Science
[38].↵
V. Shahrezaei and P. S. Swain, “Analytical distributions for stochastic gene expression,” PNAS, vol. 105, no. 45, pp. 17 256–17 261, 2008-11-11.
OpenUrl Abstract/FREE Full Text
[39].↵
H. Ochiai, T. Sugawara, T. Sakuma, and T. Yamamoto, “Stochastic promoter activation affects nanog expression variability in mouse embryonic stem cells,” Scientific reports, vol. 4, no. 1, pp. 1–9, 2014.
OpenUrl
[40].↵
N. G. Van Kampen, Stochastic Processes in Physics and Chemistry. Elsevier, 1992, vol. 1.
[41].↵
S. Modi, M. Soltani, and A. Singh, “Linear Noise Approximation for a Class of Piecewise Deterministic Markov Processes,” in 2018 Annual American Control Conference (ACC), 2018-06, pp. 1993–1998.
[42].↵
A. Singh and J. Hespanha, “Models for Multi-Specie Chemical Reactions Using Polynomial Stochastic Hybrid Systems,” in Proceedings of the 44th IEEE Conference on Decision and Control, 2005-12, pp. 2969–2974.
[43].↵
P. T. Monteiro, J. Oliveira, P. Pais, M. Antunes, M. Palma, M. Cavalheiro, M. Galocha, C. P. Godinho, L. C. Martins, N. Bourbon, et al., “Yeastract+: a portal for cross-species comparative genomics of transcription regulation in yeasts,” Nucleic acids research, vol. 48, no. D1, pp. D642–D649, 2020.
OpenUrl CrossRef
[44].↵
B. Neymotin, R. Athanasiadou, and D. Gresham, “Determination of in vivo rna kinetics using rate-seq,” Rna, vol. 20, no. 10, pp. 1645–1652, 2014.
OpenUrl Abstract/FREE Full Text
[45].↵
R. Christiano, N. Nagaraj, F. Fröhlich, and T. C. Walther, “Global proteome turnover analyses of the yeasts s. cerevisiae and s. pombe,” Cell reports, vol. 9, no. 5, pp. 1959–1965, 2014.
OpenUrl
[46].↵
C. A. Jackson, D. M. Castro, G.-A. Saldi, R. Bonneau, and D. Gresham, “Gene regulatory network reconstruction using single-cell rna sequencing of barcoded genotypes in diverse environments,” elife, vol. 9, p. e51254, 2020.
OpenUrl
[47].↵
A. Strehl and J. Ghosh, “Cluster ensembles—a knowledge reuse framework for combining multiple partitions,” Journal of machine learning research, vol. 3, no. Dec, pp. 583–617, 2002.
OpenUrl

View the discussion thread.

Posted April 01, 2022.

Download PDF

Citation Tools

Subject Area

Systems Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5214)
Biochemistry (11745)
Bioengineering (8751)
Bioinformatics (29194)
Biophysics (14971)
Cancer Biology (12095)
Cell Biology (17411)
Clinical Trials (138)
Developmental Biology (9421)
Ecology (14178)
Epidemiology (2067)
Evolutionary Biology (18305)
Genetics (12245)
Genomics (16801)
Immunology (11867)
Microbiology (28083)
Molecular Biology (11592)
Neuroscience (60962)
Paleontology (451)
Pathology (1870)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2885)
Systems Biology (7339)
Zoology (1651)

[1] [1].↵
F. H. Crick, “On protein synthesis,” Symposia of the Society for Experimental Biology, vol. 12, pp. 138–63, 1958.
OpenUrl PubMed

[2] [2].↵
M. Ptashne, “The chemistry of regulation of genes and other things,” Journal of Biological Chemistry, vol. 289, no. 9, p. 5417–5435, 2014.
OpenUrl FREE Full Text

[3] [3].↵
G. Karlebach and R. Shamir, “Modelling and analysis of gene regulatory networks,” Nature Reviews Molecular Cell Biology, vol. 9, no. 10, p. 770–780, 2008.
OpenUrl CrossRef PubMed Web of Science

[4] [4].↵
P. K. Kreeger and D. A. Lauffenburger, “Cancer systems biology: A network modeling perspective,” Carcinogenesis, vol. 31, no. 1, p. 2–8, 2009.
OpenUrl CrossRef PubMed

[5] [5].↵
M. M. Saint-Antoine and A. Singh, “Network inference in systems biology: Recent developments, challenges, and applications,” Current Opinion in Biotechnology, vol. 63, p. 89–98, 2020.
OpenUrl

[6] [6].↵
B. Zhang and S. Horvath, “A general framework for weighted gene co-expression network analysis,” Statistical Applications in Genetics and Molecular Biology, vol. 4, no. 1, 2005.

[7] [7].↵
A.-C. Haury, F. Mordelet, P. Vera-Licona, and J.-P. Vert, “TIGRESS: Trustful inference of gene regulation using stability selection,” BMC Systems Biology, vol. 6, no. 1, 2012.

[8] [8].
V. A. Huynh-Thu, A. Irrthum, L. Wehenkel, and P. Geurts, “Inferring regulatory networks from expression data using tree-based methods,” PLoS ONE, vol. 5, no. 9, 2010.

[9] [9].↵
N. Singh and M. Vidyasagar, “bLARS: An algorithm to infer gene regulatory networks,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 13, no. 2, p. 301–314, 2016.
OpenUrl

[10] [10].↵
A. A. Margolin, I. Nemenman, K. Basso, C. Wiggins, G. Stolovitzky, R. D. Favera, and A. Califano, “ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context,” BMC Bioinformatics, vol. 7, no. S1, 2006.

[11] [11].
A. J. Butte and I. S. Kohane, “Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements,” Biocomputing 2000, 1999.

[12] [12].
J. J. Faith, B. Hayete, J. T. Thaden, I. Mogno, J. Wierzbowski, G. Cottarel, S. Kasif, J. J. Collins, and T. S. Gardner, “Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles,” PLoS Biology, vol. 5, no. 1, 2007.

[13] [13].
P. E. Meyer, K. Kontos, F. Lafitte, and G. Bontempi, “Informationtheoretic inference of large transcriptional regulatory networks,” EURASIP Journal on Bioinformatics and Systems Biology, vol. 2007, p. 1–9, 2007.
OpenUrl

[14] [14].↵
T. E. Chan, M. P. Stumpf, and A. C. Babtie, “Gene regulatory network inference from single-cell data using multivariate information measures,” Cell Systems, vol. 5, no. 3, 2017.

[15] [15].↵
N. Friedman, M. Linial, I. Nachman, and D. Pe’er, “Using Bayesian networks to analyze expression data,” Journal of Computational Biology, vol. 7, no. 3-4, p. 601–620, 2000.
OpenUrl CrossRef PubMed Web of Science

[16] [16].↵
J. Yu, V. A. Smith, P. P. Wang, A. J. Hartemink, and E. D. Jarvis, “Advances to Bayesian network inference for generating causal networks from observational biological data,” Bioinformatics, vol. 20, no. 18, p. 3594–3603, 2004.
OpenUrl CrossRef PubMed Web of Science

[17] [17].↵
T. Xu, L. Ou-Yang, X. Hu, and X.-F. Zhang, “Identifying gene network rewiring by integrating gene expression and gene network data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 15, no. 6, p. 2079–2085, 2018.
OpenUrl

[18] [18].
H. Zhao and Z.-H. Duan, “Cancer genetic network inference using Gaussian graphical models,” Bioinformatics and Biology Insights, vol. 13, p. 117793221983940, 2019.

[19] [19].
M. Bansal, G. D. Gatta, and D. di Bernardo, “Inference of gene regulatory networks and compound mode of action from time course gene expression profiles,” Bioinformatics, vol. 22, no. 7, p. 815–822, 2006.
OpenUrl CrossRef PubMed Web of Science

[20] [20].
V. A. Huynh-Thu and G. Sanguinetti, “Combining tree-based and dynamical systems for the inference of gene regulatory networks,” Bioinformatics, vol. 31, no. 10, p. 1614–1622, 2015.
OpenUrl CrossRef PubMed

[21] [21].
C. Biane, F. Delaplace, and T. Melliti, “Abductive network action inference for targeted therapy discovery,” Electronic Notes in Theoretical Computer Science, vol. 335, p. 3–25, 2018.
OpenUrl

[22] [22].
S. Barman and Y.-K. Kwon, “A Boolean network inference from timeseries gene expression data using a genetic algorithm,” Bioinformatics, vol. 34, no. 17, p. i927–i933, 2018.
OpenUrl CrossRef

[23] [23].↵
K. Kishan, R. Li, F. Cui, Q. Yu, and A. R. Haake, “GNE: A deep learning framework for gene network inference by aggregating biological information,” BMC Systems Biology, vol. 13, no. S2, 2019.

[24] [24].↵
M. M. Saint-Antoine and A. Singh, “Evaluating pruning methods in gene network inference,” in 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2019, pp. 1–7.

[25] [25].↵
V. A. Huynh-Thu and G. Sanguinetti, “Gene regulatory network inference: An introductory survey,” Methods in Molecular Biology, p. 1–23, 2018.

[26] [26].↵
Y. Liu, A. Beyer, and R. Aebersold, “On the dependency of cellular protein levels on mRNA abundance,” Cell, vol. 165, no. 3, p. 535–550, 2016.
OpenUrl CrossRef PubMed

[27] [27].↵
C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, no. 4, p. 623–656, 1948.
OpenUrl

[28] [28].↵
I. A. Ibragimov, “Some limit theorems for stationary processes,” Theory of Probability and Its Applications, vol. 7, no. 4, p. 349–382, 1962.
OpenUrl CrossRef

[29] [29].↵
D. T. Gillespie, “Exact stochastic simulation of coupled chemical reactions,” J. Phys. Chem., vol. 81, no. 25, pp. 2340–2361, 1977-12-01.
OpenUrl CrossRef PubMed Web of Science

[30] [30].↵
N. Singh, M. E. Ahsen, N. Challapalli, H.-S. Kim, M. A. White, and M. Vidyasagar, “Inferring genome-wide interaction networks using the phi-mixing coefficient, and applications to lung and breast cancer,” IEEE Transactions on Molecular, Biological and Multi-Scale Communications, vol. 4, no. 3, p. 123–139, 2018.
OpenUrl

[31] [31].↵
D. Schnoerr, G. Sanguinetti, and R. Grima, “Approximation and inference methods for stochastic biochemical kinetics—a tutorial review,” J. Phys. A: Math. Theor., vol. 50, no. 9, p. 093001, 2017-01.
OpenUrl

[32] [32].↵
A. Singh and M. Soltani, “Quantifying intrinsic and extrinsic variability in stochastic gene expression models,” Plos one, vol. 8, no. 12, p. e84301, 2013.
OpenUrl

[33] [33].↵
A. Singh, “Transient changes in intercellular protein variability identify sources of noise in gene expression,” Biophysical Journal, vol. 107, no. 9, pp. 2214–2220, 2014.
OpenUrl CrossRef PubMed Web of Science

[34] [34].↵
J. Paulsson, “Summing up the noise in gene networks,” Nature, vol. 427, no. 6973, pp. 415–418, 2004.
OpenUrl CrossRef PubMed Web of Science

[35] [35].↵
A. Hilfinger, T. M. Norman, G. Vinnicombe, and J. Paulsson, “Constraints on fluctuations in sparsely characterized biological systems,” Physical review letters, vol. 116, no. 5, p. 58101, 2016.
OpenUrl

[36] [36].↵
T. Mahajan, A. Singh, and R. Dar, “Topological constraints on noise propagation in gene regulatory networks,” bioRxiv, 2021.

[37] [37].↵
J. Peccoud and B. Ycart, “Markovian Modeling of Gene-Product Synthesis,” Theoretical Population Biology, vol. 48, no. 2, pp. 222–234, 1995-10-01.
OpenUrl CrossRef Web of Science

[38] [38].↵
V. Shahrezaei and P. S. Swain, “Analytical distributions for stochastic gene expression,” PNAS, vol. 105, no. 45, pp. 17 256–17 261, 2008-11-11.
OpenUrl Abstract/FREE Full Text

[39] [39].↵
H. Ochiai, T. Sugawara, T. Sakuma, and T. Yamamoto, “Stochastic promoter activation affects nanog expression variability in mouse embryonic stem cells,” Scientific reports, vol. 4, no. 1, pp. 1–9, 2014.
OpenUrl

[40] [40].↵
N. G. Van Kampen, Stochastic Processes in Physics and Chemistry. Elsevier, 1992, vol. 1.

[41] [41].↵
S. Modi, M. Soltani, and A. Singh, “Linear Noise Approximation for a Class of Piecewise Deterministic Markov Processes,” in 2018 Annual American Control Conference (ACC), 2018-06, pp. 1993–1998.

[42] [42].↵
A. Singh and J. Hespanha, “Models for Multi-Specie Chemical Reactions Using Polynomial Stochastic Hybrid Systems,” in Proceedings of the 44th IEEE Conference on Decision and Control, 2005-12, pp. 2969–2974.

[43] [43].↵
P. T. Monteiro, J. Oliveira, P. Pais, M. Antunes, M. Palma, M. Cavalheiro, M. Galocha, C. P. Godinho, L. C. Martins, N. Bourbon, et al., “Yeastract+: a portal for cross-species comparative genomics of transcription regulation in yeasts,” Nucleic acids research, vol. 48, no. D1, pp. D642–D649, 2020.
OpenUrl CrossRef

[44] [44].↵
B. Neymotin, R. Athanasiadou, and D. Gresham, “Determination of in vivo rna kinetics using rate-seq,” Rna, vol. 20, no. 10, pp. 1645–1652, 2014.
OpenUrl Abstract/FREE Full Text

[45] [45].↵
R. Christiano, N. Nagaraj, F. Fröhlich, and T. C. Walther, “Global proteome turnover analyses of the yeasts s. cerevisiae and s. pombe,” Cell reports, vol. 9, no. 5, pp. 1959–1965, 2014.
OpenUrl

[46] [46].↵
C. A. Jackson, D. M. Castro, G.-A. Saldi, R. Bonneau, and D. Gresham, “Gene regulatory network reconstruction using single-cell rna sequencing of barcoded genotypes in diverse environments,” elife, vol. 9, p. e51254, 2020.
OpenUrl

[47] [47].↵
A. Strehl and J. Ghosh, “Cluster ensembles—a knowledge reuse framework for combining multiple partitions,” Journal of machine learning research, vol. 3, no. Dec, pp. 583–617, 2002.
OpenUrl