## Abstract

Synthetic biological gene networks are typically conceptualized and visualized as static graphs with nodal and edge dynamics that are time invariant. This conceptualization of biological programming stands in stark contrast to the transient nature of biological dynamics, which are driven by labile biomolecules. Here we demonstrate the use of dynamical structure function theory to evaluate and visualize network dynamics within synthetic biological circuits. We introduce the theory of dynamical structure functions as a tool for understanding network dynamics in synthetic gene networks. We show in particular, that canonical biological crosstalk and resource loading effects in synthetic biology can be quantified directly using dynamical structure functions from simulation and experimental data. We illustrate the importance of knowing these loading effects through several example systems, showing that crosstalk imbalance in feed-forward loops can explain circuit failure or performance limitations. Finally, we show how dynamical structure functions can be used to diagnose crosstalk and network imbalance to explain failure modes in two types of synthetic biocircuits: an *in vitro* genelet repressilator and an *E. coli* based transcriptional event detector. We show that dynamical structure functions can be used as a form of inverse modeling, to pinpoint biological parts within a complex biological circuit that need revision or improvement.

## 1 Introduction

Synthetic gene networks fulfill diverse roles in realizing circuit logic [1] and timing in living organisms [2]. Ranging from single-input inverters [3,4] to combinatorial input logic gates [5,6], reduction in DNA synthesis and sequencing costs have made it possible to build increasingly complex genetic circuits with tens to hundreds of components. However, the ability to pilot novel biological circuitry often outpaces our ability to revise or evaluate designs, to take stock of the intricate details of what has been built. As the field continues to build and integrate on successes of circuit and device-level complexity to engineer entire genetic systems or pathways, we are consistently seeing failure modes that arise from a lack of modularity, retroactivity [7–9], and context effects [10].

Likewise, the expansion of CRISPR-based methods for genome editing and targeted gene knockdown [11,12] has enabled a broader category of systems biology design problems, centered on redesigning genomes [13] or reprogramming host regulatory networks [14] to target specific environmental niches or to exhibit a particular phenotype. The underlying genetic program implicit in these systems biology objectives is often a vast, complex, and dynamic network of interacting genes, mRNA, and proteins. The expansion in DNA sequencing read depth has made it possible to profile individual genes via the transcriptome [15], which combined with quantitative proteomics [16] or metabolomics [17], enables a systems-level analysis of network activity. But prohibitive sampling and library preparation costs make obtaining highly time-resolved omics’ measurements hard. This makes it difficult to infer dynamic network activity at the scale of whole cell models [18] without extensive experimental investment.

Dynamic network models that describe the intricate interactions between every biomolecular state or species are referred to as state-space models. Two key variables that often determine the behavior of these network models are its network topology [19,20] and parametric realization [21,22]. The structure of a network is generally determined by how states in the system causally affect each other [22]; edges in the network are determined by causal dependence while nodes are determined by the states of the system [23]. Inferring accurate network models from data generally requires high resolution time-series measurements of each state, which makes identifying such state-space models ill-posed. Multiple network realizations can be consistent with under-sampled data, which makes it difficult to pinpoint specific parts or sequences for circuit refinement.

Identifying the active, dynamic network structure of a biological network is critical, since the hypothesized network architecture of a genetic circuit may be very different from the realized network architecture using a specific collection of parts, sequences, and composition approach. While network structure alone does not determine dynamical behavior, though, parametric information is also important in determining what dynamical behaviors a system can achieve [24]. Rather, network structure, or topology, often defines or narrows the possible behaviors a system can achieve. Without any structural constraints, a dynamical system can have arbitrary input-output behavior. Once network structure is imposed, the set of realizable input-output trajectories can be reduced [25,26]. If the realized network differs significantly from the intended network design, the dynamics of the system may produce faults or glitches when appropriately excited or interrogated [27,28]. Getting the actual network topology to match the intended network motif is thus a key element to robust synthetic biological design.

In systems and synthetic biology, canonical network motifs are broadly accepted as enabling useful dynamical behavior [26,29]. For example, an incoherent feedforward loop can be used for fold-change detection or adaptation [30,31]. A cyclic network of repressors is associated with either oscillations [32–34] or multi-stability [35] while a dual negative feedback network of two nodes is used as memory module or toggle switch [36]. Still, the active, dynamic network architecture of most realizations of these network motifs in the form of genetic circuits are not thoroughly studied or catalogued [37]. Systematic, generalizable tools that can verify realized network architecture with minimal measurement or data requirements are lacking [1].

Here we introduce a class of mesoscopic network reconstruction algorithms with adaptable resolution network models, commensurate with the depth or coverage available from fluorimetric, spectometry-based, or sequencing based measurements. We introduce the dynamical structure function as a generalized representation of measured interactions between biological or bio-chemical states and show how a dynamical structure function can encode both direct and crosstalk network interactions using theoretical and simulation examples. We show how the notion of an edge in this network captures the transient dynamics of repression and activation. We then demonstrate the practical utility of these dynamical structure models, by developing and implementing reconstruction algorithms on a genelet repressilator and a novel transcriptional event detector. We show how failure modes in the genetic circuits can be traced back to information about the performance of edges or nodes in the network, which ultimately constrain the design space for genetic sequences comprising these graphical elements.

## 2 Representing Network Interactions in Partially Measured Biological Networks

The network structure of nonlinear dynamical systems is often implicitly defined by the state-space realization. Thus, the process of network reconstruction for the full system becomes a nonlinear parameter estimation [38] or state-space realization problem [39]. Such network re-construction problems are non-convex, only locally identifiable at best, under-constrained due to the sampling limits of experimental data, and even ill-posed at times [38].

A class of dynamical systems where the concept of network structure is well-defined and reconstruction results are readily available are linear time-invariant (LTI) dynamical systems [40]. The most intricate description of network structure of LTI systems refers to the network defined by interactions between every state in the system [22,41–43]. Reconstructing the system’s network structure is equivalent to finding a unique solution for the state-space realization. It is well known that uniquely determining the state-space realization, is expensive, since it requires full-state measurements [44]. It is thus valuable to find different representations of network structure, consistent with the state-space realization, that encode essential structural information, but that impose less stringent constraints on network reconstruction [42,45].

Arguably the simplest yet most broadly employed representation of network structure is the system transfer function [44]. The transfer function describes the closed-loop causal dependencies of system outputs on system inputs. As such, it imposes weak information constraints on the process of network reconstruction. As long as it is possible to perturb the system with each input and measure each output, it is possible to reconstruct the transfer function of the system. Still, the price of reduced constraints on the network reconstruction problem is reduced information about the actual network structure of the system, e.g. how states in the system interact with each other. Only input-output relationships are encoded in the transfer function.

The state space realization and the transfer function thus represent two distinct and qualitatively different representations of system structure. While the state space re-alization encodes all direct interactions between system states and their dependencies on system input, the trans-fer function encodes closed-loop dependencies of system states and inputs. Each representation describes system structure with a different resolution of structural information, but likewise requires a concomitant cost in sys-tem measurement to infer or identify.

The tradeoffs between cost of network reconstruction and the “informativity” of the structural representation are especially clear in synthetic and systems biology re-search. In this area, finding or verifying the network of a biological system is an important problem. However, discovering the entire chemical reaction network is typ-ically an ill-posed problem, since additional reactions may be introduced due to host or environmental con-text [46], loading effects, or unanticipated retroactivity effects [47–50,7]. Even without these effects, the recon-struction problem is equivalent to finding a unique re-alization for the dynamical system, which is ill-posed without measurements of every chemical species in the system. On the other hand, there are many inputs that can be used to perturb the system of interest, e.g. silenc-ing RNA [51], genetic knock-outs [52], and small chem-ical inducers [53]. Using these inputs, it is straightfor-ward to reconstruct the transfer function of the system. However, the transfer function contains virtually no in-formation about how chemical species within the system are interacting.

An intermediate representation of network structure that addresses this trade-off is the dynamical struc-ture function [40]. It is a more detailed description of network structure than the transfer function since it models the causal interactions between measured outputs, in addition to the causal dependencies of outputs on input variables. At the same time, it does not require complete state feedback for reconstruction, since it only models the interactions among output states. In biological systems, this is especially applicable since the output variables of a system are also a subset of the state variables. All unmeasured states are subsumed in the edge-weight functions that describe interactions between measured variables. It is thus possible to experimentally target specific biochemical species to measure and verify that the network structure of a biological system is functioning as intended.

### 2.1 Dynamical Structure Functions

We briefly review the theory of dynamical structure functions, as they pertain to biochemical reaction networks. In practice, the state of the dynamical system , where *y* ∈ ℝ^{p} are the measured chemical states of the dynamical system, corresponding to components of the biochemical reaction network tagged with fluorescent reporters, and *x*_{h} ∈ ℝ^{n−p} are the unmeasured chemical states. It is also the cases that there are exogenous inputs *u* ∈ ℝ^{m} that can be introduced to influence the dynamics of the state *x*. With the exception of oscillators, many biochemical reaction networks converge to a steady state. Moreover, it is generally the case that the parameters of biochemical reaction networks are time-invariant, so long as macroscopic experimental settings of the system such as temperature, growth media, and dissolved oxygen content remain fixed. Therefore, while the model of a biochemical reaction network is of the form
we will suppose that we can linearize the system about an equilibrium point, to write it in the form:
We also assume the system’s initial condition of the linearized system is *x*(0) = 0, and the entries in *A* ∈ ℝ^{n×n} and *B* ∈ ℝ^{n×m} are calculated as
Taking Laplace transforms, solving for *X*_{h}(*s*) and replacing it in *Y* (*s*) we obtain
Where
Defining *D*(*s*) = diag (*W* (*s*)) and subtracting *D*(*s*) from both sides of equation (3) and solving for *Y* (*s*) we obtain the following equation
where *Q*(*s*) = (*sI* −*D*)^{−1}(*W D*) is a *p* ×*p* transfer function matrix and *P* (*s*) = (*sI* −*D*)^{−1}*V* is a *p* ×*m* transfer function matrix. Each entry *Q*_{ij}(*s*) is a transfer function that describes the causal dependency of measured state *Y*_{i}(*s*) on measured state *Y*_{j}(*s*). Similarly, the transfer function *P*_{ij}(*s*) describes the causal dependency of measured state *Y*_{i}(*s*) on input *U*_{j}(*s*). The matrix pair (*Q*(*s*), *P* (*s*)) is known as the *dynamical structure function*, where *Q*(*s*) is referred to as the network structure and *P* (*s*) as the control structure. We illustrate these concepts with an example biochemical reaction network.

#### 2.1.1 The DSF of an Idealized Incoherent Feedforward Loop

Consider the following synthetic biology design problem: design and implement an incoherent feed-forward loop. Specifically, we consider implementing a feed-forward loop using the synthetic parts pLac-LasR-CFP-LVA, pLas-TetR-YFP-LVA, and pLas-Tet-RFP-LVA and IPTG, C_{3}O_{6}H_{12} − HSL, and aTc as inputs (see Figure 1). We model the protein concentration of LasR-CFP, TetR-YFP, and RFP as *x*_{1}, *x*_{2}, and *x*_{3}, respectively. We denote the corresponding mRNA species for each of these proteins as *m*_{1}, *m*_{2}, and *m*_{3}. A simple model without any loading effects, describing the dynamics of these states can be written as:

The dynamical structure function for this system is derived by taking Laplace transforms and eliminating the hidden mRNA states of *x*_{1}, *x*_{2}, and *x*_{3}, namely *m*_{1}, *m*_{2}, *m*_{3}, see [40] or [45] for a detailed derivation of dynamical structure functions. The network and control structure matrix transfer functions are written (*Q*^{a}(*s*), *P*^{a}(*s*)) where *Q*^{a}(*s*) is written as
and *P*^{a}(*s*) is
The network, with edge weight functions corresponding to the entries of *Q*^{a}(*s*), is drawn in Figure 1B. Notice that if we take *s* ∈ ℝ_{>0}, the sign of the entries in *Q*^{a}(*s*) coincides with the form of transcriptional regulation implemented by TetR and LasR, respectively. In [54] it was shown that the sign definite properties of entries in *Q*(ℝ_{>0}) are useful for reasoning about the monotonicity of interactions between measured outputs and how fundamental limits in system performance relate to network structure.

Let us now consider the inverse Laplace transform of *ℒ*^{−1} (*Q*^{a}(*s*)), we remark that follows from the equation
whenever *u*(*t*) ≡ 0 such that *U* (*s*) is 0. This argument holds in general for any system of the form (2). In particular, the entries *Q*^{a}(*t*) act as convolution kernels, and taken with the integral, define an operator for mapping *y*_{j}(*t*) to *y*_{i}(*t*). Most interestingly, we can see that the network structure of this incoherent feedforward loop is *dynamical*, hence our usage of the term *dynamical structure* function to describe the network structure among the measured chemical species *y*(*t*). In this particular case, the time-domain analogue of the dynamical structure (or dynamical structure convolution kernel) is given as
Where
A visualization of each of these impulse kernel functions and their corresponding location in the dynamic adjacency matrix, defined by *Q*^{a}(*t*), is given in Figure 2. Notice how the activating or repressing nature of genetic regulation is encoded by the positivity or negativity of the corresponding kernel response. In addition to uncovering the Boolean network of interactions between biological states, the dynamical network convolution kernel *Q*^{a}(*t*) reveals the time-scales of response of each network edge, as well as the amplitude and the rate of decay of the gain. Interestingly, the transfer function *G*^{a}(*s*) of the system is likewise lower triangular, reflecting the feed-forward network topology in the genetic circuit. Specifically, *G*^{a}()*s* has a sparsity structure of the form

#### 2.1.2 The DSF of an Incoherent Feedforward Loop with crosstalk

In prototyping a feedforward loop, it is important to anticipate *in vivo* context effects. We consider the same biocircuit as described in Example 2.1.1, except now we specifically consider loading effects frequently neglected in the design process of synthetic biology. First, we note that each gene may be susceptible to loading effects [7]. For each gene in Figure 1A, a degradation tag is added, to provide tunability, to the rate of degradation of the protein. Inside the cell, a protease called ClpXP targets these degradation tags and degrades the associated protein. Different tags can be incorporated to modulate the gain of the degradation process. Further, these degradation tags can be subject to mutagenesis experiments, as a means to modulate tunability.

Tunability of degradation introduces a tradeoff in performance. Since the ClpXP protease is a housekeeping protein expressed to form a common pool of proteases for all genes in the cell, there is a limit to the supply of free ClpXP protein in any instant of the cell’s growth cycle. When there are too many degradation-tagged proteins [55], the overloading of the protein degradation queue can trigger unwanted effects such as stress response. More directly, the competition for scarce proteases can induce coupled dynamics or a *virtual* or *indirect* interaction between two genes competing for the same protease pool. Even if the genes were engineered to have no direct transcriptional or translational cross-regulation, the competition for the same protease effectively couples the protein states of both genes. Modifying the above model to account for these type of loading effects yields:
Computing the dynamical structure function, we obtain *Q*^{c}(*s*)
and *P*^{c}(*s*)
Notice that *Q*^{c}(*s*) is no longer lower-triangular, but fully connected. Introducing loading effects creates additional coupling between nodes in the network. If the coupling is significant, the *designed* network interactions of the incoherent feedforward loop are overcome by the *crosstalk* network interactions [54,50,56,57,20,58,8]. Thus, the coupling that is introduced into the biochemical reaction network by loading effects is reflected in the structure of (*Q*^{c}, *P*^{c})(*s*).

In contrast, the transfer function of the crosstalk system only characterizes how system outputs causally depend on inputs. In particular, *G*_{c}(*s*) is also a full matrix like *Q*^{c}(*s*) of 6th order SISO transfer functions
but all structural information about how loading effects cause interference *among* system states is mixed with the information about how outputs causally depend on inputs in *G*(*s*). An identification algorithm of entries in *G*(*s*) will thus be *unable* to quantify the size of crosstalk or interference among system states. To what extent can the entries of (*Q*(*s*), *P* (*s*)) can be used to quantify the size of crosstalk in a synthetic gene networks? Additional theoretical results in the Supplementary Information and [54] show that the dynamical structure function can be used to quantify crosstalk in biochemical reaction networks.

## 3 Method Validation with *in vitro* Biological Data: Identifying the Dynamical Structure of The Genelet Repressilator

We now turn to a practical illustration of dynamical structure reconstruction, one involving experimental data. In this section, we take as a first test case the synthetic genelet repressilator developed by Kim and Winfree [59]. The genelet repressilator consists of three DNA switches that repress one another through indirect sequestration. Specifically, each DNA switch transcribes its mRNA product only when its activator strand binds to complete its T7 RNA polymerase promoter sequence. The RNA product produced from each DNA switch, in turn, acts as an inhibitor to the downstream switch by binding to the downstream switch’s DNA activator molecule. Thus, by sequestering the DNA activator from completing the T7 RNA polymerase promoter region, the mRNA product of the upstream switch inhibits activation of the downstream switch. Figure 4A shows the mechanistic design of the genelet switch.

The genelet switch relies heavily on RNase H to degrade any activator-mRNA inhibitor complexes. With-out degradation, the binding of activator to mRNA in-hibitor is much faster than unbinding and so sequestra-tion is effectively irreversible. Thus, in order for the re-pressilator to function properly, RNase H must degrade its target substrates sufficiently fast. If RNase H is satu-rated with high levels of a particular substrate, this slows the degradation of other substrates, creating a crosstalk interaction between competing DNA-RNA complexes.

By performing network reconstruction on the genelet re-pressilator, we can determine how much crosstalk ex-ists in the biocircuit. To reconstruct *Q*(*s*) and *P* (*s*), we performed a single experiment with three perturbations applied in series. To perturb each switch we pipetted a small perturbative concentration of DNA inhibitor (a DNA analogue of RNA inhibitor). Since DNA is not degradable in a T7 expression system by RNase H, it effectively acts as a step input since it binds to DNA activator and does not degrade. In this way, our pertur-bation design ensures sufficiency of excitation and inde-pendent perturbation of each activator (and downstream switch), thereby satisfying the identifiability conditions in [40] and the persistence of excitation conditions de-scribed in [60]

A detailed model of the repressilator can be found in the supplement of [59]. Since the derivation is lengthy, it suffices to write the idealized dynamical structure func-tion *Q*^{a}(*s*) of this system, corresponding to the detailed model provided in Supplementary Section 1.6 [59]. The structure is obtained by linearizing the system, trans-forming into the Laplace domain, eliminating hidden variables to obtain the following:
reflecting the cyclic structure of the system. Though the parameters of *Q*^{a}(*s*) are unknown, we know that for every entry where , estimating corresponding entry in *Q*^{c} (*s*) from experimental data gives a functional description of the crosstalk present in the network. The experimental data used to fit *Q*^{c}(*s*) and *P*^{c}(*s*) are plotted in Figure 5, along with their respective fits. For each row *i* of *Q*^{c}(*s*), we use *Y*_{j}, *j* ≠ *i* and *U*_{i} as inputs and *Y*_{i} as the output for a direct MIMO *p*× 1 transfer function estimation problem. The impulse response for the convolution kernel *Q*(*t*) of the reconstructed *Q*(*s*) is plotted in Figure 6.

If we compute the corresponding ℋ_{∞} gain of each entry in *Q*_{ij}(*s*) and scale by the maximum gain, we obtain
We see significant crosstalk on the edge *Q*_{23}(*s*) and minor crosstalk from entries *Q*_{31}(*s*) and *Q*_{12}(*s*). This crosstalk need not occur simultaneously, since the ℋ_{∞} gain calculates the worst-case or maximum gain over all possible frequencies. With the exception of *Q*_{23}(*s*), all other crosstalk entries have strictly smaller ℋ_{∞} gain than the designed edge. Examining the impulse response of the convolution kernel confirms these observations; the crosstalk edge *Q*_{23}(*t*) has a larger impulse response than *designed* edge *Q*_{32}(*t*).

There is also a gain imbalance between the *designed* edges *Q*_{32}(*s*), *Q*_{13}(*s*) and *Q*_{21}(*s*). In order for the oscillator to perform properly, it needs to have approximately the same gain along each edge in the network. Having applied our network reconstruction algorithm, this allows us to identify design-level criteria for improving the oscillator. In particular, we can increase the gain of the edge in *Q*_{32}(*s*) by adjusting the binding affinity of the activator DNA with its inhibitor RNA, or by increasing the concentration of the corresponding downstream switch *T*_{31}. This design insight is not obvious when perusing the experimental trajectories of each switch in Figure 5. Inferring dynamical structure functions yields a mesoscopic view of system interactions — enough detail to pinpoint the source of failure at the component level, but abstracted enough to avoid the ill-posed nature of full state-space realization problems.

## 4 Method Validation with *in vivo* Biological Data: Discovering the Dynamical Structure of a Ttranscriptional Event Detector

Discovering dynamical structure models can reveal systems-level understanding of a genetic circuit; specifically how different parts interact with each other causally as a function of time. The gain of each interaction may change over time, explaining modes of behavior in failure modes of engineered synthetic biocircuits or part-level insight into how components may need to be optimized [61]. Thus, in contrast to reductionist troubleshooting approaches, which involve exhaustive part-by-part optimization [1], network reconstruction enables a model-directed approach to troubleshooting. Reductionist approaches ensure that part-level function in isolation is optimized, but they fail to account for emergent behavior from biological part composition [62]. Dynamical structure function models capture systems-level dynamics, while retaining a network description that can describe parts-level interaction.

To illustrate these concepts, in this paper we designed and constructed a novel transcription-based event detector biocircuit. Event detectors are useful because of their ability to perform temporal logic. Making temporal logic decisions enable applications such as programmed differentiation, where the goal is to perform some operation based on a combinatorial and temporal sequences of events that dictate cell fate.

So far there are two demonstrations of temporal logic gates: 1) a temporal logic gate that differentiates start times of two chemical outputs [63] and 2) a molecular counter that counts the number of sequential pulses of inducers [64]. Both event detectors use serine integrases to perform irreversible recombination, while [64] demonstrates the use of transcription-based event detecting to perform event counting. The advantage of an integrase-based approach is the persistent nature of DNA-based memory. At the same time, the drawback of integrase-based event detection is that it is limited to one-time use.

In contrast, transcription based event detectors use proteins instead of DNA to encode a memory state [64,65]. The advantage of a transcription-based event detector is that proteins are labile, since they are diluted through cell growth or can be tagged for degradation. Thus, a transcriptional event detector’s memory state can be reset after some period of time. On the other hand, maintaining protein state over multiple generations is metabolically expensive [58] and the dynamics of the circuit can become sensitive to production and growth phase of the cells. Therefore, a transcription based event detector biocircuit must be designed with precise timing, balance of production rates, and carefully tuned gain of each transcriptional regulator. This provides a perfect use case for our network reconstruction algorithm.

### 4.1 Designing a transcriptional event detector

We designed our transcriptional event detector to be made of two constitutively expressed relay genes, AraC and LasR, and an internal toggle switch. The two relay genes transmit the arrival of two distinct induction events (arabinose and HSL) to relay output promoters pBAD and pLas respectively, which drive production of a fluorescent response in cyan fluorescent protein (CFP) and MG aptamer. To record these induction events historically, the output of each relay gene is coupled to one of two combinatorial promoters (pBAD-Lac or pLas-Tet) in a toggle switch. Each combinatorial promoter implements NIMPLY logic, e.g. pBAD-Lac (pLas-Tet) expresses TetR (LacI) only when arabinose (HSL) and AraC (LasR) are present and LacI (TetR) is absent. Thus, when one analyte (e.g. arabinose) arrives, it triggers latching of the toggle switch only if the toggle switch is unlatched to begin with or the prior latching protein state has been diluted out. The relay outputs thus transmit the *current or recent* induction event state while the toggle switch maintains the *historical* induction event state. Depending on the order of arrival of each inducer, we obtain different biocircuit states. Figure 7 details the genetic elements in the event detector biocircuit and the designed component interaction network.

We can write down an idealized model for the event detector (assuming no crosstalk), assuming first order degradation and production, with Hill functions encoding the NIMPLY logic of each promoter in the memory module.
where the measured outputs of the system are *y*_{i} = *x*_{i}, *i* = 2, 3, *ρ*_{i} is the translation rate of *m*_{i} into *x*_{i}, *δ*_{p} is the effective dilution rate of *x*_{i}, *i* = 1, …, 4, *δ*_{m} is the combined dilution and degradation rate of *m*_{i}, *i* = 1, …, 4, *k*_{M}, *u*_{i} is the Michaelis constant for *u*_{i}, *k*_{l} is the leaky catalytic transcription rate, *k*_{i} is the catalytic transcription rate for *m*_{i}, and *u*_{1}, *u*_{2} are arabinose and HSL, respectively.

Again, the dynamical structure function for this system is calculated by linearizing the system about a nominal operating point, (*x*_{0}, *m*_{0}), taking a Laplace transform and solving out the hidden variables *m*_{1}, …, *m*_{4}. We present a simplified case here, assuming algebraic symmetry of the parameters *k*_{i} = *k, ρ*_{i} = *ρ, k*_{M,i} = *k*_{M} as it does not qualitatively change the structure of (*Q*(*s*), *P* (*s*)). We obtain:
Where *P*_{ii} (*s*) = *ρ*/(δ_{m}+*s*) (δ_{p}+*s*) for *i*= 1,2 and
In the absence of protein degradation, *Q*_{12}(*s*) and *Q*_{21}(*s*) can be approximated with first-order SISO transfer functions. These expressions for *Q*(*s*) and *P* (*s*) are for the idealized dynamical structure function of the *alternative* system. Notice that *Q*_{12}(*s*) and *Q*_{21}(*s*) are strictly negative transfer functions, indicating the repression present in an idealized simulation of the event detector circuit. This is the intended *dynamical network* structure of the event detector, in the absence of all genetic crosstalk or context effects.

Depending on the abundance of transcription factors such as LacI, TetR, and AraC, as well as commonly shared transcriptional and translational proteins, the *actual* dynamical structure function *Q*^{c}(*s*) may not exhibit monotonic repression or may even unveil unwanted interactions. This raises two important questions: 1) when the event detector fails, how is this failure characterized by the dynamical structure function and 2) how do we use the outcomes from this inverse modeling process to close the design-build-test-learn loop? These questions can be posed and answered for our event detector circuit using dynamical structure function estimation.

We constructed a biological implementation of the event detector, using the design specified in Figure 7. The logical components containing the relays and the memory module were encoded on to a plasmid vector with a kanamycin resistance marker and a ColE1 (high copy) replication origin. The fluorescent reporter elements with the relay promoters and readouts for the toggle switch were encoded on a plasmid vector with chloramphenicol resistance and the p15 replication origin.

### 4.2 Event Detector Latching Experiments

We evaluated the performance of our transcriptional event detector circuit using a temporal logic test. A standard temporal logic experiment for any two-input event detector is to evaluate the effect of varying the order of presentation of two input signals. In one test, we present the first input, arabinose, for 7.5 hours, followed by induction of the second input, a homo-serine lactone (HSL) quorum sensing molecule to activate the pLas-Tet promoter. In the second test, we swap the order of the inputs, presenting HSL quorum sensing molecule to the event detector for 7.5 hours, then present arabinose inducer as a second input. Both tests evaluate the ability of the memory module of the event detector to latch in the correct state in response to the first input, followed by a challenge to ignore the second input signal while the relays detect and read out the second input signal. The data for both of these *in vivo* tests is plotted in Figure 8B-C.

The event detector showed the correct latching response in all tests at standard maximum induction concentrations of arabinose (1 mM) and working induction concentrations of 1 *µ*M HSL. For example, Figure 8C shows that when the event detector is given arabinose followed by HSL, it generates the correct fluorescent response of yellow fluorescent protein, with lower expressions level of RFP. Conversely, when we add HSL first, followed by arabinose, RFP signal ramps up immediately beginning as early as 1-2 hours after induction while YFP expression is abolished to background levels.

We tested a variety of combinations of high and low concentrations for arabinose and HSL. When the concentration of HSL was decreased to 1 nM, we observed consistent leaks in the memory module in either the YFP channel or the RFP channel. Decreasing arabinose down to 1 *µ*M still allows for latching of high YFP expression, but in the presence of 1 *µ*M HSL, any arabinose latching is reversed by HSL induction (data not plotted). Conversely, when we attenuate HSL induction to 1 *nM*, HSL does not prevent arabinose from reversing a HSL latch on the the memory module, see Figure 8B. This leak is significant enough in the 1 nM HSL induction level that the difference in signal between the arabinose-HSL induction scenario versus the HSL-arabinose induction scenario vanished. This temporal logic response profile is evident of a glitch in the event detector circuit that occurs at lower HSL and arabinose concentrations.

### 4.3 Network Reconstruction Experiments to Debug Circuit Failure

We conducted 4 *in vivo* network reconstruction experiments (2 inducers versus 2 concentrations), recording time-series data of the memory module relay elements, YFP and RFP. The memory module is designed using two hybrid promoters, so from a design standpoint, verification of the memory module was most critical. The arabinose inducer targets the pAra-Lac promoter, while the HSL inducer targets the pLas-Tet promoter (see Supplementary Information for sequences). it is known that arabinose and HSL inducer have an independent, orthogonal effect on their cognate activator proteins AraC and LasR, which allows us to model *P* (*s*) as diagonal. This assumption is not essential when performing direct estimation of *Q*(*s*) and *P* (*s*) from data, but it is still helpful to reduce the number of free parameters. For each network reconstruction experiment, we ran four biological replicates to account for pipetting and innoculation variability and stacked the replicate data to determine the best fit model parameters for *Q* and *P*. Since the parameters of *Q* and *P* are linearly related to the observed data, the unknown parameters can be stacked in vector form,
where Θ contains the stacked polynomial coefficients of *Q*(*z*) and *P* (*z*). After computing and populating the entries of *Q*(*z*) and *P* (*z*), we can calculate *Q*(*s*) and *P* (*s*) through a standard discrete-to-continuous model transformation. We tested both zero-order hold (given the step nature of our inducer inputs) and the Tustin transformation; both produced the same reconstruction result as expected.

The number of entries in Θ is directly dependent on the structural degree *n*_{o}, the order of the characteristic polynomial parameterizing the denominator of both *Q*(*s*) and *P* (*s*). In fitting the data to estimate the dynamical structure function, we optimized over a range of reasonable model orders or what is known as the *structural minimal degree* of the system, ranging from *n*_{o} = 1, …5.. For example, a 5th order system carries the interpretation that there are at least 5 nascent, unmeasured biological states with significant dynamics at the timescales exhibited by the fluorescent proteins. We found that a third order system produced the best fit to the data, when running our identification algorithm for each order *n*_{o}. Finally, direct estimation of *Q*(*s*) and *P* (*s*) is extremely fast, it took 0.15 seconds for each order *n*_{o} in MATLAB. In practice, it appears feasible to conduct parameter sweeps on the model order to find the best overall mean square error fit, while minimizing bias on the training dataset.

As shown in the model (8) of the event detector, the actual event detector we constructed exhibits nonlinear response. However, for any one parametric concentration regime, e.g. at a fixed arabinose or HSL concentration, the response of the system behaves similar to that of a linear system. Thus, our dynamical structure model allows us to reconstruct a *Q*(*s*) and *P* (*s*) for each inducer concentration level used. The accuracy in fitting dynamical structure models to the low gain condition (1 *µ*M arabinose, 1 nM HSL) and high gain condition (1 mM arabinose and 1 *µ*M HSL) were 99.996% and 99.9947% respectively. Both of these model fits scores indicate high fidelity representation of biological dynamics; they confirm the modeling hypothesis that for our transcriptional event detector, local behavior at fixed concentration points approximates linear time-invariant system response. For each distinct concentration we can thus reconstruct a network model to track how changes in concentration of input affect the realized network topology of our circuit and use it to explain variation in performance.

As in the case of the genelet repressilator, we can plot a dynamical network graph for the *in vivo* event detector to understand how the memory module components labeled by YFP and RFP, representing TetR and LacI respectively, interact with each other. A movie visualizing the dynamics of the edges of the graph is available for download (see Supplementary Information). Each edge represents the convolution kernel response of the edge to an impulse applied to that input. All responses are superimposed to form a dynamical graph. Snapshots of the graph are plotted in Figure 10, while time-lapse responses of the weights of each edge are plotted in Figure 9. Again as with the repressilator, we can see that the regulatory nature of edges in the event detector’s memory module manifests as two edges with negative or positive values indicating repression or activation, respectively.

The reconstructed network of our transcriptional event detector reveals the functional relationship between states in the circuit at different concentration regimes. At lower concentrations of arabinose and HSL, the reconstructed transcriptional event detector network reveals functional cause of failed circuit latching. Both edges in the memory module did not repress their target promoters as intended, while the pLas-Tet promoter appears to enact a much higher gain of activated expression from HSL induction than does the activated expression of the pAra-Lac promoter in response to arabinose.

In the high gain setting, where arabinose is induced at 1 mM and HSL is induced at 1 *µ*M, we see that the memory module exhibits the proper mutually repressing motif characteristic of the genetic toggle switch up to 3 hours after the arrival of the HSL inducer. Since the inputs are presented at 7.5 hrs, this is switch from negative repression in *Q*_{21}(*s*) to positive activation is consistent with the timescale of the leaky ramp in RFP signal observed 3 hours in Figure 8C. From our reconstruction model, we can see that the edges are not perfectly balanced, even in the high gain concentration, which spotlights an area for improvement for this circuit. The LacI regulator appears to have a much stronger effect on the pAra-Lac promoter than TetR on its cognate pLas-Tet promoter. Thus, from network reconstruction experiments we can quantitatively infer latent or emergent functional relationships when assembling parts to form new circuits. These latent effects or emergent functional relationships are often the sources for failure in genetic circuits.

## 5 Conclusion

The dynamical structure function models the dependencies among measured states. It is a flexible representation of network structure that naturally adapts to the constraints imposed by experimental measurement. Since identifiability conditions of the dynamical structure function have been well characterized, appropriate experimental design can ensure that the process of network reconstruction produces a sensible answer.

Most importantly, network reconstruction of the dynamical structure function can be used to validate the intended network design of a synthetic biological system. In specific cases, where orthogonality between two chemical species is intended, the entries in a reconstructed dynamical structure function provide a direct estimate of crosstalk or interference between the two species of interest. More generally, the dynamical structure function allows us to characterize the operational or active network and study the relationships between environmental parameters, active network dynamics, and biocircuit performance. We have integrated theory, simulation, and experiments to demonstrate that dynamical structure functions can be a powerful tool for understanding, engineering, and validating synthetic gene networks and biological circuits.

## 6 Experimental Methods

All plasmids were constructed using either Golden Gate assembly [66] or Gibson isothermal assembly [67] in *E. coli*. Plasmids were sequence verified in JM109 cloning strains and transformed into the strain MG1655ΔLacI, provided as a courtesy by R. J. Krom and J. J. Collins. The event detector was transformed as a two-plasmid system with kanamycin and chloramphenicol selection. All *in vivo* experiments were carried out with *n* = 2 replicates using MatriPlates (Brook Life Science Systems MGB096-1-2-LG-L) 96 square-well glass bottom plates at 29° C in a H1 Synergy Biotek plate reader using 505/535 nm and 580/610 nm excitation/emission wavelengths. Cell density was quantified with optical density at 600 nm.

For *in vitro* experiments, all genelet repressilator reconstruction experiments were carried out at 37° C in a Horiba Spectrofluoremeter with 1 minute readout times, using Rhodamine Green, TYE 563 and Texas Red flourophores with 10 nm monochromator excitation and emission bands centered at 502/527, 549/563, and 585/615 nm respectively. All event detector network reconstruction reactions were performed using 500 *µ*L reaction volumes in transformed *E. coli*, grown in square well glass-bottom plates using MatriPlates (Brook Life Science Systems MGB095-1-2-LG-L) with Luria-Bertain rich media broth at 29° C.

## 7 Author Contributions

E. Y. wrote the paper. E.Y, J.G., Y.Y., J. K., and R. M. M. edited drafts of the paper. E.Y. and J. K. designed and carried out experiments and processed experimental data. E.Y. performed analysis and modeling. J. G. and R. M. M. secured research funding. R. M. M. supervised the research process.

## 9 Supplementary Information

### Experimental Methods for Circuit Preparation, Assembly, and Testing

#### 9.1 The Repressilator Genelet Circuit

The DNA sequences for the T31, T12, T23 switch were obtained as a gift from the Winfree lab, mirroring the design identically of the repressilator genelet circuit used in [59]. Oligonucleotides were ordered with functionalized fluorophores or quenchers, corresponding to the original design of the genetic repressilator. DNA sequences were suspended in Tris-EDTA buffer for primary stock storage, while all genelet switches T12, T31, T23 added at concentrations of 75 nM, 75 nM, and 60 nM, respectively to match previous tuning experiments to balance the repressilator, with 7.5 mM working concentration of mono-NTP solution, 24 mM MgCl2, and 1x T7 expression system buffer.

DNA analogues of RNA inhibitors were added to sequester DNA activator signal from the switches as an effective step input perturbation to each node. The switches produced a RNA signal that was designed to interfere with formation of a complete promoter region of the next downstream switch in the repressilator circuit. Adding DNA served as a step perturbation to the corresponding switch. Each DNA moiety added thus had the effect of an activator. Activator DNA molecules A1, A2, and A3, each containing Iowa Black quencher were added at 75 nM, 80 nM, and 75 nM working concentration at 20 minutes from the onset of the reaction, to determine the maximum range of quenching. At 58 minutes, we added 0.7 *µ*L of pyrophosphatase, 3 *µ*L of T7 RNA Polymerase and 2.2 *µ*L of RNase H to achieve identical working concentrations as those described in [59].

#### 9.2 The Transcriptional Event Detector Circuit

The transcriptional event detector circuit, as illustrated in Figure 7 in the main text, is composed of four distinct gene expression cassettes that define the regulatory logic of the circuit and four distinct gene expression cassettes that generate the fluorescent reporter elements of the circuit. Each gene cassette defines a transcriptional unit, with a promoter element, an RBS, a coding sequence, and a terminator sequence. Each gene cassette was cloned using a 5 part Golden Gate assembly, with a type II BsaI restriction enzyme and overhang sequences from [68,69]. Each assembled gene cassette was cloned in JM109 *E. coli* cloning strains and sequence verified at Eurofins Genomic, by Sanger sequencing. Assembled plasmids were engineered to enable a second stage Golden Gate assembly, using the BbsI Type II restriction enzyme, and assembled to either 1) form a master regulatory logic plasmid (pEY15K), comprised of four distinct gene expression cassettes driving transcription factor or allosteric response or 2) form a master reporter plasmid comprised of four distinct reporter elements (pEY14C). Both Stage 2 assembled regulatory logic and reporter plasmids were sequence verified using Sanger sequencing (Eurofin Genomics) and transformed into MG1655ΔLacI (a gift from the Collins laboratory). The sequences for all individual plasmids and the circuit plasmids are listed in Table 1.

### Sequences of Genetic Circuit Components

The sequences for all genetic components and circuits for the event detector circuit are listed in Table 1. All genelet repressilator sequences are identical to the sequences used and listed in [59]. All ribosome binding site (RBS) sequences were derived from the bicistronic design (BCD) ribosome binding site library [70], while all terminator sequences were drawn from the synthetic terminator library characterized in [71].

## 10 Data Accessibility

All data files and network reconstruction code can be obtained from the GitHub repository https://github.com/YeungRepo/NetworkRecon.

## 11 Quantifying Crosstalk in Biochemical Reaction Networks

A common way that crosstalk arises in biochemical reaction networks is when species compete for commonly shared enzymes. When this occurs, the sequestration of an enzyme by one competing species makes the enzyme less accessible to other competing species. For example, when two mRNA are competing for a single ribosome, the binding of one mRNA to the ribosome during translation makes it less accessible to other mRNA. At the core of any such crosstalk is a sudden increase in the dependency of one biochemical state on another. Though enzyme loading may be a common source of crosstalk, such interactions can be modeled at a higher level of abstraction, namely how the dynamics of a given state are affected by the concentration fluctuations of other states.

Nearly every synthetic gene network implements causal dependencies among states. Often, these “designed” interactions take the form of transcription factor binding, sense-anti-sense mRNA regulation, and sequestration events. In practice, every physical system exhibits trajectories that are a mixture of the consequences of both interaction types: designed and crosstalk interactions. Throughout the course of this paper, we will denote the physical system of interest in our models as
To quantify crosstalk in such systems, we can compare the dynamics of system (9) against the dynamics of a reference or alternative system that is free of crosstalk. Such a reference system will still retain the *desired* interaction dynamics and reflects the idealized model often used to design a synthetic gene network, e.g. the feed-forward loop model in Example 2.1.1. Moreover, it can represent the desired behavior of the system in a regime where the magnitude of crosstalk effects are supposed to be minimal or engineered in such a way that they are suppressed [7]. We write the reference system as

*For the comparison between the alternative and crosstalk system to be fair, it is important that* (*10*) *satisfies internal equivalence* [*72*]. *Specifically, we will suppose that any parameters or dynamics unassociated with crosstalk, e*.*g. interaction dynamics, catalytic reactions, or anabolic reactions with no loading effects, are held fixed. Thus, as we compare the behavior of both systems, any differences in the hidden state x*_{h} *or output y dynamics are purely due to effects of crosstalk*.

With the definition of an alternative system in place, it becomes possible to reason about the size of crosstalk, by comparing the dynamics of both systems. In particular, we can develop a rigorous notion for describing the amount of crosstalk arising from the difference of trajectories in both systems.

*Consider two systems, a crosstalk system and an alternative or reference system, initialized from the same initial condition x*(0). *For each initial condition x*(0) = (*y*(0), *x*_{h}(0)) ∈ ℝ^{n} *and input trajectory u*(*t*) *we define the crosstalk trajectory ζ*(*t*) *as*

The crosstalk trajectory is a time-evolving vector that describes the deviation of the physical system (subject to crosstalk) from the reference system’s trajectory. With this notion of crosstalk, we can also make precise the concept of crosstalk between states. We note that in writing the following quantity of interest , it is with a slight abuse of notation, since *ζ*_{i}(*x*^{a}(*t*), *x*^{c}(*t*)). Mathematically, we are computing the *j*^{th} partial derivative of each term in . Thus, to be clear, when we write it will be implicit that we mean .

*Given an initial condition of* (*x*(0), *y*(0)) *and input trajectory u*(*t*) *we say that a chemical species x*_{j} *exerts a crosstalk effect on chemical species x*_{i} *if the i*^{th} *component of the crosstalk trajectory ζ*(*t*) *has nonzero partial derivative*
*for some initial condition of* (*x*(0), *y*(0)) *and input trajectory u*(*t*). *In general, we will refer to* *as the crosstalk sensitivity of x*_{i} *to x*_{j}.

Notice that the mathematical definition of crosstalk sensitivity depends on the initial condition *x*_{0}(*t*) and the input *u*(*t*). This dependency is consistent with the parametric sensitivity of biological function. Many genetic circuits in bacteria behave acceptably in one initial condition and for one input condition, e.g., in log-phase with an attenuated amount of a small molecule or sugar compound, but exhibit significantly different behavior when input concentrations are increased by an order of magnitude or subject to an alternate preparation method prior to the experiment. The latter imposes a state history that defines a distinct initial condition, which can drive a biological network to a highly coupled or decoupled state.

*Consider two mRNA species m*_{1} *and m*_{2} *competing for the same degradation enzyme D in a physical system. For simplicity of exposition, suppose their production dynamics do not depend on each other and can be modeled as P*_{1}(*t*) *and P*_{2}(*t*) *respectively. The crosstalk system is given as*
*while the reference system is given as*
*In both systems, we have supposed that time has been rescaled so that the customary parameter k*_{cat} *for degradation is unity. The crosstalk sensitivity of m*_{1} *and m*_{2} (*with respect to each other*) *are given as*
*respectively. The crosstalk sensitivity between m*_{1} *and m*_{2} *is nonzero whenever m*_{1} *or m*_{2} *have non-zero initial condition*.

*In synthetic biocircuit design, two chemical species x*_{i} *and x*_{j} *are often declared orthogonal when there is no designed interaction between them. Mathematically, in the crosstalk free system, this corresponds to*
*for all x*(0) *and u*(*t*). *In such a situation, ζ*(*x*_{i}, *x*_{j}) ≠0 *if and only if*
*This condition is interesting in experimental settings since a computational estimate of* *from perturbation experiments coincides with a direct estimate of the sensitivity of the crosstalk* . *More specifically, when x*_{i} *and x*_{j} *are measured outputs of the system, we will show in the sequel that quantifying* *is directly related to an estimate of the crosstalk sensitivity* *near the equilibrium point* .

*In general, estimating the crosstalk sensitivity for the nonlinear systems* (*9*) *and* (*10*) *can be challenging if either x*_{i} *and x*_{j} *are not measured directly. Firstly, if experimental data is available, it will often consist of data for the measured species y in the crosstalk-system, but not the reference system. Second, if only one of the species x*_{i} (*or none*) *is available for measurement, even if perturbation of x*_{j} *is possible, a nonlinear observer is required to estimate the trajectory of x*_{j}(*t*). *Unless the parameters of f*_{i}(*x, u*) *are known a priori* (*which is generally not the case*), *this then also requires system identification of the parameters of f*_{c}(*x, u*) *and f*_{a}(*x, u*) *which often results in a non-convex optimization problem*.

Thus, our goal is to estimate the observed crosstalk between measured species *Y*_{i} and *Y*_{j}. This crosstalk estimate will invariably include the dynamics of unmeasured chemical species (such as ATP, RNAP, untagged mRNA and protein species, DNA-protein complexes etc.). From a synthetic biology design standpoint, this is not a disadvantage, since the goal is to design a synthetic gene network with an *abstracted* circuit architecture operating reliably in the context of many unmeasured species. In any genetic circuit, there are always additional biochemical compounds that are unmeasured. Our goal is to validate that a biocircuit (e.g. an IFFL, repressilator, or a novel biocircuit) still manifests the intended network structure even in the presence of unmeasured dynamics.

*Let ℒdenote the two-sided Laplace operator. Suppose the states x*^{c} *and x*^{a} *of the systems* (*9*) *are* (*10*) *are shifted, so that the origin is a locally asymptotically stable equilibrium point and Q*^{c} *and Q*^{a} *are the respective dynamical structure functions calculated for each linearized system about the origin. Then*
*and in particular, if*
*then*
*and can be estimated from input output data* (*Y* (*s*), *U* (*s*)).

*First, notice that the Laplace transform of ℒ* (*ζ*(*t*)) = *ℒ* (*x*^{a} − *x*^{c}) ≜*X*^{a}(*s*) − *X*^{c}(*s*), *which can be decomposed into its measured and unmeasured states*
*Examining the i*^{th} *component equation and taking partials along Y*_{j}(*s*) *yields equation* (*11*).

This result is important, since it tells us when estimating *Q*^{c}(*s*) from experimental data will correspond to estimating crosstalk between measured states in *Y* (*s*). Since necessary and sufficient conditions for identifying *Q*(*s*) and *P* (*s*) have been already characterized [40], this provides conditions for inferring crosstalk from input-output data. For example, a sufficient condition required is that there is an input variable available to excite each measured output of the genetic network attempting to be reconstructed. This allows for the possibility that some biological states are unmeasured and unexcited, but these will be viewed as hidden states that play a role in defining the edge dynamics in *Q*^{c}(*s*).

More generally, even if parameters for *f*^{a}(*x, u*)(*t*) are unknown, the structure of *Q*^{a}(*s*) can be analytically calculated (using a symbolic algebra package). For every zero entry in *Q*^{a}(*s*) (coinciding with designed orthogonality between measured states), we can then estimate *Q*^{c}(*s*) directly.

In practice, estimation of *Q*^{c}(*s*) is also confounded by noise. In our analysis in this paper, we suppose that a series of filters can be applied to eliminate the noise in the data. This may not be the case for biological systems that have been characterized as inherently stochastic, e.g. single cell gene expression dynamics. In such settings, the estimated dynamical structure *Q*^{c}(*s*) is a mixture of the process noise in the system and the crosstalk. From the standpoint of synthetic biocircuit prototyping, both are undesirable in the ultimate iteration of the biocircuit and thus need to be quantified. In this paper, we will demonstrate our theoretical and computational framework with experimental results derived from *in vitro* systems, where signal-to-noise ratios are high and the only sources of noise are measurement noise and pipetting error. For a theoretical treatment of how to reverse engineer *Q*^{c}(*s*) in the presence of process noise or system perturbation, see [74].

An advantage of using *Q*^{c}(*s*) to estimate the crosstalk is that we can use the ℋ_{∞} norm of to calculate the worst-case crosstalk magnitude and *ℋ*_{2} of to calculate the average crosstalk across all frequencies.

### 11.0.1 Quantifying Crosstalk with Q^{c}(s)

Recall the incoherent feedforward loop in subsections 2.1.1 and 2.1.2. In particular, comparing *Q*^{a}(*s*) and *Q*^{c}(*s*) we see that *Q*^{c}(*s*) is a full transfer function matrix
and *Q*^{a}(*s*) is lower-triangular, reflecting the network structure of the intended IFFL. By examining the upper triangular entries in *Q*^{c}(*s*), we can directly examine the effects of degradation crosstalk. In the lower entries of *Q*^{c}(*s*), these crosstalk effects are confounded with the direct interactions modeled in *Q*^{a}(*s*). Although the gain of the entries in *Q*^{c}(*s*) are small, they nonetheless can have a significant effect on the dynamics of the IFFL.

In Figure 11 we plot the time-lapse response of *y*_{2}(*t*) and *y*_{3}(*t*) for varying parameter values of *k*_{2,d} in equation (7). The *k*_{2,d} parameter is a Michaelis constant that determines the effective affinity of substrate *x*_{2} in binding with *C*_{0}. As *k*_{2,d} increases, the affinity of substrate *x*_{2} is diminished, relative to the affinity of *x*_{1} and *x*_{3}. Attenuating *k*_{2,d} can be viewed as similar to swapping out a strong degradation marker for protease degradation with a weaker degradation marker on the species *x*_{2}. In the experimental literature, there are multiple degradation markers for proteins that confer varying binding affinities to an associated protease [75]. In our simulation, we consider five potential values for *k*_{2,d}: 500, 1625, 2750, 3875, and 5000 *µM* corresponding to five artificial LVA markers of varying strengths for the protease ClpXP frequently used in *E. coli*.

Notice that as we decrease the affinity of *y*_{2} for ClpXP, this also coincides with an increased *ζ*_{2} crosstalk magnitude. Here, we have computed . We find that |*ζ*_{2}| increases as *k*_{2,d} increases. In Figure 11B-D, *ζ*_{2} is plotted as a percentage of maximum absolute change across all values of *k*_{2,d}.

We see that the time-lapse response of *y*_{2}(*t*) increases monotonically for all *t* as the crosstalk *ζ*_{2}(*t*) increases. This is consistent with biological intuition, since an increase in competition for resource loading (an increase in *k*_{2,d}) results in prolonged lifetimes of each individual *y*_{2} (TetR-YFP) protein. This in turn results in higher repression levels of *y*_{3} in the incoherent feedforward loop. Increased competition for ClpXP from substrates *y*_{3} and *y*_{1} have the effect of damping *y*_{3} dynamics and reinforcing the pulsatile response of the IFFL. The crosstalk in this circuit thus has the effect of effectively strengthening the negative regulation of *y*_{2} on *y*_{3}, encouraging the downward transient after *t* ≊0.75 hours. Our network analysis shows we can improve the robustness of an IFFL’s pulse by attenuating the relative binding affinity of the repressor to its protease.

In general, crosstalk effects do not necessarily reinforce the feedback architecture of a biocircuit. This underscores the importance of having techniques for quantifying crosstalk in a synthetic gene network and validating that designed interactions are dominant over crosstalk interactions. In the next two sections, we illustrate these concepts with experimental systems implemented *in vitro* and *in vivo*.

## 8 Acknowledgments

We would like to acknowledge Sean Warnick, Vipul Singhal, Shara Balakrishnan, and Anandh Swaminanthan for insightful conversations on network reconstruction algorithms. We would like to thank and acknowledge Victoria Hsiao, Ophelia Venturelli, Clarmyra Hayes, Emmanuel de los Santos, Joe Meyerowitz, and Zachary Sun for guidance with experimental techniques. This work was supported by the Engineering and Physical Sciences Research Council, the Luxembourg National Research Foundation, Air Force Office of Scientific Research, Grant FA9550-14-1-0060, the Defense Advanced Research Projects Agency, Grants HR0011-12-C-0065 and FA8750-19-2-0502, the Army Research Office Young Investigator Program, Grant W911NF-20-1-0165, the National Science Foundation, Grant 1317291, and the John and Ursula Kanel Charitable Foundation.

## References

- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].
- [48].
- [49].
- [50].
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].
- [57].
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].
- [73].↵
- [74].↵
- [75].↵