Abstract
Gene regulatory networks are ubiquitous in nature and critical for bottom-up engineering of synthetic networks. Transcriptional repression is a fundamental function that can be tuned at the level of DNA, protein, and cooperative protein – protein interactions, necessitating high-throughput experimental approaches for in-depth characterization. Here we used a cell-free system in combination with a high-throughput microfluidic device to comprehensively study the different tuning mechanisms of a synthetic zinc-finger repressor library, whose affinity and cooperativity can be rationally engineered. The device is integrated into a comprehensive workflow that includes determination of transcription factor binding energy landscapes and mechanistic modeling, enabling us to generate a library of well-characterized synthetic transcription factors and corresponding promoters, which we then used to build gene regulatory networks de novo. The well-characterized synthetic parts and insights gained should be useful for rationally engineering gene regulatory networks and for studying the biophysics of transcriptional regulation.
INTRODUCTION
Cell-free systems have emerged as versatile and efficient platforms for rapid engineering, characterization, and implementation of genetic networks. It has been demonstrated that linear genetic cascades (Noireaux et al. 2003), logic gates (Shin and Noireaux 2012), and oscillators (Karzbrun et al. 2014, Niederholtmeyer et al. 2013, 2015) could be implemented and characterized in cell-free systems, and that networks engineered in cell-free systems function in cells with remarkably similar characteristics, indicating that cell-free systems accurately emulate the cellular environment (Chappell et al. 2013, Niederholtmeyer et al. 2015). Besides these examples in molecular systems engineering and characterization of complex biological systems, cell-free systems provide a viable starting point for the bottom-up synthesis of artificial cells (Forster and Church 2006, Schwille et al. 2018). Work is progressing in establishing critical cellular sub-systems including DNA replication (van Nies et al. 2018), metabolism (Otrin et al. 2017), ribosome synthesis (Jewett et al. 2013), membrane synthesis (Bhattacharya et al. 2017), and protein structures (Furusato et al. 2018). Gene regulatory networks (GRNs) are one such critical sub-system, and here we demonstrate de novo bottom-up engineering and comprehensive characterization of synthetic GRNs in a cell-free system.
GRNs execute the genome and thus play a central role across all domains of life. Due to their importance and ubiquity, GRNs have been intensely studied and considerable progress is being made in deciphering components, topologies, and general mechanisms of GRNs, although a complete mechanistic understanding is still lacking. Because GRNs perform many sophisticated cellular tasks, synthetic biologists use GRNs to engineer new systems (Brophy and Voigt 2014) such as logic gates (Nielsen et al. 2016), toggle switches (Gardner et al. 2000), band-pass filters (Basu et al. 2005), and oscillators (Elowitz and Leibler 2000). Nonetheless, past and current efforts in engineering GRNs have shown that rational design is not yet possible, and that engineering GRNs still heavily relies on trial-and-error and high-throughput screening approaches (Nielsen et al. 2016). The inability to rationally design GRNs is in part due to the aforementioned lack of complete mechanistic understanding, and because basic GRN components such as transcriptional regulators and promoters are often neither fully characterized nor standardized. A corollary of the lack of an in-depth mechanistic understanding of these systems is that individual components are not yet readily composable. Nature provides a plethora of potential transcriptional regulators, but the number that have been tested and characterized remains rather limited. Most engineered GRNs make use of naturally occurring transcription factors, making it difficult to robustly engineer GRNs with such a non-standard set of proteins (Stanton et al. 2014). A library of well-characterized, synthetic transcription factors could alleviate many of these problems by providing a set of standardized transcription factors that are based on the same basic structural framework, and whose function can be extended by generating fusion proteins in a plug-and-play format.
Native GRNs employ a wide range of transcription factors that can be categorized into several structural families. The family with the largest number of members is the zinc-finger (ZF) family, followed by homeodomain, basic helix-loop-helix, and basic-leucine zipper (LZ) families (Vaquerizas et al. 2009). ZFs are of interest in biology as they represent the largest class of transcriptional regulators and are involved in diverse biological functions. ZFs are also appealing for bottom-up engineering as they consist of well-defined subunits that, in combination, determine DNA sequence specificity (Beerli and Barbas III 2002, Tebas et al. 2014). Many resources are therefore available that provide sequence specificity information for a large number of native (Najafabadi et al. 2015) and engineered (Fu and Voytas 2013) ZF transcription factors. An additional advantage is that ZFs are small (264 bp, 10.6 kDa (Zif268)) compared to other engineerable transcriptional regulators such as TALE (e.g. 1161–2397 bp, 39.9–82.6 kDa, DNA binding domain only (Moore et al. 2014)) or dCas9 (4107 bp, 158.3 kDa), so that the coding sequence for ZFs is easily obtainable and modifiable. Due to their small size and simple structure, ZFs can be readily expressed both in vivo and in vitro. Synthetic ZFs have already been successfully used as activators in S. cerevisiae (Khalil et al. 2012) and human cells (Lohmueller et al. 2012). Here we engineer and explore the use of synthetic ZF transcriptional regulators as ideal building blocks for bottom-up design and implementation of cell-free GRNs.
In this paper, we took advantage of an existing synthetic ZF library (Blackburn et al. 2015) to generate a well-characterized resource of transcriptional repressors and corresponding synthetic promoters that can be used for bottom-up design, implementation, and characterization of GRNs in cell-free systems. While the mechanism of action of the simplest prokaryotic repression is competitive inhibition (Ptashne et al. 1976), it has long been appreciated that both cis modifications to the promoter, such as operator position (Cox III et al. 2007), basal promoter strength (Lutz and Bujard 1997), as well as trans modifications to the transcription factor itself strongly affect repression (Lanzer and Bujard 1988, Sharon et al. 2012). These inter-dependencies result in a large experimental space with many degrees of freedom. In order to tackle this complexity we developed a microfluidics based method capable of performing 768 cell-free transcription-translation (TX-TL) reactions on a single device. The ability to rapidly generate ZF repressor and promoter variants using fast PCR assembly and the use of our high-throughput microfluidic device allowed us to perform a comprehensive characterization of repressors and promoters. We investigated the effects of binding site position, binding site affinity, binding site combinations, and cooperative interactions between the repressors on transcriptional repression performance. We generated quantitative position weight matrices (PWMs) for four ZF repressors with MIT-OMI (Maerkl and Quake 2007), which allowed us to rationally tune binding site affinity and promoter output. Finally, we used the parts library and insights acquired in this study to engineer logic gates, showing that de novo synthetic GRNs can be rationally engineered using a bottom-up approach. The transcription factor / promoter parts library, data, and methods described here provide a resource that should facilitate efforts to build synthetic GRNs, serve as a viable approach for building GRNs for use in artificial cells, and establish an experimental platform for studying the biophysics of transcriptional regulation.
RESULTS
A. Design and characterization of a microfluidic device for high-throughput cell-free experiments
The design space of even a single TF – promoter pair is large, encompassing different binding site affinities, binding site positions, binding site sequences, and binding site combinations. This complexity necessitates high-throughput methods capable of the functional characterization of hundreds to thousands of engineered variants. Current approaches in cell-free synthetic biology primarily rely on standard microtiter plates, which require a minimal reaction volume of 5 – 10 μL. Such relatively large volumes quickly become cost-limiting in terms of how much cell-free reaction solution and DNA is required to perform the assays. Researchers recently made use of an acoustic liquid handling robot that reduced reaction volumes to 2 μL in a 384 well plate format (Moore et al. 2018). Here we repurposed the MITOMI platform, a microfluidic device originally developed for high-throughput molecular interaction analysis (Garcia-Cordero and Maerkl 2016, Maerkl and Quake 2007), and applied it to the high-throughput characterization of cell-free genetic networks. The repurposed device performs 768 cell-free reactions, and reduces volumes by ∼4 orders of magnitude to ∼690 pL per reaction.
The process involves the synthesis of DNA parts, followed by microarraying and incorporation into microfluidic unit cells where they serve as templates in cell-free TX-TL reactions (Figure 1A). To expedite the synthesis of large libraries of DNA parts we used an assembly PCR strategy to generate linear DNA templates with different promoter regions upstream of a deGFP gene. A microarray robot is used to spot the linear templates onto an epoxycoated glass slide, on top of which the PDMS device is aligned. Immobilizing DNA within each reaction chamber first requires surface patterning in the assay section of each unit cell, resulting in a circular area of neutravidin to which biotinlyated DNA can bind. Once DNA is surface immobilized, cell-free extract is flowed into the device and the unit cells are isolated from one another while the TX-TL reactions occur. A detailed schematic of the experimental procedure is shown in Figure S1A.
Controlling the precise amount of DNA in each unit cell is important for quantitative experiments. By simply varying the concentration of spotted biotinylated DNA templates we were unable to precisely control DNA concentration on-chip. We thus developed an approach based on spotting a mixture of single stranded biotinylated DNA oligos (ssDNA) and double stranded DNA templates (dsDNA). The amount of DNA immobilized on the surface reached saturation at a concentration of ∼100 nM spotted DNA (Figure S2). We therefore held the total concentration of spotted DNA above this saturation point. Changing the ratio of dsDNA:ssDNA gave rise to a linear correlation between the concentration of dsDNA free in solution (DNAF) and dsDNA bound to the surface (DNAB), and was insensitive to the total amount of DNA deposited during spotting (Figure 1B, C). This approach allowed us to immobilize DNA over a wide concentration range, which gave rise to corresponding levels of expressed deGFP (Fig. 1D, E). The results obtained with the high-throughput microfluidic device are reproducible with a global normalized root-mean-square deviation of ∼14%, not only when a single dsDNA template is used, but also for more complex experiments requiring multiple templates in each unit cell (Figure S3). Furthermore, a subset of on-chip measurements was carried out in standard micro-well plate reactions, showing good correlation (Figure S4).
To demonstrate the high-throughput capabilities of our microfluidic chip we created and characterized a library based on the E. coli σ70 λPR promoter. We synthesized 124 promoter variants that covered all possible single base mutations within the −47 to −7 region of the λPR promoter (Figure 1F). Cell-free reactions for each promoter were run in 6 replicates on a single chip and yielded deGFP expression profiles revealing the impact of each mutation on protein expression (Figures 1F, S3A). As expected, mutations within the −10 and −35 boxes affected deGFP expression most strongly and the results are comparable to previous results obtained by an in vivo analysis of the lac promoter (Kinney et al. 2010).
Protein synthesis eventually stops in cell-free batch reactions as seen in the saturation dynamics in time course measurements (Figure 1E); this is fundamentally different from cellular steady state protein levels which result from balancing production with degradation and dilution rates. In this paper we report end-point batch reaction values and derived quantities such as fold repression. It is thus important that the end-point values correspond to protein production rates. While the relationship between the initial rate of deGFP production and its final saturated level may be complex, we observe a linear relationship between the two quantities under our experimental conditions (Figure S5). This is an important validation of our use of end-point protein levels and linearly derived quantities such as fold repression as proxies for synthesis rates and their ratios.
B. Zinc-finger repressor and promoter library design
Using the characterization of the λPR promoter as a starting point, we applied our chip to the in depth characterization of synthetic ZFs for use as transcriptional repressors. We adopted a ZF design based on Zif268, a three-finger Cys2His2 protein. A large ZF repressor library can be generated by combinatorially shuffling a small number of individual ZF domains (Figure S6A). We utilized ZF proteins drawn from a 64-member library that we previously synthesized and characterized (Blackburn et al. 2015) (Figure S6).
The affinity of a ZF repressor to DNA can be improved by increasing the number of finger domains (Kamiuchi et al. 1998, Kim and Pabo 1998, Moore et al. 2001, Pomerantz et al. 1998). The same effect can also be achieved by engineering dimerizing ZFs that bind cooperatively. An early example used structure-based design to engineer a two-finger ZF which dimerized via a LZ motif to form a four-finger complex (Wolfe et al. 2003, 2000). Three-finger ZFs have also been dimerized using PDZ domains (Khalil et al. 2012). Cooperative interactions are of interest because they potentially increase the nonlinearity of regulation, as well as decreasing non-specific binding compared to extended arrays of ZFs. To study cooperative interactions we built several different ZFs fused to either PDZ or LZ domains (Figure S6B).
In parallel, we designed corresponding repressible promoter libraries. As we use an E. coli cell-free system (Sun et al. 2013), we based our promoter designs on the strong λPR promoter in combination with transcription and translation elements optimized for E. coli cell-free expression (Sun et al. 2014). Previous work has shown that the most effective position for transcriptional repression is the space between the −35 and −10 boxes (Cox III et al. 2007); we thus generated a library with consensus ZF binding sites (ZFBSs) inserted into this location. Additionally, we built promoters with a second ZFBS upstream of the −35 box, allowing us to study the effect of multiple non-cooperative and cooperative ZFBSs (Figure S6B). The promoters drive expression of a deGFP reporter, a GFP protein previously optimized for cell-free translation (Shin and Noireaux 2010). All constructs were built and tested using linear DNA templates generated by PCR in concordance with recommended guidelines for cell-free expression (Sun et al. 2014).
C. Repression with single and multiple binding sites
We performed an in depth characterization of 11 synthetic ZFs by assessing their repressive capacity in cell-free reactions, and by measuring their respective dissociation constants (Kd) with MITOMI. We used MITOMI to measure the Kds for each ZF against all possible target promoters. By localizing pre-synthesized his-tagged ZFs to the surface of each unit cell we are able to measure the binding of DNA sequences spanning the promoter region including the ZF binding site (Figure 3A, S1B). We obtained standard Gibbs free energies, ΔG = RT ln(Kd), for each ZF - target promoter complex (Figure 3B). A range of binding strengths was observed for the respective consensus ZF binding sequences, as well as low affinity off-target binding. The CBD zinc finger was included as a negative control as it does not bind to its own predicted binding site nor any of the other targets.
To test whether the relative binding strength of each ZF related to functional gene repression, we implemented cell-free TX-TL reactions screening the same matrix of ZFs versus promoters. Each microfluidic unit cell contained a linear template encoding the ZF to be tested and a second linear template encoding deGFP downstream of a promoter with a single ZF binding site (Figure 3C). Binding of the expressed ZF to the target promoter would lead to down-regulation of deGFP expression. A common measure of repression performance is fold repression, or the ratio of unrepressed to repressed expression levels. Unrepressed measurements were obtained by co-expressing the target promoter template with the nonbinding ZFCBD template to control for loading effects (Siegal-Gaskins et al. 2014). Despite some off-target binding observed by MITOMI, functional repression of all ZF – target pairs was almost perfectly orthogonal (Figure 3D), with one exception: the repression of promoter BDD by ZFADD. However the general trend of weak off-target affinities translated to no or minimal off-target repression, resulting in functional repression only for cognate pairs. Furthermore, on-target fold repression directly correlated with the measured MITOMI affinity values (Figure 3E). Using two high-throughput microfluidic techniques we were able to characterize the binding affinity, repressive strength, and orthogonality of synthetic transcription factor – promoter pairs.
Promoters with a single ZF binding site achieved low to medium fold repression levels in the range of 1.5 to 7 (Fig. 4A). We tested whether placing an additional binding site upstream of the −35 box could further improve fold repression levels. While fold repression is a convenient measure used to describe the functionality of a given repressor - promoter pair, for applying these repressors in genetic networks it is important to also consider basal promoter strength (unrepressed state) and leak (repressed state). These quantities are also shown in Figure 4, where we observed that variation in binding site sequence led to variations in basal promoter strength; this variation increased upon inclusion of the second binding site upstream of the −35 box. At the same time, the average leak from the repressed state decreased for the dual site library, resulting in higher fold repression values. Overall, fold repression improved for almost all two binding-site promoters, with the best promoters achieving a fold repression level of 7 – 10 (Figure 4B). These results showed that good repression levels can be achieved by synthetic ZF repressors with either single or double binding site promoters in a cell-free system.
Next we characterized the effect of binding site position on repression strength. We generated a library of promoters containing a single ZF binding site that was placed in various positions relative to the −35 box. Best fold repression was achieved by positioning binding sites directly proximal to the −35 box, in the range of −2 to +4 bps relative to the start and end of the −35 box, respectively. We also observe that repression is sensitive to single bp shifts in position. For instance, the site at the +5 position is effectively non-functional compared to repressing neighbouring sites at +4 and +6; and the site at the −5 position exhibited significantly stronger repression than its neighbours at −4 and −6. Based on the crystal structure alignment of ZF and RNA polymerase bound to DNA containing the binding site at position +5, we note that it is possible for both proteins to bind simultaneously with minimal steric interference. To ascertain that the observed repression strengths were not due to changes in binding site affinity of the ZF, as each binding site is located in a different sequence context, we measured the binding affinity of the ZF repressor to each promoter using MITOMI. The results showed only minor differences in affinity across all promoters, suggesting that the ZF repressor bound to these promoters with equal strength. Promoter repression thus appears to be primarily a function of the ability of the ZF to sterically hinder and compete with RNA polymerase. These data are consistent with an occlusion mechanism whereby RNAP binding is competitively inhibited by ZF binding (Ptashne et al. 1976), and the effectiveness of the competition is dependent on the relative positions of ZF and RNAP on the promoter.
D. Engineering cooperativity
We showed that incorporating a second binding site can result in improved fold repression. However, engineering certain types of genetic circuits often requires an additional increase in the nonlinear response as well as a decrease in the leak for a given promoter – TF pair. Nonlinearity can be increased by introducing cooperativity via protein – protein interactions. We implemented two different protein interaction domains previously demonstrated to successfully dimerize ZFs.
PDZ domains enable natural protein – protein interactions by binding specific C-terminal peptide sequences with micromolar affinity (Khalil et al. 2012). We took advantage of this interaction to engineer cooperativity by linking ZFBCB to a mammalian α1-syntrophin PDZ domain, and ZFADD to its corresponding cognate C-terminal peptide ligand (VKESLV). Furthermore, we linked ZFADD with a non-cognate ligand (VKEAAA) to use as a noncooperative control.
The second type of interaction we explored was dimerization by linking ZFBCB and ZFADD to GCN4 LZ domains. The GCN4 LZ has previously been used in a structure-based design to enable homodimerization of two-finger ZFs (Wolfe et al. 2000), and we thus also tested this existing structure. In both cases, a mutated LZ was used as a negative control.
Preliminary studies on a plate reader demonstrated that ZFs containing interaction domains exhibited significantly increased fold repression and decreased leak (Figure 5A, B). Whereas two non-cooperative repressors gave a maximum fold repression of ∼6, this value was increased to ∼30 for PDZ and ∼16 for LZ-mediated cooperativity. Concurrently, leak values decreased four-fold from around 4000 to <1000 RFUs. One critical parameter affecting PDZ cooperativity was the choice of linker, with an optimized glycine-serine linker vastly outperforming a rigid proline linker. The two-finger LZ transcriptional repressor also performed very well, achieving a fold repression ratio of ∼28.
To investigate cooperativity in more detail, we measured dose response curves by titrating repressor DNA concentration. To keep the load on the transcription-translation machinery constant, the total ZF DNA concentration was kept constant by adding DNA coding for a non-binding ZF control (ZFCBD). Figure 5C shows dose response curves of ZFBCB – PDZ and ZFADD – L separately, together with those for the cooperative pair: ZFBCB – PDZ + ZFADD – L, and the non-cooperative pair: ZFBCB – PDZ + ZFADD – NL. An increase in the steepness of the dose response curve was observed as we proceeded from a single ZF to two non-cooperatively interacting ZFs, and finally to two cooperatively interacting ZFs. Similar results were obtained for the LZ designs (Figure 5D, E). The effect of cooperativity can be quantified by determining the sensitivity (Figure S7), which measures the steepness of the dose response curve (Bintu et al. 2005b), as well as the effective Hill coefficient, which is obtained by fitting phenomenological Hill functions (Figure S8). The results of this analysis are shown in Table S1. We observe that cooperativity increased sensitivity by nearly 50% with respect to the non-cooperative repression, as well as slightly increasing the Hill coefficient.
We sought to understand this behaviour quantitatively by developing a thermodynamic model that relates protein expression to the equilibrium occupancy of the promoter by RNAP (Bintu et al. 2005). We extended the standard competitive model of repression to include a term for the interaction between repressor and RNAP, which is quantified by an effective interaction energy. As this energy tends to large positive values, DNA binding by either RNAP or the repressor is exclusive, and the model tends towards that of competitive inhibition. As the energy approaches zero, both RNAP and DNA can bind simultaneously, resulting in leaky expression at full repressor occupancy. This extension to the model was motivated by our results that a ZF with a fixed binding affinity represses with varying efficiency depending on the position of the binding site; the changing RNAP-ZF interaction energy therefore provides a simple description of this effect. We fit the model to the dose response curves using Markov chain Monte Carlo (MCMC) sampling (Figure S9), allowing us to consistently extract the posterior probability distributions of all parameters, which consist of fixed effective dissociation constants of each individual ZF, as well as the effective energies describing ZF-RNAP and ZF-ZF interactions. The fits are shown in Figure 5C–E as solid lines and shading, which represent the mean and 2 SD boundaries for model predictions, respectively. The values of all fitted parameters are given in Table S2, and a full description of the model is given in the Methods section. We find physically sensible values for all our parameters; in particular, the cooperative interaction energies for PDZ-L (—2.1 ± 0.2 kcal/mol) and LZ (–1.8 ± 0.2 kcal/mol) are consistent with literature values for similar domains (∼ –2 to —10 kcal/mol (Jana et al. 2000, Saro et al. 2007)).
Since the location of the ZF binding site, and hence the relative positioning of ZF and RNAP, is an important determinant of repression efficiency, it is likely that the relative positioning of the ZFBCB – PDZ and ZFADD – L binding sites would also determine their ability to interact and subsequently alter their repressive strength. Keeping the ZFBCB – PDZ binding site position fixed, we shifted the ZFADD – L binding site further and further upstream. If the two ZFs are positioned on the promoter such that the cooperative PDZ-ligand interaction is unfavorable, we would expect fold repression to be similar to that of the non-cooperative ZFs. In other words, the ratio between the cooperative and the noncooperative fold repression, a quantity we call the cooperativity ratio, should go to unity when the PDZ-ligand interaction cannot occur.
We observed an effect due to this variation of spacing between the two binding sites (Figure 5F), and this behavior corresponded to the relative orientation of the PDZ-ligand domains. As the binding site is shifted, ZFADD – L rotates around the DNA, modulating its alignment with ZFBCB – PDZ. The cooperativity ratio fell to 1 when the interaction was unfavorably aligned, but increased again as the domains began to realign (Figure 5G). The cartoon in Figure 5H shows the predicted orientations of the two ZFs as the left-hand site is shifted. The ability of the ZFs to interact over distances of a few tens of bp is likely due to extension of the long flexible glycine-serine linker used to join the ZFBCB and the PDZ domain. It is unlikely that DNA bending plays a significant role at these distances, due to dsDNA’s much longer persistence length of ∼150 bp.
We incorporated into our model a phenomenological exponential decay of interaction energies with distance, both between the two ZFs as well as between the ZF and the RNAP. Additionally, the ZF-ZF interaction energy was modulated by a periodic function at the frequency of the DNA helical pitch (10.5 bp/turn). Using previously inferred parameters for energies and KDs from the dose response measurements, we performed a fit to determine the decay constant and phase shift; the results are shown as solid lines and shading in Figure 5F and G, and in Table S2. Fitting a model with an explicit position dependence for the binding sites illustrates the importance of site positioning for functional repression. More generally, while simplistic, our model fits demonstrate that it is possible to understand cell-free gene expression in terms of thermodynamic occupancy.
E. Affinity tuning
In order to test whether fold repression levels could be precisely and predictively tuned, we investigated the effect of varying binding site affinity. In order to rationally tune binding site affinity, we first generated quantitative PWMs for three ZFs: ZFBCB, ZFAAA and ZFADD, covering the nine bp core sequence plus three flanking bases on either side (Figure 6A, S10A, B). The sequence logo determined for ZFAAA is in concordance with the consensus sequence determined by bacterial one-hybrid and in vitro SELEX assays (Meng et al. 2005, Wolfe et al. 1999). Based on our PWMs we designed a library of promoters that included a single binding site at a fixed position between the −35 and −10 boxes, with single or double mutations within or outside the core binding sequence. As binding site affinity decreased we observed corresponding decreases in fold repression for all ZFs tested (Figure 6B, S10C). By converting our macroscopically measured ΔG values into microscopic interaction energies Δє we found that the fold repression data could be described by the thermodynamic model presented in the previous section.
Mutating either a single base outside the core site, or one core position of low information content (high entropy), enabled fine tuning of fold repression, whereas a single mutation in the core site of high information content strongly decreased fold repression. Two core mutations decreased fold repression to baseline levels. Fold repression was therefore precisely tuneable over the entire dynamic range by modulating binding site affinity, and the affinity changes required to achieve tuning were relatively small. Affinity changes of ∼ 0.5 to 1 kcal/mol were sufficient to cover the entire dynamic range for each ZF repressor tested. The results are in line with previous findings that promoter tuning in S. cerevisiae can be accomplished by relatively subtle affinity changes in a single binding site created by mutations in flanking or single core site mutations of high entropy (Rajkumar et al. 2013). They also correspond to recent results obtained in E. coli (Barnes et al. 2018).
Given that a single ZF binding site could be mutated to yield varying levels of repression we investigated whether the same tuning could be applied to cooperative ZFs. We measured the binding affinity of the ZFAA – GCN homodimer versus a library of DNA targets that consisted of all single point mutations for the 10 bp core binding sequence plus 2 flanking bases on either side. The resulting sequence logo and PWM reveal the symmetric binding profile of the homodimer (Figure 6C). Mutating a single binding site within the −35 and −10 boxes led to a change in repression levels that reflected the measured Kds for both the cooperative and non-cooperative ZFAA – GCN variants (Figure 6D). As the two 6 bp binding sequences overlap, mutating a single base within the core site leads to a finer tuning of fold repression in comparison with the three-finger ZFs. Furthermore, we extended binding site tuning to the ZFADD – L - ZFBCB – PDZ heterodimer pair, taking advantage of the PWMs generated for ZFBCB and ZFADD. Implementing a subset of mutations to each ZF binding site yielded a range of fold repression values not only for the single ZF but also for the cooperative and non-cooperative ZF pairs (Figure 6E). As the affinity of one ZF is reduced we see that the fold repression observed for the cooperative and non-cooperative cases tends to the fold repression measured for the second ZF whose binding site remains constant.
F. Logic gate construction
Having established a well-characterized resource of transcriptional repressors and promoters, we applied them to designing logic gates. By combining two cooperative ZF repressors on a single promoter we were able to create NAND gates, which are of particular interest as they are functionally complete. An effective NAND gate should have low output only when both inputs are present (Figure 7A). We therefore placed the binding site for a strongly binding ZF (ZFBCB) 2 bp upstream of the −35 box, and second binding site for different ZFs between the −35 and −10 boxes. ZFBCB cannot strongly repress by itself at the −2 position and the second ZF should also not strongly repress on its own. Only when both ZFs are bound to the promoter should they strongly repress, which can be achieved by including a cooperative interaction between the two ZFs. Using this general design we tested NAND gates for ZFBCB – PDZ in combination with the remaining ZFs (Figure 7B). As expected, NAND gate performance improved as the affinity of the ZFXXX – L decreased. For instance the combination of ZFBCB – PDZ and ZFBDD – L gave rise to a functional NAND gate, whereas a combination with ZFAAA – L did not due to the high affinity of ZFAAA – L, which led to functional repression even when only ZFAAA – L was present.
Since we showed that binding affinity could be precisely tuned (Figure 6) we tested whether we could improve our non-functional NAND gates. Based on the PWM measured for ZFAAA we mutated the ZFAAA – L binding site sequence in the NAND gate promoter and showed that we could achieve tuning in this context as well (Figure S10D). We then investigated the effect of tuning the ZFAAA – L binding site for all possible input combinations and showed that the NAND gate improved as we weakened ZFAAA – L binding affinity (Figure 7C). Mutations +1C and +1A gave rise to functional NAND gates. Decreasing the binding site affinity increased the output when only ZFAAA – L was present; however, when the mutation resulted in a ΔΔG of greater than ∼ 0.5 kcal/mol (Δ2T), the cooperative binding output also suffered. Our synthetic ZF repressors can thus be used to build functional NAND gates, which can additionally be rationally optimized and precisely tuned by modifying binding site affinities.
As a final example we generated compound logic gates by combining NAND and NOT logic gates as linear cascades in order to create AND and OR gates. We created an AND gate by appending a NOT gate to the output of a NAND gate (Figure 7D). Specifically we combined the ZFBDD – L - ZFBCB – PDZ NAND gate with four different ZFs. Each AND gate was tested and yielded the expected outputs (Figure 7E). We then generated OR logic gates by prepending two NOT gates in front of different NAND gates to invert the inputs (Figure 7F). We used ZFADB and ZFBAB as the two NOT gate inverters and a set of NAND gates, all of which gave rise to functional OR gates (Figure 7G).
DISCUSSION
GRNs are of central importance in both native and engineered systems. They integrate, compute, and transduce input signals, leading to specific changes in gene expression. Many components contribute to the function of GRNs, and transcription factors and their interaction with promoters are core players. Due to the complexity of even a single transcription factor – promoter interaction it has proven difficult to quantitatively study these systems in vitro or in vivo. Although the development of new technologies is steadily enabling progress in this area, our understanding of GRNs remains limited as exemplified by our inability to predict in vivo gene expression levels in essentially any organism, and the difficulty asso ciated with de novo engineering of GRNs. Although methods exist for high-throughput in vitro characterization of transcription factor binding specificities (Bulyk et al. 2001, Jung et al. 2018, Maerkl and Quake 2007, Zhao et al. 2009) and medium to high-throughput approaches are used to understand gene regulation in vivo (Barnes et al. 2018, Mogno et al. 2013, Rajkumar et al. 2013, Sharon et al. 2012) both approaches have limitations. Both an advantage and disadvantage of in vitro methods is that they generally include only the smallest number of components necessary, i.e. a transcription factor, dsDNA target and a defined buffer solution. In vivo methods are on the other hand convoluted by cellular complexity. Furthermore, generating and analyzing defined libraries in vivo remains labor intensive and difficult. Here we explored the use of a cell-free transcription-translation system to build and characterize GRNs in an environment that bridges the gap between in vitro and in vivo methods. This cell-free approach also has the advantage of allowing complex assays to be performed in high-throughput, in a well-controlled and accessible environment. As a consequence, the ability to study functional transcriptional regulation in an in vitro system has allowed us to delve into much greater depth than comparable in vivo methods have been able to achieve (Amit et al. 2011, Garcia and Phillips 2011, Rajkumar et al. 2013)
We chose to build GRNs from the bottom up using ZF transcription factors for several reasons. First, in regards to GRN engineering, researchers have long been hampered by the relatively small number and poor characterization of available transcriptional regulators. Khalil et al. have previously engineered ZF regulators, showing that they are viable tunable transcriptional regulators in vivo (Khalil et al. 2012). We built on this concept, generating additional ZF regulators and interaction domains. More importantly, we quantified the binding energy landscapes of several synthetic ZF regulators and were able to show that repression can be precisely tuned with small changes in affinity. These small changes were achieved by mutating the flanking bases lying outside of the consensus core sequence or by mutating one consensus core base of low information content. Hitherto, only coarse tuning has been accomplished through varying the number of consensus sequence binding sites leading to rather large differences in output (Khalil et al. 2012, Lohmueller et al. 2012). The ability to predictively and precisely tune expression levels as demonstrated here is important in engineered GRNs where individual nodes of the network need to be matched in expression levels. For example, we show here that the ability to precisely adjust individual binding site affinities is crucially important for optimizing logic gate function.
With the advent of TALEs and dCas9, ZFs might be considered outdated technology, but there are a number of reasons why ZF TFs remain an appealing tool for GRN engineering. ZFs have several advantages such as small size, relatively easy gene synthesis, and good expressability. The biggest advantage of dCas9 and TALEs is their programmability, allowing them to be precisely targeted to any DNA sequence. Conversely for ZFs, it remains relatively difficult to rationally design a particular binding site preference. For genome editing and in vivo targeting approaches, in which the target sequence is defined and immutable, programmability is crucial. In the context of bottom up GRN design, this ability becomes less important as target sequences can be easily adjusted to a particular TF specificity. We argue that it is actually more important to be in possession of a well-characterized TF binding energy landscape that can be obtained for ZF TFs using current methods (Blackburn et al. 2015).
A second argument in support of using ZF transcription factors over TALEs and dCas9 is the simple but important fact that ZFs are native transcriptional regulators and the most abundant class of transcriptional regulators in vivo. Cas9, to the best of our knowledge, has not been shown to be involved in gene regulation in native systems, while TALEs are injected into plant host cells to modulate gene expression by pathogenic bacteria (Boch et al. 2009). If cell-free approaches are to be used to understand the function of native systems it is important to build GRNs with native transcription factors. For example, the protein – DNA interaction kinetics are very different in that dCas9 (Boyle et al. 2017) and TALE (Cuculis et al. 2016) tend to have very slow DNA dissociation rates, while native transcriptional regulators have fast dissociation rates (Geertz et al. 2012), which may make engineering dynamic GRNs using TALEs and dCas9 difficult.
In order to improve fold repression and to add more control over the system we engineered cooperative binding into our ZF TFs by including PDZ or LZ protein – protein interaction domains. These interactions improved repression from ∼10 to up to ∼30 fold and were functional for both two- and three-finger ZFs. We showed that the relative placement of binding sites for two cooperative TFs is a major determinant of interaction capacity and consequently repression strength. Repression was achieved when the TFs were located on the same face of the DNA, and repression strength followed the helical twist of DNA. Cooperative interactions consequently allowed us to engineer functionally complete NAND gates. In all cases we were able to explain our data with thermodynamic models. Combining these models with binding energy landscapes thus provides a viable and useful approach to rationally engineer GRNs.
One outstanding problem encountered during this study is the issue of composability. Although transcription factor binding sites were only introduced in regions outside the −10 and −35 boxes of the original λPR promoter, many of the synthetic promoters had considerably different baseline (non-repressed) expression levels. In the future it will clearly be important to better understand and predict basal promoter strength from the underlying sequence, which would lead to models that allow introduction of transcription factor binding sites without affecting basal promoter output. Here we have seen that basal promoter strength itself can be finely tuned over a relatively large range of expression levels (Figure 1). It should therefore be possible to adjust promoter strength as desired: we demonstrate a basic example of this idea by tuning the basal expression level of a repressible promoter (Figure S11). Ultimately understanding the outcome of multiple base changes in close context with each other remains a complex issue. Evaluating a greater number of sequences and systematically addressing all factors affecting transcription efficiency similar to the approach taken by Cambray et al. towards translation could lead to an improved understanding of promoter sequence design principles (Cambray et al. 2018).
In order to characterize and measure our synthetic ZF transcription factors and promoters in detail we repurposed a high-throughput microfluidic device that allowed us to measure 768 cell-free reactions in parallel. Eliminating cloning and transformation steps by relying on PCR-based assembly strategies allowed us to measure a large number of defined transcription factor and promoter variants. Over 13,000 on-chip cell-free TX-TL reactions were performed, encompassing replicates for ∼2000 unique reactions. We furthermore took over 8000 MITOMI measurements to provide binding energy landscapes for 4 synthetic ZF transcription factors. Together, these technologies allowed us to establish a quantitative and in-depth dataset and insights into transcriptional regulation that should be of general interest. The approach taken here nonetheless does not per se require these state-of-the-art technologies, and is easily transferable to standard lab equipment. Cell-free lysate can now be easily and cheaply generated, yielding sufficient material so that medium-scale screens in 384-well plates are feasible (Sun et al. 2013). Commercial liquid handling equipment can also be used to scale up throughput. Binding energy landscapes can be generated by many approaches including PBMs (Bulyk et al. 2001), MITOMI (Maerkl and Quake 2009), SELEX-seq (Zhao et al. 2009), and HiP-FA (Jung et al. 2018). While our binding energy landscapes are based on direct affinity measurements, it may be sufficient to use PWMs from indirect measurements as found in other high-throughput techniques.
Rapid progress is being made in the development and application of cell-free synthetic biology. Cell-free systems are being used to tackle fundamental problems in molecular engineering and are being applied to molecular diagnostics (Pardee et al. 2016), therapeutics (Pardee et al. 2016b), synthesis (Goering et al. 2016), and are even being used for educational purposes (Stark et al. 2018). Cell-free systems are an appealing alternative to cellular systems, as they eliminate many of the complexities associated with working with cells. Cell-free systems are also a rapid prototyping platform for engineering molecular systems destined to be applied in cellular hosts (Niederholtmeyer et al. 2015). As engineered systems become more complex it will become increasingly important that a large number of standardized characterized components become available. It will be equally important to develop a comprehensive mechanistic understanding of these components and systems to allow parts to be standardized and rationally assembled without requiring extensive trial-and-error cycles or large screens, which may not be feasible for large systems. As work progresses on cellular sub-systems such as gene regulation, DNA replication, ribosome biogenesis, metabolic networks, and membrane and protein super-structures, it will be intriguing to contemplate whether it may be possible to integrate these individual systems to create a synthetic cell or cell-like mimic. Work in this area will not only provide tools and methods aiding engineering of synthetic systems, but is likely to provide insights into the function of native systems as well. Prior to being used as tools for protein synthesis and synthetic biology, cell-free systems have already had a rich history in deciphering fundamental aspects of biochemistry including DNA replication (Fuller and Kornberg 1983) and the genetic code (Nirenberg and Matthaei 1961). It is likely that they will continue to provide fundamental insights into complex systems such as transcriptional regulation.
AUTHOR CONTRIBUTIONS
Z.S. and N.L. performed experiments. Z.S., N.L. and S.J.M. designed experiments, analyzed data and wrote the manuscript.
DECLARATION OF INTERESTS
The authors declare no competing interests.
ACKNOWLEDGEMENTS
We thank Samuel Clamons and Miki Yun from the Murray lab (Caltech) for providing the cell-free transcription-translation extract and Samuel Clamons and Richard Murray for helpful discussions. We also thank Malek Kabani, Eugenia Pankevich, and Stefan Bassler for their experimental contributions to this project. This work was supported by an HFSP Program Grant (RGP0032/2015) and the École Polytechnique Fédérale de Lausanne.