Abstract
Recently, a synthetic circuit in E. coli demonstrated that two proteins engineered with LAA tags targeted to the native protease ClpXP are susceptible to crosstalk due to competition for degradation between proteins. To understand proteolytic crosstalk beyond the single protease regime, we investigated in E. coli a set of synthetic circuits designed to probe the dynamics of existing and novel degradation tags fused to fluorescent proteins. These circuits were tested using both microplate reader and single-cell assays. We first quantified the degradation rates of each tag in isolation. We then tested if there was crosstalk between two distinguishable fluorescent proteins engineered with identical or different degradation tags. We demonstrated that proteolytic crosstalk was indeed not limited to the LAA degradation tag, but was also apparent between other diverse tags, supporting the complexity of the E. coli protein degradation system.
Introduction
Proper cell behavior is maintained using a finite pool of processing resources, such as the limited pool of enzymes required for gene transcription and protein translation1-2. Natural biological circuits are largely thought to have evolved to buffer against the effects of limited resources, but we are beginning to understand how processing machinery can form a bottleneck that is in fact leveraged as a control or signaling mechanism3-4. Proteolytic (protein degrading) pathways, in particular, have been found to form functional bottlenecks in a native E. coli network regulating the stationary phase sigma factor S (σS). The protein σS is degraded by the ClpXP proteolysis system (ClpXP protease and its chaperones) much faster during exponential growth phase5-6 than stationary phase7, and the corresponding buildup of σS in stationary phase acts as a signal triggering the stress response system for starvation8. An explanation for increased stability of σS in stationary phase is that there are an increased number of mistranslated proteins targeted for degradation by ClpXP. Mistranslated proteins are targeted to ClpXP because they have a C-terminal SsrA tag (sometimes labeled LAA tag) due to a special transfer-messenger RNA (tmRNA) being added to mRNA to flag peptides for degradation9-10. These proteins compete for a limited number of proteases, especially ClpXP, which results in the formation of a “queue” of substrates for the protease that increases the apparent half-life of σS8.
The complexity of natural proteolytic pathways serves as a barrier to understanding this phenomenon, and so synthetic circuits offer a valuable alternative approach11. It was predicted based on the theoretical modeling of a synthetic oscillator that overexpression of LAA-tagged proteins could lead to saturation of proteolytic machinery12-13, i.e. that proteolytic machinery could be limiting. Recently, a synthetic circuit more directly demonstrated that two distinguishable fluorescent proteins engineered with ClpXP-targeting LAA-tags can lead the formation of a queue that resulted in crosstalk, such that the buildup of one protein can increase the concentration of another (Fig. 1A)8. Queueing theory has since been adopted to describe how competition between substrates for protease can lead to pronounced coupling and statistical correlation8, 14-16. The impact of proteolytic queueing competition leads to a rewiring of natural and synthetic circuits to include mutual modulation of substrate degradation rates17. This, in particular, applies to all but one existing bacterial synthetic oscillator, which target multiple species of protein to a common protease ClpXP18. One exception is the recently modified repressilator19, where active degradation by protease was systematically removed to produce a more robust growth-dependent (dilution-dependent) oscillator, which interestingly was predicted based on a prior analysis of proteolytic competition17.
The single protease crosstalk picture is too simplistic for native circuits, and the reliance on a single degradation pathway for bacterial synthetic oscillators presents a scalability problem that limits the complexity of circuits that can be developed. To address this issue, we investigated the crosstalk between multiple native degradation pathways in E. coli. This study extends a prior investigation of computational models that suggested a multi-protease proteolytic bottleneck may still contribute substantially to crosstalk in simple and complex (oscillatory) networks20. The influence of crosstalk in multi-protease models was evident even between substrates with substantially different affinities for a protease. Crosstalk may then be generic and complex in native networks. In particular, the different proteolytic networks of E. coli may exhibit strong mutual crosstalk, as only three proteases (Lon, ClpXP, and ClpAP, see Table S1) together are required to account for approximately 70%–80% of ATP-dependent degradation in bacterial cells21-22. These few proteases thus establish the bulk of proteolytic degradation bandwidth that is shared among a diverse set of actively degraded proteins (about 20% of newly synthesized polypeptides are degraded23). This fact combined with the phenomenon of queueing coupling suggests that crosstalk between diverse cellular networks may be typical and cannot be easily relieved by the limited number of proteolytic pathways.
In this work, we used a synthetic biology approach to understand crosstalk between the proteolytic systems of E. coli. We designed several synthetic genes that produce fluorescent proteins with protease-targeting tags to serve as probes for and indicators of crosstalk between different protease queues. Using the same E. coli strain, we systematically investigated these genes in isolation and in combination. The tags that resulted in substantial degradation in isolation were further investigated for proteolytic crosstalk by co-expressing fluorescent proteins with specific tags. We observed that there is often measurable crosstalk when two fluorescent proteins engineered with identical degradation tags were targeted to the same protease, which is as expected based on the queueing analogy. We also identified a range of crosstalk strengths, ranging from undetectable to high, when proteins are selectively targeted to different protease pathways.
Results and Discussion
Multiple synthetic amino acid sequence tags target fluorescent proteins for degradation by different proteases
To study proteolytic degradation by different proteases, fluorescent proteins were engineered with different potential degradation tags on either the N-terminus or C-terminus (Table S2). Previously tested degradation tags could not be compared directly because they were characterized in different E. coli strains and under different conditions. To our knowledge, this is the first systematic investigation of multiple degradation tags in E. coli. We tested the previously determined degradation tags and several newly designed tags (Table 1A-B). Degradation tags were fused to multiple fluorescent proteins and tested for activity using a high-throughput in vivo microplate reader assay. All putative degradation tags tested targeted YFP for degradation (Fig. 2A and Fig. S1). HipBc20, MazE, SoxSn20, RepA15, and HipB were identified as poor degradation tags. In all upcoming experiments, only one poor degradation tag with a medium degradation rate, HipB, was tested for proteolytic crosstalk with other tags. All other tags (LAA, RepA70, MarA, and MarAn20) except SoxS were tested for proteolytic crosstalk.
The main proteases of an E. coli cell can be overloaded when proteins are targeted to a single protease via engineered degradation tags
Previous researchers utilized LAA tagged proteins to demonstrate that proteolytic queues form at ClpXP in the overloaded state 8. We recreated their results using our own circuits (Fig. 3B). We then explored if other degradation tags with presumably different affinities (Fig. 1A) would result in the formation of queues and if queues could form with proteases other than ClpXP. YFP and CFP proteins were engineered with RepA70 tags for degradation by ClpAP. The fluorescence levels of RepA70-YFP were monitored as we increase the level of RepA70-CFP proteins by adding the chemical inducer IPTG. The fluorescence of RepA70-YFP increased as more RepA70-CFP was produced (Fig. 3C). This indicated that ClpAP protease can be overloaded, and a proteolytic-queue forms, similar to what was observed with the LAA tagged proteins targeted to ClpXP8. We also tested two other tags, MarA and MarA20 (20 amino acids from the N-terminal of MarA), which target proteins to be degraded by the Lon protease. The Lon protease was weakly overloaded by MarA tagged proteins, but was overloaded more by MarAn20 tagged proteins (Fig. 3C-D). This made us wonder if Lon could be overloaded when both MarA and MarAn20 were co-produced. Indeed, this was the case (Fig. 3E).
The main proteases of E. coli can exhibit different levels of crosstalk depending on the degradation tags used
We have demonstrated that ClpXP, ClpAP, and Lon can be overloaded using two proteins engineered with identical degradation tags targeted for a specific protease. We have hypothesized that crosstalk between proteases may occur through shared information (Fig. 1B). To test this hypothesis in a synthetic system, we monitored the level of a fluorescent protein (YFP) targeted to one protease while producing another protein (CFP) targeted to a different protease. There was strong crosstalk when proteins with the LAA degradation tag (target to ClpXP) were co-produced with proteins with all other tags (RepA70, MarA, MarAn20, and HipB; Fig. 4A-D). Although ClpXP is the primary protease that degrades LAA-tagged proteins9, 24, several other proteases recognize this tag (Table S1). Perhaps due to the importance of removal of potentially harmful waste proteins, cells evolved to have “backup” (a cellular redundancy) proteolytic pathways to remove peptides even if ClpXP is overloaded. Although LAA-tagged proteins are often utilized in synthetic biology circuits18, our results suggest it is not an ideal candidate for orthogonal circuits that depend on a proteolytic pathway such as most oscillators. We have tested several other degradation tags that can be potentially used in future synthetic circuits. There was measurable crosstalk when proteins with RepA70 and MarA degradation tags were co-produced (Fig. 4E), but no detectable crosstalk when MarAn20 and RepA70 tagged proteins were co-produced (Fig. 4F). This suggests that MarAn20 and RepA70 may be useful for future synthetic circuits. All proteins with the degradation tags tested thus far were rapidly degraded. We tested a medium-degradation tag, HipB, and, interestingly, we observed no apparent crosstalk between proteins with HipB-tags and all other tagged tested tags (Fig. S2), except for the LAA tag (Fig. 4D).
Single-cell data from microscopy slides support the high-throughput microplate data
The high-throughput microplate batch data allowed us to get an overall picture of the queueing phenomena with a gradient of inducer concentrations, but it is an average measurement of 1000’s of cells. Analysis of single-cell images is an independent technique that is more sensitive than batch data. Single-cell analysis is less subjected to the averaging effects characteristic of batch (population-scale methods) and offers a level of discrete detection that is unobtainable with traditional microbiological techniques25. Single-cell data and microplate batch data had similar results when proteins were targeted to the same proteolytic pathways, although crosstalk is more apparent with the single-cell data, especially for the RepA70 tag (Fig. 5A). The single-cell data and microplate batch data also had similar results when proteins were targeted to different pathways (Fig. 5B and Fig. S3). The only minor difference detected by these methods was when proteins with RepA70 and MarA were co-produced. The batch data showed weak crosstalk (crosstalk was only detected at the highest Dox concentration; Fig. 4E), while the single-cell data showed no apparent crosstalk (Fig. 5B).
Conclusion
Using synthetic circuits, we demonstrated that crosstalk could arise between several different proteolytic pathways. We first characterized a collection of amino acid sequences (tags) that target proteins towards active degradation. Proteins fused to these tags were expressed in a common strain using identical promoters and identical plasmid origins, which allowed us to compare fairly the effectiveness of these tags as degradation signals. This initial characterization of molecular parts is itself of value to synthetic biology. We then co-expressed YFP and CFP with generally different tags to determine crosstalk between pathways. The LAA tag was particularly prone to exhibiting crosstalk with itself and other tags. Other tag combinations demonstrated a range of crosstalk, though not as strong as we measured with LAA. Since many current synthetic systems rely on proteins engineered with LAA tags (targeted to ClpXP)12, 16, 18, 26-35, our results strongly suggest that proteolytic crosstalk may be a major hurdle to scalability, indicating that new protease tags are required. Our select pairs of degradation tags with minimal to moderate crosstalk may have future applications in this direction, e.g. in the development of the first synthetic orthogonal to semi-orthogonal oscillators. Of course, tags with strong crosstalk may still be of value, as they could be used for more coordinated modules.
Our findings have implications for both the study of natural systems and the design of complex synthetic systems. In natural systems, cells may use transcriptional regulation and proteolytic coupling as a form of regulation based on the desired response. Transcriptional regulation allows for either a coupled or an uncoupled response. Proteolytic coupling has advantages over a transcriptional response; it may be a quicker mechanism that requires less energy. A transcriptional response to an overloaded protease requires transcription of RNA, production of protein, protein folding, and removal of excess protease after the overabundant substrates are removed. In contrast, proteolytic queueing utilizes other proteases already present in the cell, thus a subsequent response is quick and requires no additional energy to be effective nor to remove the protease later. This means cells can respond to misfolded proteins and the environmental conditions that cause an increase in faulty translation without significantly slowing growth. E. coli utilizes proteolytic queues for σS regulation8. Our recent stochastic models suggest that other natural systems such as toxin-antitoxin mechanism used to modulate bacterial persistence may also be augmented by proteolytic queues20.
Methods
Reagents
All of the reagents were reagent grade and purchased from Sigma-Aldrich, Fisher Scientific, or Thomas Scientific unless otherwise stated.
Strains and Plasmids
All strains were derived from E. coli DH5alphaZ1 (purchased from Dr. Rolf Lutz). All of the plasmid DNA sequences (in GenBank format) are provided in the supporting Information (Zip file 1). CFP and YFP gene derivatives were cloned downstream of the PLtetO of p31Cm (chloramphenicol 10 μg/ml)16 or downstream of the Plac/ara of p24Km (kanamycin 25 μg/ml). The plasmid p24Km was constructed by PCR amplification and cloning of a T1 terminator into KpnI-ClaI restriction sites of pZE24MCS (purchased from Dr. Rolf Lutz). The CFP and YFP gene derivatives (Table 1; Table S2) were purchased from ThermoFisher or GenScript. The cultures were grown in Lysogeny broth (LB).
Absorbance and fluorescence measurements with a microplate reader
Cells were passed (1:250 dilution) from an overnight culture or a culture stored at 4° C for less than two weeks into LB and antibiotics. Cells were grown at 37° C at a shaking rate of 250 rpm for 2-3 h, and then 100 μl cell culture was added to individual wells of a 96-Well Optical-Bottom Plates with Polymer Base (ThermoFisher) already containing 100 μl of regents (LB, antibiotics, and inducers). A Cytation 3 microplate reader (BioTek) was used to grow and monitor cells. Cell growth was measured at OD600 (Optical density at 600 nm). The excitation and emission (Ex/Em) used for YFP and CFP were 510/540 and 447/477 nm, respectively. The wavelengths for Ex/Em were empirically determined to minimize crosstalk between different fluorescent proteins. The background of the media (median absorbance and mean YFP fluorescence) was subtracted from the raw reads. The fluorescence values were compared at OD600nm ∼0.4 (mid-log growth phase) and then the fluorescent values were normalized by dividing by the OD. Four biological replicas were used to calculate the mean fluorescence and standard deviation. Bleed-through from one fluorescent channel into another was tested by two different methods. First, the potential bleed-through from the YFP channel into the CFP channel was tested by producing untagged YFP (the brightest YFP derivative) at different doxycycline (Dox) concentrations (inducing YFP) and monitoring CFP production. A similar test was done with only the untagged CFP protein (induced by isopropyl β-D-1-thiogalactopyranoside (IPTG) and arabinose). No apparent bleed-through was detected in the CFP channel when YFP was produced (Fig. S4A) and no apparent bleed-through was detected in the YFP channel when CFP was produced (Fig. S4B). Second, bleed-through was tested with E. coli strains carrying both untagged fluorescent proteins CFP and YFP. No apparent bleed-through was detected in the CFP channel when YFP was produced (Fig. S4C), and no apparent bleed-through was detected in the YFP channel when CFP was produced (Fig. S4D) despite carrying genes for both fluorescent proteins.
Single-cell snapshots
Cells were harvested from microplate reader wells at mid-log growth phase (near OD600nm 0.4) and single-cell snapshots were taken on a Nikon Ti microscope with CFP and YFP fluorescence cubes at 1000× magnification (a 100× objective coupled to additional 10× magnification). Phase-contrast images and fluorescence images were taken. All fluorescence images were taken with the same exposure (75 ms) and light intensity (6% solo-intensity) based on bleed-through tests. We empirically determined these settings had no apparent bleed-through (Fig. S5).
Analysis of Mean Fluorescence
To analyze single-cell snapshots, we used a custom pipeline for single-cell segmentation based on machine learning techniques. This process leveraged a FastRandomForest classifier trained and then applied using the Trainable Weka Segmentation tool in Fiji36. The classifier used phase contrast images (with pixel values normalized to a common mean and standard deviation) to identify individual cells that were in focus (Fig. S6), and the resulting segmented regions were saved as a separate collection of segmented images. Segmented images were then processed by custom Python 2.7 scripts using the SciPy and OpenCV packages. These scripts both measured single cell fluorescence in corresponding fluorescence images and also computed single cell statistics.
Model Fitting to Single Tag Microplate Data
Fluorescence data for the single tag constructs in microplate experiments were fit to a simple enzymatic degradation model in order to describe the data points by a mathematical function. We checked whether the measured YFP-tag fluorescence YT (for a range of tags T) for a given concentration [DOX] could be described by the steady-state of the ODE: i.e., with YT the solution to the equation In this model, we have rescaled time such that the dilution rate is 1.0, and we allow the apparent enzymatic velocity µT and molar constant KT to, in principle, have independent values for each tag. The parameters α and C0 characterize the production rate of protein for a given level of [DOX]. We assume that untagged YFP proteins are degraded with zero enzymatic velocity, i.e. μUntagged = 0. All data points, excepting data for YFP-RepA15, were jointly fit to this model using the scipy.optimize.minimize function from the SciPy library. Parameter values for one of our best fits are reported in SI Table S3, and the fit is displayed graphically in SI Fig. S1. It is worth noting that if fitted value for KT is large relative to typical values of YT, then the relevant degradation parameter is the first-order degradation rate constant, μT /KT, and only this ratio is likely very meaningful regarding model fit.
Supporting Information Captions
Fig. S1. Single degradation tag results with different levels of Dox. An in vivo microplate reader assay was used to determine the fluorescence of YFP proteins with and without degradation tags. Symbols (connected by dashed lines) represent the mean fluorescence from four biological replicates, and error bars are the calculated standard deviations from these replicates. Solid lines represent fits to a simple mathematical model (see Methods), with parameters given in Table S3. SoxSn20 tagged proteins deviated noticeably from our model fit, but were included in our fitting analysis. RepA15 tagged proteins were excluded from our model fit due to weak degradation and irregular response, though we still report the corresponding percent degradation in Fig. 2.
Fig. S2. Proteins engineered with the HipB tag and co-produced with (C) RepA70, (D) MarA, and (E) MarAn20 tagged proteins had no detectable crosstalk. YFP derivatives were expressed from the PLtetO promoter using the inducer Dox; while CFP derivatives were expressed from Plac/ara promoter using 0.5 mM IPTG (all experiments contained 1% arabinose). Each tag comparison is indicated by tag/tag with CFP being the first tagged protein and YFP being the second tagged protein. Four biological replicas were used to calculate the mean fluorescence and standard deviation from in vivo microplate reader batch data. FU: arbitrary fluorescence unit.
Fig. S3. Analysis of crosstalk using both single-cell and batch data with HipB-tagged proteins. See Fig. 5 and the Method section for details on acquisition and analysis of the single-cell data and microplate batch data.
Fig. S4. Test for bleed-through from one channel to another in in vivo microplate reader experiments. No apparent bleed-through was detected in the (A) CFP channel when YFP was produced and no apparent bleed-through was detected in the (B) YFP channel when CFP was produced. Bleed-through was tested with the E. coli strains carrying both untagged fluorescent proteins CFP and YFP. (C) No apparent bleed-through was detected in the CFP channel when YFP was produced. (D) No apparent bleed-through was detected in the YFP channel when CFP was produced. Dox was used to induce YFP expression and IPTG (with 1% arabinose) was used to induce CFP expression. Four biological replicas were used to calculate the mean fluorescence and standard deviation. FU: arbitrary fluorescence unit.
Fig. S5. Test for bleed-through from one channel to another in single-cell snapshot experiments. No apparent bleed-through was detected in the CFP channel when YFP was produced and no apparent bleed-through was detected in the YFP channel when CFP was produced. Bleed-through was tested with the E. coli strains carrying no fluorescent proteins and untagged fluorescent proteins CFP and YFP. When YFP was produced the CFP channel had no apparent change indicated; the CFP channel when YFP had similar mean fluorescence as the CFP channel with no fluorescent proteins (the mean values were within 1 SEM). The same was true when the CFP was produced and the YFP channel was monitored. Dox 200 ng/ml was used to induce YFP expression and 0.5 mM IPTG + 1% arabinose was used to induce CFP expression. Four biological replicas were used to calculate the mean fluorescence and standard deviation. FU: arbitrary fluorescence unit. Single-cell data was acquired by imaging cells at 1000X magnification using a fluorescent confocal microscope with an exposure of 75 ms and light intensity of 6% for both the CFP and YFP channel. Cells were identified and fluorescence levels were calculated using Fiji, machine learning, and in-house scripts (see Methods). The mean and SEM were calculated from 10-19 individual cells.
Fig. S6. Method for Single Cell Analysis. (a) To extract single cell fluorescence data, phase contrast images (shown) were taken in sequence with fluorescence images. Phase contrast images generally contained both cells in focus and out of focus. (b) Machine learning-based classification (see Methods) was applied to phase contrast images to identify individual cells that were in focus (cells indicated by colored regions). The regions in the image associated with individual cells were then used to measure fluorescence in corresponding images.
Acknowledgements
Funding for this research was provided by the 454 National Science Foundation under Grant No. MCB-1330180.