Meeting Measurement Precision Requirements for Effective Engineering of Genetic Regulatory Networks

Reliable, predictable engineering of cellular behavior is one of the key goals of synthetic biology. As the field matures, biological engineers will become increasingly reliant on computer models that allow for the rapid exploration of design space prior to the more costly construction and characterization of candidate designs. The efficacy of such models, however, depends on the accuracy of their predictions, the precision of the measurements used to parameterize the models, and the tolerance of biological devices for imperfections in modeling and measurement. To better understand this relationship, we have derived an Engineering Error Inequality that provides a quantitative mathematical bound on the relationship between predictability of results, model accuracy, measurement precision, and device characteristics. We apply this relation to estimate measurement precision requirements for engineering genetic regulatory networks given current model and device characteristics, recommending a target standard deviation of 1.5-fold. We then compare these requirements with the results of an interlaboratory study to validate that these requirements can be met via flow cytometry with matched instrument channels and an independent calibrant. Based on these results, we recommend a set of best practices for quality control of flow cytometry data and discuss how these might be extended to other measurement modalities and applied to support further development of genetic regulatory network engineering.


Introduction
In mature fields of engineering, most design work is carried out "virtually" using quantitative numerical models to efficiently explore design space. Physical prototypes, on the other hand, are primarily used to validate models and designs and to refine predictions, rather than as a part of the search for some viable design. Once the goals of an engineering project reach even a moderate degree of complexity, the modeling and simulation of a proposed design is much faster and cheaper than the construction of a physical prototype, but such models are only useful if they are sufficiently accurate to reliably predict whether the physical instantiation is likely to satisfy the engineering goals. For a synthetic biology project, the number of potentially viable designs rapidly exceeds the number of combinations that can be readily brute-forced with high-throughput screening methods 1 . Moreover, high-throughput screening methods are still effectively out of reach of all but the most well-financed labs, meaning that most synthetic biology practitioners are only able to build and test a small number of constructs in each iteration of their engineering cycle.
Quantitative predictive modeling should allow practitioners to more efficiently explore design space, increase the likelihood that a physical prototype meets design goals, and thus decrease the barriers to entry for synthetic biology engineering. The efficacy of such predictions involves three components: accurate biological models, precise measurements providing parameters for the models, and biological devices with some degree of tolerance for imperfection in modeling and measurements. Recent years have seen much improvement in all three of these areas, including models that provide good accuracy in predicting certain compositional behaviors of engineered genetic constructs (e.g., 2-10 ), protocols for reproducibly precise measurement of genetic construct behavior [11][12][13] , and families of biological regulatory devices with improved performance (e.g., 6,8,10,[14][15][16][17][18][19][20]. Despite all of these advances, however, effective predictive engineering with genetic constructs remains extremely difficult and is only currently done under specific narrow circumstances.
This difficulty likely stems at least partly from persistent shortcomings in measurement reproducibility and precision. But just how reproducible do measurements actually need to be? Moreover, can we achieve a sufficient level of precision with available methods, and what best practices can help to ensure data actually attains the desired precision? To shed light on these questions, we derive a mathematical bound, which we call the Engineering Error Inequality, that quantifies the relationship between predictability of results, model accuracy, measurement precision, and device characteristics. We then apply this bound to estimate requirements for precision for measurements used to engineer genetic regulatory networks.
Finally, we use data from an interlaboratory study to validate that these requirements can be met via flow cytometry with the aid of matched instrument channels and an independent calibrant. Based on these results, we recommend a set of best practices for quality control of flow cytometry data and discuss how these results might support more effective engineering of genetic networks. geometric standard deviation of model error (fold) p err probability a design instantiation fails to satisfy its specification p s probability a design instantiation succeeds in satisfying its specification

The Engineering Error Inequality
We propose that a conservative approximation of the difficulty of engineering a genetic construct can be made by considering a simple "atomic" act of engineering decision-making: checking whether a property value v of a proposed system falls into a range of viable values r = [r − , r + ]. Practically every choice in genetic construct design can be interpreted in terms of one or more such evaluations, for example: • Will a given promoter drive production of an enzyme to a high enough level to support the intended reaction to generate a desired product? Here, v is promoter activity and r is the range of activity levels that will result in the intended production rate.
• Is the high signal supplied to a genetic inverter sufficient to suppress its output below a desired level of expression? Here, v is the input signal level and r is the range of input levels that will produce outputs below the threshold.
• Will a given chemical sensor detectably vary its output when exposed to the chemical at various concentration levels of interest? Here, v is the minimum difference in outputs produced by the target concentrations and r is the range of output differences that can be detected.
The full details of any such engineering process will typically be much more complex. Most systems will involve more than one decision and it may be difficult to actually achieve a value in the viable range. The basic ability to check whether a value falls in a desired range, however, is a necessary precondition for any such more sophisticated decision-making. This implies an inequality relation (per details in the derivation given in the Methods), such that any metric quantifying the difficulty of evaluating a single value check is also a lower bound on the difficulty of evaluating any other design decision.
If we know the actual values of v and r, then making such a check is trivial. In practice, however, uncertainty enters the decision-making process because we cannot directly access the true values of either v or r. Instead, we must work with estimated values. Experimental characterization produces an estimated valuev that varies from the unknown true value v.
Applying predictive models produces an estimated ranger that varies from the unknown true value r. In the context of the engineering of genetic regulatory networks, gene expression distributions are typically log-normal 21 (or a close approximation 22 thereto). As a consequence, the errors of measurements and model predictions are typically well-described in terms of ratios, rather than differences. In this case, we can thus describe the error of measuring v (i.e., for which precision gives a lower bound) as the geometric standard deviation µ of values ofv. Likewise, model accuracy may be described as the geometric standard deviation m of predictions of the magnitude or bounds ofr. Figure 1(a) provides a diagram illustrating this relationship.
For example, consider the case given above of selecting a constitutive promoter to drive production of an enzyme to an expression level v within a range r of expression levels that will support the intended reaction. Here, the measured expression levelv might be an average protein expression level of 3.5e4 molecules/cell to a precision µ of 2.5-fold standard deviation (i.e., 95% chance of falling within a range of 5.6e3 to 2.2e5 molecules/cell). Likewise, the estimate of viable ranger might be made with a model that predicts good enzymatic function at expression levels within a range of 2e3 to 5e4 molecules/cell, with an empirical accuracy m of 3-fold standard deviation on either bound (i.e., 95% chance that the true range of viability has a lower bound between 6.7e2 and 6.0e3 and an upper bound between 1.7e4 and 1.5e5, assuming that m and µ are themselves well known). In this case, the promoter would produce good enzymatic function if the true values turn out to be close to the estimated value but fail if they are farther away. For example, the system will fail if the measurement produced an underestimate, with the true expression level v being 8.0e4 molecules/cell and the model produced an overestimate, with the true upper bound of r being 3.0e4 molecules/cell.
The probability of success p s of an engineered design can then be conservatively bounded as being no greater than the probability that a system that has been estimated to have a value in the viable range actually does in fact have that value in the viable range: If we evaluate Equation 1 in this context, then by following the derivation provided in the Methods we may compute a conservative bound on the probability of success p s , which we term the Engineering Error Inequality: where µ and m are the geometric standard deviation (log fold) of measurement and model error distributions, respectively, |r log | is the magnitude ofr on the log scale (i.e., logr + r − ), and erf is the standard error function. The Engineering Error Inequality thus provides a bound on the probability that an instantiation of a design satisfies its specification, establishing a four-way relationship between the predictive accuracy m of a model, the precision µ of property value measurements, the estimated ranger over which a property value will fulfill its specification, and the probability p s that a particular implementation will actually fulfill said specification.  Figure 1: (a) Minimum requirements for effective engineering can be estimated by considering the effect of errors on an "atomic" engineering decision: checking whether an estimated property valuev (red) lies within the estimated ranger (blue) that can correctly realize an engineering specification, given a measurement error distribution of magnitude µ (light red) estimating the property value and a model error distribution of magnitude m (light blue) in predicting the acceptable range. The relationship of these values determines the probability that the true value v (orange) actually falls within the true range r (purple); in the illustrated case, they do not, and the design will fail. (b) Minimum probability of failure with respect to target range and total error (combining both measurement error and model error); red line highlights the typical heuristic significance cutoff p err = 0.05. (c, d, e) Examples of the measurement precision required to hit a target range with no more than given a probability of failure, given a model geometric standard deviation of 1.2-fold (c), 1.5-fold (d) or 2-fold (e). Black cross highlights the example of a 10-fold range and 5% chance of failure, which requires approximately 1.7-fold measurement precision with 1.2-fold model error, 1.5-fold precision with 1.5-fold model error, and is not possible with 2-fold model error. Figure 1(b) illustrates how the Engineering Error Inequality may be thought of with respect to a total "budget" for error in both modeling and measurement. For example, consider a potential definition of "significant" chance of failure (p err = 1 − p s = 0.05). The transition around this range is fairly sharp: small changes in total measurement and model errors result in large changes in the likelihood that the system operates as intended. A p err = 0.05 cutoff is approximately equivalent to two geometric standard deviations under the assumptions of Equation 2, so avoiding a significant chance of failure requires the geometric standard deviation of total error be less than the fourth root of the target range (i.e., the range being two geometric standard deviations up and two down from its geometric center). In current engineering of genetic regulator networks, the estimated range of values |r log | that satisfy a design specification typically spans one or two orders of magnitude (e.g., the range of gene expression levels for a repressor that are high enough to fully repress the promoter it targets, but not too high for the cell to sustain). At these levels, selecting a design that places a value within a 10-fold range requires that error ( µ 2 + m 2 ) be less than 1.78-fold while doing the same for a 100-fold range requires that error be less than 3.16-fold.

Estimating Measurement Precision Requirements
So far, we have discussed the combined error of both model and measurement. Equation 2 tells us that our ability to engineer effectively will be limited by the larger of the two errors. Improvements in the smaller error can improve the situation by a ratio of up to 1 √ 2 (i.e., about a third). Thus, improving one of the two errors can make a significant difference, but the situation rapidly becomes one of diminishing returns.
Of the two, model uncertainty m is generally the more difficult to quantify and control in novel contexts, given its relationship to fundamental unknowns about biology. For example, a prediction regarding transcriptional activity may be thrown off by an unexpected contextual sequence interaction out of scope of the model that is being applied. The precision of measurements µ, on the other hand, can be estimated through interlaboratory studies. Even though our understanding of the implications of a measurement is imperfect, we can at least determine how precisely that measurement can be reproduced. Focusing specifically on the question of requirements for measurement precision µ, Figure 1(c-e) illustrates this relationship by evaluating Equation 2 to determine the minimum measurement error µ for which a desired probability of failure p err can be achieved given a specified target range size |r| and model error standard deviation m. In particular, Figure 1 Complementarily, more tolerant system designs and better-performing devices can relax demands for both measurement precision µ and model accuracy m by providing a larger acceptable range of values to target. For example, with m = 1.5-fold model accuracy, changing from a 10-fold target |r| to a 30-fold or 100-fold target |r| decreases the required measurement precision µ to 2.16-fold and 3.01-fold, respectively.
These analyses are conservative, meaning that in practice the de facto requirements for effective engineering may be significantly more stringent. However, these estimates can at least be used to determine minimum requirements for use of a measurement technique in engineering biological organisms. Thus, given the typical prediction power of current models and the typical operating ranges of current biological devices 6,8,10,14-20 , we may set a nominal 1.5-fold standard deviation target for measurement precision µ as an approximate mid-point in the range where 5% change of failure is achieved with 1.2-fold to 2.0-fold m models and 10-fold to 100-fold |r| devices, per Figure 1(c-e). Further improvements will bring additional marginal gains, with rapidly diminishing returns for anything tighter than approximately 1.2-fold. On the other hand, measurement precision worse than approximately 2.0-fold is effectively unusable for predicting the behavior of anything but the most tolerant systems and highest performance devices.
Given the landscape defined by evaluating Equation 2 against the current state of the art in modeling and available devices, we suggest that these numbers provide a reasonable set of first approximation targets against which to evaluate modalities of measurement. This, then, is the first key result of this work: in order for a measurement to support effective engineering of genetic regulatory networks given current model precision and device characteristics, we recommend a target precision µ of 1.5-fold standard deviation, with a maximum allowable value of 2.0-fold and no significant improvement expected for µ less than 1.2-fold.

Precision of Gene Expression Quantification with Flow Cytometry
We now apply these results to ask whether flow cytometry can satisfy the requirements for measurement precision. Flow cytometry is one of the most frequently used methods for quantifying gene expression, e.g., by co-expressing a fluorescent protein with a gene of interest or by using dye-tagged antibodies that attach to membrane proteins. These instruments produce data in arbitrary units whose values depend on the specific instrument and its configuration, but these data can be mapped to reproducible units by measuring a calibrant of known fluorescence and scaling the unknown data accordingly. These units are given as

Molecules of Equivalent Soluble Fluorophores (MESF) or
MEx where x is a particular fluorophore, e.g., MEFL for Molecules of Equivalent Fluorescein. Flow cytometry is also the only currently widely accessible assay modality that can collect expression data from large numbers of single cells, which can make flow cytometry particularly valuable for modeling and prediction. Indeed, flow cytometry has been the primary assay used in a number of recent studies demonstrating effective prediction of multi-element genetic expression systems from models of their constituent components [4][5][6][8][9][10] . Alternatives either gather population data (e.g., plate reader, RNAseq, proteomics) or single cell data from relatively small num-bers of cells (e.g., microscopy) or are not yet widely accessible (e.g., droplet microfluidic sequencing 23 , various single-cell transcriptomic methods 24 ).
Prior results with calibrated flow cytometry suggest that this method is indeed likely to be able to achieve the required degree of precision. A study by the International Society for Advancement of Cytometry (ISAC) and the US National Institute of Standards and Technology (NIST) demonstrated that currently available fluorescence calibration beads achieve no better than approximately 1.2-fold variation 25 , a number consistent with prior studies on the variation of dye-based quantification of fluorescence (e.g., 26 ). This provides an anticipated lower bound on measurement error, since measurements that are calibrated using fluorescent beads cannot be more precise that the measurement of the calibrant. Additionally, two interlaboratory studies organized by iGEM 11,12 suggest a likely upper bound. In these studies, each laboratory independently transformed several genetic constructs into cells, then cultured the cells and measured their fluorescence using plate readers and/or flow cytometry. The flow cytometry measurements in these two studies, with units determined using SpheroTech calibration beads 27 , produced values with geometric standard deviations of 1.7-fold 11 and 2.3-fold 12 , but they include possible error introduced by independent transformation and culturing methods. To the best of our knowledge, no prior study has specifically examined the precision of measurements quantifying fluorescent protein expression via bead-calibrated flow cytometry. Figure 2: Interlaboratory study design: samples prepared in one lab were split and shipped to all participant labs. Once received at each lab, samples were measured first to determine instrument settings, then to acquire study data, which was uploaded for unified analysis.

Study Design and Instruments
To address this question, we organized an interlaboratory study to measure the same cells on different instruments. Figure 2 shows the overall design of this study. First, five strains of E. coli were transformed and cultured in a single laboratory, each strain with a different plasmid: a blank plasmid expressing no fluorescent protein, two plasmids for constitutive expression of GFP at different expression levels, and two plasmids for constitutive expression of mCherry at different expression levels (full details are provided in S1: Interlab Plasmid Designs). All participating laboratories were sent frozen aliquots of each of the samples, along with SpheroTech RCP-30-5A rainbow calibration beads 27 , such that all measurements were made with aliquots from the same biological material and the same set of calibration beads. The laboratories performed flow cytometry measurements in two stages, first using one set of samples to determine appropriate instrument settings, then measuring three sets of samples to acquire study data. For instruments with variable channel gain, each sample was measured at three different gain levels. Three technical replicates of measurement were requested for each sample in every measurement condition. All data was then uploaded for a unified analysis process. Full details are provided in Methods, S2: Protocol, and S4: Flow Cytometry Data Processing.
In total, measurements and configuration information were collected from 23 flow cytometers at 15 institutions. We deliberately did not attempt to select for identical instruments, but instead gathered as much information as was available about each instrument in order to allow our study to consider both identical and non-identical channels and instruments.
The full list of instruments and channels is provided in S3: Participating Instruments and Channels. In total, the study comprised 12 different models of flow cytometer from three manufacturers. The overall variety of available channels is quite wide. For subsequent analysis, we selected channels that best matched the excitation and emission spectra of GFP and mCherry, as shown in Figure 3 (details given in S4: Flow Cytometry Data Processing).
For GFP, the 488nm excitation, 530nm/30nm emission filter channel is widespread ( Figure 3: Comparison of best match channels for calibration of GFP to MEFL and and mCherry to MEPE using SpheroTech RCP-30-5A beads (sorted first by laser then by emission filter center). Relative spectra for GFP and mCherry are plotted in the background for comparison (source: fpbase.org). For GFP, most channels are quite similar, while for mCherry there is a high degree of heterogeneity. Note: for the 473 nm excitation instrument at bottom, emission filter information was not available and is thus not shown. Full channel details information is provided in S3: Participating Instruments and Channels.
In principle, every set of channels with identical excitation and emission filters should produce identical results. For non-matched channels, the degree of disparity should depend on the relative difference between the spectral properties of the calibration beads versus the spectral properties of the fluorescent proteins. In principle, such spectral information might be used to adjust unit to be equivalent across unmatched channels. For the present, however, we will ask only the empirical question of how much variability is observed for both matched and unmatched channels.
Furthermore, in all cases, we cannot truly say that the calibration is to the designated units of the manufacturer (molecules of equivalent fluorescein, MEFL and molecules of equivalent phycoerythrin, MEPE), since no channel precisely matches the channels for which the calibrant manufacturer has provided values. As the target of our study is not absolute units but variation in measurement, however, this is not a limitation. We will use the designation nonetheless as a shorthand for beads calibrated using the values provided for MEFL and MEPE in the closest matching available channel. Note that this lack does indicate a need for coordination between calibration bead manufacturers and flow cytometry providers on standard channels and/or more complete spectral information to be made available about calibration beads.

Study Results
Of the 23 participating instruments, 19    different instruments. Figure 5 shows the geometric standard deviation for each fluorescent sample for each cluster of instruments with matching channels. Overall, the differences are notably higher than for different voltages on a single instrument. The geometric mean of these geometric standard deviations is still only 1.52-fold. Notably the only two channels shared by high numbers of instruments (488 ex, 530/30em for GFP; 561 ex, 610/20 em for mCherry) have values very close to this for both the high and medium constructs. These results indicate that flow cytometry with bead-based calibration can meet the 1.5-fold target precision estimated above.
Finally, we consider the relationship of calibrated values as measured by instruments with different channels. Figure 6 shows these values for each of the four fluorescent constructs.  investigation is needed, however, to determine the bounds of tolerable variation between channels, the degree to which other requirements can also be relaxed (e.g., using calibrants from different lots or different manufacturers), and the degree to which more sophisticated unit computation can compensate for differences between channels.

Discussion
The results that we have presented establish an error budget for engineering genetic regulatory networks, with the Engineering Error Inequality quantifying the tension between device characteristics, error in measurement and modeling, and the difficulty of engineering. Once established, this relationship can be used to derive requirements for any one of these properties based on the available capabilities for the others. In particular, with the current state of the art in genetic regulatory network engineering, we have estimated a target precision for measurements of µ = 1.5-fold geometric standard deviation, with approximate upper and lower bounds of 2.0-fold and 1.2-fold.
By measuring a uniform set of cell samples on a heterogeneous collection of flow cytometers, we determined that bead-calibrated flow cytometry can indeed meet this error budget, with a geometric mean value of 1.52-fold geometric standard deviation across matched channels. At this level of precision, we should expect that flow cytometry can be effectively used for engineering with reasonably good models (m ≤ 2.0-fold), using biological devices with strongly differentiated outputs (|r| at least 10-fold to 100-fold, depending on µ and m), collectively meeting a sufficiently tight bound on predictability. Flow cytometry measurements do not, however, appear to be sufficiently precise to allow for engineering with either poor models or weak devices. Bead-based calibration is also sufficient to allow comparison of data from channels with similar emission filters, even when these filters are not matched. As a byproduct of this study, we also confirm that bead-based calibration allows highly precise matching across different voltage levels on a single instrument (finding a precision of 1.09fold), which has the potential to expand effective measurement range and to allow fusion of otherwise incompatible datasets.
Based on these results, we recommend that the following be adopted as best practices for use of flow cytometry measurements as a basis for engineering genetic regulatory networks: • Calibrate units using independent reference materials (e.g., NIST traceable dye-based calibration beads). Use of a reference strain (e.g., RPU 6,28 ) is not sufficient, as demonstrated in 11 .
• Collect technical replicates for controls in triplicate to validate that measurement replicability on the instrument is better than 1.2-fold standard deviation.
• Use negative and single positive controls for background removal and compensation (not part of our study, but standard recommended practice, per 29,30 ) • Standardizing machine, light path, and laser power are not necessary, as long as calibration beads are used and controls indicate the instrument is performing well.
• The final recommendation also points to important additional work that is still needed.
Notably, coordination between calibration bead manufacturers and flow cytometry providers to standardize channels and/or to provide more complete spectral information about calibration beads would allow for device characterization data to be shared across a broader range of laboratories and instruments. Likewise, with a profusion of available fluorescent proteins, it would benefit the engineering community to consider reusability of data when selecting fluorescent proteins. There is also a gap between calibration in terms of equivalent dye molecules and the ability to interpret these numbers in terms of the actual fluorescent protein used, as illustrated by the high degree of variation between channels in Figure 6(cd). Shifting units from equivalent dye molecules to protein molecules would also address the problem of channel-to-channel conversion, though in some circumstances this can also be handled by multi-color controls, as shown in 5,31 .
Beyond flow cytometry, it will be important establish that other measurement modalities can also achieve the requisite level of measurement precision. Plate reader data seems readily able to achieve similarly usable, if imperfect, error levels, given the precision and unit equivalence demonstrated in 12 , though this has not yet been explicitly confirmed. Calibration of RNAseq with spike-in controls also appears likely to provide sufficient precision, based on the results in 13 . Other forms of measurement likewise may be evaluated using similar methods to those presented here.
Finally, effective engineering will only be actually enabled by the availability of sufficient data collected using methods such as these. The synthetic biology community has built up large public repositories of reusable genetic parts, through organizations such as iGEM, Addgene, and SEVA. The ability to engineer with such parts will be advanced by the community building up public repositories of reusable data describing part behavior, with the data collected using comparable fluorescent proteins and sufficient precision and calibration to allow its application in the design of new genetic regulatory networks.

Derivation of Engineering Error Inequality
Consider an engineered design with a collection of measurable property values v, such that the design will meet its specification only if every property value is within some interval range r = [r − , r + ] specified for that property.
Note that this is formulated as a necessary condition, but not sufficient: there may be other interactions within the system that impose additional constraints on relationships between properties, but we may at least take the conservative view that success cannot be achieved without at least getting every property to a value within that property's specified range of potentially acceptable values. Likewise, we may assume without loss of generality that there is only a single range of viable values: a disjoint range r = [r −,1 , r +,1 ] ∪ [r −,2 , r +,2 ] is strictly lower probability than a single range of the same total magnitude r = [r −,1 , r +,1 + (r +,2 − r −,2 )], as the impact of uncertainty is greatest at an interval boundary and a disjoint range has more such boundaries.
The probability of a particular instantiation of a design satisfying its specification is thus bounded above by the joint probability of all property values being within their respective ranges. This, in turn, is bounded above by the probability of any individual value being within a specified range, such that p s , the probability of successful realization of a design specification with a particular set of design choices, given an estimated property valuev whose distribution is determined by measurement precision, and an estimated ranger whose accuracy is limited by the accuracy of available models, may be bounded above by: where s is the event of success, in which an instantiation of a design is able to meet its specification.
The distribution of genetic expression levels in cells is typically log-normal 21 (or a close approximation 22 ), which implies that estimates of expression levels made from measurements will typically have a log-normal estimation error distribution as well, unless another larger source of error dominates (e.g., stochasticity due to very low expression levels). Likewise, effective quantitative predictive models have typically shown distributions well-described in terms of error on a logarithmic scale as well (e.g., [3][4][5][6][7][8] ). When dealing with genetic regulatory networks, then, a reasonable null hypothesis is that both measurement errors and model errors have independent log-normal distributions.
When this hypothesis holds, the composition of these two perturbations is a log-normal with the sum of their variances: taking µ as the (log-scale) standard deviation of the distribution of measurement error and m likewise for model error.
If it is possible to choose any value forv, then p s may be maximized by choosingv to be at the center ofr (if not all values can be chosen, the best available value may have an even lower chance of success, so the inequality is unaffected). The actual values ofv andr then become irrelevant, and we may refactor the equation in terms of the magnitude of the estimated range versus the magnitude of uncertainty from model and measurement around a zero mean: with the magnitude of the range in the equation is also quantified on a log-scale (note however, that this equation does assume that the values of µ and m are well constrained).
We can then normalize the range to express this probability in terms of the number of standard deviations with respect to a standard normal distribution: This value of this probability is the cumulative distribution for a folded normal distribution evaluated at |r log | 2 √ µ 2 +m 2 , which is: where erf is the Gauss error function. We thus have an inequality that relates model accuracy, measurement precision, and design tolerance (in terms of magnitude of acceptable range) to the probability that an engineered genetic regulatory network satisfies its design specification.

Plasmid construction
Plasmids were constructed by Golden Gate 32 assembly in the MoClo 33 architecture.

Strain Culturing
Plasmids were transformed into 10-beta competent E. coli cells (New England Biolabs cat. At this point all strains had an OD600 of 0.5 or less and were concentrated to an OD600 of 0.5; 150 µL of this culture was combined with 90 µL of 0.45 µm filter-sterilized 50% glycerol and mixed gently, then kept on ice. Pre-labelled 2 mL labelled Eppendorf tubes chilled to −80°C were then used to aliquot 10 µL of culture and immediately placed back at −80°C until distribution.

Flow Cytometry
Determining Instrument Pre-Collection Gating Each recipient was instructed to set FSC and SSC voltages (if permitted by their instrument) such that the peak density of cells was centered in the range of their detector. As pre-collection gating options can vary significantly by instrument and software, recipients were instructed to set gating to retain as many cell events as possible, even if this would result in more non-cell events being captured in the same. Additional details can be found in S2: Protocol. See Data Processing and S4: Flow Cytometry Data Processing for the separation of cell and non-cell events post-capture.
Determining High, Medium, and Low Channel Voltages Each recipient was instructed to take the brightest strain for red and green and raise the corresponding channel's voltage until the distribution was as high as possible without clipping the sample. This would be used as the High voltage; the process was repeated with targeting voltages for 10x and 100x fold less arbitrary units for Medium and Low voltage settings respectively. For instruments with low maximum values, recipients were instructed to still space the Medium and Low voltages evenly, but to lower the overall separation so that the sample was not clipped significantly on the low end. For instruments without a variable voltage setting, this step was necessarily omitted. Additional details can be found in S2: Protocol.
Sample Preparation Bacterial cultures were prepared for flow cytometry as follows: Remove one replicate of samples from −80°C and transfer to 42°C heat bath for 60 sec.
990 µL of sterile filtered 1X PBS to bacterial culture, or 75 µL for calibration beads, is added to bring the sample concentration to 1X for running on the flow cytometer. Samples are then run immediately. This procedure is repeated independently for each replicate.
Event Collection Samples were run as close to these conditions as possible based on local instrument and software: Each sample 1 µL s −1 for either 150 µL or 10 5 total events, whichever was achieved first.

Data processing
Cell events were then separated from non-cell events by means of a Gaussian mixture model filter trained on the negative sample. Flow cytometry data was processed using the TASBE Flow Analytics software package 31 , using the recommended practices for gating, background

S3: Participating Instruments and Channels
List of instruments used in study and all information that was able to be collected regarding the available channels for each instrument, highlighting channels selected as "best match" for GFP and RFP measurements.

S4: Flow Cytometry Data Processing
Details of how flow cytometry data processing was conducted.