Quantitative characterization of the eukaryotic transcription cycle using live imaging and statistical inference

The eukaryotic transcription cycle consists of three main steps: initiation, elongation, and cleavage of the nascent RNA transcript. Although each of these steps can be regulated as well as coupled with each other, their in vivo dissection has remained challenging because available experimental readouts lack sufficient spatiotemporal resolution to separate the contributions from each of these steps. Here, we describe a novel computational technique to simultaneously infer the effective parameters of the transcription cycle in real time and at the single-cell level using a two-color MS2/PP7 reporter gene and the developing fruit fly embryo as a case study. Our method enables detailed investigations into cell-to-cell variability in transcription-cycle parameters with high precision. These measurements, combined with theoretical modeling, reveal a significant variability in the elongation rate of individual RNA polymerase molecules. We further illustrate the power of this technique by uncovering a novel mechanistic connection between RNA polymerase density and nascent RNA cleavage efficiency. Thus, our approach makes it possible to shed light on the regulatory mechanisms in play during each step of the transcription cycle in individual, living cells at high spatiotemporal resolution.


30
The eukaryotic transcription cycle consists of three main steps: initiation, elongation, and cleavage of the nascent RNA transcript ( Fig. 1A; Alberts (2015)). Crucially, each of these three steps can be controlled to regulate transcriptional activity. For example, binding of transcription factors to enhancers dictates initiation rates (Spitz and Furlong, 2012), modulation of elongation rates helps determine splicing efficiency (De La Mata et al., 2003), and regulation of cleavage controls aspects 35 of 3' processing such as alternative polyadenylation (Tian and Manley, 2016).
1 of 37 bioRχiv preprint The steps of the transcription cycle can be coupled with each other. For example, elongation rates contribute to determining mRNA cleavage and RNA polymerase (RNAP) termination efficiency (Pinto et al., 2011;Hazelbaker et al., 2013;Fong et al., 2015;Liu et al., 2017), and functional linkages have been demonstrated between transcription initiation and termination (Moore and Proudfoot, 40 2009; Mapendano et al., 2010). Nonetheless, initiation, elongation, and transcript cleavage have largely been studied in isolation. In order to dissect the entire transcription cycle, it is necessary to develop a holistic approach that makes it possible to understand how the regulation of each step dictates mRNA production and to unearth potential couplings among these steps.
To date, the processes of the transcription cycle have mostly been studied in detail using 45 in vitro approaches (Bai et al., 2006;Herbert et al., 2008) or genome-wide measurements that require the fixation of cellular material and lack the spatiotemporal resolution to uncover how the regulation of the transcription cycle unfolds in real time (Roeder, 1991 Despite the great promise of MS2 and PP7, using these techniques to comprehensively analyze 60 the transcription cycle is hindered by the fact that the signal from these in vivo RNA-labeling technologies convolves contributions from all aspects of the cycle. Specifically, the fluorescence signal from nascent RNA transcripts persists throughout the entire cycle of transcript initiation, elongation, and cleavage; further, a single gene can carry many tens of transcripts. Thus, at any given point, an MS2 or PP7 signal reports on the contributions of transcripts in various stages of the transcription cycle (Ferraro et al., 2016). Precisely interpreting an MS2 or PP7 signal therefore demands an integrated approach that accounts for this complexity.
Here, we present a method for analyzing live-imaging data from the MS2 and PP7 techniques in order to dynamically characterize the steps-initiation, elongation, and cleavage-of the full transcription cycle at single-cell resolution. This method combines a dual-color MS2/PP7 fluorescent 70 reporter (Hocine et al., 2013;Coulon et al., 2014;Fukaya et al., 2017) with statistical inference techniques and quantitative modeling. As a proof of principle, we applied this analysis to the transcription cycle of a hunchback reporter gene in the developing embryo of the fruit fly Drosophila melanogaster. We validate our approach by comparing our inferred average initiation and elongation rates with previously reported results. 75 Crucially, our analysis also delivered novel single-cell statistics of the whole transcription cycle that were previously unmeasurable using genome-wide approaches, making it possible to generate distributions of parameter values necessary for investigations that go beyond simple population- . We show that, by taking advantage of time-resolved data, our inference is able to filter out experimental noise in these distributions and retain sources of biological variability. By combining these statistics with theoretical models, we revealed substantial variability in RNAP stepping rates between individual molecules, demonstrating the utility of our approach for testing hypotheses of the molecular mechanisms underlying the 85 transcription cycle and its regulation.
This unified analysis enabled us to investigate couplings between the various transcription cycle parameters at the single-cell level, whereby we discovered a surprising correlation of cleavage rates with nascent transcript densities. These discoveries illustrate the potential of our method to sharpen hypotheses of the molecular processes underlying the regulation of the transcription cycle 90 and to provide a framework for testing those hypotheses.

Results
To quantitatively dissect the transcription cycle in its entirety from live imaging data, we developed a simple model (Fig. 1A) in which RNAP molecules are loaded at the promoter of a gene of total length with a time-dependent loading rate ( ). For simplicity, we assume that each individual RNAP 95 molecule behaves identically and independently: there are no interactions between molecules. While this assumption is a crude simplification, it nevertheless allows us to infer effective average transcription cycle parameters. We parameterize this ( ) as the sum of a constant term ⟨ ⟩ that represents the mean, or time-averaged, rate of initiation, and a small fluctuation term given by ( ) such that ( ) = ⟨ ⟩ + ( ). After initiation, each RNAP molecule traverses the gene at 100 a constant, uniform elongation rate . Upon reaching the end of the gene, there follows a deterministic cleavage time, , after which the nascent transcript is cleaved. We do not consider RNAP molecules that do not productively initiate transcription (Darzacq et al., 2007) or that are paused at the promoter (Core et al., 2008), as they will provide no experimental readout. Based on experimental evidence (Garcia et al., 2013), we assume that these RNAP molecules are processive, 105 such that each molecule successfully completes transcription, with no loss of RNAP molecules before the end of the gene (see Section S5 for a validation of this hypothesis).

Dual-color reporter for dissecting the transcription cycle
Here we studied the transcription cycle of early embryos of the fruit fly D. melanogaster. We focused on the P2 minimal enhancer and promoter of the hunchback gene during the 14th nuclear cycle The lacZ sequence and a portion of the lacY sequence from Escherichia coli were placed as a neutral spacer (Chen et al., 2012) between the MS2 and PP7 stem loops. As an individual RNAP molecule transcribes through a set of MS2/PP7 stem loops, constitutively expressed MCP-mCherry and PCP-GFP fusion proteins bind their respective stem loops, resulting in sites of nascent transcript formation that appear as fluorescent puncta under a laser-scanning confocal microscope 125 ( Fig. 2A and Video S1). The intensity of the puncta in each color channel is linearly related to the number of actively transcribing RNAP molecules that have elongated past the location of the associated stem loop sequence (Garcia et al., 2013), albeit with different arbitrary fluorescence units (Section S4). Upon reaching the end of the gene, which contains the 3'UTR of the -tubulin gene (Chen et al., 2012)   The reporter construct, which is driven by the hunchback P2 minimal enhancer and promoter, is expressed in a step-like fashion along the anterior-posterior axis of the fruit fly embryo. (C) Transcription of the stem loops results in fluorescent puncta with the 5' mCherry signal appearing before the signal from 3' GFP. Only one stem loop per fluorophore is shown for clarity, but the actual construct contains 24 repeats. (D) Intuition for how MS2 and PP7 fluorescence depend on the model parameters. (i) Example transcription activity that consists of a pulse of transcription initiation of constant magnitude . (ii) At first, the zero initiation rate results in no fluorescence.
(iii) When initiation commences, RNAP molecules load onto the promoter and elongation of nascent transcripts occurs, resulting in a constant increase in the MS2 signal. (iv) After time = , the first RNAP molecules reach the PP7 stem loops and the PP7 signal also increases at a constant rate. (v) After time = , the first RNAP molecules reach the end of the gene, and after the cleavage time , these first nascent transcripts are cleaved. The total time a nascent RNA transcript spends on the gene is given by = + . The subsequent loss of fluorescence is balanced by the addition of new nascent transcripts, resulting in a plateauing of the signal. (vi) Once the initiation rate shuts off, no new RNAP molecules are added and the overall fluorescence signal starts to decrease due to cleavage of the nascent transcripts still on the gene. Data in (B) adapted from Garcia et al. (2013); the line represents the mean and error bars represent standard error across 24 embryos. timescale of mRNA diffusion is about two order of magnitudes faster than the time resolution of our experiment, we approximate the cleavage of a single transcript as resulting in the instantaneous loss of its associated fluorescent signal in both channels (Section S2). The qualitative relationship between the model parameters and the fluorescence data is described in Figure 1D, which considers the case of a pulse of constant initiation rate.

Transcription cycle parameter inference using Markov Chain Monte Carlo
Our statistical framework extracts quantitative estimates of transcription-cycle parameters (Fig. 1A) from fluorescence signals. From microscopy data ( Fig. 2A and Video S1), time traces of mCherry and eGFP fluorescence intensity are extracted to produce a dual-signal readout of nascent RNA transcription at single-cell resolution (Fig. 2B, data points). To extract quantitative insights from 140 the observed fluorescence data, we used the established Bayesian inference technique of Markov Chain Monte Carlo (MCMC) (Geyer (1992) and Section S3.1) to infer the effective parameter values in our simple model of transcription: the time-dependent transcription initiation rate, separated into the constant term ⟨ ⟩ and fluctuations ( ), the elongation rate , and the cleavage time . We included a few additional free parameters: basal levels of fluorescence in each channel, the 145 time of transcription onset after the previous mitosis, and the scaling factor between the arbitrary fluorescence units of the two different fluorophores (Section S1 and S3.1). Altogether, the model produced the best-fit to the data shown in Figure 2B. The inference was run at the single-cell level, resulting in separate parameter estimates for each cell. Additionally, we used a two-step process to sequentially determine parameters with higher accuracy (Section S3.2). Inference results were 150 filtered both automatically and via manual curation to disregard results that were obscured by experimental limitations such as incomplete fluorescent signals; this curation process did not substantially affect the ultimate inference results (Section S3.3 and Fig. S3). To aggregate the results, we constructed a distribution from the single-cell results for each inferred parameter. Intra-embryo variability between single cells was greater than inter-embryo variability (Section S6 and Fig. S5). As 155 a result, unless stated otherwise, all statistics reported here were aggregated across single cells combined between embryos, and all shaded errors reflect the standard error of the mean.
The aggregated inference results produced a suite of quantitative measurements of transcription initiation, elongation, and cleavage dynamics for the hunchback reporter gene as a function of the position along the anterior-posterior axis of the embryo (Fig. 2C-H  scaling factor between the mCherry and eGFP fluorescence units (Section S4 and Fig. S4). This inferred fluorescence scaling factor agreed with an independent calibration control experiment (Section S4 and Fig. S4), showing that our methodology calibrates the intensities of distinct fluorescent proteins without resorting to independent control experiments.
Inference of single-cell initiation rates recapitulates and improves on previous 165 measurements Control of initiation rates is one of the predominant, and as a result most well-studied, strategies for gene regulation (Roeder, 1991;Spitz and Furlong, 2012;Lenstra et al., 2016). Thus, comparing our inferred initiation rates with previously established results comprised a crucial benchmark for our methodology. Our inferred values of the mean initiation rate ⟨ ⟩ exhibited a step-like pattern 170 along the anterior-posterior axis of the embryo, qualitatively reproducing the known hunchback expression profile (Fig. 2C, blue). As a point of comparison, we also examined the mean initiation rate measured by Garcia et al. (2013), which was obtained by manually fitting a trapezoid ( Figure 1D) to the average MS2 signal (Fig. 2C, black). The quantitative agreement between these two dissimilar analysis methodologies demonstrates that our inference method can reliably extract the average Our inference approach isolates the transcription initiation rate from the remaining steps of the transcription cycle at the single-cell level, making it possible to calculate, for example, the coefficient 190 of variation (CV; standard deviation divided by the mean) of the mean rate of initiation. Our results yielded values for the CV along the embryo that peaked around 35% (Fig. 2D, top). This value is roughly comparable to that obtained for hunchback using single-molecule FISH ( One of the challenges in measuring CV values, however, is that informative biological variability 195 is often convolved with undesired experimental noise. Although we currently cannot separate these sources with our data and inference method, a strategy for this separation was recently implemented in the context of snapshot-based fluorescent data (Zoller et al., 2018). Building on this strategy, we took a single snapshot from our live-imaging data and calculated the total squared CV of the fluorescence of spots at a single time point (Fig. 2D, bottom, purple). 200 Following Zoller et al. (2018) and as described in detail in Section S7 and Figure S6, we calculated the biological and experimental noise in this snapshot-based measurement. The bar graph shown in the bottom of Figure 2D shows that, once the experimental noise (light purple) is subtracted from the total noise of our snapshot-based measurement, the remaining biological variability (dark purple) is comparable to the variability of our inference results (blue). Thus, our inference mostly 205 captures biological variability and filters out experimental noise, similarly to techniques such as single-molecule FISH (Zoller et al., 2018) but with the added advantage of also being able to resolve temporal information. These results further validate our approach and demonstrate its capability to capture measures of cell-to-cell variability in the transcription cycle with high precision.

Elongation rate inference reveals single-molecule variability in RNAP stepping
210 rates Next, we investigated the ability of our inference approach to report on the elongation rate . Nascent RNA elongation plays a prominent role in gene regulation, for example, in dosage com- To illustrate the resolving power of examining elongation rate distributions, we performed theoretical investigations of elongation dynamics. Following Klumpp and Hwa (2008), we considered a model where RNAP molecules stochastically step along a gene and cannot overlap or pass each other (Section S9). First, we considered a scenario where the stepping rate of each RNAP molecule is identical. As shown in brown in Figure 2F, this model cannot account for the wide distribution 240 of observed single-cell elongation rates. In contrast, the model can reasonably describe data by allowing for substantial variability in the elongation rate of individual RNAP molecules. As shown in gold in Figure Figure 2G, the inferred mRNA cleavage time was strongly dependent on anterior-posterior positioning along the embryo, with high values (∼4 min) in the anterior end and lower values 260 toward the posterior end (∼2 min). Such a modulation could not have been easily revealed using genome-wide approaches that, by necessity, average information across multiple cells. The CV of the cleavage time also increased toward the posterior end of the embryo (Fig. 2H), providing fertile ground for uncovering the molecular underpinnings of these processes using theoretical models analogous to those discussed previously. Thus, although cleavage remains an understudied 265 process compared to initiation and elongation, both theoretically and experimentally, these results provide the quantitative precision necessary to carry out such mechanistic analyses.

Uncovering mechanistic correlations between transcription cycle parameters
In addition to revealing trends in average quantities of the transcription cycle along the length of the embryo, the simultaneous nature of the inference afforded us the unprecedented ability to Possessing units of (AU/kb), this mean transcript density estimates the average number of nascent RNA transcripts per kilobase of template DNA. Plotting the cleavage time as a function of the mean 290 transcript density yielded a negative correlation that, while moderate ( 2 = 0.18, = × − ), was stronger than any of the previous correlations between transcription-cycle parameters (Fig. 3D). Mechanistically, this correlation suggests that, on average, more closely packed nascent transcripts at the 3' end of a gene cleave faster.
Using an absolute calibration for a similar reporter gene (Garcia et al., 2013) led to a rough 295 scaling of 1 AU ≈ 1 molecule corresponding to a maximal RNAP density of 20 RNAP molecules/kb in Figure 3D. With a DNA footprint of 40 bases per molecule (Selby et al., 1997), this calculation suggests that in this regime, the RNAP molecules are densely distributed enough to occupy 80% of the reporter gene. We hypothesize that increased RNAP density could lead to increased pausing as a 9 of 37 bioRχiv preprint result of traffic jams (Klumpp and Hwa, 2008;Klumpp, 2011). Due to this pausing, transcripts would 300 be more available for cleavage, increasing overall cleavage efficiency. Regardless of the particular molecular mechanisms underlying our observations, we anticipate that this ability to resolve singlecell correlations between transcription parameters, combined with perturbative experiments, will provide ample future opportunities for studying the underlying biophysical mechanisms linking transcription processes.  320 In this work, we established a novel method for inferring quantitative parameters of the entire transcription cycle-initiation, elongation and cleavage-from live imaging data. Notably, this method offers high spatiotemporal resolution at the single-cell level. These features are unattainable by widespread, but still powerful, genome-wide techniques that examine fixed samples, such as global run-on sequencing (GRO-seq) to measure elongation rates in vivo ( quantities. The single-cell measurements afforded by our approach made it possible to infer full distributions of transcription parameters (Fig. 2D, F, and H). This single-cell resolution motivates a dialogue between theory and experiment for studying transcription initiation, elongation, and cleavage at the single-cell level. Specifically, we showed how our inferred distributions of initiation rates effectively filter out experimental noise while retaining temporal information, and how elongation 340 rate distributions make it possible to test molecular models of RNAP transit along the gene.
Finally, the simultaneous inference of various transcription-cycle parameters granted us the novel capability to investigate couplings between aspects of transcription initiation, elongation, and cleavage, paving the way for future studies of mechanistic linkages between these processes. In particular, the observed coupling of the mRNA cleavage time with RNAP density (Fig. 3D , 2006). Finally, while our experimental setup utilized two fluorophores, we found that the calibration between their intensities could be inferred directly from the data (Section S4 and Fig. S4), rendering independent calibration 360 and control experiments unnecessary. Thus, we envision that our analysis strategy will be of broad applicability to the quantitative and molecular in vivo dissection of the transcription cycle and its regulation across many distinct model systems.

Methods and Materials
DNA constructs 365 The fly strain used to express constitutive MCP-mCherry and PCP-eGFP consisted of two transgenic constructs. The first construct, MCP-NoNLS-mCherry, was created by replacing the eGFP in MCP-NoNLS-eGFP (Garcia et al., 2013) with mCherry. The second construct, PCP-NoNLS-eGFP, was created by replacing MCP in the aforementioned MCP-NoNLS-eGFP with PCP, sourced from Larson et al. (2011a). Both constructs were driven with the nanos promoter to deliver protein maternally 370 into the embryo. The constructs lacked nuclear localization sequences because the presence these sequences created spurious fluorescence puncta in the nucleus that decreased the overall signal quality (data not shown). Both constructs were incorporated into fly lines using P-element transgenesis, and a single stable fly line was created by combining all three transgenes.
The reporter construct P2P-MS2-lacZ-PP7 was cloned using services from GenScript, and was 375 incorporated into a fly line using PhiC31-mediated Recombinase Mediated Cassette Exchange (RMCE), at the 38F1 landing site. Full details of construct and sequence information can be found in a public Benchling folder.

Fly strains
Transcription of the hunchback reporter was measured by imaging embryos resulting from crossing Biberach, Germany). The MCP-mCherry, PCP-eGFP, and Histone-iRFP were excited with laser wavelengths of 488 nm, 587 nm, and 670 nm, respectively, using a White Light Laser. Average laser powers on the specimen (measured at the output of a 10x objective) were 35 W and 20 W for the eGFP and mCherry excitation lasers, respectively. Three Hybrid Detectors (HyD) were used to acquire the fluorescent signal, with spectral windows of 496-546 nm, 600-660 nm, and 700-800 nm 395 for the eGFP, mCherry, and iRFP signals, respectively. The confocal stack consisted of 15 equidistant slices with an overall z-height of 7 m and an inter-slice distance of 0.5 m. The images were acquired at a time resolution of 15 s, using an image resolution of 512 x 128 pixels, a pixel size of 202 nm, and a pixel dwell time of 1.2 s. The signal from each frame was accumulated over 3 repetitions. Data were taken for 299 cells over a total of 7 embryos, and each embryo was imaged 400 over the first 25 min of nuclear cycle 14.

Image analysis
Images were analyzed using custom-written software following the protocols in Garcia et al. (2013) and Lammers et al. (2020). Briefly, this procedure involved segmenting individual nuclei using the Histone-iRFP signal as a nuclear mask, segmenting each transcription spot based on its fluorescence, 405 and calculating the intensity of each MCP-mCherry and PCP-eGFP transcription spot inside a nucleus as a function of time. The Trainable Weka Segmentation plugin for FIJI (Arganda-Carreras  et al., 2017), which uses the FastRandomForest algorithm, was used to identify and segment the transcription spots.

410
Inference was done using MCMCstat, an adaptive MCMC algorithm (Haario et al., 2001, 2006). Figures were generated using the open-source gramm package for MATLAB, developed by Pierre Morel (Morel, 2018). Generalized linear regression used in Fig. 3

S1 Full Model
To predict MS2 and PP7 fluorescence traces, we utilized a simple model of transcription initiation, elongation, and cleavage. The entire model has the following free parameters: • ⟨ ⟩, the mean transcription initiation rate After initiation, each RNAP molecule then proceeds forward with the constant elongation rate . Once an RNAP molecule reaches the end of the gene, an additional cleavage time elapses, after 720 which the nascent transcript is cleaved and disappears instantly. This assumption of instantaneous disappearance following cleavage is justified in Section S2 based on the diffusion time scale of individual mRNA molecules. From this position map, and based on the locations of the stem loop sequences along the reporter construct (Fig. S1A), we calculate the predicted MS2 and PP7 fluorescence signals. the 725 contribution to the MS2 signal 2 ( ) of an individual RNAP molecule at position ( ) is given by where 2 and 2 are the start and end positions of the MS2 stem loop sequence, respectively, and 2 is the mCherry fluorescence produced by a single RNAP molecule that has transcribed the entire set of MS2 stem loops. Here, we also assume that RNAP molecules that have only partially transcribed the MS2 stem loops result in a fractional fluorescence given by the fractional length where 7 and 7 are the start and end positions of the PP7 stem loop sequence, respectively, and 7 is the GFP fluorescence produced by a single RNAP molecule that has transcribed the entire set of PP7 stem loops. 735 The temporal dynamics of the total MS2 and PP7 signals 2 ( ) and 7 ( ) are then obtained by summing over all the individual RNAP molecule contributions for each timepoint where is the index of each individual RNAP molecule and is the total number of loaded RNAP molecules. The final signal is then modified by accounting for the scaling factor and the basal fluorescence values of MS2 and PP7 . is necessary because the two fluorescent protein signals have different arbitrary units (see Section S4). Further, the two basal fluorescence values are incorporated to account for the experimentally observed low baseline fluorescence in each 740 fluorescent channel. The final signals ′ 2 ( ) and ′ 7 ( ) are then given by All of the model parameters introduced in this section were used as free parameters in the fitting procedure described in Section S3. Figure S1B and C show a qualitative description of how the various parameters influence the 745 shape of the simulated MS2 and PP7 fluorescence traces. Here, we consider an idealized scenario where the initiation rate consists of a pulse that starts at zero and switches to a constant magnitude < > at time = (Fig. S1B). Figure S1C show the resulting simulated fluorescence values. As a baseline, the basal fluorescence of MS2 and PP7 cause both signals to always be nonzero (red and green horizontal dashed 750 lines). Once the simulation time exceeds the value of (blue dashed line), RNAP molecules begin initiating transcription at the promoter.
MS2 fluorescence (red curve) begins to rise once the first RNAP molecule reaches the start of the MS2 sequence (red vertical bar). The slope of the increase in MS2 fluorescence value is given by the mean initiation rate ⟨ ⟩ (black triangle). 755 When the first RNAP molecule reaches the start of the PP7 sequence (green bar), the PP7 fluorescence also begins to rise with the same slope (green curve). The time between MS2 and PP7 signal onset is given by the distance between the start of the two stem loop repeats divided by the elongation rate, . After a time , where is the total length of the reporter gene, the first RNAP molecule 760 reaches the end of the gene (black dashed line). From here, after the cleavage time has elapsed (brown dashed line), nascent RNA transcripts begin cleaving and the two fluorescence signals plateau at a constant value.

S2 Justification for approximating transcript cleavage as instantaneous
In the model presented in Section S1, we assumed that, when a nascent RNA transcript is cleaved 765 at the end of the reporter gene, its MS2 and PP7 fluorescence signals disappear instantaneously. Here, we justify this assumption by demonstrating that the timescale of mRNA diffusion away from the active locus is much shorter than the experimental resolution of our system. When a nascent RNA transcript is cleaved, it diffuses away from the gene locus. For a free particle with diffusion coefficient , the characteristic timescale to diffuse a length scale is given 770 by ∼ 2 . (S8) In the context of the experiment performed here, this can interpreted as the timescale for a cleaved mRNA transcript to diffuse away from the diffraction-limited fluorescence punctum at the locus due to labeled nascent transcripts. We can estimate the characteristic timescale by plugging in the following values. Assume that 775 the completed transcript possesses a typical mRNA diffusion coefficient of ∼ 0.1 2 ∕ (Gorski  et al., 2006). The length scale corresponds to the Abbe diffraction limit, which yields ∼ 250 for green light with a wavelength of about 500 and a microscope with a numerical aperture of 1.

Plugging these values into the equation yields a diffusion time scale of
As a result, a newly cleaved mRNA transcript will typically diffuse away from the locus in less than 780 a second, meaning that its MS2 and PP7 fluorescence signal will vanish much faster than the experimental time resolution of 15 . For this reason, we can justify approximating the cleavage process as instantaneously removing the fluorescent signals of newly cleaved transcripts.

785
The inference procedures described in the main text were carried out using the established technique of Markov Chain Monte Carlo (MCMC). Specifically, we used the MATLAB package MCMCstat, an adaptive MCMC technique (Haario et al., 2001, 2006). For detailed descriptions, we refer the reader to the the MCMCstat website (https://mjlaine.github.io/mcmcstat/), as well as to a technical overview of MCMC (Geyer, 1992). Briefly, MCMC allows for an estimation of the parameter values 790 of a model that best fit the experimentally observed data along with an associated error. In this work, we use MCMC to infer the best fit values of the transcription cycle parameters given observed fluorescence data at the single-cell level. Then, we combine these inference results across cells to construct distributions of inferred values across the ensemble of cells. MCMC calculates a Bayesian posterior probability distribution of each free parameter given 795 the data by stochastically sampling different parameter values. For a given set of observations and a model with parameters , the so-called posterior probability distribution of possessing a particular set of values is given by Bayes' theorem (S10) This posterior distribution is a combination of three components -the likelihood, prior, and evidence. This latter term represents the probability of the observations possessing their particular 800 values, and allows the overall posterior distribution to be normalized. In practice, the evidence term is often dropped since MCMC can still yield accurate results without requiring this normalization. Thus, we have The prior function contains a priori assumptions about the probability distribution of parameter values , and the likelihood function represents the probability of obtaining the observations, given 805 a particular set of parameters . Thus, the most likely set of parameters occurs when the product of the likelihood and prior is maximized, resulting in a maximum in the posterior function. MCMC extends this by sampling different values of such that an approximation of the full posterior distribution is also obtained. The prior distributions for the inferred parameters were set as follows. The prior distribution for 810 the fluctuations in the initiation rate ( ) at each time point was assumed to be a Gaussian prior centered around 0 AU/min with a standard deviation of 30 AU/min. This penalized fluctuations that strayed too far from zero, smoothing the overall initiation rate ( ). For the rest of the parameters, a uniform distribution was chosen using the following uniform intervals: where SS is a sum-of-squares residual function given by Here, the summation runs over individual time points, corresponds to the MS2 or PP7 fluorescence at a given timepoint, and corresponds to the predicted MS2 or PP7 fluorescence 825 according to the model, for a given set of parameter values. That is, where the subscripts indicate the time index over time points. Similarly, where the superscripts indicate that these are model predictions evaluated at the experimental time points. The MCMC approach samples values of parameters to approximate the posterior probability 830 distribution. There are several algorithms that achieve this -the adaptive technique used in the MCMCstat package is an efficient algorithm that updates the sampling technique to more quickly arrive at the converged distribution. For each inference run, an initial condition of parameter values is chosen. The algorithm then stochastically updates the next set of parameter values based on the current and previous values 835 bioRχiv preprint of the posterior distribution function. After a preset number of updates (typically at least on the order of thousands), the algorithm stops, resulting in a chain of MCMC parameter value samples. The initial period following the initial condition, known as the burn-in time, is typically discarded since the results are not reliable. The remaining values of the chain comprise an approximation of the underlying posterior probability distribution, with smaller errors for longer run times. 840 For the purposes of this work, the MCMC procedure was run by separately inferring parameter values for the data corresponding to each single cell. For each inference, random parameter values were chosen for the initial condition of the sampling algorithm in order to prevent initial condition bias from affecting the inference results. The algorithm was run for a total of 20, 000 iterations, which, after removing a burn-in window of length 10, 000, resulted in a chain of length 10, 000 for 845 each of the 299 cells examined. To assess whether or not the algorithm was run for a sufficient number of iterations, the final chain was examined for rapid mixing, where the sampled values of a particular parameter rapidly fluctuate around a converged value. Figure S2A highlights this rapid mixing in the inferred transcription cycle parameters of a sample single cell. The lack of long-timescale correlations, also exemplified by the quick decay of the autocorrelation function of 850 each chain (Fig. S2B), indicates that the algorithm has converged. In addition, a corner plot of the three transcription cycle parameters (Fig. S2C) illustrates the pairwise correlations between them, demonstrating that the inference did not encounter degenerate solutions, and that each parameter has a fairly unimodal distribution.
These diagnostics provided a check on the quality of the inference results. Afterwards, the mean 855 value of each parameter's final chain was then retained for each single cell for use in the further statistical analysis carried out in the main text.

S3.2 Hierarchical inference procedure
Naively running the MCMC inference on the entire 18 min window of time for each single cell's worth of data compromised the quality of the inferred elongation rate. This is highlighted in 860 Figure S3A, which shows the results of running the MCMC procedure on the entire 18 min window of fluorescence data for sample data from a single cell. The fit deviates from the true MS2 and PP7 signals at early timepoints (black lines), resulting in an unreliable measure of the elongation rate. This deviation stems from the inference procedure treating each time point as equally important, whereas, in reality, earlier time points are more important for estimating the elongation rate since 865 they capture the onset of the MS2 and PP7 fluorescence signals.
To resolve this problem, we split the inference procedure into a hierarchy, where we chose to first run the MCMC procedure on an initial subset of the whole time window, consisting of the first 10 min of fluorescent signal. This time window was long enough to capture the initial rise in both fluorescent channels, but neglected the later time points that did not matter as much for 870 the estimation of the elongation rate. Figure S3B shows the best fit of the MCMC procedure run on only this initial period, where the fit better captures the onset of signal in both fluorescent channels. Comparing the overall inferred elongation rates using the short and long time windows indicated that the long time-window inference scheme yielded systematically lower and more variable single-cell elongation rate values, with a mean and standard deviation of 1.90 kb/min and 875 1.24 kb/min, respectively (Fig. S3D, purple). In contrast, the inference performed over the short time window yielded a more constrained distribution, with a mean and standard deviation of 1.84 kb/min and 0.75 kb/min, respectively (Fig. S3D, yellow).
We next we ran the MCMC procedure on the whole 18 min time window for each single cell, but fixed the value of the elongation rate, , to the best-fit value from the initial inference done 880 previously for that same cell. This did not introduce significant error to the fit, as seen in Figure S3C. Instead, this hierarchical method simply biased earlier time points to be more important for the elongation rate in particular. Thus, with this scheme, we were able to establish a robust procedure for reliably estimating the elongation rate while still analyzing the whole time window of data to be able to measure the other parameters such as the cleavage time.

S3.3 Curation of inference results
Individual single cell inference results were filtered automatically and then curated manually for final quality control. First, due to experimental and computational imaging limits, some MS2 or PP7 trajectories were too short to run a meaningful inference on. As a result, we automatically skipped over any cell with an MS2 or PP7 signal with fewer than 10 datapoints. This amounted to 260 cells 890 skipped out of a total of 1053, with 567 (54%) retained.
Second, some single cells yielded poor fits, due to effects such as low signal-to-noise ratio. For example, some single-cell fluorescent signals did not capture enough of the early part of the nuclear cycle, resulting in ambiguity over when the locus actually began transcribing (Fig. S3E). In such cases, the resulting inferred parameter values were not reliable, and we rejected these fits. Additionally, 895 the model failed to fit to the some fluorescent traces (Fig. S3F). For these cases, we surmised that this poor fit could be due to a multitude of factors, ranging from experimental acquisition noise to intrinsic biological deviation from the model. Due to our inability to further resolve these details, there was no one-size-fits-all approach to automatically filtering these inference results, and so we opted for conservative manual curation, retaining only the fits that were decent. 900 In sum, 299 cells of data were retained out of 567 total after this curation process. We reasoned that, since we still ended up with hundreds of single cells of data, the resultant statistical sample size was large enough to extract meaningful conclusions. To verify that the curation process did not introduce substantial bias into our results, we compared the mean inferred transcription cycle parameters from the 299 curated cells with those from the entire 567 cells, both curated and 905 uncurated (Fig. S3G). The inferred results were nearly identical, with a minor systematic decrease in the mean initiation rate and cleavage time in the case of the uncurated results that nevertheless did not alter any position-dependent trends. Thus, we were confident that the curated dataset constituted a representative sample of the whole embryo. 910 In the inference scheme presented in the main text, we allowed the scaling factor between the MS2 and PP7 fluorescence signals to be a free model parameter. Allowing this scaling factor to be a free parameter facilitates adoption of the method and obviates the need for an external control measurement to calibrate the MS2 and PP7 signals. Here, we show that the inferred scaling parameter is comparable to that resulting from such calibration measurements. 915 Due to the fact that the MS2 and PP7 stem loop sequences were associated with mCherry and eGFP fluorescent proteins, respectively, the two experimental fluorescent signals possessed different arbitrary fluorescent units, related by the scaling factor

S4 Calibration of MS2 and PP7 signals
where 2 and 7 are the fluorescence values generated by a fully transcribed set of MS2 and PP7 stem loops, respectively. Although has units of MS2 ∕ PP7 , we will express without units in 920 the interest of clarity of notation. The value of was inferred as described above in Section S3. As an independent validation, we measured by using another two-color reporter, consisting of 24 alternating, rather than sequential, MS2 and PP7 loops (Wu et al., 2014; Chen et al., 2018) inserted at the 5' end of our reporter construct (Fig. S4A). Figure S4B shows a representative trace of a single spot containing our calibration construct. 925 For each time point in nuclear cycle 14, the mCherry fluorescence in all measured single-cell traces was plotted against the corresponding eGFP fluorescence (Fig. S4C, yellow points). The mean was t fi l a n fi l a n g then calculated by fitting the resulting scatter plot to a line going through the origin (Fig. S4C, black line). The best-fit slope yielded the experimentally calculated value of = 0.154 ± 0.001 (SEM). A distribution for was also constructed by dividing the mCherry fluorescence by the corresponding 930 eGFP fluorescence for each datapoint in Figure S4D, yielding the histogram in Figure S4D (yellow), which possessed a standard deviation of 0.0733.
Binning the cells by position along the embryo revealed a slight position dependence in the scaling factor, with higher values of in the anterior, about 0.15, and lower values in the posterior, about 0.1 (Fig. S4E, yellow). 935 We then compared the values of from the single-cell inference in the main text to those of the control experiment. Figure S4D shows a histogram of inferred values of from the inference procedure (blue), with a mean of 0.161 ± 0.003 (SEM) and a standard deviation of 0.049, in agreement with the independent measurement described above. Furthermore, the inference revealed the same position-dependent trend as the control experiment, with higher mean values of in the 940 anterior of the embryo (Fig. S4E, blue line).
The position dependence observed both in the calibration experiments and inference suggests that this spatial modulation in the value of is not an artifact of the constructs or our analysis, but a real feature of the system. We speculated that this spatial dependence could stem from differential availability of MCP-GFP and PCP-GFP along the embryo, leading to a modulation in the maximum 945 occupancy of the MS2 stem loops versus the PP7 stem loops. Regardless, our data demonstrate that the inferred and calibrated can be used interchangeably.
With the inference of validated against the independent control calculation, the MS2 signals for each single cell could be rescaled to the same units as the PP7 signal (Fig. S4F). All plots in the main text and supplementary information, unless otherwise stated, reflect these rescaled values 950 using the overall mean value of = 0.1539 obtained from the inference.

S5 Validation of the RNAP processivity assumption
The calibration between the MS2 and PP7 (Section S4) signals provided an opportunity to test the processivity assumption presented in the main text, namely that the majority of loaded RNAP molecules transcribe to the end of the gene without falling off. To estimate the processivity 955 quantitatively, we assume that a series of RNAP molecules transcribes past the MS2 stem loop sequence at the 5' end of the reporter gene, and that only successfully transcribe past the PP7 stem loop sequence at the 3' end. Here, we define to be the processivity factor, and require 0 < < 1. Thus, = 1 indicates maximal processivity where every RNAP molecule that transcribes the MS2 sequence also transcribes the PP7 sequence, and = 0 indicates minimal processivity, 960 where no RNAP molecules make it to the PP7 sequence.
We assume that no RNAP molecules fall off the gene while they transcribe the interlaced MS2/PP7 loops used in the calibration experiment described in Figure S4A. Under this assumption, RNAP molecules will fully transcribe both sets of stem loop sequences, allowing us to define the scaling factor as the ratio of total fluorescence values 965 calib = 2 7 = 2 7 . (S17) Note that, in this simple model, RNAP molecules can still fall off the gene after they transcribe the set of MS2/PP7 loops. Now, we consider the construct with MS2 and PP7 at opposite ends used in the main text. Allowing a fraction of RNAP molecules to fall off the gene between the MS2 and PP7 loops, we arrive at a scaling factor infer = We can thus calculate the processivity from taking the ratio of the true and biased scaling factors 970 = calib infer . (S19) Taking the mean value of calib from our control experiment to be the true value and the mean value of infer from the inference from the main text to be the biased value, we calculate a mean processivity of = 0.96, with a negligible standard error of . × − . Thus, on average, 96% of RNAP molecules that successfully transcribe the 5' MS2 stem loop sequence also successfully transcribe the 3' PP7 stem loop sequence, confirming previous results (Garcia et al., 2013) and 975 lending support to the processivity assumption invoked in our model.

S6 Comparing intra-and inter-embryo variability
In the analysis in the main text, we treated all single cell inference results equally within one statistical set. In principle, this is justified only if the variability between single cells is at least as large as the variability between individual embryos. In this section we prove this assumption. 980 Here, we examine two quantities: the intra-embryo variability, defined as the variance in a parameter across all single cells in a single embryo, and the inter-embryo variability, defined as the variance across embryos in the single-embryo mean of a parameter. We examined these two quantities for the three primary inferred parameters -the mean initiation rate, elongation rate, and cleavage time. 985 Figure S5 shows the results of this comparison, where the red (blue) lines indicate the intra-(inter-) embryo variability and the red (blue) shaded regions indicate the standard error (bootstrapped standard error) in the intra-(inter-) embryo variability. For all of the parameters, the intra-embryo variability is at least as large as the inter-embryo variability, validating our treatment of all of the single-cell inference results as a single dataset, regardless of embryo. the cell-to-cell variability in transcription initiation. Further, under the right conditions, the variability reported by this method has been shown to be dominated by biological sources of variability and to have a negligible contribution from experimental sources of noise (Zoller et al., 2018). We sought to determine how well our approach could report on biological variability. To do so, 1000 we contrasted the inference results of the transcriptional activity of our hunchback reporter with a snapshot-based analysis inspired by single-molecule FISH (Zoller et al., 2018). Specifically, we calculated the CVs in the raw MS2 and PP7 fluorescence in snapshots taken at 10 minutes after the start of nuclear cycle 14. We reasoned that, since this calculation does not utilize the full timeresolved nature of the data, it provides a baseline measurement of total noise that encompasses 1005 both experimental and biological variability. As a point of comparison, we also calculated the CV in the instantaneous MS2 signal from another work using a similar P2P-MS2-lacZ construct (Eck et al.,  2020). Figure S6A shows the CV as a function of embryo position as reported by these different approaches. For the static measurements (red, green, and blue), the CV lay around 20% to 40%, 1010 reaching a peak near 40% along the length of the embryo. The CV of the inferred mean initiation rate (purple) exhibited similar values, although it was slightly lower in a systematic fashion. This difference was likely due to the fact that the inference relies on time-dependent measurements that can average out certain sources of error such as experimental noise, whereas such time averaging is not possible in the context of single time point measurements. 1015 To test whether the discrepancy in the CV between time-resolved and snapshot-based measurements arose from differences in the experimental error of each technique, we utilized the alternating MS2-PP7 reporter used in the calibration calculation (Section S4) to separate out experimental sources of variability from true biological sources. Specifically, because the MS2 and PP7 fluorescent signals in this reporter construct should, in principle, reflect the same underlying 1020 biological signal, deviations in each signal from each other should report on the magnitude of the experimental error.

S7 Comparison of variability in mean initiation rate reported by our inference with static measurements
Following the formalism introduced by Elowitz et al. (2002), we identify the correlated noise between the MS2 and PP7 signals as true biological variability. In contrast we identify the uncorrelated noise with experimental variability. We then used the two-color formalism (Elowitz et al., 2002) to separate out experimental noise from biological noise using the MS2 and PP7 fluorescent signals at each point in time presented in Figure S4C . First, we defined the deviations 2 and 7 of each instantaneous MS2 and PP7 fluorescent signal from the mean MS2 and PP7 fluorescence signals, averaged across nuclei and time Note that the total noise 2 is simply the squared coefficient of variation. Thus, the squared coefficient of variation (CV 2 ) of our data is equal to 2 and can be separated into the uncorrelated and correlated components. Figure S6B shows this CV 2 (averaged across embryo position) compared with the separated uncorrelated and correlated noise sources. Intriguingly, the uncorrelated and correlated noise 1030 (yellow) each contribute about half to the overall noise, which is quantitatively comparable to the CV 2 of the static snapshot of MS2 and PP7 data used in the main text (red, green), roughly 20%. Furthermore, the CV 2 of the inferred mean initiation rate is roughly half of the CV 2 of the static fluorescence measurements and is quantitatively comparable to the correlated noise, at about 10%.
As a result, the MCMC inference method can quantitatively capture the true biological variability 1035 in the mean initiation rate while separating out the uncorrelated contribution due to experimental noise. Thus our results support the power of model-driven inference approaches in providing clean readouts of variability in transcriptional parameters.

S8 Comparison of distribution of elongation rates with other works
As an additional validation of our inference results, we compared the distribution of single-cell  Figure S7 shows the comparison of distributions of elongation rates. Because the reporter constructs and analysis techniques differed between works, a quantitative comparison is not possible. Nevertheless, all three sets of results report a significant cell-to-cell variability in mean elongation rate, ranging from 1 kb/min to 3 kb/min. 1050 To investigate the molecular mechanisms underling single-cell distributions of elongation rates obtained from the inference, we developed a single-molecule theoretical model. We were interested in how the observed variability in single-cell elongation rates could constrain models of the singlemolecule variability in RNAP elongation rates. To disregard effects due to position-dependent modulations in the transcription initiation rate, we only studied cells anterior of 40% along the 1055 embryo length, where the initiation rate was roughly constant.

S9 Theoretical investigation of single-cell distribution of elongation rates
The model was adapted from the stochastic Monte Carlo simulation used in Klumpp and Hwa (2008), which accounts for the finite size of RNAP molecules (Fig. S8). Here, single RNAP molecules are represented by one-dimensional objects of size that traverse a gene consisting of a one-dimensional lattice with a total number of sites, corresponding to single base pairs, equal to 1060 . The position of the active site of molecule is given by , which takes integer values-each integer corresponds to a single base pair of the gene lattice. Because RNAP molecules have a finite size, given by , an RNAP molecule thus occupies the lattice sites from to + . New RNAP molecules are loaded at the start of the gene located at = 0. Due to the exclusionary interactions between molecules, simultaneously simulating the motion of all molecules is unfeasible, 1065 and a simulation rule dictating the order of events is necessary. At each simulation timestep , a randomized sequence of indices is created from the following sequence where {1, … , } correspond to any RNAP molecules = 1, … , already existing on the gene, and 0 corresponds to the promoter loading site for new RNAP molecules. The process is repeated until a total simulation time has elapsed. Choosing indices from the random sequence  obtained above, the following actions are taken. If the promoter loading site is chosen ( = 0), an RNAP molecule is loaded with probability , only if no already existing RNAP molecules overlap with the footprint of the new RNAP molecule. If such an overlap occurs, then no action is taken. To calculate the probability of loading, a random number is drawn from a Poisson distribution with parameter . Recall that, for a Poisson distribution with 1075 parameter , the resulting random variable corresponds to the number of occurrences in a time frame . Here, if this number is one or higher, then the loading event is considered a success.
If the index indicates that an RNAP molecule was chosen > 0, then that RNAP molecule advances forward with stochastic rate . This probability is simulated by drawing a random number from a Poisson distribution with parameter , thus giving an expected distance traveled of 1080 per timestep. If this movement would cause the RNAP molecule to overlap with another RNAP molecule, then no action is taken. Otherwise, the RNAP molecule moves forward the number of steps given by the generated random variable.
To simulate potential single-molecule variability, each RNAP molecule can possess a different stepping rate . For a given RNAP molecule , its stochastic stepping rate is drawn from a truncated 1085 normal distribution with mean and standard deviation and lower and upper limits 0 and infinity, respectively = ( , , 0, ∞).
Once the position of the active site of an RNAP molecule exceeds that of the total number of sites , i.e. the molecule reaches the end of the gene, it is removed from the simulation. The simulation does not incorporate any cleavage or RNAP termination processes, since it only focuses 1090 on studying elongation dynamics in the body of the gene. Additionally, we do not incorporate sequence-dependent RNAP pausing along the gene.
To calculate a mean elongation rate, we first computed a mean elongation rate for each single RNAP molecule in the simulation. This single-molecule elongation rate was obtained by taking the finite difference in position divided by the timestep for each RNAP molecule loaded during a single simulation rate. Then, we averaged these single-molecule elongation rates to obtain a simulation-wide mean elongation rate.
We treated each simulation as an individual cell such that we interpreted this quantity as

Parameter Description Value
total simulation time 600 sec simulation timestep 0.5 sec size of lattice 6626 bp RNAP footprint (Selby et al., 1997) 40 bp mean loading rate 0.17 sec −1 standard deviation of loading rate 0.05 sec −1 mean elongation rate free parameter standard deviation of elongation rate free parameter Table S1. Parameters used in single-molecule Monte Carlo simulation of elongation rates.
precisely the single-cell elongation rate inferred from the data in the main text. Thus, the distribution of mean elongation rates across a simulated population of cells was compared to the experimentally 1100 inferred distribution of single-cell elongation rates. The simulation was run for 200 cells.
Finally, to account for single-cell variability in the transcription initiation rate, the loading rate was allowed to vary across each simulated cell and was drawn from a Gaussian distribution with parameters reflecting the actual data. Since hunchback is known to load new nascent RNA transcripts at a rate of 1 molecule every 6 seconds in the anterior of the embryo (Garcia et al., 1105 2013), we thus chose the mean of this distribution to be 1 molecule∕6 = 0.17 −1 . The standard deviation was chosen to be this mean multiplied by the CV in the initiation rate in the anterior inferred in the main text, resulting in a value of 0.05 −1 . Thus, for simulated cell where any negative value was replaced with zero. The values of each simulation parameter are summarized in Table S1. 1110 To investigate the nature of molecular variability in elongation rates, we attempted to fit the mean and variance of the simulated distribution of elongation rates with those of the inferred single-cell distribution from the data.
First, we fixed to zero and left as a free parameter (Fig. 2F, brown). Because here each RNAP molecule had the same stepping rate, the exclusionary interactions between molecules did not 1115 appear to substantially alter the single-cell mean elongation rate and the model could not produce the high single-cell variability in elongation rate seen in the data (Fig. 2F, blue).
In contrast, if we also left the single-molecule variability in elongation rate as a free parameter, the model could reasonably produce the distribution observed in the data (Fig. 2F, gold). The model produced a simulated mean (standard deviation) of 1.83(0.75) kb/min compared to the inferred 1120 values of 1.84(0.75) kb/min from the data.
The best fit parameters were = 2.76 kb/min and = 3.72 kb/min, indicating that substantial variability in the single-molecule elongation rate was necessary. In addition, the value of was much larger than the overall mean elongation rate of 1.84 kb/min seen in the data, likely due to the fact that the high densities of RNAP molecules here resulted in traffic jams, effectively slowing 1125 down the overall single-cell elongation rate despite the high single-molecule stepping rates of each individual RNAP molecule.  Figure S8. Cartoon overview of simulation. RNAP molecules with footprint stochastically advance one-dimensional gene represented as a lattice with unique sites, with each site equivalent to a single base pair. Each RNAP molecule possesses an intrinsic stepping rate , and each cell stochastically loads new RNAP molecules at the promoter with rate .