Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Estimating RNA dynamics using one time point for one sample in a single-pulse metabolic experiment

Micha Hersch, Adriano Biasini, Ana Claudia Marques, Sven Bergmann
doi: https://doi.org/10.1101/2020.05.01.071779
Micha Hersch
1Department of Computational Biology, University of Lausanne,CH-1015 Lausanne, Switzerland
2Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: micha.hersch@unil.ch
Adriano Biasini
1Department of Computational Biology, University of Lausanne,CH-1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ana Claudia Marques
1Department of Computational Biology, University of Lausanne,CH-1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sven Bergmann
1Department of Computational Biology, University of Lausanne,CH-1015 Lausanne, Switzerland
2Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Over the past decade, experimental procedures such as metabolic labeling for determining RNA turnover rates at the transcriptome-wide scale have been widely adopted. Several computational methods to estimate RNA processing and degradation rates from such experiments have been suggested, but they all require several RNA sequencing samples. Here we present a method that can estimate RNA synthesis, processing and degradation rates from a single sample. To this end, we use the Zeisel model and take advantage of its analytical solution, reducing the problem to solving a univariate non-linear equation on a bounded domain. This makes our method computationally efficient, while enabling inference of rates that correlate well with previously published data sets. Using our approach on a single sample, we were able to reproduce and extend the observation that dynamic biological processes such as transcription or chromatin modifications tend to involve genes with higher metabolic rates, while stable processes such as basic metabolism involve genes with lower rates.

In addition to saving experimental work and computational time, having a sample-based rate estimation has several advantages. It does not require an error-prone normalization across samples and enables the use of replicates to estimate uncertainty and perform quality control. Finally the method and theoretical results described here are general enough to be useful in other settings such as nucleotide conversion methods.

1 Introduction

Since the advent of molecular biology, a consensus has emerged that the regulation of gene expression underlies most biological processes including development, disease and adaptation [19, 15, 18]. While gene expression regulation has mostly been associated with activating the production of RNA (e.g. through transcription factors), it has become apparent that the regulation of RNA splicing and RNA stability also plays an important role in determining the expression level of a gene [8, 1]. Taking advantage of high throughput RNA quantification protocols, methods designed to distinguish the effects of RNA synthesis, processing and degradation at the transcriptome-wide level have been developed. Among them, RNA metabolic labeling techniques relying on chemically modified ribonucleotides such as 4-thiouridine (4sU) and 5’-Bromouridine (BrU) have been widely adopted, as their impact on cellular function is minimal [9]. Briefly, incubating cells with modified ribonucleotides for a limited period of time (referred to as the pulse), and their concomitant incorporation in newly synthesized transcripts, allows distinguishing newly transcribed from preexisting RNA, which can be biochemically separated and quantified. This quantification, initially performed using microarray [7] and later using RNA-seq [20, 23], can then be used to estimate RNA decay rates. More recently, methods that rely on nucleotide conversion have been used to the same effect, with the advantage of circumventing the cumbersome biochemical enrichment and separation step.

In the last decade, several methods to estimates RNA dynamics from metabolic labeling experiment data have been developed [26, 21, 2]. Typically, labeled transcript abundance are fitted to an exponential approach to equilibrium, from which the RNA half-life can be estimated [24, 16]. This requires time-course experiments in order to have enough points for fitting, as well as a way to normalize RNA concentrations across samples, either using spike-ins [22], or using internal controls such as intron concentrations [17]. The INSPEcT method [6] goes beyond first order dynamics and takes into account the RNA processing rates, which are estimated along with the degradation and synthesis rates. This increases the number of parameters in the model and thus the number of samples needed for the estimation.

In this work, we build on this approach. However, by considering the intron to exon ratio for each transcript in both the labeled and unlabeled RNA pools, we bypass the need for normalization across samples. Moreover by using the analytical solution to our RNA model, we can infer processing and degradation rates from a single sample and time point. This has several advantages, such as reducing the experimental load and costs, as well as enabling comparisons across samples and time points. Applying our method to our own experimental data and using a single sample and time point, we obtain mRNA degradation rates that correlate well with previously published rates obtained with three replicates and seven time points [12].

2 Method

2.1 Overview

This paragraph summarizes the general strategy of the method, with references to relevant equations indicated in parentheses. We use the Zeisel model of RNA dynamics [27] to model both the unlabeled and the labeled RNA (1). Using the standard procedure for solving systems of linear differential equations, we find its general solution and its free parameters by setting the initial conditions for both the unlabeled (or pre-existing) and the labeled RNA (3,5), as illustrated in Fig 1. We can then express, for a given gene, the ratios for both unlabeled and labeled RNA of intron to exon expression level as functions of the processing and degradation rate of that gene (8,9). Those two ratios are independent from the RNA synthesis rate. Using the intron to exon ratios as observables, we are left with two non-linear equations and two unknowns, namely the processing and degradation rates. Those equations are then reparametrized with dimensionless parameters and reduced to a single non-linear equation with one unknown (22). This resulting equation is only defined on a bounded domain (24). Our rates can thus be inferred by numerically solving that equation on a bounded domain, which is very fast. In addition, we prove in Appendix 2 that under certain conditions, that equation has a single solution (but in general it can also have two or no solution).

Fig. 1.
  • Download figure
  • Open in new tab
Fig. 1.

Evolution of unlabeled and labeled, premature and mature RNA during labeling according to the Zeisel model. Dotted horizontal lines correspond to steady-state levels, dashed lines correspond the unlabeled RNA and solid lines to labeled RNA. Processing and degradation rates can be estimated from the ratios of the two dashed lines and of the two solid lines at a single time point.

2.2 Model

Like previous work [6], we use the Zeisel model of RNA synthesis, processing and degradation [27]. Embedded Image Embedded Image where p is the premature RNA, m the mature RNA, and α, β, γ are RNA the synthesis, processing and degradation rates. This model can be solved analytically (see appendix 1). In particular, enforcing the boundary conditions corresponding to the unlabeled RNA, namely that it is a steady-state when the pulse starts (t = 0) and then pre-mature RNA is not produced anymore, results in Embedded Image Embedded Image where the u subscript indicates that this corresponds to the unlabeled RNA pool.

Enforcing boundary conditions corresponding to the labeled RNA, namely that it is not expressed at t = 0 leads to Embedded Image Embedded Image where the l subscript indicates that this corresponds to the labeled RNA pool.

2.3 Inferring processing and degradation rates

We consider that the exonic RNA abundance χ corresponds to the processed and mature RNA, while the intronic RNA abundance 𝒾 correspond to the processed RNA only. Furthermore, we assume that χ and 𝒾 are suitably normalised for exonic and intronic length so that they are proportional to the number of transcripts. We can then compute: Embedded Image where T is the duration of the labeling.

In the case of unlabeled fraction, we have Embedded Image where we define Eβ = exp(−βT) and Eγ = exp(−γT) as abbreviations.

For the labeled fraction, we have Embedded Image

We notice that this last expression is of the same form as the one for the unlabeled fraction (8), but replacing exponentials by their complement to one. Importantly those two fractions do not depend on α, which (unlike [10]) allows our method to estimate processing and degradation rates independently from the synthesis rate.

Denoting Embedded Image and Embedded Image as the observable unlabeled and labeled fractions of intron abundance, we are left with a system of two equations and two unkowns β and γ, which we now set out to solve. First, we reparametrize our system with β = kγ and define Ekγ = Eβ = exp(−kγT) leading to Embedded Image Embedded Image

We thus have Embedded Image Embedded Image

Summing (12) and (13) yields Embedded Image Embedded Image

Dividing (12) by (13) and inserting (15) results in Embedded Image

It follows that Embedded Image Embedded Image an thus Embedded Image Embedded Image

Moreover, from (10), we have that Embedded Image

Multiplying (20) by Embedded Image and subtracting (21) results in Embedded Image

Our system of two equations can thus be reduce to a single equation which does not explicitly depend on T and can be solved numerically. In practice a and b are approximated by ru and rl, defined as the length-normalized intronic to exonic read count ratio (or TPM ratio) for the unlabeled and for the labeled sampled respectively. This equation also provides upper and lower bounds for k as both Embedded Image and bk + b − 1 must be strictly positive for their logarithm to be defined and Embedded Image for (19) to hold. Developing those three conditions results in the following domain of definition 𝒟 for k: Embedded Image where 0 < a < b < 1. Note that the right-hand side of (22) is in general continuous in k = 1, but not in k = 1 − a. Furthermore, it can be shown (see Appendix 2) that for Embedded Image, (22) has a single solution in the domain given by (24), which can be found very efficiently. This enables the estimation of the processing and degradation rates for a single sample. Moreover, since the reduced equation is independent from T, uncertainty on its true value does not affect the relative values of the resulting rates. Hence replicates can be used to assess the reliability of the estimates and time courses allow to test whether the rates are constant as assumed by the model.

If (22) does not have a solution, estimates can be obtained by minimizing (in log space) the squared Euclidian distance between the observed (i.e., ru, rl) and derived values of a and b: Embedded Image

The ratios ru, rl must be smaller than one to make sense within our model and genes where this is not the case should be discarded. The log function is used to give exon and intron counts equal standing.

The above bivariate function can be reduced to a univariate function f ∗ using (20): Embedded Image

Once processing and degradation rates are obtained, the (relative) synthesis rates α can be easily obtained from (4), where mu is approximated by χu (which is likely the most reliably measured specie).

3 Results

3.1 Simulated data

In order to confirm that our method can be applied in principle, we evaluated our method on simulated data, where the data was generated using the exact model used to develop the method (see equations 3 and following). We then generated 50000 random value for α, β, and γ ranging between exp(−5) and exp(5) and computed the corresponding values for 𝒾 and χ. We then computed ru and rl by taking the ratio. Estimates Embedded Image and Embedded Image where then inferred by using ru and rl as an input to the method and compare the original β and γ.

Numerically solving equation (22), yielded either one or two solutions. The results for the unambiguous cases are shown in Fig 2. We see that in virtually all cases, the method yields accurate estimates of the processing and degradation rates. For a few points, the method is less accurate at the upper boundary of the parameter space, probably due limited floating point precision. Indeed if the labeling time is too long with respect to the metabolic rates, virtually all unlabeled RNA are degraded and the rates cannot be reliably estimated.

Fig. 2.
  • Download figure
  • Open in new tab
Fig. 2.

Simulated data. The method correctly estimates processing and degradation rates. Points with ambiguous solutions are not shown. Some points corresponding to high rates cannot be estimated correctly as the system as already reached steady-state during the simulated “pulse”.

As we are considering single-sample estimates, it is possible to chart the observable space given by a and b and see when the method provides unambiguous results. Fig. 3 confirms that for Embedded Image the method provides a unique (and correct) solution as proven in appendix 2. Below this line (displayed in green), the methods provides ambiguous results as two distinct set of values β and γ can account for the same value of a and b (in blue).

Fig. 3.
  • Download figure
  • Open in new tab
Fig. 3.

Simulated data. The measurement space can be partitioned into ambiguous and unambiguous regions. The green line corresponds to Embedded Image Above that line, rates are correctly and unambiguously estimated. Boundary cases are sometimes wrongly estimated, probably due to numerical errors (red dots).

It is also possible to visualize the trajectories of the observables a and b for various values of k, as depicted in Fig. 4. When T = 0, trajectories start from the top of the space at Embedded Image. When k < 1, as time passes the system moves down to (a,b) → Embedded Image. For k ≥ 1, trajectories move to Embedded Image. Note that this is the expected case, as the splicing of mRNA occurs in general faster than its degradation. Note that, in this case, trajectories cross below the green line, explaining why two solutions can be found for a single value of (a, b). The speed at which the system follows those trajectories depends on γ.

Fig. 4.
  • Download figure
  • Open in new tab
Fig. 4.

Observable space of the dynamical system. Trajectories in the phase space are solely determined by the k parameter. They start at time T = 0 at the top (b = 1) and go down. For k < 1 the trajectories (in blue) remain above the green line defined by b = (2 − a)−1 and do not cross. For k > 1 (in red), they cross each other below the system follows the trajectory depends on the actual values of β and γ.

3.2 Real data

In order to assess the performance of the method on real data, we applied our method on the 4sU labeling experiment described in [3]. Briefly, mouse embryonic stem cells were plated at a density of 40000 cells/cm2 on gelatin-coated 10cm tissue culture plates and grown for approximately 14 hours. After addition of 4sU to the growth medium, cells were incubated at 37C for 10 minutes (10 minutes labeling pulse). RNA was then extracted and processed according to the protocol described in [4]. Reads that did not map to mouse ribosomal RNA sequences were aligned to intronic and exonic sequences using STAR V2.5 and quantified using RSEM V1.1.17, yielding intron and exon expression levels for unlabeled and labeled RNA.

For a single sample, the observable space represented in Figs 3 and 4 is represented (in log coordinates) in Fig 5. We see that, while the points are centered on the expected region of the observable space, many transcripts lie above rl = 1 (in blue) or below the diagonal (in red), which is not compatible with our model. Those transcripts, amounting to 25% of protein-coding genes with an exon TPM higher than 1, are discarded from further analyses.

Fig. 5.
  • Download figure
  • Open in new tab
Fig. 5.

Real data. Each point corresponds to a transcript with its transparency reflecting log expression value. Like in the previous figure, the green line is defined y = (2 − x)−1. For transcripts lying between the abscissa (in blue) and the green line, estimates of processing and degradation rates can be obtained by solving (22). For transcripts lying between the diagonal (in red) and the green line, estimates can be obtained by minimizing (25). The observed ratios for the remaining transcripts are not coherent with the model and are discarded.

The processing and degradation rates were computed either by solving (22) when rl > (2 − ru)−1 or by optimizing (25) otherwise. It took a few seconds to estimate several tens of thousands of rates on a desktop computer. For those cases that had two solutions (6% of the transcripts), we selected the one corresponding to rates most consistent with the other transcripts.

The resulting synthesis and degradation rates are depicted in Fig. 6. They have a correlation of 65%, which is very similar to the 66% reported by [12] for the same system. This is also consistent with the emerging concept of a coupling between RNA transcription and decay [11]. Our data indicate that genes span a large range of dynamics, irrespective of their expression level. Indeed, genes with high synthesis and degradation rates can have the same steady-state expression level as genes with low synthesis and degradation rates. However, the former will reach this steady state faster than the latter. It thus make sense to consider our RNA metabolic rates in the functional frame of reference indicated in Fig. 6. One axis corresponds to the steady state concentration, given by the log-ratio of synthesis over degradation rates (or equivalently by the difference of log of the rates). The second axis correspond to the responsiveness of the gene, i.e. how fast it reaches steady state (computed by the sum of the log of the synthesis and degradation rates). It has been observed before that genes involved in more reactive and dynamic biological processes such as chromatin remodeling or transcription regulation tend to have a higher turnover than genes involved in more stable processes such as basic metabolism [23]. We checked that our data confirm this observation by looking at the Gene Ontology (GO,[5])) categories most associated by [23] with high and low turnover, namely “transcription” and “monosaccharide metabolism”. Despite having similar steady-state abundances, transcripts of genes involved in transcription indeed have significantly faster dynamics and the ones involved in monosaccharide metabolism have significantly slower dynamics than the rest of the genes, as illustrated by the squares in Fig. 6. Other categories where our data confirms faster genes include chromatin modifications, cell cycle and transcription regulation.

Fig. 6.
  • Download figure
  • Open in new tab
Fig. 6.

Estimated mRNA synthesis and degradation rates obtained from a single sample. Those rates can also be considered in a different and maybe functionally more relevant frame of reference defined by the steady state abundance (first axis) and gene responsiveness (second and perpendicular axis), as illustrated by the background grid. Genes involved in fast adapting biological processes (such as transcription) tend to be more responsive than genes involved in stable functions (such as monosaccharide metabolism). The squares on the axes represent the projections of the mean rates for the respective categories (gray representing genes that belong to neither of the two categories) and indicate that mean transcript responsiveness (but not abundance) is strongly affected by the category. Those two GO categories were selected for illustration because they were previously reported to be mostly enriched in high and low turn-over genes respectively [23].

We assessed the precision of our method by comparing the resulting degradation rates to those published for the same cell type by [12]. Those were obtained by using three replicates and seven time points and applying the SLAM-seq nucleotide-conversion method that, unlike metabolic labeling, does not require biochemical separation between the labeled and unlabeled RNA and is thus not affected by noise generated by the imperfect separation process (although that method has its own source of noise). From our data, we obtained gene degradation rates by taking, for each gene, the weighted average degradation rates of the corresponding transcripts The weights were given by the mean exonic expression levels (unlabeled and labeled). We expect a lower precision for transcripts close to the rl = 1 line, for which the labeling time was likely somewhat too short, so to assess the correlation, we weighted the transcripts by 1 − rl. Fig. 7 compares degradation rates obtained in our experiments with those reported by [12], keeping only genes with an average expression value higher than 100 TPM. We expect a higher precision for highly expressed genes, as this allows for a more precises estimates of the intron to exon ratios. This is indeed the case, and depending on the expression threshold and the sample, the correlation between our data and the previously published rates, we obtain a correlation ranging between 30% and 67% for a single sample estimate (see Fig. 8). As those are experiments performed in different labs using different methods, those numbers show that our rates obtained on a single sample and time point are meaningful. For comparison, [13] reports correlations around 70% by using the same data, but changing only the method of analysis. Using three replicates, [4] reports a 26% correlation using the INSPEcT package.

Fig. 7.
  • Download figure
  • Open in new tab
Fig. 7.

Degradation rates estimated from a single sample plotted against degradation rates published in [12] (obtained using slam-seq). The red line is obtained through weighted linear regression. The weights are set as 1 − rl as indicated by the transparency of the dots. The (weighted) correlation of 55% indicates that the estimated rates are meaningful. Only genes with a mean exon TPM above 100 are taken into account.

Fig. 8.
  • Download figure
  • Open in new tab
Fig. 8.

Correlation between degradation rates obtained by [12] and the ones obtained our single-sample method as a function of expression level. Each line represents a biological replicate. The dot corresponds to the data shown in Fig 7. As expected, the correlation is higher for highly expressed genes, as the intro to exon ratios can be more reliably estimated. In this experiment, replicate 1 correlates better than the two others, indicating that it is probably of better quality.

4 Discussion

In this paper, we presented a method to estimates splicing and degradation rates of RNA transcripts from a single 4sU labeled sample. We validated our method first in silico and then on real data obtained from mouse embryonic stem cells. Using our method we first replicated, on a different cell type, previous findings about the enrichment in high or low turn-over gene of specific cellular processes. Second, we showed that the rates obtained with our method correlate well (between 30% and 67%) with published rates obtained by applying SLAM-seq to the same cell types. Methods for such estimation have been published before, but they usually require a sufficient number of samples (around a dozen). In contrast to these methods, our method explicitly uses the analytical solution to the standard RNA dynamics model given by (26). Moreover, our method is self-normalizing as it only uses the ratio of intron to exon expression levels. It is thus not affected by differences in sequencing depth of the various samples. This approach makes our method also faster than other methods as it boils down to numerically solving on a bounded domain either a univariate equation or a one-dimensional optimization for each transcript. However, a caveat of our method is that a sizable fraction of mostly lowly expressed transcripts (about 25 % in our case) are inconsistent with the model and their dynamics cannot be estimated, while the method provides two solutions for another fraction (6%) of the transcripts. However, our theoretical considerations indicate that this issue is constitutive of the model, and is likely to also impact other methods (for example through mutliple local extremas in the likelihood function), although it may be more difficult to detect it. Using multiple samples or time-points is likely to help solve this problem.

While using a single sample allows to reduce costs, this is not the only merit of this approach. In practice most experiments will have biological replicates, in which case our methods enables obtaining point estimates of α, β and γ for each of them. This in turn allows for estimating their variance, as well as assessing sample quality (e.g. if one of them systematically gives very different estimates for all genes). Moreover because cell growth is likely to be limited during (short) labeling time, it is less likely to interfere in the estimation process than when using time course data, where it can have an effet [17]. In addition, when used in a time-course experiment, our method allows to investigate the evolution of those rates over time and assess whether those rates are stationnary. Finally, the theoretical results obtained in this paper, could be used to improve other methods. For example, the method could be used to analyze SLAM-seq data which would reduce the number of samples but also provide estimate for the processing rate. Another possible application is single cell RNA velocity [14], where the Zeisel model of RNA dynamics is also used, but splicing rates γ are set to be equal for all transcripts. While it has been documented (and is consistent with our data) that splicing rates are more homogeneous than degradation rates [21], this is potentially an approximation that could be improved with our framework to increase the accuracy of the method.

The method presented in this paper can be adapted for the case when unlabeled RNA is mixed with labeled RNA in a “total” rather than a “unlabeled” RNA pool. In that case, the intron to exon ratio in the total RNA pool is constant during labeling time and is given by Embedded Image, and rates can be easily obtained from (11). This method is however likely to be less precise than separating unlabeled from labeling RNA, as additional information can be gained from the decreasing unlabeled RNA pool.

Our method could be further improved in several ways. For example, unlike in [10], we did not consider the effect of leakage of unlabeled RNA in the labeled RNA pool because of unspecific capture. This leakage has the effect of dragging rl down towards the diagonal, and could potentially be estimated from the data as it is shared across all transcripts.

Another improvement would be to embed this method in a probabilistic framework in order to quantify the estimate uncertainty (as in [13] for a simpler model) or to determine the optimal labeling time (as in [25]).

Author contribution and conflict of interest

.H. developed and implemented the method, analyzed the expression quantification data, interpreted the results, figured out the proof, generated the figures and wrote the manuscript, A.B. performed the experiments and interpreted the results, A.C.M. initiated the project, designed the study and generated the expression quantification data, S.B. reviewed the math, interpreted the results and supervised the process. All authors contributed to the manuscript.

The authors declare that they have no conflicts of interest.

Data and code availabiity

An R package implementing our method is available on github, together with the code used to generate the figures as well as the gene expression data used: https://github.com/BergmannLab/SingleSampleRNAdynamics The raw data files data are available on the Gene Expression Omnibus accession number GEO:GSE150286 (main replicate) and GEO: GSE143277 (second and third replicates of Fig. 8)

Acknowledgements

This work was funded by the Swiss National Science Foundation through grant no. FN 310030 152724/1 to S.B and PP00P3 150667 and the NCCR in RNA & Disease to A.C.M.

Appendix

1 Derivation of the model solution

This is a first order linear ordinary differential equation in p(t) and m(t) that can be expressed in matrix form as Embedded Image

The solution to this equation is given by Embedded Image where k1 and k2 are scalar constants determined by the boundary conditions, λ1, λ2 are eigenvalues of the matrix in (26) and v, w are the corresponding eigenvectors.

The eigenvalues are given by λ1 = −β and λ2 = −γ. The first eigenvector v is obtained by solving Embedded Image

Similarly the second eigenvector is obtained by solving Embedded Image

The solution to (26) is thus given by Embedded Image

Expressed by its component this is equivalent to Embedded Image Embedded Image

We now turn to the boundary conditions to determine k1 and k2. The boundary conditions are different for the unlabeled and the labeled RNA.

Unlabeled RNA

Like in [6], we assume the system to be in steady-state prior to labeling. The steady-state is given by solving (26) with Embedded Image. Embedded Image

During labeling time, we assume that no unlabeled RNA is synthesized such that α = 0. Assuming that we start labeling at time t = 0, we thus have Embedded Image

Moreover we have Embedded Image

This leads us to the solution for the unlabeled RNA Embedded Image Embedded Image where the u label indicates that this corresponds to the unlabeled RNA pool.

Labeled RNA

The solution for the labeled RNA could be obtained the same way as for the unlabeled RNA, but setting α ≠ 0 and pl(0) = ml(0) = 0. However, it is simpler to notice that the total RNA (labeled and non-labeled) stay at steady-state during the labeling such that we have the following solution for labeled RNA. Embedded Image where the l label indicates that this corresponds to the labeled RNA pool.

2 Proof of unicity of solution

In this appendix, we prove that (22) has a single solution for b > (2 − a)−1. We first note that Embedded Image, so the lower bound for k is k− = 1 − a. We then define the right-hand side of (22) as Embedded Image

We then observe that on that lower bound Embedded Image because a + x − 1 tends to zero and x – 1 is negative.

On the other hand, for the upper bound Embedded Image, we have Embedded Image and Embedded Image we can deduce that the upper bound of x is a zero of g: Embedded Image

Moreover, the derivative of g is given by: Embedded Image

Then Embedded Image

Since g(k+) reaches zero from below, while g(k−) > 0, we can infer that g(x) has a zero between k− and k+ as illustrated on Fig 9.

Fig. 9.
  • Download figure
  • Open in new tab
Fig. 9.

Sketch of the proof that g(x) has a single zero in 𝒟 = (k−, k+). We first show that Embedded Image, that Embedded Image and Embedded Image, so that g must cross the x-axis on 𝒟. To show that it only does it once, we consider a function h(x) that has the same sign as g(x) when g′(x) = 0. We show that h is convex on 𝒟 and thus g cannot have a negative extrema, followed by a positive extrema, followed by a negative extrema. Hence it cannot have more than one zero on 𝒟.

To show that this zero is unique, we look at the sign of g′(x).

We can rewrite g′(x) as Embedded Image where Embedded Image Embedded Image

Let x0 be a zero of g′, i.e., the position of a local extrema of g. We have Embedded Image

The second equality holds because g′(x0) = 0 by definition of x0. By multiplying (40) by (bx0 + b − 1), which is positive, we can then define a new function h(x) whose sign is the same as the sign of g(x) for x = x0 (see Fig 9 for an illustration). Embedded Image where Embedded Image Embedded Image

We can now compute the second derivatives of C(x) and D(x). Embedded Image

Hence Embedded Image

This means that h is convex, so there cannot be three points x1 < x2 < x3 such that 0 > h(x1) < h(x2) > 0 > h(x3). Hence the same can be said of three zeros of g′, so g(x) cannot have more that one zero.□

Footnotes

  • Added a figure (6) showing that transcript synthesis and degradation rates than can be represented in a frame of reference given by transcript steady-state abundance and responsiveness. The figure also illustrates the fact that some biological processes that are more dynamical involve genes with more responsive transcripts, while more stable biological processes involve genes with less responsive transcripts.

  • https://github.com/BergmannLab/SingleSampleRNAdynamics

References

  1. [1].↵
    Tara Alpert, Lydia Herzel, and Karla M Neugebauer. Perfect timing: splicing and transcription rates in living cells. WIREs: RNA, 8(2):e1401, 2017.
    OpenUrl
  2. [2].↵
    J David Barrass, Jane EA Reid, Yuanhua Huang, Ralph D Hector, Guido Sanguinetti, Jean D Beggs, and Sander Granneman. Transcriptome-wide RNA processing kinetics revealed using extremely short 4tu labeling. Genome biology, 16(1):282, 2015.
    OpenUrlCrossRef
  3. [3].↵
    Adriano Biasini, Stefano De Pretis, Jennifer Yihong Tan, Baroj Abdulkarim, Harry Wischnewski, Rene Dreos, Mattia Pelizzola, Constance Ciaudo, and Ana Claudia Marques. Translation is required for miRNA-dependent decay of endogenous transcripts. BioRxiv, 2020.
  4. [4].↵
    Adriano Biasini and Ana Claudia Marques. A protocol for transcriptome-wide inference of RNA metabolic rates in mouse embryonic stem cells. Frontiers in Cell and Developmental Biology, 8:97, 2020.
    OpenUrl
  5. [5].↵
    Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic acids research, 47(D1):D330–D338, 2019.
    OpenUrlCrossRefPubMed
  6. [6].↵
    Stefano De Pretis, Theresia Kress, Marco J Morelli, Giorgio EM Mellon i, Laura Riva, Bruno Amati, and Mattia Pelizzola. INSPEcT: a computational tool to infer mrna synthesis, processing and d egradation dynamics from rna-and 4su-seq time course experiments. Bioinformatics, 31(17):2829–2835, 2015.
    OpenUrlCrossRefPubMed
  7. [7].↵
    Lars Dölken, Zsolt Ruzsics, Bernd Rädle, Caroline C Friedel, Ralf Zimmer, Jörg Mages, Reinhard Hoffmann, Paul Dickinson, Thorsten Forster, Peter Ghazal, et al. High-resolution gene expression profiling for simultaneous kinetic parameter analysis of RNA synthesis and decay. Rna, 14(9):1959–1972, 2008.
    OpenUrlAbstract/FREE Full Text
  8. [8].↵
    Ran Elkon, Eitan Zlotorynski, Karen I Zeller, and Reuven Agami. Major role for mrna stability in shaping the kinetics of gene induction. BMC genomics, 11(1):259, 2010.
    OpenUrlCrossRefPubMed
  9. [9].↵
    Caroline C Friedel and Lars Dölken. Metabolic tagging and purification of nascent rna: implications for transcriptomics. Molecular BioSystems, 5(11):1271–1278, 2009.
    OpenUrl
  10. [10].↵
    Mattia Furlan, Stefano de Pretis, Eugenia Galeota, Michele Caselle, and Mattia Pelizzola. Dynamics of transcriptional regulation from total RNA-seq experiments. bioRxiv, p. 520155, 2019.
  11. [11].↵
    Ella Hartenian and Britt A Glaunsinger. Feedback to the central dogma: cytoplasmic mrna decay and transcription are interdependent processes. Critical reviews in biochemistry and molecular biology, 54(4):385–398, 2019.
    OpenUrl
  12. [12].↵
    Veronika A Herzog, Brian Reichholf, Tobias Neumann, Philipp Rescheneder, Pooja Bhat, Thomas R Burkard, Wiebke Wlotzka, Arndt von Haeseler, Johannes Zuber, and Stefan L Ameres. Thiol-linked alkylation of RNA to assess expression dynamics. Nature methods, 14(12):1198, 2017.
    OpenUrl
  13. [13].↵
    Christopher Jürges, Lars Dölken, and Florian Erhard. Dissecting newly transcribed and old RNA using GRAND-SLAM. Bioinformatics, 34(13):i218–i226, 2018.
    OpenUrlCrossRef
  14. [14].↵
    Gioele La Manno, Ruslan Soldatov, Amit Zeisel, Emelie Braun, Hannah Hochgerner, Viktor Petukhov, Katja Lidschreiber, Maria E Kastriti, Peter Lönnerberg, Alessandro Furlan, et al. RNA velocity of single cells. Nature, 560(7719):494, 2018.
    OpenUrlCrossRefPubMed
  15. [15].↵
    Tong Ihn Lee and Richard A Young. Transcriptional regulation and its misregulation in disease. Cell, 152(6):1237–1251, 2013.
    OpenUrlCrossRefPubMedWeb of Science
  16. [16].↵
    Andrew Lugowski, Beth Nicholson, and Olivia S Rissland. Determining mRNA half-lives on a transcriptome-wide scale. Methods, 137:90–98, 2018.
    OpenUrl
  17. [17].↵
    Andrew Lugowski, Beth Nicholson, and Olivia S Rissland. DRUID: A pipeline for transcriptome-wide measurements of mRNA stability. RNA, 24(5):623–632, 2018.
    OpenUrlAbstract/FREE Full Text
  18. [18].↵
    Katya L Mack, Mallory A Ballinger, Megan Phifer-Rixey, and Michael W Nachman. Gene regulation underlies environmental adaptation in house mice. Genome research, 28(11):1636–1645, 2018.
    OpenUrlAbstract/FREE Full Text
  19. [19].↵
    Florence Petit, Karen E Sears, and Nadav Ahituv. Limb development: a paradigm of gene regulation. Nature Reviews Genetics, 18(4):245, 2017.
    OpenUrlCrossRefPubMed
  20. [20].↵
    Michal Rabani, Joshua Z Levin, Lin Fan, Xian Adiconis, Raktima Raychowdhury, Manuel Garber, Andreas Gnirke, Chad Nusbaum, Nir Hacohen, Nir Friedman, et al. Metabolic labeling of rna uncovers principles of rna production and degradation dynamics in mammalian cells. Nature biotechnology, 29(5):436, 2011.
    OpenUrlCrossRefPubMed
  21. [21].↵
    Michal Rabani, Raktima Raychowdhury, Marko Jovanovic, Michael Rooney, Deborah J Stumpo, Andrea Pauli, Nir Hacohen, Alexander F Schier, Perry J Blackshear, Nir Friedman, et al. High-resolution sequencing and modeling identifies distinct dynamic RNA regulatory strategies. Cell, 159(7):1698–1710, 2014.
    OpenUrlCrossRefPubMedWeb of Science
  22. [22].↵
    Joseph Russo, Adam M Heck, Jeffrey Wilusz, and Carol J Wilusz. Metabolic labeling and recovery of nascent RNA to accurately quantify mRNA stability. Methods, 120:39–48, 2017.
    OpenUrlCrossRefPubMed
  23. [23].↵
    Björn Schwanhäusser, Dorothea Busse, Na Li, Gunnar Dittmar, Johannes Schuchhardt, Jana Wolf, Wei Chen, and Matthias Selbach. Global quantification of mammalian gene expression control. Nature, 473(7347):337, 2011.
    OpenUrlCrossRefPubMedWeb of Science
  24. [24].↵
    Alexey Uvarovskii and Christoph Dieterich. pulseR: Versatile computational analysis of rna turnover from metabolic labeling experiments. Bioinformatics, 33(20):3305–3307, 2017.
    OpenUrlCrossRef
  25. [25].↵
    Alexey Uvarovskii, Isabel S Naarmann-de Vries, and Christoph Dieterich. On the optimal design of metabolic RNA labeling experiments. PLoS computational biology, 15(8):e1007252, 2019.
    OpenUrl
  26. [26].↵
    Lukas Windhager, Thomas Bonfert, Kaspar Burger, Zsolt Ruzsics, Stefan Krebs, Stefanie Kaufmann, Georg Malterer, Anne L’Hernault, Markus Schilhabel, Stefan Schreiber, et al. Ultrashort and progressive 4su-tagging reveals key characteristics of RNA processing at nucleotide resolution. Genome research, 22(10):2031–2042, 2012.
    OpenUrlAbstract/FREE Full Text
  27. [27].↵
    Amit Zeisel, Wolfgang J Köstler, Natali Molotski, Jonathan M Tsai, Rita Krauthgamer, Jasmine Jacob-Hirsch, Gideon Rechavi, Yoav Soen, Steffen Jung, Yosef Yarden, et al. Coupled pre-mRNA and mRNA dynamics unveil operational strategies underlying transcriptional responses to stimuli. Molecular systems biology, 7(1), 2011.
Back to top
PreviousNext
Posted June 10, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Estimating RNA dynamics using one time point for one sample in a single-pulse metabolic experiment
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Estimating RNA dynamics using one time point for one sample in a single-pulse metabolic experiment
Micha Hersch, Adriano Biasini, Ana Claudia Marques, Sven Bergmann
bioRxiv 2020.05.01.071779; doi: https://doi.org/10.1101/2020.05.01.071779
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Estimating RNA dynamics using one time point for one sample in a single-pulse metabolic experiment
Micha Hersch, Adriano Biasini, Ana Claudia Marques, Sven Bergmann
bioRxiv 2020.05.01.071779; doi: https://doi.org/10.1101/2020.05.01.071779

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3480)
  • Biochemistry (7327)
  • Bioengineering (5300)
  • Bioinformatics (20207)
  • Biophysics (9985)
  • Cancer Biology (7705)
  • Cell Biology (11263)
  • Clinical Trials (138)
  • Developmental Biology (6425)
  • Ecology (9920)
  • Epidemiology (2065)
  • Evolutionary Biology (13289)
  • Genetics (9353)
  • Genomics (12559)
  • Immunology (7679)
  • Microbiology (18963)
  • Molecular Biology (7421)
  • Neuroscience (40906)
  • Paleontology (298)
  • Pathology (1226)
  • Pharmacology and Toxicology (2127)
  • Physiology (3142)
  • Plant Biology (6839)
  • Scientific Communication and Education (1270)
  • Synthetic Biology (1893)
  • Systems Biology (5299)
  • Zoology (1086)