Abstract
Competition to bind microRNAs induces an effective positive crosstalk between their targets, therefore known as ‘competing endogenous RNAs’ or ceRNAs. While such an effect is known to play a significant role in specific situations, estimating its strength from data and, experimentally, in physiological conditions appears to be far from simple. Here we show that the susceptibility of ceRNAs to different types of perturbations affecting their competitors (and hence their tendency to crosstalk) can be encoded in quantities as intuitive and as simple to measure as correlation functions. We confirm this scenario by extensive numerical simulations and validate it by re-analyzing PTEN’s crosstalk pattern from TCGA breast cancer database. These results clarify the links between different quantities used to estimate the intensity of ceRNA crosstalk and provide new keys to analyze transcriptional datasets and effectively probe ceRNA networks in silico.
INTRODUCTION
MicroRNAs (miRNAs) are small non coding RNA (ncRNA) molecules that post-transcriptionally regulate a significant portion of the eukaryotic transcriptome via sequence-specific, protein-mediated binding in the cytoplasm [1]. Their primary effects on coding transcripts consist in inhibiting translation and fostering degradation [2]. Long ncRNAs, instead, can transiently sequester miRNAs, thereby altering their availability and overall repressive potential [3]. Following early observations concerning small regulatory RNAs in plants and bacteria [4, 5], competition to bind miRNAs has been hypothesized to cause an effective positive interaction (‘crosstalk’) between their coding and/or non-coding targets that may directly affect protein levels [6] (see Fig. 1A,B). Several experimental and modeling studies have clarified the conditions under which such a scenario may become biologically relevant, highlighting specifically how molecular levels and kinetic heterogeneities may control it [7–13]. So far, such a ‘ceRNA effect’ (whereby ceRNA stands for ‘competing endogenous RNA’) has been quantitatively validated in cases of differentiation [14], disease [15] or in presence of unphysiologically large transcriptional inputs [16]. Its significance in standard physiological conditions is therefore subject to scrutiny [17].
A major difficulty in detecting the ceRNA effect unambiguously in experiments or data lies in the fact that it should be disentangled from other mechanisms that may bear a similar impact, i.e. an effective positive coupling, on transcripts. Imagine a network of N ceRNA species interacting with M miRNA species. ceRNA levels mi (i = 1, …, N) fluctuate stochastically in time due to random synthesis and degradation events and to interactions with miRNAs, whose levels are also subject to random fluctuations. Denoting by 〈·〉 the time-average in the steady state, an effective ceRNA-ceRNA dependence can be signaled by a statistical correlation coefficient such as Pearson’s [8], i.e.
with the idea that, if ρij is large enough, a perturbation altering the level of ceRNA j will cause part of the miRNA population to move from one target to the other, effectively broadcasting the perturbation from ceRNA j to ceRNA i through miRNA-mediated interactions. A more direct description of this mechanism is attained instead via susceptibilities like [7] where bj stands for the transcription rate of ceRNA j. χij quantifies the shift in the mean level of ceRNA i caused by a (small) variation in bj, and a large χij (assuming no direct control of ceRNA i by ceRNA j) points to miRNA-mediated crosstalk between ceRNAs i and j (see Fig. 1C).
While both χij and ρij capture aspects of ceRNA crosstalk seen in experiments, their underlying physical meaning is a priori different. Fluctuating miRNA levels naturally correlate co-regulated targets, so that a large ρij is obtained when both ceRNAs respond to the stochastic dynamics of their regulator. This however does not necessarily imply a large χij. In fact, χij can be large even in absence of fluctuations in miRNA levels, i.e. as a consequence of competition alone. In such conditions, ρij vanishes. χij has indeed been found to be asymmetric under exchange of its indices (i.e. χij ≠ χji in general) [7], at odds with ρij which is necessarily symmetric. It would therefore be important to clarify how quantities like (1) and (2) are related in miRNA-ceRNA networks, especially to understand whether responses to perturbations (a central quantity of interest for many potential applications of the ceRNA effect) can be encoded in quantities as intuitive and as simple to measure experimentally or from data as a Pearson correlation coefficient.
Here we show that the information conveyed by χij is indeed captured by a correlation function similar to ρij. On the other hand, ρij is linked to a susceptibility, i.e. to the response of a target to a perturbation altering the level of its competitor, but the perturbation concerns the intrinsic decay rate of the competitor rather than its transcription (as is the case for χij). In the following, we will derive these results and validate them by computer simulations and gene expression data analysis, and explore their consequences.
MATERIALS AND METHODS
Numerical simulations were performed using the Gillespie algorithm [18], an implementation of which, for a miRNA-ceRNA network, is available from https://github.com/araksm/ceRNA/. Gene expression data analysis was performed starting from 1098 breast cancer samples obtained from The Cancer Genome Atlas (TCGA, http://cancergenome.nih.gov/; project ID: TCGA-BRCA).
RESULTS
Theory
We start from the dynamics of molecular populations in a miRNA-ceRNA network, denoting by mi the level of ceRNA species i (ranging from 1 to N), by µa the level of miRNA species a (ranging from 1 to M), and by cia the levels of miRNA-ceRNA complexes. In the deterministic limit where stochastic fluctuations are neglected, the time evolution of concentration variables is described by with the different parameters denoting intrinsic synthesis (bi, βa) and degradation rates (di, δa), complex association/ dissociation rates () and complex processing rates (σia and κia for stoichiometric and catalytic processing, respectively), while represents the mean lifetime of the complex formed by miRNA a and ceRNA i. We note that if the mean lifetime of complexes is much shorter than that of free molecular species, i.e. if τia ≪ 1/di and τia ≪ 1/δa for each i and a, miRNA-ceRNA complexes achieve a steady state much faster than miRNA and ceRNA levels. In such conditions, and one can eliminate complexes from (3) by replacing cia with its steady state value
For (i.e. when stoichiometric degradation without miRNA recycling is the dominant channel of complex processing), this allows to re-cast (3) in the form (see Supplementary Text 1) where L is a function of all miRNA levels µ = {µa} and all ceRNA levels m = {mi} given by
One easily sees (see Supplementary Text 2) that L decreases along trajectories of (5), implying that its minimum describes the physically relevant steady state of (3) with m ≠ 0 and µ ≠ 0.
If intrinsic molecular noise (arising from stochastic transcription and degradation events and from titration due to miRNA-ceRNA interactions) is added to (3), after a transient molecular levels will eventually stabilize and fluctuate over time around the steady state described by the minimum of L. We are interested in finding a compact and intuitive mathematical form for the correlations arising between the different components in such conditions. Molecular noise is Poissonian, namely the strength of fluctuations affecting each variable is proportional to the square root of mean molecular levels (see e.g. [13] for an explicit representation in the context of a miRNA-ceRNA network), which makes our goal especially challenging. However we will see that the effects of molecular noise can be remarkably well approximated by a uniform “effective temperature” T representing the strength of fluctuations affecting all molecular species involved. In this case, one can describe fluctuations around the steady state as thermal fluctuations around a Boltzmann-Gibbs equilibrium state. This allows to compute averages of generic functions of m and µ as “thermal averages”, i.e.
where is a normalization factor, the deterministic limit being obtained for T → 0. In particular, defining by straightforward calculations one finds
Therefore, in this approximation, the susceptibility χij [Eq. (2)] is linked to the correlation function [Eq. (12)] which, as χij, is not symmetric under the exchange of i and j, while the ceRNA-ceRNA covariance is tied to the susceptibility ωij quantifying the change in 〈mi〉 induced by a (small) change of the intrinsic degradation rate dj of ceRNA j [Eq. (11)]. (Note that ωij ≤ 0.) The constant linking these quantities is the temperature T quantifying the strength of the uniform “effective noise”.
Somewhat unexpectedly, the above results suggest that the susceptibility ωij must be symmetric under exchange of i and j, i.e., for instance, if the level of ceRNA i is altered by changing the intrinsic degradation rate of ceRNA j, then the reverse is also true. To check this property, one can calculate ωij explicitly for a system formed by N ceRNA species interacting with a single miRNA species at steady state by following a different route, specifically along the lines of [7]. Considering the repression strength to which ceRNAs i and j are subject at a given (mean) level 〈µ1〉 of miRNA species 1 (M = 1 in this case), one finds, for each ceRNA, a soft “threshold” value of 〈µ1〉, denoted by , such that i is unrepressed (resp. repressed or susceptible to changes in 〈µ1〉) if 〈µ1〉 ≪ µ0,i1 (resp. ≫ µ0,i1 or ≃ µ0,i1). A direct calculation (see Supplementary Text 3) shows that ωij can attain large values only if both ceRNAs are susceptible to 〈µ1〉, in which case one has (i ≠ j) where al = 0, 1, 1/4 if ceRNA l is repressed, unrepressed or susceptible, respectively. Eq. (15) confirms that ωij is indeed symmetric under exchange of i and j.
Concerning the approximations under which the the above results were obtained, we remark that we started by considering (3) in the limit of (i) fast complex equilibration, and (ii) miRNA-ceRNA complex processing dominated by the stoichiometric channel, with the former playing the key role in deriving the function L (see Supplementary Text 1). We note however that the overall scenario just described also holds for when complexes evolve over time scales much longer than those of free molecular levels, i.e. for τia ≫ 1/di and τia ≫ 1/δa. In particular, (3) can again be re-cast in the form of (5) with L given by (6), albeit with re-scaled transcription rates (see Supplementary Text 4 for details).
Therefore we conclude that, as long as molecular noise can be approximated by a uniform effective temperature,
(i) the ceRNA-ceRNA covariance Cij = 〈mimj〉c is a proxy for the susceptibility ωij, and
(ii) the correlation function Xij = 〈mi log mj〉c is a proxy for the susceptibility χij.
Validation
We have validated the above scenario by simulating a small network involving 2 ceRNA and a single miRNA species via the Gillespie algorithm [18], where molecular noise is accounted for explicitly (see Supplementary Text 5). Results are summarized in Fig. 2, where we compare ω12, ω21 and C12 ≡ C21 on on hand, and χ12, χ21, X12 and X21 on the other, as computed from simulations (i.e. with the actual molecular noise), against the theoretical predictions. We considered three scenarios for the mean lifetime of miRNA-ceRNA complexes, namely those covered by the theory (i.e. complex equilibration much faster and much slower than the equilibration of miRNA and ceRNA levels) as well as the intermediate case where characteristic timescales are comparable for all variables.
One sees that theoretical predictions obtained in the “thermal noise” approximation agree remarkably well with simulations including the actual molecular noise. In particular, the full correspondence between the susceptibilities ωij and χij and the (re-scaled) correlation functions Cij and, respectively, Xij is evident. Notice that a single global parameter T ≥ 0 has been used to fit all data in each of the conditions. This shows how accurately the assumption of a uniform effective temperature can mimic the effects of intrinsic stochasticity. On the other hand, its limits might be reflected, at least in part, in the discrepancies that occur at high transcription rates.
These results confirm that (12) and (14) are indeed good predictors of the response of a ceRNA to a perturbation affecting one of its competitors within a miRNA-ceRNA network. Notably, such correlation functions are easy to estimate from transcription data sets. Our framework therefore has the potential to offer new insight into post-transcriptional regulation, its system-level organization and its impact on cellular functions.
In order to test this idea, we analyzed the ceRNA scenario emerging from 1098 breast cancer samples obtained from TCGA, focusing on the widely studied oncosuppressor PTEN and its immediate competitors (i.e. the ceRNAs sharing at least one miRNA regulator with PTEN). In particular, we computed for a set of candidate PTEN ceRNAs found in [15] by means of Mutually Targeted miRNA-Response Element Enrichment Analysis. Notice that the average appearing in Eqs. (16–18) is over samples and not over time. We expect however that, if the interaction network is conserved across samples, averages over samples should reproduce statistical averages such as (13), as different samples effectively represent different snapshots of the state of the network. Fig. 3A shows that when (16) (whose value is encoded in the color of markers) is large, both (17) and (18) tend to be large. According to (15), a large Cij (or ωij) signals that both PTEN and its competitor are susceptible to changes in the level of at least one of their shared regulators. For such pairs, in addition, it has previously been shown that both χij and χji are expected to be large [7]. This implies a fully bi-directional crosstalk, i.e. any perturbation affecting the level of one species should affect the level of the other via miRNA-mediated regulation. Remarkably, this was experimentally shown to be the case in [15] for some of the ceRNAs we tested (e.g. SERINC1, VAPA), all of which are in this regime according to our analysis. Adding to this, we are also able to point to a number of other PTEN competitors, a perturbation of which should trigger a response by PTEN.
On the other hand, smaller values of (16) (orange markers in Fig. 3A) are associated to strongly asymmetric PTEN-ceRNA pairs for which (18) is much larger than (17). This suggests that PTEN will respond to an increase of its competitor’s bare transcription rate (and not vice-versa), while no response of PTEN should be expected upon perturbing the bare decay rate of the same ceRNA as Cij is small. Within the steady state theory of [7], ceRNA pairs with strongly different values of χij and χji pertain to cases where the responding ceRNA (PTEN here) is susceptible to variations in the miRNA levels while the perturbed one (PTEN’s competitor) is fully repressed. Our data analysis fully confirms both this scenario and the theory presented here in linking such cases to low values of the bare covariance (14).
Finally, note (Fig. 3B) that the above information can not be retrieved if Cij is replaced by the Pearson coefficient ρij, Eq. (1), which just amounts to normalizing Cij by the product of the standard deviations σmi and σmj of mi and mj. Indeed, using the value of ρij to color-code PTEN’s ceRNAs, one sees that the Pearson coefficient can mislead into expecting (or not expecting) a response to a perturbation when the actual susceptibilities are small (resp. large).
For instance, ρij is rather small for the pair formed by PTEN and SLC1A2, which seems to suggest absence of mutual cross-talk between these two transcripts. However, while both CPTEN,SLC1A2 and XSLC1A2,PTEN are small, XPTEN,SLC1A2 is significant. This suggests that (i) SLC1A2 will not respond to a perturbation affecting the transcription rate of PTEN, and (ii) the pair will be insensitive to changes in each other’s bare decay rate; however, (iii) PTEN will be affected by a change in the bare transcription rate of its competitor despite the small statistical correlation that exists between their levels. Likewise, the large value of the Pearson coefficient between PTEN and DTWD2 can mislead into generically expecting a response when instead the susceptibility is strongly perturbation-dependent. In particular, the level of DTWD2 should not be significantly modified by a change in the level of PTEN (as XDTWD2,PTEN is rather small) in spite of the large Pearson coefficient. Notice that, remarkably, for this pair, Cij and ρij take on very different values.
DISCUSSION
To sum up, we have identified [Eq.s (11) and (12)] a set of correlation functions that can serve as proxies for ceRNA susceptibilities to perturbations. Specifically, Cij = 〈mimj〉c is related to the susceptibility ωij quantifying ceRNA i’s response to a change of the bare decay rate of ceRNA j, while Xij = 〈mi log mj〉c is related to the susceptibility χij quantifying ceRNA i’s response to a change of the bare transcription rate of ceRNA j. These relations are valid at steady state and within the approximations discussed, are fully confirmed by numerical simulations.
Most importantly, quantities like Cij and Xij can be easily estimated from data and possibly measured in experiments. An analysis of PTEN’s emergent crosstalk pattern from TCGA breast cancer dataset using these functions has indeed shown that a map of ceRNA responses to perturbations affecting competitors can be constructed by combining the information provided by each, while the Pearson coefficient ρij can be inaccurate in this respect. This opens the way to probing the structure and function of ceRNA networks in silico by straightforwardly analyzing transcriptional data, and provides a key to obtain testable transcriptome-scale predictions about ceRNA crosstalk.
Notice that our results apply without any modification to ceRNA pairs that don’t share miRNA regulators, i.e. it is capable of identifying long-range crosstalk (i.e. interactions between ceRNAs that are separated by multiple miRNAs along the miRNA-ceRNA network) of the kind discussed in [19]. In this sense, they can provide insight into ceRNA crosstalk both at the local scale and at an extended, network-level scale.
From the viewpoint of physics, results like (11) and (12) are akin to the “fluctuation-response relations” that constitute a cornerstone of statistical mechanics [20]. Their derivation in our context has relied on an equilibrium framework that presupposes stationarity of molecular levels. Since ceRNA crosstalk can be substantially more complex away from the steady state [21], a more refined mathematical study will be required to extend the theory developed here to off-equilibrium dynamical regimes. On the other hand, our results open the way for the application of recently developed inference techniques [22] to the estimation of miRNA levels or kinetic parameters from ceRNA levels. Overall, the approach presented here provides new means to extract information on post-transcription regulation from sequencing and/or gene expression data, thereby potentially enhancing our ability to exploit the ceRNA effect for therapeutic purposes by allowing for the identification of better (i.e. more responsive) targets for intervention.
2. L DECREASES ALONG THE DYNAMICS
By direct differentiation and using the fact that (see Eq. (5) in the Main Text) one finds
In other words, under the approximations discussed above, L decreases along the dynamics of the miRNA-ceRNA network. Therefore the minimum of L (which is unique by virtue of the concavity of L) describes a steady state of the dynamics (1).
3. APPROXIMATE CALCULATION OF ωij FOR A SYSTEM WITH ONE MIRNA AND N CERNA SPECIES
Starting from Eq. (3) taken for N ceRNA species and a single miRNA species (we suppress its index for simplicity) in the limit , the steady-state level of mi reads
Now noting that and that the steady state miRNA level can be approximated by [1] so that we can compute the susceptibility ωij as
One finds where is a 3 × 3 matrix that only depends on the regime R(i) (repressed, susceptible or expressed) to which ceRNA i belongs. By considering the definitions of the different regimes in terms of the value of µ, all elements of are found to be ≪ 1 (for instance, WExpr,Expr = µ/(µ0,iµ0,j) ≪ 1 as µ ≪ µ0,i and µ ≪ µ0,j if ceRNAs i and j are both expressed) except for WSusc,Susc, which is given by leading immediately to Eq. (15) of the Main Text.
4. CASE OF SLOW COMPLEX PROCESSING
Assuming complex levels cia are roughly stationary over time scales for which mi and µa evolve (i.e. τia ≫ 1/di and τia ≫ 1/δa for each i and a), then all terms in (1) that involve the variables cia can be taken to be roughly constant for short enough characteristic times. In such conditions, miRNAs are effectively transcribed at rates while ceRNAs are effectively transcribed at rates
In this limit, (1) can again be cast as with
The main difference from the previous case lies in the fact that the minimum of L should now be computed self-consistently from the asymptotic value of cia: after the (fast) equilibration of mi’s and µa’s following (18), a new steady state value for complexes is computed as , leading in turn to new values for the effective transcription rates and and hence to new values for mi’s and µa’s from (18), and so on until convergence.
5. STOCHASTIC DYNAMICS OF A MIRNA-CERNA NETWORK
The time evolution of our miRNA-ceRNA network with N ceRNA species (labeled i), M miRNA species (labeled a) and intrinsic (molecular) noise is described by the system where while ηa, ξa and ζia represent stochastic variables. As each noise source contributes independently to the overall noise level, one has where ξmi, ηµa, , , and and are mutually independent zero-average random variables representing, respectively, the intrinsic noise in ceRNA levels, in miRNA levels, in the binding/unbinding dynamics of complexes, in the stoichiometric complex degradation channel and in the catalytic complex degradation channel. Correlations are, for each component, described by where denote the mean steady-state molecular levels. To obtain Fig. 2 of the Main Text, we have simulated the above system with M = 1, N = 2 using the Gillespie algorithm [2].
Acknowledgments
We gratefully acknowledge Carla Bosia and Andrea Pagnani for useful insight and suggestions.
Footnotes
↵* andrea.demartino{at}roma1.infn.it
References
References
- [1].
- [2].