- Split View
-
Views
-
Cite
Cite
Sandro Bottaro, Francesco Di Palma, Giovanni Bussi, The role of nucleobase interactions in RNA structure and dynamics, Nucleic Acids Research, Volume 42, Issue 21, 1 December 2014, Pages 13306–13314, https://doi.org/10.1093/nar/gku972
- Share Icon Share
Abstract
The intricate network of interactions observed in RNA three-dimensional structures is often described in terms of a multitude of geometrical properties, including helical parameters, base pairing/stacking, hydrogen bonding and backbone conformation. We show that a simple molecular representation consisting in one oriented bead per nucleotide can account for the fundamental structural properties of RNA. In this framework, canonical Watson-Crick, non-Watson-Crick base-pairing and base-stacking interactions can be unambiguously identified within a well-defined interaction shell. We validate this representation by performing two independent, complementary tests. First, we use it to construct a sequence-independent, knowledge-based scoring function for RNA structural prediction, which compares favorably to fully atomistic, state-of-the-art techniques. Second, we define a metric to measure deviation between RNA structures that directly reports on the differences in the base–base interaction network. The effectiveness of this metric is tested with respect to the ability to discriminate between structurally and kinetically distant RNA conformations, performing better compared to standard techniques. Taken together, our results suggest that this minimalist, nucleobase-centric representation captures the main interactions that are relevant for describing RNA structure and dynamics.
INTRODUCTION
Ribonucleic acid (RNA) plays a fundamental role in a large variety of biological processes, such as enzymatic catalysis (1), protein synthesis (2) and gene regulation (3). RNA molecules fold in a well-defined three-dimensional structure dictated by their nucleotide sequence (4), that can be determined by means of X-ray crystallography or nuclear magnetic resonance experiments (5). Despite recent important advances in the field, RNA structure determination is still a complex and expensive procedure. Computational/theoretical models, and in particular structure prediction methods, ideally complement experiments, as they can in principle provide the tertiary, native structure for a given RNA sequence.
Although traditionally considered simpler than the protein folding problem (4), de novo prediction of RNA three-dimensional structure still represents a challenging task (6). Atomistic molecular dynamics methods have been so far limited to the predictive folding of very small systems (7,8), whereas coarse-grained, physics-based models have proven useful in folding larger RNA structures (9,10). However, the most popular and successful RNA structure prediction methods are the so-called knowledge-based techniques, that typically employ fragment libraries for constructing RNA structures. These fragments are either combined using secondary structure information (11,12) or used to generate a large number of plausible, putative structures that are subsequently ranked according to a scoring function (13–17). Knowledge-based methods often rely on two assumptions: first, that the main structural features of RNA molecules can be described in terms of few relevant observables. Second, that the distribution of these observables in a data set of experimental structures displays specific features, that can be used for generating and/or scoring three-dimensional structures (18). A crucial issue is the choice of the aforementioned observables. While backbone atoms and torsional angles are the most natural choice for representing protein structures (19), RNA molecules are more difficult to treat, due to the presence of a larger number of flexible backbone degrees of freedom and because of the important structural role of base–base interactions. For this reason RNA molecules are often described in terms of several structural observables, including backbone dihedrals (20,21), pseudo dihedrals (22), helical parameters (23), hydrogen-bond networks and stacking interactions (24,25).
The proper choice of simplified, insightful descriptors for RNA molecules is not limited to the development of structure prediction methods, but is a general and relevant issue in RNA structural analysis, as in the definition of metrics for comparing three-dimensional structures. The well-known root mean square deviation (RMSD) after optimal superposition (26) is known to be suboptimal in the analysis of RNA structure, as it does not provide reliable information about the differences in the base interaction network (6,27).
This motivated the development of alternative RNA-specific deviation measures based on selected geometrical properties. These measures can be used for comparing observed structures with a priori known patterns, as is done for instance in annotation methods (28) and in motif recognition algorithms (29–31). Other RNA specific similarity measures were introduced with the purpose of assessing the quality of RNA structural predictions (6) or for clustering/identifying recurrent structural motifs in large databases of known structures (21,29). The definition of a proper structural deviation measure is also a central issue in the construction of Markov state models (32), that typically assume the geometric similarity to provide an adequate measure of kinetic proximity.
The analysis of RNA three-dimensional structure, the construction of a knowledge-based structure prediction method and the definition of a structural deviation share a common trait: all of them are based on the apparently arbitrary choice of the structural observables used to represent RNA. Is there a simple representation of an RNA molecule that recapitulates its key structural and dynamical properties?In this Paper we introduce a minimalist representation for RNA molecules consisting in one oriented bead per nucleotide. We first show that this description captures the fundamental base-pair and base-stacking interactions observed in experimental structures. Using this representation we define a simple sequence-independent scoring function for RNA structure prediction, which performs on par or better compared to existing atomistic techniques. Based on the same molecular description, we then introduce a metric for calculating structural deviation. When compared with the standard RMSD measure, this metric correlates better with the physical time required to explore the conformational space in molecular dynamics simulations. As a further proof of the capability of this metric to distinguish relevant structural differences, we show that it can be successfully used for recognizing local structural motifs, reproducing results obtained with state-of-the-art techniques. Finally, in the Discussion section, we summarize the analogies between these applications and we proceed with an extensive comparison with existing techniques. The software for performing the calculations is freely available at https://github.com/srnas/barnaba.
METHODS
This section is structured as follows: first, we introduce a simple representation for RNA three-dimensional structures that has an intuitive interpretation in terms of base-stacking and base-pairing interactions. We then use this molecular description to define i) a scoring function for RNA structure prediction (|$\mathcal {E}$|SCORE) and ii) a metric for calculating deviations between RNA three-dimensional structures (|$\mathcal {E}$|RMSD).
Molecular representation
Number of base–base interactions detected in the crystal structure of the H. marismortui large ribosomal subunit
Interaction Type . | Occurrencesa . | |$\tilde{r}<\sqrt{2.5}^b$| . |
---|---|---|
Stacking | 2328 | 2318 |
Watson-Crick | 723 | 723 |
Non-Watson-Crick (base–base) | 399 | 382 |
Base-sugar/phosphate interactions | 781 | 398 |
Not annotated | — | 1214 |
Interaction Type . | Occurrencesa . | |$\tilde{r}<\sqrt{2.5}^b$| . |
---|---|---|
Stacking | 2328 | 2318 |
Watson-Crick | 723 | 723 |
Non-Watson-Crick (base–base) | 399 | 382 |
Base-sugar/phosphate interactions | 781 | 398 |
Not annotated | — | 1214 |
aNumber of interactions detected by MC-Annotate (25).
bBase pair (j,k) is counted when both |$\tilde{r}_{jk}$| and |$\tilde{r}_{kj}<\sqrt{2.5}$|.
Interaction Type . | Occurrencesa . | |$\tilde{r}<\sqrt{2.5}^b$| . |
---|---|---|
Stacking | 2328 | 2318 |
Watson-Crick | 723 | 723 |
Non-Watson-Crick (base–base) | 399 | 382 |
Base-sugar/phosphate interactions | 781 | 398 |
Not annotated | — | 1214 |
Interaction Type . | Occurrencesa . | |$\tilde{r}<\sqrt{2.5}^b$| . |
---|---|---|
Stacking | 2328 | 2318 |
Watson-Crick | 723 | 723 |
Non-Watson-Crick (base–base) | 399 | 382 |
Base-sugar/phosphate interactions | 781 | 398 |
Not annotated | — | 1214 |
aNumber of interactions detected by MC-Annotate (25).
bBase pair (j,k) is counted when both |$\tilde{r}_{jk}$| and |$\tilde{r}_{kj}<\sqrt{2.5}$|.
It is well known that the strength and nature of pairing and stacking interactions depend on the base–base distance, on the angle θ as well as on other angular parameters (e.g. twist, roll, tilt) in a non-trivial manner (24,33). Such dependence can be observed in Figure 2, where the points belonging to the pairing and stacking zone of Figure 1c are projected on two separate ρ–θ planes. These distributions give an average picture containing contributions from different base pair types (purine–purine, purine–pyrimidine and pyrimidine–pyrimidine) and with weights dictated by the employed data set. Nevertheless, the observations below hold also when considering the 16 possible combinations of base pairs individually and different data sets (see Supporting Data (SD) Figrues S1–S3). In the pairing zone (Figure 2, left panel) we first observe a dominant peak centered around (ρ=5.6 Å, θ=60°), corresponding to the position of canonical Watson-Crick base pairs as well as wobble (GU) base pairs. The two other peaks correspond instead to base pairs interacting through the Hoogsteen or sugar edge (13). One can also appreciate the absence of bases in the region occupied by the sugar (190° < θ < 290°). The probability distribution in the stacking zone (Figure 2, right panel) shows a broad peak in the proximity of the origin and extending up to ρ ≈ 4 Å, which can be compared to the typical radius of the six-membered ring (≈1.4 Å). This means that partial or negligible ring overlap is very frequent in RNA structures, as also observed in a seminal paper by Bugg et al. (34). This feature is more evident in pyrimidine–pyrimidine and purine–purine pairs, for which high overlap is the exception rather than the rule (see Supplementry Figure S3), whereas overlap is systematically observed in pyrimidine–purine pairs. The fact that bases in the stacking zone are very often ‘imbricated,’ similarly to roof tiles, rather than literally stacked one on top of the other, does not imply that they are not interacting. Indeed, base–base interaction is not limited to π–π stacking but also includes electrostatic effects, London dispersion attraction, short range repulsion as well as backbone-induced effects (35).
Scoring function
Structural deviation
|$|\bf {G}(\tilde{\bf {r}}^{\alpha }) - \bf {G}(\tilde{\bf {r}}^{\beta })| \approx |\tilde{\bf {r}}^{\alpha } - \tilde{\bf {r}}^{\beta }|$| if |$\tilde{r}^{\alpha },\tilde{r}^{\beta } \ll \tilde{r}_{\text{cutoff}}$|.
|$|\bf {G}(\tilde{\bf {r}}^{\alpha }) - \bf {G}(\tilde{\bf {r}}^{\beta })| = 0$| if |$\tilde{r}^{\alpha },\tilde{r}^{\beta } \ge \tilde{r}_{\text{cutoff}}$|.
|$\bf {G}(\tilde{\bf {r}})$| is a continuous function.
RESULTS
Scoring function for RNA structure prediction
We assess the quality of the |$\mathcal {E}$|SCORE on its capability of recognizing the folded state among a set of decoys (39). We consider here a total of 39 different decoy sets previously used to benchmark other knowledge-based scoring functions: 20 decoy sets generated de novo using the FARNA algorithm (13) and 19 additional decoy sets from the work of Bernauer et al. (16) obtained by gradual deformation of the native structures. As a measure of performance we use the normalized rank (40), defined as the percentage of decoys scoring better than the native structure. In order to simulate a realistic structure prediction experiment, we additionally report the RMSD from native of the best scoring structure in the decoy set. Because the RMSD has been reported to be a suboptimal indicator of structural proximity for RNA, we also compute the interaction fidelity network (INF), which measures the overlap between annotations in two structures (27), as well as the |$\mathcal {E}$|RMSD introduced here. Equivalent analyses are performed using two state-of-the-art scoring functions, namely the FARFAR (15) and the all-atom knowledge-based scoring function RASP (17). Both FARFAR and RASP are defined at atomistic resolution. FARFAR combines the low-resolution FARNA potential with explicit terms for solvation and hydrogen bonding, while RASP is a statistical potential based on pairwise interatomic distances. These two scoring functions are among the best available (6) and the only ones for which it was possible to perform automatic scoring of a large set of decoys.
Figure 3 summarizes the results obtained on all the 39 decoy sets. For each method we report the number of decoy sets below the value indicated on the horizontal axis. Thus, the better is a scoring function, the faster the curve reaches the maximum number 39. In general, whereas FARFAR performs well in ranking and RASP in finding the best decoys, |$\mathcal {E}$|SCORE performs well in both tasks (See also Supplementry Table S6). This is a remarkable result considering that |$\mathcal {E}$|SCORE employs a coarse-grained representation (one bead per nucleotide) and is not sequence dependent. The scoring results on all RNA decoys, together with a short discussion of the cases for which |$\mathcal {E}$|SCORE fails to identify correctly the native structure can be found in Supplementry Figures S7 and S8.
RNA structural deviation
The definition of an accurate measure of structural deviation for comparing RNA three-dimensional structures is a relevant problem, as the standard RMSD approach does not report on differences of base-stacking and base-pairing patterns, that are the fundamental, key features of RNA molecules (25). As a striking example, the RMSD between an ideal A-form double helix |$^{\texttt {GGGG}}_{\texttt {CCCC}}$| and the mismatched structure |$^{\texttt {\phantom{G}GGGG}}_{\texttt {CCCC \phantom{C}}}$| is 1.9 Å, meaning that the threshold of 4Å that is sometimes employed to consider two structures as similar (8) may not be sufficiently strict. A similar issue arises in kinetic modeling of proteins, where the assumption that geometrically close configurations (in terms of the RMSD distance) are also kinetically close has been shown to be problematic in discretizing the register shift dynamics of β strands (32).
In order to elucidate the difference between the standard RMSD and the |$\mathcal {E}$|RMSD, we compute these two quantities on a series of snapshots taken from an atomistic molecular dynamics simulation of a short stem-loop. These structures have been obtained by repeatedly melting and folding (see SD Text 9), and thus include near-native structures, partially folded, misfolded as well as extended conformations. Figure 4 shows that structures with similar RMSD do not necessarily present the same base–base interaction network of the native state: in fact, the secondary structure can be substantially different. Conversely, the |$\mathcal {E}$|RMSD discriminates structures with near-native base–base contacts (|$\mathcal {E}$|RMSD <0.8) from structures with non-native base–base interactions (|$\mathcal {E}$|RMSD>1). Note that in general low |$\mathcal {E}$|RMSD correspond to high INF values, except for the region around |$\mathcal {E}$|RMSD=1.5, RMSD=2.7. In this region we observe structures with formed stem and distorted loop (high INF) but also structures with formed loop and incorrect stem (low INF). While Figure 4 serves as an illustrative example, the validity of the |$\mathcal {E}$|RMSD is more rigorously assessed by performing two tests that probe the accuracy of the method in the analysis of structural and dynamical properties of RNA molecules. More precisely, in the following two subsections we demonstrate i) that the |$\mathcal {E}$|RMSD is a good measure of kinetic proximity between configurations, performing considerably better compared to standard structural distances and ii) that the |$\mathcal {E}$|RMSD is highly specific in recognizing known structural motifs within a large set of crystallographic structures.
Structural and kinetic proximity
The correspondence between a structural deviation (RMSD, |$\mathcal {E}$|RMSD, etc.) and the kinetic distance (defined as the time needed to evolve from one configuration to the other), can be quantitatively assessed by computing the coefficient of variation (CV).
In Figure 5 we show the CV as a function of time separation in molecular dynamics simulation of the add riboswitch (42,43). The molecular dynamics simulation was initialized in the native state and remained stable throughout the 100 ns run, while experiencing non trivial dynamics such as base-pairing breakage/formation and double-helix breathing. The CV for |$\mathcal {E}$|RMSD is compared with RMSD and with dRMSD. Additionally, the scalar variant of |$\mathcal {E}$|RMSD, based on Eq. 6, is also presented.
We first observe that for short time separations the correspondence between geometric and kinetic distance is high (low CV) in all cases, and it diminishes at large time lag. The CV for the |$\mathcal {E}$|RMSD is consistently lower compared to both RMSD and dRMSD, suggesting this measure to better reflect the temporal separation between configurations. We also note that, in this test, the scalar and vectorial version of the |$\mathcal {E}$|RMSD behave very similarly.
Search of structural motifs
RNA molecules display a great variety of base–base interaction patterns (RNA motifs) that play a key role in RNA folding and that mediate important biological processes (44). A structural deviation should therefore accurately capture such RNA-specific features. This property can be assessed by searching for known RNA structural motifs in the database of experimentally solved structures. Here, we use the |$\mathcal {E}$|RMSD to search for five known internal loop motifs (K-Turn, C-loop, Sarcin Ricin, triple sheared, double sheared) and two hairpin loops (T-loop and GNRA) within the RNA structure atlas database (45) release 1.43, composed by 772 crystal structures. As a reference, we compared the results with ‘find RNA 3D’ (FR3D) (29), the most extensively used and carefully tested method for RNA motif recognition. FR3D employs a reduced representation with one centroid per nucleobase, and computes the geometrical deviation by combining fitting and orientation errors after optimal superposition. Reference and motif structures were obtained from the RNA 3D motif atlas release 1.13. Except for the K-turn motif, we were able to correctly identify almost all FR3D matches, including motifs with bulged bases (Table 2). By visual inspection, it can be seen that most of the false positive (i.e. motifs that were found using the |$\mathcal {E}$|RMSD and not by FR3D) are either bona fide matches or known motif variants, while the residual discrepancies are due to differences in the employed cutoff value (45). We refer the reader to SD Text 10 for a detailed description of the search strategy and to SD Text 11 for the full list of results.
Structural search of known motifs within the RNA motif atlas. The table summarizes the number of found motifs compared with FR3D (http://rna.bgsu.edu/rna3dhub/motifs). See SD Text 11 for a full list.
Motif . | Reference PDB . | Found motifs . | False positive . |
---|---|---|---|
T-LOOP | 3RG5 | 57/66 | 17 |
GNRA | 4A1B | 217/219 | 63 |
C-LOOP | 3V2F | 18 /22 | 7 |
D-SHEAR | 1SAQ | 23/26 | 3 |
T-SHEAR | 3V2F | 24/31 | 18 |
K-TURN | 3GX5 | 10/22 | 15 |
SARCIN | 1Q96 | 13/16 | 3 |
Motif . | Reference PDB . | Found motifs . | False positive . |
---|---|---|---|
T-LOOP | 3RG5 | 57/66 | 17 |
GNRA | 4A1B | 217/219 | 63 |
C-LOOP | 3V2F | 18 /22 | 7 |
D-SHEAR | 1SAQ | 23/26 | 3 |
T-SHEAR | 3V2F | 24/31 | 18 |
K-TURN | 3GX5 | 10/22 | 15 |
SARCIN | 1Q96 | 13/16 | 3 |
Motif . | Reference PDB . | Found motifs . | False positive . |
---|---|---|---|
T-LOOP | 3RG5 | 57/66 | 17 |
GNRA | 4A1B | 217/219 | 63 |
C-LOOP | 3V2F | 18 /22 | 7 |
D-SHEAR | 1SAQ | 23/26 | 3 |
T-SHEAR | 3V2F | 24/31 | 18 |
K-TURN | 3GX5 | 10/22 | 15 |
SARCIN | 1Q96 | 13/16 | 3 |
Motif . | Reference PDB . | Found motifs . | False positive . |
---|---|---|---|
T-LOOP | 3RG5 | 57/66 | 17 |
GNRA | 4A1B | 217/219 | 63 |
C-LOOP | 3V2F | 18 /22 | 7 |
D-SHEAR | 1SAQ | 23/26 | 3 |
T-SHEAR | 3V2F | 24/31 | 18 |
K-TURN | 3GX5 | 10/22 | 15 |
SARCIN | 1Q96 | 13/16 | 3 |
DISCUSSION
A fully atomistic description of an RNA molecule requires on average ≈20 non-hydrogen atoms per nucleotide, and thus ≈60 degrees of freedom per nucleotide. Biologically relevant RNA structures are composed of hundreds or even thousands of nucleotides that display an intricate network of interactions. The complexity of the problem is further increased when considering that many RNA molecules are highly dynamic entities, and multiple chain configurations are needed for describing their mechanism and function (3). In order to provide a simpler representation, RNA structures are usually analysed in terms of sugar pucker conformations, backbone dihedrals (20,21) or pseudo dihedrals (22). Complementary approaches consider instead base-pair parameters (propeller, slide, roll, twist, etc.) and hydrogen bonding to classify base-pair interactions in terms of recurrent geometrical configurations observed in experimentally solved RNA structures (23–25).A key result of the present work is that the main structural and dynamical properties of RNA can be described in terms of the spatial arrangement of the nucleobases only. Since nucleobases are substantially rigid, this implies 6 degrees of freedom per nucleotide. This observation is in full accordance with previous analysis of RNA crystal structures (25,46), where the position and orientation of nucleobases only were considered. This work complements the analysis by showing that nucleobase position and orientation are sufficient to correctly describe the RNA dynamics as obtained from explicit solvent molecular dynamics simulations.
As shown in Figure 1, RNA base–base interactions mainly occur in a restricted neighborhood of the nucleobase that is well approximated by an ellipsoidal shell. Since specific regions of the ellipsoidal shell surrounding nucleobases correspond to different types of pairing and stacking interactions, it is in principle possible to infer secondary structure information directly from the pairing and stacking plots of Figure 2. This procedure is formally similar to the one employed in the analysis of protein structures, where the angular preferences of the backbone approximately correspond to different secondary structures (19). Our approach depends on nucleobase positions only and is thus complementary to that proposed in Ref. (22), where backbone pseudo dihedrals were used to classify RNA secondary structures.
By considering only the relative spatial arrangement of nucleobases, we introduce a knowledge-based scoring function for RNA structure prediction (|$\mathcal {E}$|SCORE). In contrast with several studies underlining the importance of an atomistic description for an accurate scoring (15–17), our work shows that a minimalist description of nucleobase arrangement is sufficient to produce equivalent or better results. Two aspects are worth highlighting. First, while the backbone can be ignored for scoring, one cannot avoid including the chain connectivity in the sampling algorithm used to generate the decoys. Second, the outcome of the scoring benchmarks is strongly dependent on the employed decoy sets. Some of the decoy sets generated using the FARNA algorithm are very challenging, as all scoring functions (|$\mathcal {E}$|SCORE, FARFAR and RASP) are highly degenerate, and assign good scores to structures that are very far from native (see e.g. Supplementry Figure S6, decoy set 1A4D and Ref. (16)). These decoys can be therefore used to improve the current methodologies for de novo predictions of RNA structure.A second aspect intimately connected with the choice of molecular representation is the definition of a proper structural proximity measure. Typically, structural deviations are used either to analyse and compare three-dimensional structures (e.g. RMSD, dRMSD, INF (27)) or to recognize and cluster RNA structural motifs (29–31). The |$\mathcal {E}$|RMSD introduced here is suitable for both tasks. A distinctive feature of the |$\mathcal {E}$|RMSD is the use of the ellipsoidal metric |$\tilde{r}$|, that naturally stems from the empirical data and that takes into account both relative distance and orientation of nucleobases. Interestingly, the intrinsic base anisotropy has been also discussed in the context of the electronic polarizability of nucleobases (35) and anisotropic interactions have been used in coarse-grain models for proteins (47). We note that the |$\mathcal {E}$|RMSD is conceptually reminiscent of the INF measure (27), that also considers the differences in the network of base-pairing and base-stacking interactions. With respect to INF, however, the |$\mathcal {E}$|RMSD presents important advantages: i) it does not depend on the annotation scheme employed and therefore on the quality of the structure, ii) it is a continuous function of the atomic coordinates and iii) it is well defined even when the annotation is not informative (e.g. for unfolded states, unstructured, or single-stranded RNAs). The last two items make the |$\mathcal {E}$|RMSD particularly well-suited for studying dynamical processes, such as RNA folding or binding events. The |$\mathcal {E}$|RMSD also shares a number of similarities with the structural distance used in FR3D for recognizing three-dimensional motifs (29). However, in the |$\mathcal {E}$|RMSD calculation we introduce important simplifications: i) the definition of the local coordinate system is not nucleobase-specific and requires the knowledge of three atoms per nucleobase only (Figure 1a), ii) the |$\mathcal {E}$|RMSD calculation does not require least square fitting and iii) the problem of combining the positional and orientational base–base distance is circumvented by introducing an ellipsoidal metric. The fact that the two methods produce similar results (Table 2) strongly suggests that for most practical uses the |$\mathcal {E}$|RMSD is accurate enough to capture the specific pattern of base–base interactions.
In this study we introduce a minimalist representation of RNA three-dimensional structures that solely considers the relative orientation and position of nucleobases. Based on this representation, we construct a scoring function for RNA structure prediction and we define a metric for calculating deviation between RNA three-dimensional structures. In both cases, the results are on par or better compared with state-of-the-art techniques. Taken together, the presented results suggest that this minimalist representation captures the main interactions that are relevant for describing RNA structure and dynamics. As a final remark, we note that the methodology presented here can be readily applied to DNA molecules as well.
AVAILABILITY
Python scripts to compute |$\mathcal {E}$|SCORE and |$\mathcal {E}$|RMSD from PDB structures are available at https://github.com/srnas/barnaba.
We are grateful to Francesco Colizzi, Richard Andre Cunha, Eli Hershkovits, Alessandro Laio, Gian Gaetano Tartaglia, Eric Westhof and Craig Zirbel for carefully reading the manuscript and providing several enlightening suggestions. Massimiliano Bonomi, Thomas Hamelryck and Andrea Pagnani are also acknowledged for useful discussions.
FUNDING
European Research Council under the European Union's Seventh Framework Programme (FP/2007-2013) [306662, S-RNA-S]. Funding for open access charge: European Research Council [306662, S-RNA-S].
Conflict of interest statement. None declared.
Comments