Abstract
RNA prediction has long been struggling with long-range base pairs since prediction accuracy decreases with base pair span. We analyze here the empirical distribution of base pair spans in large collection of experimentally known RNA structures. Surprisingly, we find that long-range base pairs are overrepresented in these data. In particular, there is no evidence that long-range base pairs are systematically overpredicted relative to short-range interactions in thermodynamic predictions. This casts doubt on a recent suggestion that kinetic effects are the cause of length-dependent decrease of predictability. Instead of a modification of the energy model we advocate a modification of the expected accuracy model for RNA secondary structures. We demonstrate that the inclusion of a span-dependent penalty leads to improved maximum expected accuracy structure predictions compared to both the standard MEA model and a modified folding algorithm with an energy penalty function. The prevalence of long-range base pairs provide further evidence that RNA structures in general do not have the so-called polymer zeta property. This has consequences for the asymptotic performance for a large class of sparsified RNA folding algorithms.
The Students of the Bioinformatics II Lab Class 2013.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Doshi, K., Cannone, J., Cobaugh, C., Gutell, R.: Evaluation of the suitability of free-energy minimization using nearest-neighbor energy parameters for RNA secondary structure prediction. BMC Bioinformatics 5, 105 (2004)
Hofacker, I.L., Priwitzer, B., Stadler, P.F.: Prediction of locally stable RNA secondary structures for genome-wide surveys. Bioinformatics 20, 191–198 (2004)
Bernhart, S., Hofacker, I.L., Stadler, P.F.: Local RNA base pairing probabilities in large sequences. Bioinformatics 22, 614–615 (2006)
Kiryu, H., Kin, T., Asai, K.: Rfold: an exact algorithm for computing local base pairing probabilities. Bioinformatics 24, 367–373 (2008)
Kiryu, H., Terai, G., Imamura, O., Yoneyama, H., Suzuki, K., Asai, K.: A detailed investigation of accessibilities around target sites of siRNAs and miRNAs. Bioinformatics 27, 1788–1797 (2011)
Lange, S.J., Maticzka, D., Möhl, M., Gagnon, J.N., Brown, C.M., Backofen, R.: Global or local? Predicting secondary structure and accessibility in mRNAs. Nucleic Acids Res. 40, 5215–5226 (2012)
Proctor, J.R.P., Meyer, I.M.: CoFold: an RNA secondary structure prediction method that takes co-transcriptional folding into account. Nucleic Acids Res. 41, e102 (2013)
Romero-López, C., Berzal-Herranz, A.: A long-range RNA-RNA interaction between the 5’ and 3’ ends of the HCV genome. RNA 15, 1740–1752 (2009)
Wu, B., Grigull, J., Ore, M.O., Morin, S., White, K.A.: Global organization of a positive-strand RNA virus genome. PLoS Pathog. 9, e1003363 (2013)
Raker, V.A., Mironov, A.A., Gelfand, M.S., Pervouchine, D.D.: Modulation of alternative splicing by long-range RNA structures in Drosophila. Nucleic Acids Res. 37, 4533–4534 (2009)
Pervouchine, D.D., Khrameeva, E.E., Pichugina, M.Y., Nikolaienko, O.V., Gelfand, M.S., Rubtsov, P.M., Mironov, A.A.: Evidence for widespread association of mammalian splicing and conserved long-range RNA structures. RNA 18, 1–15 (2012)
Yoffe, A.M., Prinsen, P., Gelbart, W.M., Ben-Shaul, A.: The ends of a large RNA molecule are necessarily close. Nucl. Acids Res. 39, 292–299 (2011)
Fang, L.T.: The end-to-end distance of RNA as a randomly self-paired polymer. J. Theor. Biol. 280, 101–107 (2011)
Clote, P., Ponty, Y., Steyaert, J.M.: Expected distance between terminal nucleotides of RNA secondary structures. J. Math. Biol. 65, 581–599 (2012)
Han, H.S., Reidys, C.M.: The 5’-3’ distance of RNA secondary structures. J. Comput. Biol. 19, 867–878 (2012)
Backofen, R., Fricke, M., Marz, M., Qin, J., Stadler, P.F.: Distribution of graph-distances in Boltzmann ensembles of RNA secondary structures. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 112–125. Springer, Heidelberg (2013)
Backofen, R., Tsur, D., Zakov, S., Ziv-Ukelson, M.: Sparse RNA folding: Time and space efficient algorithms. J. Discr. Alg. 9, 12–31 (2011)
Andronescu, M., Bereg, V., Hoos, H.H., Condon, A.: RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinf. 9, 340 (2008)
Zwieb, C., Gorodkin, J., Knudsen, B., Burks, J., Wower, J.: tmrdb (tmrna database). Nucleic Acids Res. 31(1), 446–447 (2003)
Rosenblad, M.A., Larsen, N., Samuelsson, T., Zwieb, C.: Kinship in the SRP RNA family. RNA Biol. 6(5), 508–516 (2009)
Brown, J.: The ribonuclease p database. NARÂ 27(1) (1999)
Jiang, M., Anderson, J., Gillespie, J., Mayne, M.: ushuffle: a useful tool for shuffling biological sequences while preserving the k-let counts. BMC Bioinformatics 9(1), 192 (2008)
Waterman, M.S.: Secondary structure of single-stranded nucleic acids. Adv. Math. Suppl. Studies 1, 167–212 (1978)
McCaskill, J.S.: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 29(6-7), 1105–1119 (1990)
Lu, Z., Gloor, J., Mathews, D.: Improved RNA secondary structure prediction by maximizing expected pair accuracy. RNA 15, 1805–1813 (2009)
Lorenz, R., Bernhart, S.H., Höner Zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P.F., Hofacker, I.L.: ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011)
van Rijsbergen, C.J.: Information Retrieval. Butterworth (1979)
Gardner, P.P., Giegerich, R.: A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 5, 140 (2004)
Wexler, Y., Zilberstein, C., Ziv-Ukelson, M.: A study of accessible motifs and RNA folding complexity. J. Comput. Biol. 14, 856–872
Dimitrieva, S., Bucher, P.: Practicality and time complexity of a sparsified RNA folding algorithm. J Bioinf. Comp. Biol. 10, 1241007 (2012)
Huang, F.W.D., Reidys, C.M.: On the combinatorics of sparsification. Alg. Mol. Biol. 7, 28 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Amman, F. et al. (2013). The Trouble with Long-Range Base Pairs in RNA Folding. In: Setubal, J.C., Almeida, N.F. (eds) Advances in Bioinformatics and Computational Biology. BSB 2013. Lecture Notes in Computer Science(), vol 8213. Springer, Cham. https://doi.org/10.1007/978-3-319-02624-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-02624-4_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02623-7
Online ISBN: 978-3-319-02624-4
eBook Packages: Computer ScienceComputer Science (R0)