Query-dependent banding (QDB) for faster RNA similarity searches

Eric P Nawrocki; Sean R Eddy

doi:10.1371/journal.pcbi.0030056

Query-dependent banding (QDB) for faster RNA similarity searches

PLoS Comput Biol. 2007 Mar 30;3(3):e56. doi: 10.1371/journal.pcbi.0030056. Epub 2007 Feb 7.

Authors

Eric P Nawrocki¹, Sean R Eddy

Affiliation

¹ Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, Virginia, United States of America.

Abstract

When searching sequence databases for RNAs, it is desirable to score both primary sequence and RNA secondary structure similarity. Covariance models (CMs) are probabilistic models well-suited for RNA similarity search applications. However, the computational complexity of CM dynamic programming alignment algorithms has limited their practical application. Here we describe an acceleration method called query-dependent banding (QDB), which uses the probabilistic query CM to precalculate regions of the dynamic programming lattice that have negligible probability, independently of the target database. We have implemented QDB in the freely available Infernal software package. QDB reduces the average case time complexity of CM alignment from LN(2.4) to LN(1.3) for a query RNA of N residues and a target database of L residues, resulting in a 4-fold speedup for typical RNA queries. Combined with other improvements to Infernal, including informative mixture Dirichlet priors on model parameters, benchmarks also show increased sensitivity and specificity resulting from improved parameterization.

Publication types

Evaluation Study
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Base Sequence
Database Management Systems*
Databases, Genetic*
Information Storage and Retrieval / methods*
Molecular Sequence Data
RNA / chemistry*
RNA / genetics*
Sequence Alignment / methods*
Sequence Analysis, RNA / methods*
Sequence Homology, Nucleic Acid

Substances

RNA

Abstract

Publication types

MeSH terms

Substances

Grants and funding