Site identification in high-throughput RNA-protein interaction data

Philip J Uren; Emad Bahrami-Samani; Suzanne C Burns; Mei Qiao; Fedor V Karginov; Emily Hodges; Gregory J Hannon; Jeremy R Sanford; Luiz O F Penalva; Andrew D Smith

doi:10.1093/bioinformatics/bts569

Site identification in high-throughput RNA-protein interaction data

Bioinformatics. 2012 Dec 1;28(23):3013-20. doi: 10.1093/bioinformatics/bts569. Epub 2012 Sep 28.

Authors

Philip J Uren¹, Emad Bahrami-Samani, Suzanne C Burns, Mei Qiao, Fedor V Karginov, Emily Hodges, Gregory J Hannon, Jeremy R Sanford, Luiz O F Penalva, Andrew D Smith

Affiliation

¹ Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA.

Abstract

Motivation: Post-transcriptional and co-transcriptional regulation is a crucial link between genotype and phenotype. The central players are the RNA-binding proteins, and experimental technologies [such as cross-linking with immunoprecipitation- (CLIP-) and RIP-seq] for probing their activities have advanced rapidly over the course of the past decade. Statistically robust, flexible computational methods for binding site identification from high-throughput immunoprecipitation assays are largely lacking however.

Results: We introduce a method for site identification which provides four key advantages over previous methods: (i) it can be applied on all variations of CLIP and RIP-seq technologies, (ii) it accurately models the underlying read-count distributions, (iii) it allows external covariates, such as transcript abundance (which we demonstrate is highly correlated with read count) to inform the site identification process and (iv) it allows for direct comparison of site usage across cell types or conditions.

Availability and implementation: We have implemented our method in a software tool called Piranha. Source code and binaries, licensed under the GNU General Public License (version 3) are freely available for download from http://smithlab.usc.edu.

Contact: andrewds@usc.edu

Supplementary information: Supplementary data available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Base Sequence
Binding Sites
Computational Biology / methods
HEK293 Cells
HeLa Cells
High-Throughput Nucleotide Sequencing / methods
Humans
RNA / genetics
RNA-Binding Proteins / genetics
Sequence Analysis, RNA / methods*
Software*

Substances

RNA-Binding Proteins
RNA

Abstract

Publication types

MeSH terms

Substances

Grants and funding