RT Journal Article SR Electronic T1 Binding specificities of human RNA binding proteins towards structured and linear RNA sequences JF bioRxiv FD Cold Spring Harbor Laboratory SP 317909 DO 10.1101/317909 A1 Arttu Jolma A1 Jilin Zhang A1 Estefania Mondragón A1 Teemu Kivioja A1 Yimeng Yin A1 Fangjie Zhu A1 Quaid Morris A1 Timothy R. Hughes A1 Louis James Maher III A1 Jussi Taipale YR 2019 UL http://biorxiv.org/content/early/2019/03/01/317909.abstract AB Sequence specific RNA-binding proteins (RBPs) control many important processes affecting gene expression. They regulate RNA metabolism at multiple levels, by affecting splicing of nascent transcripts, RNA folding, base modification, transport, localization, translation and stability. Despite their central role in most aspects of RNA metabolism and function, most RBP binding specificities remain unknown or incompletely defined. To address this, we have assembled a genome-scale collection of RBPs and their RNA binding domains (RBDs), and assessed their specificities using high throughput RNA-SELEX (HTR-SELEX). Approximately 70% of RBPs for which we obtained a motif bound to short linear sequences, whereas ∼30% preferred structured motifs folding into stem-loops. We also found that many RBPs can bind to multiple distinctly different motifs. Analysis of the matches of the motifs on human genomic sequences suggested novel roles for many RBPs in regulation of splicing, and revealed RBPs that are likely to control specific classes of transcripts. Global analysis of the motifs also revealed an enrichment of G and U nucleotides. Masking of G and U by RBPs is expected to increase the specificity of RNA folding, as both G and U can pair to two other RNA bases via canonical Watson-Crick or G-U base pairs. The collection containing 145 high resolution binding specificity models for 86 RBPs is the largest systematic resource for the analysis of human RBPs, and will greatly facilitate future analysis of the various biological roles of this important class of proteins.