TY - JOUR T1 - A Deep Boosting Based Approach for Capturing the Sequence Binding Preferences of RNA-Binding Proteins from High-Throughput CLIP-Seq Data JF - bioRxiv DO - 10.1101/086421 SP - 086421 AU - Shuya Li AU - Fanghong Dong AU - Yuexin Wu AU - Sai Zhang AU - Chen Zhang AU - Xiao Liu AU - Tao Jiang AU - Jianyang Zeng Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/11/08/086421.abstract N2 - Characterizing the binding behaviors of RNA-binding proteins (RBPs) is important for understanding their functional roles in gene expression regulation. However, current high-throughput experimental methods for identifying RBP targets, such as CLIP-seq and RNAcompete, usually suffer from the false positive and false negative issues. Here, we develop a deep boosting based machine learning approach, called DeBooster, to accurately model the binding sequence preferences and identify the corresponding binding targets of RBPs from CLIP-seq data. Comprehensive validation tests have shown that DeBooster can outperform other state-of-the-art approaches in predicting RBP targets and recover false negatives that are common in current CLIP-seq data. In addition, we have demonstrated several new potential applications of DeBooster in understanding the regulatory functions of RBPs, including the binding effects of the RNA helicase MOV10 on mRNA degradation, the influence of different binding behaviors of the ADAR proteins on RNA editing, as well as the antagonizing effect of RBP binding on miRNA repression. Moreover, DeBooster may provide an effective index to investigate the effect of pathogenic mutations in RBP binding sites, especially those related to splicing events. We expect that DeBooster will be widely applied to analyze large-scale CLIP-seq experimental data and can provide a practically useful tool for novel biological discoveries in understanding the regulatory mechanisms of RBPs. The scource code of DeBooster can be downloaded from http://github.com/dongfanghong/deepboost. ER -