PT - JOURNAL ARTICLE AU - Huijuan Feng AU - Suying Bao AU - Sebastien M. Weyn-Vanhentenryck AU - Aziz Khan AU - Justin Wong AU - Ankeeta Shah AU - Elise D. Flynn AU - Chaolin Zhang TI - Modeling RNA-binding protein specificity <em>in vivo</em> by precisely registering protein-RNA crosslink sites AID - 10.1101/428615 DP - 2018 Jan 01 TA - bioRxiv PG - 428615 4099 - http://biorxiv.org/content/early/2018/09/27/428615.short 4100 - http://biorxiv.org/content/early/2018/09/27/428615.full AB - RNA-binding proteins (RBPs) regulate post-transcriptional gene expression by recognizing short and degenerate sequence elements in their target transcripts. Despite the expanding list of RBPs with in vivo binding sites mapped genomewide using crosslinking and immunoprecipitation (CLIP), defining precise RBP binding specificity remains challenging. We previously demonstrated that the exact protein-RNA crosslink sites can be mapped using CLIP data at single-nucleotide resolution and observed that crosslinking frequently occurs at specific positions in RBP motifs. Here we have developed a computational method, named mCross, to jointly model RBP binding specificity while precisely registering the crosslinking position in motif sites. We applied mCross to 112 RBPs using ENCODE eCLIP data and validated the reliability of the resulting motifs by genome-wide analysis of allelic binding sites also detected by CLIP. We found that the prototypical SR protein SRSF1 recognizes GGA clusters to regulate splicing in a much larger repertoire of transcripts than previously appreciated.