RT Journal Article SR Electronic T1 Modeling RNA-binding protein specificity in vivo by precisely registering protein-RNA crosslink sites JF bioRxiv FD Cold Spring Harbor Laboratory SP 428615 DO 10.1101/428615 A1 Huijuan Feng A1 Suying Bao A1 Sebastien M. Weyn-Vanhentenryck A1 Aziz Khan A1 Justin Wong A1 Ankeeta Shah A1 Elise D. Flynn A1 Chaolin Zhang YR 2018 UL http://biorxiv.org/content/early/2018/09/27/428615.abstract AB RNA-binding proteins (RBPs) regulate post-transcriptional gene expression by recognizing short and degenerate sequence elements in their target transcripts. Despite the expanding list of RBPs with in vivo binding sites mapped genomewide using crosslinking and immunoprecipitation (CLIP), defining precise RBP binding specificity remains challenging. We previously demonstrated that the exact protein-RNA crosslink sites can be mapped using CLIP data at single-nucleotide resolution and observed that crosslinking frequently occurs at specific positions in RBP motifs. Here we have developed a computational method, named mCross, to jointly model RBP binding specificity while precisely registering the crosslinking position in motif sites. We applied mCross to 112 RBPs using ENCODE eCLIP data and validated the reliability of the resulting motifs by genome-wide analysis of allelic binding sites also detected by CLIP. We found that the prototypical SR protein SRSF1 recognizes GGA clusters to regulate splicing in a much larger repertoire of transcripts than previously appreciated.