Transcriptome-wide profiling of mammalian spliceosome and branchpoints with iCLIP ================================================================================= * Michael Briese * Nejc Haberman * Christopher R Sibley * Anob Chakrabarti * Zhen Wang * Julian König * David Perera * Vihandha O. Wickramasinghe * Ashok R. Venkitaraman * Nicholas M Luscombe * Christopher W. Smith * Tomaž Curk * Jernej Ule ## Abstract Studies of spliceosomal interactions are challenging due to their dynamic nature. Here we employed spliceosome iCLIP, which immunoprecipitates SmB along with snRNPs and auxiliary RNA binding proteins (RBPs), to simultaneously map human spliceosome engagement with snRNAs and pre-mRNAs. We identify nine sites on pre-mRNAs that overlap with position-dependent binding profiles of 15 RBPs. We reveal over 50,000 branchpoints (BPs), indicating that most human introns use a primary BP for spliceosome assembly, whereas alternative BPs are amplified when analysing intron lariats. Notably, we find that the binding patterns of many RBPs, especially the components of SF3 complex, are affected by RNA structure and position of BPs. Moreover, the stability of RNA structures around BPs distinguishes exons regulated by RBPs involved in early exon definition from those regulated by the SF3B components and PRPF8. These insights exemplify spliceosome iCLIP as a broadly applicable method for transcriptomic studies of BPs and splicing mechanisms. ## Introduction The vast majority of mammalian transcripts undergo splicing, a process through which introns are excised and respective exons adjoined. Splicing is integral to gene expression, whilst alternative splicing broadens the diversity of the transcriptome. It is a multi-step process in which multiple snRNPs and associated splicing factors bind at specific positions around intron boundaries in order to assemble an active spliceosome through a series of remodeling steps. The splicing reactions are coordinated by dynamic pairings between different snRNAs, between snRNAs and pre-mRNA, and by protein-RNA contacts. Spliceosome assembly begins with ATP-independent binding of U1 snRNP at the 5’ss, as well as many auxiliary RBPs binding to pre-mRNA as part of complex E. This includes the U2 small nuclear RNA auxiliary factors 1 and 2 (U2AF1 and U2AF2, also known as U2AF35 and U2AF65) that bind at the 3’ss (Wahl et al., 2009). ATP-dependent remodeling then leads to the formation of complex A. This enables U2 snRNP to contact the BP, where it is stabilized through interactions with SF3a, SF3b and U2AF2. Next, U4/U6 and U5 snRNPs are recruited to form complex B. The action of many RNA helicases and pre-mRNA processing factor 8 (PRPF8) then facilitate rearrangements of snRNP interactions and establishment of the catalytically competent Bact and C complexes. These catalyze the two trans-esterification reactions that lead to lariat formation, intron removal and exon ligation (Scheres and Nagai, 2017). While techniques such as NET-seq and ribosome profiling monitor transcription or translation through RNA protection by a single macromolecular machinery (Churchman and Weissman, 2011; Ingolia et al., 2009), studies of splicing reaction are particularly challenging due the multi-component and dynamic assembly of spliceosome on the pre-mRNA substrate (Scheres and Nagai, 2017). Therefore, they have been studied mainly *in vitro* with model substrates, or indirectly *in vivo* through monitoring the outcome of splicing with techniques such as RNA-seq. More recently, high-throughput techniques to study RNA interactions of the spliceosomal machinery have recently been developed (Burke et al., 2018; Chen et al., 2018; Wickramasinghe et al., 2015). For studies in mammalian cells, we have adapted the individual nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) to developed spliceosome iCLIP, which defines the positions of spliceosomal crosslinks on pre-mRNAs at nucleotide resolution (Wickramasinghe et al., 2015). In yeast, “spliceosome profiling” has been developed through affinity purification of tagged U2·U5·U6·NTC complex from *Schizosaccharomyces pombe* to monitor its interactions with a RNA footprinting-based strategy (Burke et al., 2018; Chen et al., 2018). It is currently unclear if this method can be applied to mammalian cells, which might be more sensitive to the introduction of affinity tags into splicing factors. Notably, none of these studies has examined the full complexity of the interactions of diverse RBPs on pre-mRNAs from the earliest to the latest stages of spliceosomal assembly. A second challenge in understanding splicing mechanism is the need to assign the position of branchpoints (BPs). The sequence consensus of mammalian BPs is less defined when compared to yeast, and therefore experimental methods are important to assess computational predictions. High-throughput methods to identify BPs have so far relied on lariat-spanning RNA-seq reads that cross from the 5’ portion of the intron, over the BP, and finally finish in the 3’ portion of the intron upstream of the BP (Mercer et al., 2015; Pineda and Bradley, 2018; Taggart et al., 2017; Vogel et al., 1997). However, both the RNA-seq methods and computational predictions tend to identify multiple potential BPs in most introns, and it therefore remains unclear which BP is of primary functional importance (Corvelo et al., 2010; Kol et al., 2005; Pagii and Bejerano, 2018; Pineda and Bradley, 2018). Moreover, since lariat-spanning RNA-seq reads are very rare, they require substantial sequencing depth. In yeast, spliceosome profiling was successful in assigning the positions of BPs in yeast by monitoring the position of cDNAs truncating at BPs (Chen et al., 2018), thus indicating that a similar approach could be applied also to mammalian cells. Here, we use spliceosome iCLIP to simultaneously monitor interactions of spliceosomal RBPs across all the stages of the splicing cycle in human cell lines and mouse brain. We find that this method can simultaneously map the crosslink profiles of core and accessory spliceosomal factors that are known to participate across the diverse stages of splicing cycle. In spite of this high diversity of purified factors, the nucleotide precision of iCLIP reveals 7 binding peaks corresponding to distinct RBPs that differ in their requirement for ATP or for the factor PRPF8. Spliceosome iCLIP also purifies intron lariats to identify the positions of BPs. This allowed us to identify BPs in ~65% of introns within expressed human genes, which contain more canonical sequence and structure features when compared to those identified by RNA-seq. We have used the identified BPs to examine surrounding binding of spliceosomal RBPs, which demonstrate that a single BP tends to be primarily responsible for the assembly of SF3 and associated spliceosomal complex, even though alternative BPs are detected by lariat-derived reads. Moreover, we identify complementary roles of U2AF and SF3 complexes in BP definition; BPs with weak scores are coupled with stronger binding of the U2AF complex, whilst BPs with higher scores that tend to be presented within stem-loops are coupled with stronger binding and function of SF3 complex. Taken together, these findings demonstrate the value of spliceosome iCLIP for transcriptomic studies of BP definition and spliceosomal interactions with pre-mRNAs. ## Results ### Spliceosome iCLIP identifies interactions between splicing factors, snRNAs and pre-mRNAs SmB/B’ proteins are part of the highly stable Sm core common to all spliceosomal snRNPs except U6 (Kambach et al., 1999), making them suitable candidates for enriching snRNPs via immunopurification. In order to adapt iCLIP for the study of a multi-component machine like the spliceosome, we immunopurified the SmB/B’ proteins using the 18F6 or 12F5 monoclonal antibodies (Carissimi et al., 2006) (see online Methods for details) (Figure 1A, Figure S1A-B), using a range of conditions with differing stringency. First, we employed the standard condition used for iCLIP experiments in cultured human cells, containing 0.1% SDS and all other detergents that lead to a stringent, but non-denaturing purification (termed ‘medium’, Table S1). We examined the crosslinked radioactive protein-RNA complexes on the SDS-PAGE gel upon use of a high RNase concentration, which is optimized to create ~10 nt long RNA fragments (Huppertz et al., 2014). Two strong radioactive bands were present, the first of which (at ~25 kDa) corresponds to the molecular weight of the SmB-RNA complex (Figure 1B). In addition, a band at ~20 kDa may correspond to another Sm protein, or another RBP that directly interacts with the Sm ring. Next, we examined the radioactive protein-RNA complexes with the use of a lower RNase concentration, which is optimized to create an average of 200 nt long RNA fragments (with a full range of 50-500 nt) (Huppertz et al., 2014). Low RNase treatment should allow snRNAs to remain generally intact, and therefore the snRNAs could serve as a scaffold to purify a multi-protein complex (Figure 1A). In agreement, low RNase treatment led to appearance of a diffuse signal between 30-200 kDa, indicating that protein-RNA complexes of diverse sizes were present (Figure 1B). The same result was obtained when the SmB/B’ proteins were immunopurified using three different monoclonal antibodies (Figure S1C). View this table: [Table S1.](http://biorxiv.org/content/early/2018/06/22/353599/T1) Table S1. Related to Figure 1. iCLIP lysis and wash buffer compositions. ![Figure 1.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F1.medium.gif) [Figure 1.](http://biorxiv.org/content/early/2018/06/22/353599/F1) Figure 1. Spliceosome iCLIP identifies protein interactions with snRNAs and splicing substrates (A) Schematic representation of the spliceosome iCLIP method performed under conditions of varying lysis stringency. (B) Autoradiogram of crosslinked RNPs immunopurified with SmB/B’ from HeLa cells following digestion with high (++) or low (+) amounts of RNase I in medium-stringency lysis buffer. The dotted line depicts the region excised from the nitrocellulose membrane for spliceosome iCLIP. (C) Genomic distribution of spliceosome iCLIP cDNAs. Next, to examine if the signal between 30-200 kDa corresponds to the Sm proteins, or other RBPs that are associated with snRNPs, we used a denaturing condition (termed ‘stringent’) that employed urea to specifically purify SmB via a Flag tag (Table S1). In studies of other RBPs, this condition minimised co-purification of additional proteins apart from that being purified (Huppertz et al., 2014) (Figure 1A). Indeed, we observed a single band corresponding to the molecular weight of SmB-RNA complex, and the signal remained largely unchanged under the low RNase condition (Figure S1D). This indicates that purification of a multi-protein snRNP complex requires low RNase and purification conditions that can preserve protein-protein interactions. Our standard iCLIP protocol employs a high concentration of detergents in the lysis buffer, followed by washing buffer with 1M NaCl (medium, Table S1), which is not compatible with many protein-protein interactions, except the stable complexes such as snRNPs (Figure 1A). Therefore, we established a modified purification with decreased concentration of detergents in the lysis buffer, and only 0.1M NaCl in the washing buffer (termed ‘mild’, Table S1). Under this mild purification condition, the diffuse signal at 30-200 kDa strongly increased upon low RNase treatment, indicating that the mild purification allows most efficient purification of multi-protein complexes, such as snRNPs and their associated proteins (Figure 1A, Figure S1E). To produce cDNA libraries with spliceosome iCLIP, we immunoprecipitated SmB under conditions of three different levels of stringency from UV-crosslinked cell lysates, and isolated a broad size distribution of protein-RNA complexes in order to recover the greatest possible diversity of spliceosomal protein-RNA interactions (Figure 1B, S1C-E). Lysates were treated with low RNase condition that leads to cDNAs of optimal size for comprehensive crosslink determination (Haberman et al., 2017), and three biological replicate samples of cDNA libraries were prepared (Table S2 and S3) (Huppertz et al., 2014). As in previous iCLIP studies (König et al., 2010), the nucleotide preceding each cDNA was considered as the crosslink site and cDNAs at each crosslink site were summed as cDNA counts. When stringent conditions were used, >75% of spliceosomal crosslinking mapped to snRNAs (Figure 1C), which agrees with its specific purification of Flag-tagged SmB. However, upon mild and medium conditions of spliceosome iCLIP, the proportion of snRNA crosslinking was reduced to approximately 10% and replaced by crosslinking to introns and exons, indicating that these conditions allow mapping of snRNP-associated proteins on pre-mRNAs (Figure 1A and C). View this table: [Table S2.](http://biorxiv.org/content/early/2018/06/22/353599/T2) Table S2. Related to Figure 1 and 2. Reverse transcription primer sequences. ### Spliceosome iCLIP identifies seven crosslinking peaks on pre-mRNAs Assembly of the spliceosome on pre-mRNA is guided by three main landmarks: the 5’ss, 3’ss and BP. Therefore, we evaluated if spliceosomal crosslinks locate at specific positions relative to these landmarks. The position of BPs was determined by using published computationally predicted BPs (Pagii and Bejerano, 2018). Notably, when we plotted summarized spliceosomal crosslinking at these landmarks of all introns, 7 peaks of crosslinking were observed (Figure 2A, Figure S2A). In addition, two peaks precisely align with the start of the intron and putative BPs. We term these positions A and B, and these are discussed in detail in due course (Figure 2A, Figure S2A). The center of each peak was seen at 15 nt upstream of the 5’ss (peak 1), the nucleotide at the start of the intron (position A), 10 nt downstream of the 5’ss (peak 2), 31 nt downstream of the 5’ss (peak 3), 26 nt upstream of the BP (peak 4), 20 nt upstream of the BP (peak 5), the nucleotide of the BP (position B), 11 nt upstream of the 3’ss (peak 6) and 3 nt upstream of the 3’ss (peak 7). The medium and mild conditions led to the same positions of the peaks, with the signal of the mild condition generally being stronger, especially at the 3’ss (Figure S2A). This is consistent with the stronger signal of co-purified complexes on the SDS-PAGE gel under the mild condition (Figure S1D). ![Figure 2.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F2.medium.gif) [Figure 2.](http://biorxiv.org/content/early/2018/06/22/353599/F2) Figure 2. Analysis of splicesomal interactions with pre-mRNAs in vitro and in vivo. (A) RNA map of summarised crosslinking at all exon-intron and intron-exon boundaries detected by spliceosome iCLIP from Cal51 cells to identify major binding peaks, and to monitor changes between the WT and PRPF8 KD. Individual lines of comparable colour represent replicate datasets. (B) Normalized spliceosome iCLIP cDNA counts on the C6orf10 in vitro splicing substrate. Exons are marked by grey boxes, intron by a line, and the BP by a green dot. The positions of peak crosslinking are marked by numbers and letters corresponding to the peaks in Figure 2A. (C) Schematic description of the three-way junctions of intron lariats. The three-way junction is produced after limited RNase I digestion of intron lariats, followed by ligation of the L3 adaptor to the two 3’ ends of the three-way junction. This can lead to cDNAs that don’t truncate at sites of protein-RNA crosslinking, but rather at the three-way junction of intron lariats. These cDNAs initiate from the end of the intron and truncate at the BP (position B), or initiate downstream of the 5’ splice site and truncate at the first nucleotide of the intron (position A). We used the mild condition to investigate how PRPF8 knockdown (KD) affects spliceosomal interactions in Cal51 cells (Figure S2B). PRPF8 is an integral U5 snRNP component, and therefore part of of complexes B and C, where it contacts residues of U5 and U6 snRNAs, as well as pre-mRNA at both the splice sites and BP (Teigelkamp et al., 1995). We have previously used spliceosome iCLIP to show that PRPF8 KD affects the crosslinking pattern of spliceosomal factors at the 5’ss (Wickramasinghe et al., 2015). Here we additionally examined the crosslinking around BPs to find that PRPF8 KD reproducibly leads to diminished crosslinking at peaks 1 and 3-5 (Figure 2A). Moreover, we also observed a major decrease of reads truncating at the positions A and B, consistent with the expectation that production of lariats should diminish upon PRPF8 KD. In contrast, crosslinking at peaks 2 and 6 is increased upon PRPF8 KD. This demonstrates that spliceosome iCLIP monitors pre-mRNA interactions of factors that interact with pre-mRNAs at different stages of the splicing reaction. ### *in vitro* spliceosome iCLIP defines the ATP-dependence of crosslinking peaks Having identified the characteristic peaks of spliceosomal crosslinking *in vivo*, we then tested if these peaks could also be observed under the controlled conditions of an *in vitro* splicing reaction. To this end, we added an exogenous pre-mRNA splicing substrate to HeLa nuclear extract in the presence or absence of ATP. The RNA substrate was produced by *in vitro* transcription of a minigene construct containing a short intron and flanking exons from the human C6orf10 gene. Gel electrophoresis analysis of splicing products confirmed that ATP was required for the formation of intron lariats and other splicing products (Figure S2C). We performed spliceosome iCLIP on the splicing reactions and confirmed the presence of the spliced product with exon-exon junction reads, which were specific to the library produced from extracts with ATP (+ATP) (Figure S2D-E, Table S4). This indicates that spliceosome iCLIP is suitable to monitor progression of spliceosomal assembly on pre-mRNAs during the splicing reaction. View this table: [Table S3.](http://biorxiv.org/content/early/2018/06/22/353599/T3) Table S3. Related to Figure 1 and 2. Number of Sm iCLIP sequence reads mapping to the genome. View this table: [Table S4.](http://biorxiv.org/content/early/2018/06/22/353599/T4) Table S4. Related to Figure 2. Number of Sm iCLIP reads mapping to C6orf10 unspliced and spliced substrates. We then visualized the crosslinking on the substrate RNA, and marked the positions of peaks that corresponded best to those found on endogenous transcripts (Figure 2B). A similar extent of crosslinking was seen at peaks 1, 2, 6 and 7 in the presence or absence of ATP, indicating that these peaks correspond to ATP-independent contacts of early spliceosomal factors. These crosslinks could originate from RBPs and snRNPs that are capable of binding pre-mRNA in a pre-splicing H complex or the ATP-independent E complex (Jurica et al., 2002) including TIA1/TIAL1 and U2AF2 proteins, which tend to bind at positions corresponding to peaks 2 and 6, respectively (Bennett et al., 1992; Forch et al., 2000). Both proteins bind U-rich sites in an ATP-independent manner, in agreement with the presence of U-tracts at peaks 2 and 6. The peaks 4 and 5 located in the region 1940 nt upstream of the BP, and positions A and B, were most dependent on the presence of ATP. ### Lariat-specific reads are efficiently obtained with spliceosome iCLIP Following crosslinking, the peptide that remains bound to the RNA after digestion of the RBP can lead to termination of reverse transcription, and production of so called ‘truncated cDNAs’ (Lee and Ule, 2018). The predominance of truncated cDNAs in iCLIP libraries has been validated by multiple means (Haberman et al., 2017; Sugimoto et al., 2012), and therefore our analysis of iCLIP data refers to the nucleotide preceding the iCLIP read on the reference genome as the ‘crosslink site’, and same applies to derived methods, such as eCLIP. However, in the case of spliceosome iCLIP, we expect that the presence of intron lariats that interact with splicing factors could lead to additional types of truncated cDNAs. If an RBP is crosslinked in the vicinity of the BP, it could remain bound to the three-way-junction that is formed upon fragmention by RNase, where the 5’ end of the intron is linked via a 2’-5’ phosphodiester bond to the BP (Figure 2C). Such three-way-junction RNAs present two available 3’ ends for ligation to adapters, and the reads derived from them can truncate at the BP (i.e. position B) or at the start of the intron (i.e. position A), rather than representing the RBP crosslink site. In agreement with this hypothesis, we find that cDNA truncations at positions A and B are dramatically decreased in conditions where intron lariats are not expected to be present: *in vivo* upon PRPF8 KD (Figure 2A), or *in vitro* in the absence of ATP (Figure 2B). Taken together, we find that *in vivo* depletion of PRPF8 and *in vitro* depletion of ATP have similar effect on crosslinking of splicing factors to pre-mRNAs, and also lead to loss of intron lariats. In particular, peaks upstream of BPs are diminished upon both conditions, indicating that they represent interactions of factors that are involved in the later stages of spliceosomal assembly. ### iCLIP identifies >50,000 human branchpoints (BPs) with canonical features Our analysis indicates that cDNAs truncating at BPs might result from the presence of three-way junctions that cause termination of the reverse transcriptase at the BP (Figure 2C). We found that the medium purification condition was optimal to identify cDNAs that truncate at peak B (Figure S2A), and we therefore performed spliceosome iCLIP under medium purification conditions from UV crosslinked cell lysates (see Methods). Next, we examined the nucleotide composition around the starts of cDNAs that end at the last nucleotide of introns, which are most likely derived from intron lariats (see Figure 2C). Interestingly, the starts of these cDNAs overlap with YUNAY motif, which is the known consensus sequence of BPs (Sharp and Burge, 1997) (Figure 3A). Further, these cDNAs have higher enrichment of mismatches of adenosines at their first nucleotide (Figure 3B), which is consistent with mismatch, insertion and deletion errors during reverse transcription across the three-way junction of the branch point (Mercer et al., 2015; Vogel et al., 1997). In contrast, the starts of all remaining spliceosome iCLIP cDNAs overlap with uridine-rich motif (Figure 3C), in agreement with the increased propensity of protein-RNA crosslinking at uridine-rich motifs (Sugimoto et al., 2012). We conclude that cDNAs overlapping with intron ends are mainly derived from intron lariats and truncate at BPs, while the remaining cDNAs mainly represent RBP crosslink sites. ![Figure 3.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F3.medium.gif) [Figure 3.](http://biorxiv.org/content/early/2018/06/22/353599/F3) Figure 3. Comparison of BPs identified by spliceosome iCLIP or RNA-seq lariat reads. (A) Weblogo around the nucleotide preceding all spliceosome iCLIP reads that align with ends of introns. (B) Enrichment of mismatches at the first nucleotide of spliceosome iCLIP reads that overlap with ends of introns, compared to remaining iCLIP reads. (C) Weblogo around the nucleotide preceding all remaining spliceosome iCLIP reads (i.e., those that do not align with ends of introns). (D) The number of top BPs identified by published RNA-seq (Pineda and Bradley, 2018) (i.e., the BP with most lariat-spanning reads in each intron) at positions relative to BPs identified by spliceosome iCLIP (i.e., iCLIP BPs). (E) A table providing the number of BPs identified by spliceosome with iCLIP (iCLIP BPs) in introns that also contain a BP assigned by lariat-spanning reads from RNA-seq (Pineda and Bradley, 2018). They are divided into three categories based on the distance between the iCLIP BP and the top RNA-seq BP. (F) Weblogo of iCLIP BPs that overlap with RNA-seq BPs. (G) Weblogo of iCLIP BPs that are >5 nt away from RNA-seq BP. (H) Weblogo of RNA-seq BPs that are >5 nt away from iCLIP BP. (I, J) The 80 nucleotide RNA region centered on the BP was used to calculate pairing probability with RNAfold program with the default parameters (Lorenz et al., 2011), and the average pairing probability of each nucleotide around BPs was calculated for each category of BPs. (K) RNA map of summarised crosslinking of spliceosome iCLIP from Cal51 cells around each category of BPs. (L) Crosslinking patterns of selected RBPs, as defined by cDNA starts of eCLIP in the indicated cell lines, around iCLIP BPs that are located upstream (dotted line) or downstream (filled line) of the RNA-seq BPs. (M) Crosslinking patterns of selected RBPs, as defined by cDNA starts of eCLIP in the indicated cell lines, around RNA-seq BPs that are located upstream (dotted line) or downstream (filled line) of the iCLIP BPs. We used the spliceosome iCLIP cDNAs that overlap with intron ends to identify putative BPs. We only used those cDNAs that start at adenines, which identified sites in 35,056 introns that may act as BPs. The more distal BPs would not be identified by this initial approach due to our 41 read-length limit, and therefore we proceeded to a second step in introns where the initial approach didn’t identify any BPs. We analysed all cDNAs, and overlapped their truncation sites with BPs computationally predicted in 2010 (Corvelo et al., 2010). We selected the position with the highest number of truncated cDNAs, which identified candidate BPs in another 15,756 introns. Thus, we collectively identified candidate BPs in 50,812 introns of the most highly expressed genes (FPKM>10 as determined by RNA-seq). Since these genes in total contain 78,894 annotated introns, we were able to identify putative BPs in 64% of introns in expressed genes. Next, we compared iCLIP BPs with studies that used lariat-spanning reads from RNA-seq libraries to detect BPs with three different strategies; 59,359 BPs identified by exoribonuclease digestion and targeted RNA-sequencing (Mercer et al., 2015), 36,078 BPs identified by lariat-spanning reads refined by U2snRNP/pre-mRNA base-pairing models (Taggart et al., 2017), or 130,294 BPs identified by analysis of lariat-spanning reads in 17,164 RNA sequencing data sets (Pineda and Bradley, 2018). In each case, we focused on introns where a BP was defined both by the RNA-seq study and iCLIP, and we used only a single BP per intron for each RNA-seq study (the one with most lariat-spanning reads), to find a nucleotide resolution overlap with all studies (Figures S3A-B, 3D). 9348 (55% of Mercer), 6853 (45% of Taggart) and 11520 (54% of Pineda) BPs from each study precisely overlapped with the BPs identified by spliceosome iCLIP (Figures S3C-D, 3E). As a 1st step to compare the features of the non-overlapping BPs, where the positions of BPs identified by RNA-seq (‘RNA-seq BPs) doesn’t overlap with those identified by spliceosome iCLIP (iCLIP BPs), we examined the sequence consensus present at each class of BPs. The overlapping BPs, as well the iCLIP BPs, were strongly enriched in the canonical YUNAY motif (Figures S3E-F, S3H-I, 3F-G), while the RNA-seq BPs had poor enrichment of the YUNAY motif, indicating that most are non-canonical (Figure S3G, S3J, 3H). As a 2nd step, we examined the local RNA structure around each category of BPs (Figure 3I). Notably, both the overlapping and the iCLIP-specific BPs had decreased intramolecular pairing probability at the position of the BP. In contrast, the RNA-seq-specific BPs had the opposite feature, with increased pairing probability in the region of the putative BP (Figure 3I). This demonstrates that the RNA-seq-specific BPs contain mainly non-canonical motifs that are in a structural conformation that is poorly accessible for pairing with U2 snRNP. In contrast, spliceosome iCLIP identifies over 50,000 BPs that contain canonical sequence and structural features. As a 3rd step to compare the features of RNA-seq or iCLIP-specific BPs, we examined the crosslinking of spliceosomal RBPs around the non-overlapping BPs. First, we analysed the crosslink peaks identified by spliceosome iCLIP around the non-overlapping BPs (Figure 3J). The BPs identified by iCLIP were flanked by all the peaks at expected positions. This not only included peak at position A, but also included peaks 4 and 5, even though the cDNAs truncated in these peaks were not used to identify the BPs. In contrast, all the peaks were shifted for BPs identified by RNA-seq – when these BPs located upstream of iCLIP BPs, all peaks are shifted upstream of expected positions, and vice versa. We consolidated this finding by using independent CLIP data that hasn’t contributed to BP identification: eCLIP and iCLIP data for PRPF8, SF3 and U2AF factors, which are known to crosslink close to BPs (Figure 3K-L). Overlap of SF3 and U2AF crosslinking at the expected positions of peaks 4-7 was good for the iCLIP-specific BPs (Figure 3K), but poor for RNA-seq-specific BPs (Figure 3L). Moreover, the two groups of RNA-seq-specific BPs had their peaks shifted by about 10 nt relative to each other. Thus, the RBPs binding profiles defined by eCLIP or iCLIP in the Cal51, HeLa, K562 or HEPG2 cells do not align to the RNA-seq-specific BPs, while the iCLIP-specific BPs align well to the binding profiles of these RBPs. In spite of the lack of alignment between the RNA-seq-specific BPs and SF3/U2AF complexes, we noticed that these BPs still partly aligned to the PRPF8 peak (Fig. 3M). ### Insights into introns with alternative branch points To gain further insight into the mechanisms that might contribute to the use of alternative BPs, we next compared the iCLIP BPs with the BPs that have been bioinformatically predicted. For this purpose, we used the most recent study that used a sequence-based deep learning predictor, LaBranchoR, which was reported to predict a correct branchpoint for over 90% of 3’ss (Pagii and Bejerano, 2018). Notably, the overlap between iCLIP and computational BPs was higher than with RNA-seq BPs (Figure 4A). In total, 31167 iCLIP BPs overlapped with the computationally top-scoring BP, while 11858 were over 5 nt apart from each other (Figure 4B). Only in 2393 of these non-overlapping BPs, iCLIP-specific BPs overlapped with a lower ranking computational BP, indicating that most of the iCLIP-specific BPs were not easy to predict computationally. Both the overlapping and non-overlapping BPs have a very strong enrichment of the consensus YUNAY motif (Figure 4C-E). Interestingly, the BPs that were up to 5nt apart tend to occur within A-rich sequences (Figure S4A-B). Moreover, all groups of BPs had strikingly decreased intramolecular pairing probability at the position of the BP, and high pairing probability on both sides around the BP, indicative of a propensity for BP to be presented on a loop within a stem-loop structure (Figure 4F-G). We noticed that the non-overlapping BPs have an extended region of decreased pairing probability around the BPs, indicating that they don’t form such a clear stem loop. This indicates that a less defined RNA structure at BPs may be a factor contributing to the use of alternative BPs. Interestingly, the conservation of nucleotides is high immediately next to all types of BPs, while the region surrounding the iCLIP-specific BPs is more highly conserved (Figure 4H). ![Figure 4.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F4.medium.gif) [Figure 4.](http://biorxiv.org/content/early/2018/06/22/353599/F4) Figure 4. Comparison of computationally predicted BPs with those identified with spliceosome iCLIP. (A) The number of top computationally predicted BPs (Pagii and Bejerano, 2018) (i.e., the top scoring BP in each intron) at positions relative to BPs identified by spliceosome iCLIP (i.e., iCLIP BPs). (B) A table providing the number of BPs identified by spliceosome with iCLIP (iCLIP BPs) in introns that also contain a computationally predicted BP (Pagii and Bejerano, 2018). They are divided into three categories based on the distance between the iCLIP BP and the top computational BP. (C) Weblogo of iCLIP BPs that overlap with comp BP. (D) Weblogo of iCLIP BPs that are >5 nt away from comp BP. (E) Weblogo of comp BPs that are >5 nt away from iCLIP BP. (F, G) The 80 nucleotide RNA region centered on the BP was used to calculate pairing probability with RNAfold program with the default parameters (Lorenz et al., 2011), and the average pairing probability of each nucleotide around BPs was calculated for each category of BPs, as defined by the legend. (H) Average mammalian conservation score of each nucleotide around BPs was calculated for each category of BPs, as defined by the legend. (I) Crosslinking patterns of selected RBPs, as defined by cDNA starts of eCLIP in the indicated cell lines, around iCLIP BPs that are located upstream (dotted line) or downstream (filled line) of the computational BPs. (J) Crosslinking patterns of selected RBPs, as defined by cDNA starts of eCLIP in the indicated cell lines, around computational BPs that are located upstream (dotted line) or downstream (filled line) of the iCLIP BPs. Next, we examined the crosslinking of spliceosomal RBPs around the non-overlapping BPs. Analysis of spliceosome iCLIP shows that the computation-specific BPs align better to the spliceosomal peaks 4 and 5 than the iCLIP-specific BPs (Figure S4C). Moreover, eCLIP and iCLIP data for SF3 and U2AF factors align better to the computational-specific BPs (Figure 4J) than iCLIP-specific BPs (Figure 4I). In contrast, the PRPF8 eCLIP data align with nucleotide precision to both sets of the non-overlapping BPs. As discussed earlier, PRPF8 eCLIP cDNA starts most likely reflect the presence of reads derived from intron lariats, which truncate at BPs (position B in Figure 2C). Thus, the computationally predicted BPs appear to coordinate the assembly of U2 snRNP and the associated SF3 and U2AF complexes, while both classes of BPs contribute to lariat formation. This indicates that most introns primarily use a single BP for spliceosome assembly, which can be computationally predicted, while the lariat-based methods amplify the detection of alternative BPs. ### The position of BPs defines the binding profiles of many spliceosomal RBPs Transcriptome-wide RNA maps represent valuable tools to study the position-dependent binding profiles of many splicing factors (Ule et al., 2006; Witten and Ule, 2011). This showed that function of RBPs is often determined by the position of their binding relative to splice sites, but less is known about their orientation relative to BPs. We therefore sought to identify the RBPs binding at peaks defined by the spliceosome iCLIP that are aligned to BP (peaks 4, 5, B), and compare them to those binding at peaks aligned to 3’ss (peaks 6 and 7). For this purpose, we examined published iCLIP data produced in our lab for the 17 previously studied RBPs (Attig et al., 2018), and eCLIP data from K562 and HepG2 cells for 112 RBPs provided by the ENCODE consortium (Van Nostrand et al., 2017). For this analysis, we examined the introns containing the 31167 BPs that were identified both computationally and by iCLIP, which are likely the most reliable. We ranked the RBPs by the proportion of crosslink events mapping to each peak, and normalized them by the combined region encompassing 200 nt of intronic sequence upstream of the 3’ss and 50 nt of exonic sequence downstream of the 3’ss. Strikingly, a different set of RBPs was enriched at each peak (Figure 5, Table S5). ![Figure 5.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F5.medium.gif) [Figure 5.](http://biorxiv.org/content/early/2018/06/22/353599/F5) Figure 5. Identification of RBPs overlapping with splicesomal peaks at BPs and 3’ss. To systematically identify RBPs with crosslinking peaks that overlap with one of the peaks in spliceosomal iCLIP, we regionally normalized (relative to average crosslinking over −200..50 nt region relative to 3’ss) the crosslinking of each RBP, as shown in Fig. S5 and S6. We then calculated the average normalized crosslinking across the nucleotides within each peak, and ranked the RBPs relative to this normalized crosslinking enrichments. We analysed peaks 4-7, which are located close to 3’ss, and positions A and B, as marked on the top of each plot. The plot of top-ranking RBPs in each peak is shown on the left, and the full distribution of RBP enrichments is shown on the right. According to previous studies, the region encompassing peaks 4 and 5 should represent ATP-dependent binding of factors associated with the U2 snRNP complex, especially the U2 snRNP splicing factor 3 (SF3a/SF3b) proteins, which were previously shown to crosslink upstream of the BP (Gozani et al., 1996). Indeed, SF3B4, SF3A3 and SF3B1 are among the factors with strongest enrichment of crosslinking in these peaks (Figure 5). Surprisingly, we observed strong enrichment of additional RBPs in peaks 4 or 5, including SMNDC1, GPKOW, BUD13, XRN2 and EFTUD2. Conversely, crosslinking of U2AF1 and U2AF2 factors strongly dominates peaks 6 and 7, in agreement with their expected binding patterns (Kielkopf et al., 2004; Zarnack et al., 2013). We validated that the U2AF complex is co-purified in spliceosome iCLIP with western blot, especially in the absence of ATP, which enriches for the complex E that is known to contain U2AF (Figure S2F). To examine how the position of BPs on endogenous introns determines interactions of spliceosomal RBPs, we divided 3’ss into three categories based on the distance between BP and the 3’ss: 17-23 nt, 24-39 nt and 40-65 nt. We then analysed the position of cDNA starts of the eCLIP and iCLIP data for the 13 RBPs that are most enriched in peaks 4-7, or at position B. For each category of BPs we drew two diagrams; one where crosslinking was summarized to the BP, and the other where crosslinking aligned to 3’ss (Figures S5, S6). This confirmed that crosslinking of U2AF1 and U2AF2 is restricted to the region between the BPs and 3’ss, while the RBPs binding to peaks 4 and 5 are precisely aligned to the position of the BP, regardless of the distance between BP and 3’ss. The crosslinking pattern of SF3B4 most closely reflected spliceosomal iCLIP at peaks 4 and 5 (Figure S5). SF3B4 contains two RRM domains, which might crosslink within the positions of peaks 4 and 5. Interestingly, many additional RBPs have a binding pattern similar to SF3B4, but with lesser enrichment. These again included SMNDC1, SF3B1, EFTUD2 and XRN2 (Figure S5, S6). While it is possible that these RBPs independently bind to pre-mRNA in a pattern similar to SF3B4, it is equally plausible that they are part of a SF3B4-containing U2 snRNP complex that remains stable under the conditions of eCLIP. In that case, the similar crosslinking pattern would reflect the presence of contaminating SF3B4 in the eCLIP data of these other RBPs. At present, it is not possible to distinguish between these two options, because the purified protein-RNA complexes need to be visualized after their separation on SDS-PAGE gel to determine whether the expected RBPs have been purified (Lee and Ule, 2018). Several other RBPs at least partly overlapped with the spliceosomal peaks that align to BPs. The crosslinking profile of SF3A3 has a peak 15 nt upstream of the BP (Figure S6). Two RBPs that have very similar binding profile are BUD13 and GPKOW; their crosslinks distribute equally between peaks 4 and 5. This suggests that these two RBPs either co-purify as a complex, or independently crosslink in a very similar way upstream of BPs. Finally, several RBPs had cDNA starts enriched at intron start (position A) and the BP (position B): PRPF8, RBM22 and SUPV3L1 (Figure 5, S5, S6). PRPF8 and RBM22 are known to associate with intron lariat as part of the human catalytic step I spliceosome (Rasche et al., 2012; Scheres and Nagai, 2017; Teigelkamp et al., 1995), in agreement with our hypothesis that both positions result largely from cDNAs truncating at the three-way junction of intron lariats (Figure 2C). In conclusion, the RBPs binding in the region of peaks 4 or 5 bind at a fixed distance relative to the position of the BP (Figure S5, S6), indicating that their binding is defined primarily by the assembly of the spliceosome on BPs, which acts as a molecular ruler that positions each associated RBP on pre-mRNAs at a specific distance from BPs. ### RNA structure around BPs affects the binding and function of splicing factors To understand how the strength of spliceosomal RBP binding to pre-mRNAs is determined by the features of BPs, we next summarised the crosslinking of RBPs by normalising it relative to the complexity of the sequenced cDNA libraries, in order to study the relative intensities of their binding (Figure 6). This shows that when BPs are located close to the exon (Figure 6A), binding of U2AF factors at the 3’ss is weaker and narrower compared to the exons with more distally located BPs (Figure 6C). This is especially clear for U2AF2 binding to peak 6, which spans the whole region between the 3’ss and the BP. This can be explained by the characteristics of polypyrimidine (polyY) tracts, which are precisely defined by the BP position, and span the region between the BP and the AG dinucleotide of the 3’ss (Figure 6). ![Figure 6.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F6.medium.gif) [Figure 6.](http://biorxiv.org/content/early/2018/06/22/353599/F6) Figure 6. BP position defines the binding patterns of splicing factors at 3’ss. (A) Crosslinking patterns of selected RBPs, as defined by cDNA starts of eCLIP or iCLIP in the indicated cell lines. All 3’ss that contain BPs within 17..23 nt upstream of the exon are chosen, and crosslinking is plotted in the region −40..10 nt relative to 3’ss, and −40..10 nt relative to BPs. Below the plot, a weblogo shows nucleotide composition in the same region. (B) Same as (A), but for all 3’ss that contain BPs within 24..39 nt upstream of the exon. (C) Same as (A), but for all 3’ss that contain BPs within 40..65 nt upstream of the exon. In contrast to the decreased binding of U2AF2 at proximal BPs, the binding of SF3 components increases at these BPs (Figure 6A). We were not able to identify any enriched sequence motifs in the region of the peaks 4 or 5 (data not shown), and therefore the intensity of SF3 binding is unlikely to be determined by the sequence properties of the RNA. Thus, while binding of U2AF reflects the properties of polyY tracts, the intensity of SF3 binding more likely reflects the recognition of BPs by the larger spliceosomal complex. Thus, it appears that the stronger BP recognition may compensate for weaker polyY-tract recognition by the U2AF complex. To assess how the features of BPs affect the relative binding of SF3 and U2AF complexes, we focused our further analyses on all the BPs identified by spliceosome iCLIP that are located at 23-28 nt upstream of the intron-exon junction, which are the most common positions of BPs (20018 BPs, Table S6). In this way we minimized the effects of variations in BP position in order to focus on other features of BPs. Since we observed an effect of BP position on the the relative binding of SF3/U2AF complexes (Figure 6), we examined if the strength of BPs might also affect SF3/U2AF binding ratio. For this purpose, we separated BPs into 10 equally sized quintiles based on their computationally predicted score (Pagii and Bejerano, 2018). This revealed a significant trend, with stronger binding of U2AF complex next to low-scoring BPs, and stronger binding of SF3 complex next two high-scoring BPs, with an over 4-fold change in the ratio between the two complexes when comparing the extreme quintiles (p<0.001, Wilcoxon Rank Sum test, Figure 7A). Since the BP scores were defined by a sequence-based model, we checked if this change in ratio could reflect a correlation between BP scores and the sequence of the polyY tract, but we found that BP scores are independent of the proportion of Cs and Us within the polyY tract (Figure S7A). ![Figure 7.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F7.medium.gif) [Figure 7.](http://biorxiv.org/content/early/2018/06/22/353599/F7) Figure 7. RNA structure around BPs affects the binding and function of splicing factors. (A) BPs were divided into 10 quantiles based on their sequence consensus score, as determined in the past study (Pagii and Bejerano, 2018). The median score of each quantile is shown on the x-axis. The 4410 BPs chosen for this analysis satisfied two criteria: 1) They were located 23-28 nt away from intron-exon junction (the most common locations of BPs), and 2) they contained a total of at least 30 crosslink events of SF3B4-K562-eCLIP, SF3B4-HepG2-eCLIP, SF3A3-HepG2-eCLIP in the region 35-10 nt upstream of BPs and U2AF2-HepG2-eCLIP, U2AF2-K562-eCLIP and U2AF1-K562-eCLIP in the region 5-25 nt downstream of BPs (the peak binding region of these RBPs). The y-axis shows the ratio in binding of SF3 relative to U2AF factors around the BP, which was calculated by dividing the total cDNA count of SF3B4-K562-eCLIP, SF3B4-HepG2-eCLIP, SF3A3-HepG2-eCLIP in the region 35-10 nt upstream of BPs with the total cDNA count of U2AF2-HepG2-eCLIP, U2AF2-K562-eCLIP and U2AF1-K562-eCLIP in the region 5-25 nt downstream of BPs. P-values for the indicated comparisons were calculated by the pairwise Wilcoxon Rank Sum test. (B) BPs were divided into 10 categories based on their sequence consensus score, as determined in the past study (Pagii and Bejerano, 2018). The average pairing probability of each nucleotide around BPs was then calculated for each category of BPs. The 80 nucleotide RNA region centered on the BP was used to calculate pairing probability with RNAfold program using the default parameters (Lorenz et al., 2011). P-values for the indicated comparisons were calculated by the pairwise Wilcoxon Rank Sum test. (C) Exons that become more skipped upon knockdown of the indicated RBPs were identified from data provided by the ENCODE project from RNA-seq data following CRISPR knockdown in K562 and HepG2 cells (Van Nostrand et al., 2017). The differentially spliced exons were detected with rMATS using the junction counts only, a P-value threshold of 0.05, FDR threshold of 0.1 and IncLevelDifference > 0.2 (Shen et al., 2014), and only exons with increased skipping upon knockdown were used. We used all data where at least 45 skipped exons were detected with our threshold. The top computationally predicted BPs were used for each set of exons, and the average pairing probability was calculated for each nucleotide around BPs for each category of BPs. The 25 nucleotide RNA region around the BP was used to calculate pairing probability with RNAfold program using the default parameters (Lorenz et al., 2011). (D) The distribution of pairing probability for each group of exons that are skipped upon knockdown of the indicated RBPs. The pairing probability was calculated as in C, and the average pairing probability at positions 12-7 nt upstream of BPs and 3-6 nt downstream of BPs was calculated. The BP scores were determined with a deep-learning model that is based purely on RNA sequence, and therefore RNA structure was not used as one of the predictors. Nevertheless, we examined the pairing probability of the nucleotides around the BPs using RNAfold (Lorenz et al., 2011). We used only the 25nt region around the BPs as the input for analysis, in order to evaluate the propensity of this region to form a local stem-loop. Surprisingly, the pairing probability in the nucleotides surrounding the BPs was strongly correlated with the BP scores, such that BPs with the highest scores had the highest probability to form a stem-loop, where the BP is presented within the loop (Figure S7B). As a result, the SF3/U2AF binding ratio is also correlated to the RNA structure around BPs (data not shown). By shifting the position of the 25nt analysis window, we find that the stronger pairing propensity is partly a feature of the sequence upstream of BPs (Figure S7C-D). By analyzing the 10 quintiles of BP score, we confirmed that the propensity to form a local stem loop is significantly increased for the high-scoring BPs (p<0.001, Wilcoxon Rank Sum test, Figure 7B). Given the link between RNA structure and RBP binding profiles, we wished to understand whether the RNA structure around BPs also affects whether the exons are sensitive to depletion of specific spliceosomal RBPs. For this purpose we used exons that become more skipped upon knockdown of well-known splicing factors, as identified by the ENCODE project from RNA-seq data following knockdown in K562 and HepG2 cells (Van Nostrand et al., 2017). We only examined RBPs that play positive roles in splicing, either by promoting exon definition during early stages of splicing (SR and U2AF proteins) or by participating in spliceosomal assembly (SF3 and PRPF8 proteins). Given that these RBPs play positive roles, we only examined the features of exons that are more skipped upon knockdown. Notably, exons dependent on SF3B components or PRPF8 tend to contain BPs that are presented within a stem loop, whereas exons dependent on factors involved in exon definition (SRSF1 and U2AF component) tend to contain BPs with lower pairing probability of their flanking nucleotides (Figure 7C). This effect is diminished if the analysis window is shifted (Figure S7E-F), indicating that the local stem loop formation is the primary factor. The groups of exons significantly differ in the pairing probability of the nucleotides flanking the BPs (p<0.01, Wilcoxon Rank Sum test, Figure 7D), but not in BP scores or Y-tract coverage (Figure S7G-H). In conclusion, we find that the strength of the RNA structure around BPs affect the relative binding of SF3 and U2AF components, and distinguishes exons regulated by RBPs involved in exon definition from those regulated by the factors involved in spliceosome assembly on BPs, such as SF3B components and PRPF8. ## Discussion We developed the spliceosome iCLIP method by using Sm core proteins as a bait to purify endogenous snRNPs and their spliceosomal RBPs, as well as accessory splicing factors. This enabled mapping their binding profiles along pre-mRNAs, which defined seven primary binding peaks of spliceosomal factors around splice sites and BPs. Moreover, the presence of lariat-derived reads in spliceosome iCLIP revealed >50,000 BPs that showed good agreement with the latest computational predications. Notably, we find that the computationally predicted BPs are primarily responsible for coordinating the assembly of U2 snRNP and the associated SF3 and U2AF complexes, even though the lariat-based methods often detect alternative BPs. By means of overlap with published iCLIP and eCLIP data, we identified 15 RBPs that overlap with the spliceosomal peaks. The binding of most of these RBPs is dependent on the position of BPs, and the corresponding spliceosomal peaks are sensitive to ATP under *in vitro* conditions, and to PRPF8 knockdown *in vivo*. Finally, we find that the strength of RNA structure around BPs affects the relative binding of SF3 and U2AF complexes, and distinguishes exons regulated by RBPs involved in exon definition (such as U2AF) from those involved in later stages of splicing reaction, including the SF3B components and PRPF8. We found that spliceosome iCLIP can also be used to identify BPs on a genome-wide scale since it enriches for cDNAs derived from intron lariats that truncate at BPs. This identified >50,000 BPs (64% of all expressed introns). For comparison, a recent RNA footprinting-based strategy identified BPs in 26% (1420/5533) of all yeast introns using reads spanning the BP three-way junction (Chen et al., 2018). This was improved to 86% when BP characteristics were computationally inferred from both the lariat spanning reads and all reads ending at ends of introns, although the precise number of unique BPs supported by sequence reads remains unclear. Interestingly, BPs identified by iCLIP have a stronger consensus sequence compared to the BPs previously experimentally identified in mammalian cells through analysis of lariat reads in RNA-seq. The reason for this difference may lie in the fact that lariats formed at non-canonical BPs are less efficiently debranched, and thus they are likely to be more stable (Hartmuth and Barta, 1988). As a result, detection of non-canonical BPs may be amplified by methods that are based on analysis of lariats in RNA-seq due to the differential stability of lariats. Similarly, delayed debranching of the non-canonical BPs could delay the kinetics of release of the late spliceosomal complexes from intron lariats, thus explaining why PRPF8 eCLIP and spliceosome iCLIP aligned better to the position of the non-canonical BPs than the SF3 crosslinking peaks 4 and 5, which correspond to the pre-lariat assembly of spliceosome on BPs. Spliceosome iCLIP compares favorably to other experimental methods, most likely because it detects lariats that are in complex with spliceosome at time of crosslinking, which should be less sensitive to variations in debranching kinetics and lariat stability. We suggest that a combined of the lariat-derived reads in spliceosome iCLIP, the reads that detect crosslinking in peaks 4 and 5, and computational modelling of BPs (Pagii and Bejerano, 2018), could allow the most reliable identification and monitoring of the use of functionally important BPs across development, tissues and species. Interestingly, we observe that for the great majority of introns, the same BPs is identified both computationally and by iCLIP. This class of BPs tend to be surrounded by high pairing probability, indicating that they are often within a stem loop structure, with a short loop that is enough just to present the consensus BP sequence. In contrast, the non-overlapping BPs are more often surrounded by regions of extended low pairing probability, indicating that the use of alternative BPs may result from their increased accessibility for pairing with U2 snRNA. This would agree with the findings in yeast, where conditions that change the structure of RNA (e.g. heat shock) allow new BPs to become available to the spliceosome (Meyer et al., 2011). It will be important to test whether RNA structure contributes to mammalian BP selection, and whether the choice of alternative BPs that are identified by iCLIP or computational predictions can be regulated under conditions that change RNA structure. We found two peaks of spliceosomal crosslinking upstream of BPs. These perfectly overlap with the crosslinking of SF3B4 and SF3B1, and partly with SF3A3, which are required for the ATP-dependent step of spliceosome assembly on the BPs (Brosi et al., 1993). This is in agreement with the ATP-dependence of peak 4 *in vitro* and its disruption by PRPF8 KD. The binding positions of SF3B4 (peak at 26 nt upstream of BPs) and SF3A3 (peak at 15 nt upstream of BPs) is consistent with the structure of the human activated spliceosome, where SF3A3 (also referred to as SF3a60) binds to pre-mRNA at a position closer to the BP compared to SF3B4 (also referred to as SF3b49) (Zhang et al., 2018). The SF3A3 peak does not overlap well with spliceosome iCLIP, indicating that it is not co-purified with U2 snRNP as efficiently as SF3B4 under the conditions of spliceosome iCLIP. Interestingly, while we observe binding peaks in the region 26-19 nt upstream of BPs in humans, the late spliceosomal components in yeast had their peak centred at ~48-49 nt upstream of BPs (Chen et al., 2018). In yeast, these contacts were predicted to correspond to Cwf11, an SF1 RNA helicase (Chen et al., 2018). However, SF1 is expected to bind at a distance of ~33-40 nucleotides from the BP based on structural studies (De et al., 2015), and the identity of the yeast peak at 48-49 nt thus requires further experimental validation. Irrespective, the contacts located upstream of the BPs are free of a clear motif in both yeast and human. This implies that their binding position is defined by the assembly of spliceosome on BPs, rather than by their individual RNA binding specificities. In addition to the SF3 components, we found several other RBPs crosslinked at peaks 4/5 upstream of BPs. Among these are SMNDC1, GPKOW, BUD13 and EFTUD2, which have been shown to regulate spliceosomal remodeling at the ATP-dependent stages of the splicing cycle. SMNDC1, also known as SPF30, has originally been identified as a component of the spliceosome by proteomic analyses (Neubauer et al., 1998) and has similarity to the Survival Motor Neuron (SMN) protein (Talbot et al., 1998). Like SMN, SMNDC1 contains a Tudor domain that binds to symmetrically dimethylated arginine residues in the C-terminal tails of Sm proteins (Buhler et al., 1999). As part of the 17S U2 snRNP, SMNDC1 is essential for splicing and is involved in the transition of the pre-spliceosomal complex A to the mature spliceosomal complex B (Meister et al., 2001). GPKOW (Spp2 in yeast) has been identified as an interactor of the ATP-dependent RNA helicase DHX16 (Prp2 in yeast) (Hegele et al., 2012; Roy et al., 1995; Silverman et al., 2004). Spp2 and Prp2 are essential for the first splicing reaction and mediate the transition of the pre-catalytic Bact to the catalytic B* complex (Kim and Lin, 1996; Warkocki et al., 2015). Spp2/Prp2-mediated remodeling of the spliceosome leads to destabilization of SF3a and SF3b proteins, which is thought to prepare the BP for the first transesterification reaction (Lardelli et al., 2010; Warkocki et al., 2009). BUD13 has been originally identified as a component of the trimeric RES (pre-mRNA REtention and Splicing) complex in yeast, together with the proteins Snu17 and Pml1 (Dziembowski et al., 2004). In the absence of the RES complex, spliceosomal B complexes are prematurely disassembled by Prp2 to suggest that RES is required for an ordered remodeling of Bact into B* complexes (Bao et al., 2017). Given the extremely similar crosslinking pattern of GPKOW and BUD13 upstream of the BP, it will be interesting to explore if their functions in the transition from Bact into B* complexes are related. Finally, EFTUD2 is a component of spliceosomal C complex that interacts with PRPF8 (Achsel et al., 1998). Taken together, the fact that that crosslinking pattern of spliceosome iCLIP overlaps with these diverse RBPs indicates that it can simultaneously monitor pre-mRNA interactions of RBPs that act at different stages of the splicing cycle. Two RBPs, XRN2 and SUPV3L1, have not yet been reported to interact with spliceosome, but their crosslinking profiles indicate their likely interactions. Xrn2 is a 5′→3′ exonuclease that associates with, and co-transcriptionally degrades, aberrantly processed pre-mRNAs when processing is inhibited by Spliceostatin A (Davidson et al., 2012). The crosslinking profile of XRN2 is most similar to SF3B4 and SF3B1, and, interestingly, the SF3B complex is the target of Spliceostatin A (Kaida et al., 2007). It could thus be hypothesized that XRN2 might detect aberrant splicing progression by contacting the pre-mRNA at the same position that would normally be occupied by SF3B, thus degrading pre-mRNAs that lack appropriate SF3B assembly. In the case of SUPV3L1, its strong eCLIP signal at BPs indicates that it interacts with intron lariats. SUPV3L1 has not yet been reported to have any functions in splicing, and has been studied solely as an ATP-dependent RNA helicase within the mitochondrial exoribonuclease complex (Razew et al., 2018). However, the protein is also found in cell nuclei, where it appears to regulate cell cycle (Szewczyk et al., 2017), and thus a function in splicing remains plausible. We show that BP position, and the computationally predicted strength of BPs, which is highly correlated with RNA structure, affect the relative binding of U2AF and SF3 complexes upstream and downstream of BPs, respectively. Moreover, the strength of RNA structure around BPs distinguishes exons regulated by PRPF8, SF3B or U2AF components. This indicates complementary roles of these factors in spliceosome assembly to ensure high fidelity of BP and 3’ss recognition. Interestingly, U2AF1, SF3B1 and PRPF8 are all targets for mutations in myeloid neoplasia (Kurtovic-Kozaric et al., 2015; Yoshida et al., 2011). Even though these factors primarily bind at different regions of introns (U2AF1 at 3’ss, SF3B1 upstream of BPs and PRPF8 at 5’ss), our study reveals that the RNA secondary structure may contribute to the increased sensitivity of specific introns to deficient function of these factors. Looking beyond, spliceosome iCLIP could be used in the future to monitor transcriptome-wide impact of disease-causing mutations on spliceosome assembly in human cells. High-throughput analysis of RNA interactions of multi-protein complexes is crucial in order to understand the dynamic assembly of such complexes on target RNAs. To date this has been achieved for the human exon junction complex (EJC) with an RNA footprinting method termed RNA:protein immunoprecipitation in tandem (RIPiT)-seq (Singh et al., 2012), and the late-stage intron lariat spliceosome complex in yeast using triple tagging of the U2.U5.U6.NTC complex (Burke et al., 2018; Chen et al., 2018). Both approaches avoid UV-crosslinking in order to capture RBPs with poor UV crosslinkability (Singh et al., 2014). However, those sites where RNase protection is inefficient may be lost by these methods. Spliceosome iCLIP thus represents an alternative approach for the study of multi-protein complexes with nucleotide resolution. Its capacity to monitor concerted pre-mRNA binding of many types of spliceosomal proteins indicates that a similar approach could be readily applied to other complexes composed of multiple RBPs that bind RNA with distinct position-dependent binding patterns. ## Author contributions MB, CRS and JU conceived the project, designed the experiments and wrote the manuscript, with assistance of all co-authors. MB, CRS and ZW performed experiments, with assistance from JU, JK and CWS. NH performed most computational analyses, with assistance from CRS, TC, AC and NML. VOW, DP and ARV provided crosslinked pellets from wild-type and PRPF8-depleted Cal51 cells. ## Declaration of Interests The authors declare no competing interests. ## Supplementary legends ![Figure S1.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F8.medium.gif) [Figure S1.](http://biorxiv.org/content/early/2018/06/22/353599/F8) Figure S1. Related to Figure 1: Quality control of spliceosome iCLIP with the anti-SmB/B’ antibodies (A) Western blot analysis of total HeLa cell extract with 18F6 antibody reveals a single band of 28 kDa. (B) Analysis of HeLa cells by immunostaining with 18F6 and epifluorescence microscopy shows expected localization of SmB/B’ (a speckled nuclear pattern excluding nucleoli). (C) Autoradiograms of crosslinked RNPs after immunopurification with the anti-SmB/B’ monoclonal antibodies 18F6 (as hybridoma supernatant), 12F5 (Carissimi et al., 2006) (sc-130670, Santa Cruz Biotechnology) or Y12 (ab3138, Abcam). HeLa cell pellet was lysed in medium lysis buffer and subjected to high (++, final dilution 1:10,000) or low (+, final dilution 1:100,000) concentrations of RNase I. Lysates were split evenly between beads for immunopurification. RNAs of immunopurified RNP complexes were radiolabelled at the 5’ end followed by size-separation on denaturing gels and nitrocellulose transfer. The time below each panel indicates length of exposure during autoradiography. (D) UV-crosslinked HEK FLP-in cells with SmB-3XFLAG stably integrated were lysed under stringent conditions and subjected to partial RNase I digestion (Low - final dilution 1:100,000, High - dilution 1:5000). Spliceosomal RNPs were immunopurified with anti-FLAG M2 antibody, RNA was 5’ end radiolabeled, and RNPs were subjected to denaturing gel electrophoresis and nitrocellulose transfer, an autoradiogram of which is shown. The interrupted line indicates the area on the nitrocellulose membrane cut out for purification of crosslinked RNP complexes. (E) UV-crosslinked mouse postnatal day 7 brains were lysed under medium or mild stringency conditions and subjected to partial RNase I digestion (final dilution 1:100,000). Spliceosomal RNPs were immunopurified with anti-SmB/B’ 18F6 antibody, RNA was 5’ end radiolabeled, and RNPs were subjected to denaturing gel electrophoresis and nitrocellulose transfer, an autoradiogram of which is shown in the upper panel. The interrupted line indicates the area on the nitrocellulose membrane cut out for purification of crosslinked RNP complexes. For Western blotting, the remainder of the supernatant following cell lysis and centrifugation was mixed with 4× loading buffer (Invitrogen) and equal sample volumes were separated by SDS-PAGE and transferred onto nitrocellulose membrane, which was incubated with anti-a-tubulin antibody (1:4,000, clone B-5-1-2, cat. no. T5168, Sigma-Aldrich). ![Figure S2.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F9.medium.gif) [Figure S2.](http://biorxiv.org/content/early/2018/06/22/353599/F9) Figure S2. Related to Figure 2: Analysis spliceosome iCLIP from cell extracts and in vitro splicing reactions. (A) RNA map of summarized crosslinking for spliceosome iCLIP performed under medium or mild conditions from mouse brain is shown around the exon-intron, intron-exon junction and computationally top-scoring BP in each mouse intron (Corvelo et al., 2010). (B) Immunoblot analysis of Prpf8 knockdown efficiency in Cal51 cells. (C) RNAs transcribed in vitro from C60RF10 minigene construct were incubated with HeLa nuclear extracts as part of in vitro splicing reactions in the presence or absence of ATP. Resulting splicing products and intermediates were resolved by denaturing gel electrophoresis and visualized by autoradiography. (D) In vitro splicing reactions were diluted in mild lysis buffer, subjected to low RNase I treatment (final dilution 1:200,000) and used for spliceosome iCLIP. Autoradiogram of crosslinked size-separated RNP complexes show the radiolabelled RNA that is crosslinked to RBPs. The interrupted line indicates the area cut out from the nitrocellulose membrane for extraction of crosslinked RNAs, which were used as a template for the iCLIP cDNA libraries. (E) U2AF65 and U2AF35 are co-purified with Sm proteins in the Spliceosome iCLIP. HeLa nuclear extracts were incubated for 1 hour with or without ATP and subjected to RNase I treatment (+, final dilution 1:5,000) where indicated followed by immunoprecipitation using SmB/B’ antibody-conjugated protein-G beads. Indicated proteins were detected by immunoblotting. (F) Normalised spliceosome iCLIP cDNA counts on the C6orf10 in vitro splicing substrate. Exons are marked by grey boxes. As expected, junction reads are almost exclusively present only in the +ATP library. ![Figure S3.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F10.medium.gif) [Figure S3.](http://biorxiv.org/content/early/2018/06/22/353599/F10) Figure S3. Related to Figure 3: Comparison of BPs determined by spliceosome iCLIP or RNA-seq lariat reads. (A) The number of top BPs identified by published RNA-seq (Mercer et al., 2015) (i.e., the BP with most lariat-spanning reads in each intron) at positions relative to BPs identified by spliceosome iCLIP (i.e., iCLIP BPs). (B) The number of top BPs identified by published RNA-seq (Taggart et al., 2017) (i.e., the BP with most lariat-spanning reads in each intron) at positions relative to BPs identified by spliceosome iCLIP (i.e., iCLIP BPs). (C) A table providing the number of BPs identified by spliceosome with iCLIP (iCLIP BPs) in introns that also contain a BP assigned by lariat-spanning reads from RNA-seq. They are divided into three categories based on the distance between the iCLIP BP and the top RNA-seq BP. (D) A table providing the number of BPs identified by spliceosome with iCLIP (iCLIP BPs) in introns that also contain a BP assigned by lariat-spanning reads from RNA-seq (Taggart et al., 2017). They are divided into three categories based on the distance between the iCLIP BP and the top RNA-seq BP. (E) Weblogo of iCLIP BPs that overlap with RNA-seq BPs (Mercer et al., 2015). (F) Weblogo of iCLIP BPs that are >5 nt away from RNA-seq BP (Mercer et al., 2015). (G) Weblogo of RNA-seq BPs (Mercer et al., 2015) that are >5 nt away from iCLIP BP. (H) Weblogo of iCLIP BPs that overlap with RNA-seq BPs (Taggart et al., 2017). (I) Weblogo of iCLIP BPs that are >5 nt away from RNA-seq BP (Taggart et al., 2017). (J) Weblogo of RNA-seq BPs that are >5 nt away from iCLIP BP (Taggart et al., 2017). ![Figure S4.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F11.medium.gif) [Figure S4.](http://biorxiv.org/content/early/2018/06/22/353599/F11) Figure S4. Related to Figure 4: Comparison of BPs determined by spliceosome iCLIP or computational lariat reads. (A) Weblogo of iCLIP BPs that are >5 nt upstream or downstream of comp BP. (B) Weblogo of comp BPs that are >5 nt upstream or downstream of iCLIP BP. (C) RNA map of summarised crosslinking of spliceosome iCLIP from Cal51 cells around each category of BPs. ![Figure S5.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F12.medium.gif) [Figure S5.](http://biorxiv.org/content/early/2018/06/22/353599/F12) Figure S5. Related to Figure 5: Crosslinking of many RBPs overlaps with a peak of spliceosomal crosslinking. (A) Crosslinking patterns of selected RBPs, as defined by cDNA starts of eCLIP or iCLIP in the indicated cell lines. All 3’ss that contain BPs within 17..23 nt upstream of the exon are chosen, and crosslinking is plotted in the region −40..10 nt relative to 3’ss, and −40..10 nt relative to BPs. Crosslinking of each is regionally normalized (relative to its average crosslinking over-200..50 nt region relative to 3’ss). (B) Same as (A), but for all 3’ss that contain BPs within 24..39 nt upstream of the exon. (C) Same as (A), but for all 3’ss that contain BPs within 40..65 nt upstream of the exon. ![Figure S6.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F13.medium.gif) [Figure S6.](http://biorxiv.org/content/early/2018/06/22/353599/F13) Figure S6. Related to Figure 5: Crosslinking of many RBPs overlaps with a peak of spliceosomal crosslinking. (A) Crosslinking patterns of selected RBPs, as defined by cDNA starts of eCLIP or iCLIP in the indicated cell lines. All 3’ss that contain BPs within 17..23 nt upstream of the exon are chosen, and crosslinking is plotted in the region −40..10 nt relative to 3’ss, and −40..10 nt relative to BPs. Crosslinking of each RBP is regionally normalized (relative to its average crosslinking over −200..50 nt region relative to 3’ss). (B) Same as (A), but for all 3’ss that contain BPs within 24..39 nt upstream of the exon. (C) Same as (A), but for all 3’ss that contain BPs within 40..65 nt upstream of the exon. ![Figure S7.](http://biorxiv.org/https://www.biorxiv.org/content/biorxiv/early/2018/06/22/353599/F14.medium.gif) [Figure S7.](http://biorxiv.org/content/early/2018/06/22/353599/F14) Figure S7. Related to Figure 7: RNA structure and sequence characterisation around BPs. (A) BPs were divided into 10 quantiles as described in Figure 7A, with the median score of each quantile shown on the x-axis. The % of Ys (C or T) in the region 1-21 nt downstream of BP is shown on the y-axis. (B) BPs were divided into 10 quantiles as described in Figure 7A, and the 25 nucleotide RNA region centered on the BP was used to calculate pairing probability with RNAfold program with the default parameters (Lorenz et al., 2011), and the average pairing probability of each nucleotide around BPs is shown for each quantile. (C-D) Same as B, but the 25 nucleotide RNA region is shifted 10nt upstream or downstream of BPs. (E-F) Same as Figure 7C, but the 25 nucleotide RNA region is shifted 10nt upstream or downstream of BPs. (G) The distribution of computational BP score for each group of exons, which are skipped upon knockdown of the indicated RBPs. (H) The distribution of % of Ys (C or T) in the region 1-21 nt downstream of BP for each group of exons, which are skipped upon knockdown of the indicated RBPs. ## Methods ### Data and statistics The spliceosome iCLIP data have been deposited on EBI ArrayExpress under the accession number E-MTAB-6950. These and published data sets referenced throughout this study are listed for convenience in Table S7, including accession details. All statistical analyses were performed in the R software environment (version 3.1.3 and 3.3.2, [https://www.r-project.org](https://www.r-project.org)). ### Preparation of Cal51 cells for iCLIP Cal51 breast adenocarcinoma cells were prepared as described previously (Wickramasinghe et al., 2015). Briefly, cells were cultured in Dulbecco’s Modified Eagle Medium (DMEM, Thermo Fisher) with 10% fetal calf serum (FCS, Thermo Fisher) and 1× penicillin-streptomycin (P/S, Thermo Fisher). For siRNA-mediated depletion of PRPF8, Cal51 cells were transfected with DharmaFECT1 (Dharmafect) with 25 nM siRNA against human PRPF8. Transfected cells were harvested 54 hrs later, exposed to UV-C light and used for iCLIP as described below. For collection of samples from different stages of the cell cycle, Cal51 cells were synchronized in G1/S by standard double thymidine block. Briefly, cells were treated with 1.5 mM thymidine for 8 hrs, washed and released for 8 hrs, then treated again with thymidine for a further 8 hrs. Cells were also collected 3 hrs (S-phase) and 7 hrs (G2) after release from the thymidine block. ### *In vitro* splicing For *in vitro* splicing reactions, we produced a C6orf10 minigene construct containing exon 8 and 9 and 150 nt of the intron around both splice sites (Figure 2B). The minigene was linearized and transcribed *in vitro* using T7 polymerase with 32P-UTP (Melton et al., 1984). The transcribed RNA was then subjected to *in vitro* splicing reactions using HeLa nuclear extract as described before (Krainer et al., 1984). Briefly, HeLa nuclear extract was depleted of endogenous ATP by pre-incubation and, for each reaction, 10 ng of RNA was incubated with 60% HeLa nuclear extract at 30°C with or without additional 0.5 mM ATP for 1 h in a 20 μl reaction. Afterwards, the reaction mixture was UV-crosslinked at 100 mJ/cm2 and stored at −80°C until further use. To visualise the splicing reaction products, proteinase K was added to the reaction mixture for 30 min at 37°C. The resulting RNA was phenol-extracted, precipitated and visualised on a 5% polyacrylamide-urea gel. To test the effect of ATP on U2AF protein association with the spliceosome, *in vitro* splicing reactions were set up without minigene substrate RNA in the presence or absence of 0.5 mM ATP. Reactions were incubated at 30°C for 1 h and then diluted to 1 ml using mild lysis buffer. As indicated, reactions were subjected to high RNase I (AM2295, 100 U/μl, Thermo Fisher) treatment (+, final dilution 1:5,000) followed by immunoprecipitation using 12F5-conjugated protein-G beads. Indicated proteins were detected by Western blotting using anti-U2AF2 (sc-48804, Santa Cruz Biotechnology), anti-U2AF35 (sc-19961, Santa Cruz Biotechnology) or Y12 antibody (ab3138, Abcam)visualized using both horseradish peroxidase (HRP)-conjugated protein A (101123, Thermo Fisher) and protein G (101223, Thermo Fisher). ### Spliceosome iCLIP protocol The iCLIP method was done as previously described (König et al., 2010), with the following modifications. Crosslinked cells or tissue were dissociated in the lysis buffer according to the stringency conditions (stringent, medium, mild; Table S1) followed by sonication, limited RNase I digestion and centrifugation. For initial experiments, HEK293 FlpIn cells expressing 3×Flag-SmB were used for high stringency conditions (Huppertz et al., 2014) (Figure S1D) and mouse brain tissue was used for testing mild and medium stringency conditions (Figure S1E). Subsequently, HeLa nuclear extract was used for *in vitro* splicing reactions (Figure S2C), and Cal51 cells used for medium and mild conditions. Identification of protein crosslink sites around splice sites, in particular the peak 4, is most efficient under the mild purification condition (Figure S2A). This condition was therefore used for spliceosome iCLIP from PRPF8 knockdown and control Cal51 cells (Figure 2A), and for *in vitro* splicing in HeLa nuclear extract (Figure 2B). For the identification of BPs, we found the medium condition is best suited, since it increases the frequency of cDNAs truncating at peak B (Figure S2A). Therefore, spliceosome iCLIP was performed under medium purification conditions from Cal51 cells synchronized in G1, S and G2 phase, and data from these libraries was pooled with the data from untreated cells produced under mild conditions (which also contains many cDNA starts overlapping with BPs, see Figure 2A) in order to identify BPs. For SmB protein immunopurification under stringent conditions, M2 anti-Flag antibody (Sigma) was used against the 3×Flag-SmB protein that had been stably integrated into HEK-293 FlpIn cells. Briefly, immunopurification was performed as outlined previously (Huppertz et al. 2014). Here, a 6M Urea buffer was first used to lyse cell pellets, before being diluted down 1:9 with a Tween-20 containing IP buffer to allow for immuno-purification without denaturing of the M2 anti-Flag antibody. The iCLIP protocol was then followed as described previously. For SmB/B’ immunopurification under medium and mild conditions anti-SmB/B’ antibodies 12F5 and 18F6 were used, which are different clones from the same immunization, as described previously (Carissimi et al., 2006). These antibodies behave identically under immunopurification conditions (Figure S1C). For spliceosome iCLIP from mouse brain and *in vitro* splicing reactions, lysates were incubated with 50 μl monoclonal anti-SmB/B’ antibody 18F6 [as hybridoma supernatant, generated as described previously (Carissimi et al., 2006)], and for experiments from Cal51 cells, 12F5 anti-SmB/B’ antibody (Santa Cruz) was used. The antibody was pre-conjugated to 100 μl protein G Dynabeads (Thermo Fisher) and rotated at 4°C followed by washing. As described previously, following immunopurification, RNA 3’ end dephosphorylation, ligation of the linker 5’-rAppAGATCGGAAGAGCGGTTCAG/ddC/-3’ to the 3’ end and 5’ end radiolabeling protein-RNA complexes were size-separated by SDS-PAGE and transferred onto nitrocellulose membrane. The regions corresponding to 28180 kDa were excised from the membrane in order to isolate the bound RNA by proteinase K treatment. RNAs were reverse-transcribed in all experiments using SuperScript III reverse transcriptase at U/μl (ThermoFisher, USA) and custom indexed primers (Table S2). Resulting cDNAs were subjected to electrophoresis on a 6% TBE-urea gel (Thermo Fisher) for size selection. Purified cDNAs were circularized, linearized and amplified for high-throughput sequencing. U2AF2 and TIA1 iCLIP data used in RNA maps were produced from HeLa cells in previous studies (Wang et al., 2010; Zarnack et al., 2013). ### Mapping of Sm iCLIP reads We used mm9/NCBI37 and hg19/GRCh37 genome versions sequence information and Ensembl 75 gene annotation. Experimental and random barcode sequences of iCLIP sequenced reads were removed prior to mapping (Table S2). We mapped the cDNAs to the genome with Bowtie 0.12.7 program using the parameters (-v 2-m 1-a --best --strata). The first 9 nt of the sequenced contains the experimental barcode to separate experimental replicates, and the UMIs, which allow to avoid artefacts caused by variable PCR amplification of different cDNAs (König et al., 2010). We used these UMIs to quantify the number of unique cDNAs that mapped to each position in the genome by collapsing cDNAs with the same UMI that mapped to the same starting position to a single cDNA. For analysis of crosslinking to snRNAs, we allowed sequences to map at up to 50 locations in the genome, but for all other analyses in the manuscript, we only allowed sequence mapping to a single location in the genome. The nucleotide preceding the iCLIP cDNAs mapped by Bowtie was used to define the crosslink sites. ### Mapping of eCLIP reads For eCLIP sequencing data for all RBPs, we used GENCODE (GRCh38.p7) genome assembly and the STAR alignment (version 2.4.2a) using the following parameters from ENCODE pipeline: STAR --runThreadN 8 --runMode alignReads --genomeDir GRCh38 Gencode v25 --genomeLoad LoadAndKeep --readFilesIn read1, read2, --readFilesCommand zcat --outSAMunmapped Within-outFilterMultimapNmax 1 --outFilterMultimapScoreRange 1 --outSAMattributes All --outSAMtype BAM Unsorted - outFilterType BySJout --outFilterScoreMin 10--alignEndsType EndToEnd --outFileNamePrefix outfile. For the PCR duplicates removal, we used a python script ‘barcode collapse pe.py’ available on GitHub ([https://github.com/YeoLab/gscripts/releases/tag/1.0](https://github.com/YeoLab/gscripts/releases/tag/1.0)). which is part of the ENCODE eCLIP pipeline ([https://www.encodeproject.org/pipelines/ENCPL357ADL/](https://www.encodeproject.org/pipelines/ENCPL357ADL/)). ### Visualization of crosslink positions in the form of RNA maps RNA maps were produced by summarizing the cDNA counts at significant crosslink sites at all exon/intron and intron/exon boundaries and BPs on pre-mRNAs. The definition of intronic start and end positions was based on Ensembl version 75. Only introns longer than 300 nts were used to draw RNA maps, since they enabled best normalization of data by the intronic abundance with the following procedure: 1. For each intron, calculate the total count of cDNAs that identify crosslink sites in the deep intronic region (from 50 nt downstream from exon-intron junction to 100 nt upstream of exon-intron junctions). If the count is >10, then proceed with the analysis of this intron. The average cDNA count/nt in the deep intronic regions is used as a normalisation constant. 2. Divide the counts at each position in the intron and flanking exons by the normalisation constant of this intron. 3. Sum up the normalised values from step 2 in the examined introns, and divide this value by the number of examined introns. For all analyses in Cal51 cells, we only assessed protein-coding genes with FPKM>10 in RNA-seq data. For spliceosome iCLIP with the C6orf10 *in vitro* splicing substrate, sequence reads were first mapped to the unspliced substrates and the remaining reads were mapped to the spliced substrate allowing no mismatches. ### Identification of branchpoints (BPs) To identify BPs, we pooled spliceosome iCLIP data produced under mild and medium purification conditions from Cal51 cells. The first step to identify BPs used the spliceosome iCLIP reads that ended precisely at the ends of introns (we considered only introns that end in AG dinucleotide) after removal of the 3’ adapter. We noticed that these reads had an 3.5× increased frequency of mismatches on the A as the first nucleotide compared to remaining iCLIP reads (Figure 3B), indicating that these mismatches may have resulted from truncation at the three-way-junction formed at the BP (Figure 2C). We therefore trimmed the first nucleotide from the read if it contained a mismatch at the first position that corresponded to a genomic adenosine. We then used spliceosome iCLIP from synchronized and unsynchronized control Cal51 cells to identify all reads that ended precisely at the ends of introns and defined the position where these reads started and assessed the random nucleotides that are present at the beginning of each iCLIP read to count the number of unique cDNAs at each position. The nucleotide preceding the read start corresponds to the position where cDNAs truncated during the reverse transcription, and we selected the genomic A that had the highest number of truncated cDNAs as the candidate BP. If two positions with equal number of cDNAs were found, we selected the one closer to the 3’ SS. This identified 35,056 positions in genes with FPKM>10. The sequence composition around these positions corresponded to the previously reported sequence around BPs (Figure 3A). We then proceeded to the second step of the analysis, where we considered all cDNAs (regardless of where they ended), but including trimming of the first nucleotide if there was a mismatch with the genomic A. We then overlapped cDNA truncation sites with computationally predicted BPs in the last 100 nts of intron (Corvelo et al., 2010). If this analysis identified a position with a higher cDNA count than the initial analysis (or if the initial analysis didn’t identify any BP in the same intron), then the newly identified position was assigned as the BP. For introns where no BP was identified by either first or second step in the analysis, we assessed computationally predicted BPs located further than 100 nts from the 3’ss, and if any of these overlapped with a truncating cDNA, we assigned the position closest to the 3’ss as the BP. Together, this identified 50,812 BPs in genes with FPKM>10. These BPs were used for all analyses in the manuscript, and their coordinates were used for BP positioning in RNA maps. We additionally identified 13,496 BPs in introns of lowly expressed genes, but these were not used for any further analyses. On RNA maps, the position of BPs is marked as peak B. This peak was more efficiently identified with spliceosome iCLIP on the endogenous pre-mRNAs when medium conditions were used, but was greatly diminished when mild conditions were used (Figure S2A). We hypothesize that the Dbr1 enzyme can debranch the lariats when the mild lysis conditions are used, and therefore cDNAs do not truncate at BPs under these conditions. In contrast, position B was detected on the exogenous substrate in spite of using mild conditions for spliceosome iCLIP (Figure 2B). We hypothesize that this was because the *in vitro* splicing extract lacks the Dbr1 enzyme, and therefore the lariats are not debranched during the iCLIP assay. Bedtools Intersect command using option −u was used to compare BP coordinates from spliceosome iCLIP to the BPs identified in previous studies. We restricted this comparison to introns where BPs were detected by all three datasets. ### Comparison with the computationally predicted branch points The top scoring BP positions predicted for each intron in hg19 were obtained for 3’ss (recommended) 8/25/2017) from (Pagii and Bejerano, 2018): [http://beierano.stanford.edu/labranchor/](http://beierano.stanford.edu/labranchor/) The iCLIP BP chosen for the comparison in each intron was the one containing the highest number of iCLIP reads truncating at the BP. ### Analysis of pairing probability Computational predictions of the secondary structure were performed by RNAfold function from Vienna Package ([https://www.tbi.univie.ac.at/RNA/](https://www.tbi.univie.ac.at/RNA/)) with default parameters (Lorenz et al., 2011). The RNAfold results are provided in a customised format, where brackets are representing the double stranded region on the RNA and dots are used for unpaired nucleotides. We measured the density of pairing probability by summing the paired positions into a single vector. ### Identification of RBPs overlapping with splicesomal peaks For RBP enrichment in Figure 5, we used the eCLIP data from the ENCODE consortium (Van Nostrand et al., 2017), together with available iCLIP experiments from our lab (Attig et al., 2018), to see if any of the proteins are enriched in the region of splicesomal peaks. In total this included 140 eCLIP samples of 70 RBPs in the HepG2 cell line, and 178 eCLIP samples of 89 RBPs in the K562 cell, and iCLIP samples of 18 RBPs from different cell lines (Table S5). Next, we intersected cDNA-starts from each sample to the −200 to +50 nt region relative to the relative splice site and used it as control for each of the following peaks: Peak 4 (−23nt..-29nt relative to BP), Peak 5 (−21nt..-17nt relative to BP), Peak B (−1nt..1nt relative to BS), Peak A (−1nt..1nt of 5’ss), Peak6 (−10nt..-11nt), Peak7 (−3nt..-2nt). The positions of these peaks were determined based on crosslink enrichments in spliceosome iCLIP. ## Acknowledgements We thank Livio Pellizzoni for the 18F6 monoclonal antibody, Miriam Llorian for help with the *in vitro* splicing reactions, Kathi Zarnack and Gregor Rot for help with the data analyses, and Lisa Strittmatter, Ina Huppertz, Kiyoshi Nagai, Yoichiro Sugimoto, Jan Attig and Martina Halleger for helpful discussions and comments on the manuscript. This work was supported primarily by the European Research Council (206726-CLIP and 617837-Translate) and the Slovenian Research Agency (P2-0209, Z7-3665, J7-5460). CRS is supported by an Edmond Lily Safra fellowship. NML is a Winton Group Leader in recognition of the Winton Charitable Foundation’s support towards the establishment of the Francis Crick Institute. NML is additionally funded by a Wellcome Trust Joint Investigator Award (103760/Z/14/Z, the MRC eMedLab Medical Bioinformatics Infrastructure Award (MR/L016311/1) and core funding from the Okinawa Institute of Science & Technology Graduate University. DP and VOW were supported by Medical Research Council programme grants MC\_UU_12022/1 and MC_UU_12022/8 to ARV. The Francis Crick Institute receives its core funding from Cancer Research UK (FC001002), the UK Medical Research Council (FC001002), and the Wellcome Trust (FC001002). * Received June 22, 2018. * Revision received June 22, 2018. * Accepted June 22, 2018. * © 2018, Posted by Cold Spring Harbor Laboratory This pre-print is available under a Creative Commons License (Attribution-NoDerivs 4.0 International), CC BY-ND 4.0, as described at [http://creativecommons.org/licenses/by-nd/4.0/](http://creativecommons.org/licenses/by-nd/4.0/) ## References 1. Achsel, T., Ahrens, K., Brahms, H., Teigelkamp, S., and Luhrmann, R. (1998). The human U5-220kD protein (hPrp8) forms a stable RNA-free complex with several U5-specific proteins, including an RNA unwindase, a homologue of ribosomal elongation factor EF-2, and a novel WD-40 protein. Mol Cell Biol 18, 6756–6766. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoibWNiIjtzOjU6InJlc2lkIjtzOjEwOiIxOC8xMS82NzU2IjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 2. Attig, J., Agostini, F., Gooding, C., Chakrabarti, A., Singh, A., Haberman, N., Zagalak, J., Emmett, W., Smith, C.W., Luscombe, N.M. et al. (2018). Heteromeric RNP assembly at LINEs controls lineage-specific RNA processing. Cell in press. 3. Bao, P., Will, C.L., Urlaub, H., Boon, K.L., and Luhrmann, R. (2017). The RES complex is required for efficient transformation of the precatalytic B spliceosome into an activated B(act) complex. Genes Dev 31, 2416–2429. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXNkZXYiO3M6NToicmVzaWQiO3M6MTM6IjMxLzIzLTI0LzI0MTYiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 4. Bennett, M., Michaud, S., Kingston, J., and Reed, R. (1992). Protein components specifically associated with prespliceosome and spliceosome complexes. Genes Dev 6, 1986–2000. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXNkZXYiO3M6NToicmVzaWQiO3M6OToiNi8xMC8xOTg2IjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 5. Brosi, R., Hauri, H.P., and Kramer, A. (1993). Separation of splicing factor SF3 into two components and purification of SF3a activity. J Biol Chem 268, 17640–17646. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamJjIjtzOjU6InJlc2lkIjtzOjEyOiIyNjgvMjMvMTc2NDAiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 6. Buhler, D., Raker, V., Luhrmann, R., and Fischer, U. (1999). Essential role for the tudor domain of SMN in spliceosomal U snRNP assembly: implications for spinal muscular atrophy. Hum Mol Genet 8, 2351–2357. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1093/hmg/8.13.2351&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=10556282&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000084059700002&link_type=ISI) 7. Burke, J.E., Longhurst, A.D., Merkurjev, D., Sales-Lee, J., Rao, B., Moresco, J.J., Yates, J.R., 3rd., Li, J.J., and Madhani, H.D. (2018). Spliceosome Profiling Visualizes Operations of a Dynamic RNP at Nucleotide Resolution. Cell 173, 1014–1030 e1017. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2018.03.020&link_type=DOI) 8. Carissimi, C., Saieva, L., Gabanella, F., and Pellizzoni, L. (2006). Gemin8 is required for the architecture and function of the survival motor neuron complex. J Biol Chem 281, 37009–37016. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiamJjIjtzOjU6InJlc2lkIjtzOjEyOiIyODEvNDgvMzcwMDkiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 9. Chen, W., Moore, J., Ozadam, H., Shulha, H.P., Rhind, N., Weng, Z., and Moore, M.J. (2018). Transcriptome-wide Interrogation of the Functional Intronome by Spliceosome Profiling. Cell 173, 1031–1044 e1013. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2018.03.062&link_type=DOI) 10. Churchman, L.S., and Weissman, J.S. (2011). Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469, 368–373. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/nature09652&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=21248844&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000286385600045&link_type=ISI) 11. Corvelo, A., Hallegger, M., Smith, C.W., and Eyras, E. (2010). Genome-wide association between branch point properties and alternative splicing. PLoS computational biology 6, e1001016. 12. Davidson, L., Kerr, A., and West, S. (2012). Co-transcriptional degradation of aberrant pre-mRNA by Xrn2. Embo J 31, 2566–2578. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZW1ib2pubCI7czo1OiJyZXNpZCI7czoxMDoiMzEvMTEvMjU2NiI7czo0OiJhdG9tIjtzOjM3OiIvYmlvcnhpdi9lYXJseS8yMDE4LzA2LzIyLzM1MzU5OS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 13. De, I., Bessonov, S., Hofele, R., dos Santos, K., Will, C.L., Urlaub, H., Luhrmann, R., and Pena, V. (2015). The RNA helicase Aquarius exhibits structural adaptations mediating its recruitment to spliceosomes. Nat Struct Mol Biol 22, 138–144. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/nsmb.2951&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=25599396&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 14. Dziembowski, A., Ventura, A.P., Rutz, B., Caspary, F., Faux, C., Halgand, F., Laprevote, O., and Seraphin, B. (2004). Proteomic analysis identifies a new complex required for nuclear pre-mRNA retention and splicing. Embo J 23, 4847–4856. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZW1ib2pubCI7czo1OiJyZXNpZCI7czoxMDoiMjMvMjQvNDg0NyI7czo0OiJhdG9tIjtzOjM3OiIvYmlvcnhpdi9lYXJseS8yMDE4LzA2LzIyLzM1MzU5OS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 15. Forch, P., Puig, O., Kedersha, N., Martinez, C., Granneman, S., Seraphin, B., Anderson, P., and Valcarcel, J. (2000). The apoptosis-promoting factor TIA-1 is a regulator of alternative pre-mRNA splicing. Mol Cell 6, 1089–1098. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/S1097-2765(00)00107-6&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=11106748&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000165472100011&link_type=ISI) 16. Gozani, O., Feld, R., and Reed, R. (1996). Evidence that sequence-independent binding of highly conserved U2 snRNP proteins upstream of the branch site is required for assembly of spliceosomal complex A. Genes Dev 10, 233–243. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXNkZXYiO3M6NToicmVzaWQiO3M6ODoiMTAvMi8yMzMiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 17. Haberman, N., Huppertz, I., Attig, J., Konig, J., Wang, Z., Hauer, C., Hentze, M.W., Kulozik, A.E., Le Hir, H., Curk, T., et al. (2017). Insights into the design and interpretation of iCLIP experiments. Genome Biol 18, 7. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1186/s13059-016-1130-x&link_type=DOI) 18. Hartmuth, K., and Barta, A. (1988). Unusual branch point selection in processing of human growth hormone pre-mRNA. Mol Cell Biol 8, 2011–2020. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoibWNiIjtzOjU6InJlc2lkIjtzOjg6IjgvNS8yMDExIjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 19. Hegele, A., Kamburov, A., Grossmann, A., Sourlis, C., Wowro, S., Weimann, M., Will, C.L., Pena, V., Luhrmann, R., and Stelzl, U. (2012). Dynamic protein-protein interaction wiring of the human spliceosome. Mol Cell 45, 567–580. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.molcel.2011.12.034&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=22365833&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000300753800015&link_type=ISI) 20. Huppertz, I., Attig, J., D’Ambrogio, A., Easton, L.E., Sibley, C.R., Sugimoto, Y., Tajnik, M., Konig, J., and Ule, J. (2014). iCLIP: protein-RNA interactions at nucleotide resolution. Methods 65, 274–287. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.ymeth.2013.10.011&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=24184352&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 21. Ingolia, N.T., Ghaemmaghami, S., Newman, J.R., and Weissman, J.S. (2009). Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoic2NpIjtzOjU6InJlc2lkIjtzOjEyOiIzMjQvNTkyNC8yMTgiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 22. Jurica, M.S., Licklider, L.J., Gygi, S.R., Grigorieff, N., and Moore, M.J. (2002). Purification and characterization of native spliceosomes suitable for three-dimensional structural analysis. RNA 8, 426–439. [Abstract](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoicm5hIjtzOjU6InJlc2lkIjtzOjc6IjgvNC80MjYiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 23. Kaida, D., Motoyoshi, H., Tashiro, E., Nojima, T., Hagiwara, M., Ishigami, K., Watanabe, H., Kitahara, T., Yoshida, T., Nakajima, H., et al. (2007). Spliceostatin A targets SF3b and inhibits both splicing and nuclear retention of pre-mRNA. Nature chemical biology 3, 576–583. 24. Kambach, C., Walke, S., Young, R., Avis, J.M., de la Fortelle, E., Raker, V.A., Luhrmann, R., Li, J., and Nagai, K. (1999). Crystal structures of two Sm protein complexes and their implications for the assembly of the spliceosomal snRNPs. Cell 96, 375–387. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/S0092-8674(00)80550-4&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=10025403&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000078514600010&link_type=ISI) 25. Kielkopf, C.L., Lucke, S., and Green, M.R. (2004). U2AF homology motifs: protein recognition in the RRM world. Genes Dev 18, 1513–1526. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXNkZXYiO3M6NToicmVzaWQiO3M6MTA6IjE4LzEzLzE1MTMiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 26. Kim, S.H., and Lin, R.J. (1996). Spliceosome activation by PRP2 ATPase prior to the first transesterification reaction of pre-mRNA splicing. Mol Cell Biol 16, 6810–6819. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoibWNiIjtzOjU6InJlc2lkIjtzOjEwOiIxNi8xMi82ODEwIjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 27. Kol, G., Lev-Maor, G., and Ast, G. (2005). Human-mouse comparative analysis reveals that branch-site plasticity contributes to splicing regulation. Hum Mol Genet 14, 1559–1568. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1093/hmg/ddi164&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=15857856&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000229286100015&link_type=ISI) 28. König, J., Zarnack, K., Rot, G., Curk, T., Kayikci, M., Zupan, B., Turner, D.J., Luscombe, N.M., and Ule, J. (2010). iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat Struct Mol Biol 17, 909–915. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/nsmb.1838&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=20601959&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000279631500021&link_type=ISI) 29. Krainer, A.R., Maniatis, T., Ruskin, B., and Green, M.R. (1984). Normal and mutant human beta-globin pre-mRNAs are faithfully and efficiently spliced in vitro. Cell 36, 993–1005. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/0092-8674(84)90049-7&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=6323033&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=A1984SP24300027&link_type=ISI) 30. Kurtovic-Kozaric, A., Przychodzen, B., Singh, J., Konarska, M.M., Clemente, M.J., Otrock, Z.K., Nakashima, M., Hsi, E.D., Yoshida, K., Shiraishi, Y., et al. (2015). PRPF8 defects cause missplicing in myeloid malignancies. Leukemia 29, 126–136. 31. Lardelli, R.M., Thompson, J.X., Yates, J.R., 3rd., and Stevens, S.W. (2010). Release of SF3 from the intron branchpoint activates the first step of pre-mRNA splicing. Rna 16, 516–528. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoicm5hIjtzOjU6InJlc2lkIjtzOjg6IjE2LzMvNTE2IjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 32. Lee, F.C.Y., and Ule, J. (2018). Advances in CLIP Technologies for Studies of Protein-RNA Interactions. Mol Cell 69, 354–369. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.molcel.2018.01.005&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=29395060&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 33. Lorenz, R., Bernhart, S.H., Honer Zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P.F., and Hofacker, I.L. (2011). ViennaRNA Package 2.0. Algorithms for molecular biology: AMB 6, 26. 34. Meister, G., Hannus, S., Plottner, O., Baars, T., Hartmann, E., Fakan, S., Laggerbauer, B., and Fischer, U. (2001). SMNrp is an essential pre-mRNA splicing factor required for the formation of the mature spliceosome. Embo J 20, 2304–2314. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZW1ib2pubCI7czo1OiJyZXNpZCI7czo5OiIyMC85LzIzMDQiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 35. Melton, D.A., Krieg, P.A., Rebagliati, M.R., Maniatis, T., Zinn, K., and Green, M.R. (1984). Efficient in vitro synthesis of biologically active RNA and RNA hybridization probes from plasmids containing a bacteriophage SP6 promoter. Nucleic Acids Res 12, 7035–7056. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1093/nar/12.18.7035&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=6091052&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=A1984TK91200009&link_type=ISI) 36. Mercer, T.R., Clark, M.B., Andersen, S.B., Brunck, M.E., Haerty, W., Crawford, J., Taft, R.J., Nielsen, L.K., Dinger, M.E., and Mattick, J.S. (2015). Genome-wide discovery of human splicing branchpoints. Genome Res 25, 290–303. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ2Vub21lIjtzOjU6InJlc2lkIjtzOjg6IjI1LzIvMjkwIjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 37. Meyer, M., Plass, M., Perez-Valle, J., Eyras, E., and Vilardell, J. (2011). Deciphering 3’ss selection in the yeast genome reveals an RNA thermosensor that mediates alternative splicing. Mol Cell 43, 1033–1039. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.molcel.2011.07.030&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=21925391&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000295309800018&link_type=ISI) 38. Neubauer, G., King, A., Rappsilber, J., Calvio, C., Watson, M., Ajuh, P., Sleeman, J., Lamond, A., and Mann, M. (1998). Mass spectrometry and EST-database searching allows characterization of the multi-protein spliceosome complex. Nat Genet 20, 46–50. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/1700&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=9731529&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000075671400017&link_type=ISI) 39. Pagii, J.M., and Bejerano, G. (2018). A sequence-based, deep learning model accurately predicts RNA splicing branchpoints. bioRxiv. 40. Pineda, J.M.B., and Bradley, R.K. (2018). Most human introns are recognized via multiple and tissue-specific branchpoints. Genes Dev 32, 577–591. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXNkZXYiO3M6NToicmVzaWQiO3M6MTA6IjMyLzctOC81NzciO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 41. Rasche, N., Dybkov, O., Schmitzova, J., Akyildiz, B., Fabrizio, P., and Luhrmann, R. (2012). Cwc2 and its human homologue RBM22 promote an active conformation of the spliceosome catalytic centre. EMBO J 31, 1591–1604. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZW1ib2pubCI7czo1OiJyZXNpZCI7czo5OiIzMS82LzE1OTEiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 42. Razew, M., Warkocki, Z., Taube, M., Kolondra, A., Czarnocki-Cieciura, M., Nowak, E., Labedzka-Dmoch, K., Kawinska, A., Piatkowski, J., Golik, P., et al. (2018). Structural analysis of mtEXO mitochondrial RNA degradosome reveals tight coupling of nuclease and helicase components. Nature communications 9, 97. 43. Roy, J., Kim, K., Maddock, J.R., Anthony, J.G., and Woolford, J.L., Jr.. (1995). The final stages of spliceosome maturation require Spp2p that can interact with the DEAH box protein Prp2p and promote step 1 of splicing. Rna 1, 375–390. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6Mzoicm5hIjtzOjU6InJlc2lkIjtzOjc6IjEvNC8zNzUiO3M6NDoiYXRvbSI7czozNzoiL2Jpb3J4aXYvZWFybHkvMjAxOC8wNi8yMi8zNTM1OTkuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 44. Scheres, S.H., and Nagai, K. (2017). CryoEM structures of spliceosomal complexes reveal the molecular mechanism of pre-mRNA splicing. Curr Opin Struct Biol 46, 130–139. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.sbi.2017.08.001&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=28888105&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 45. Sharp, P.A., and Burge, C.B. (1997). Classification of introns: U2-type or U12-type. Cell 91, 875–879. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/S0092-8674(00)80479-1&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=9428511&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000071281400005&link_type=ISI) 46. Shen, S., Park, J.W., Lu, Z.X., Lin, L., Henry, M.D., Wu, Y.N., Zhou, Q., and Xing, Y. (2014). rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data. Proc Natl Acad Sci U S A 111, E5593–5601. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoicG5hcyI7czo1OiJyZXNpZCI7czoxMjoiMTExLzUxL0U1NTkzIjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 47. Silverman, E.J., Maeda, A., Wei, J., Smith, P., Beggs, J.D., and Lin, R.J. (2004). Interaction between a G-patch protein and a spliceosomal DEXD/H-box ATPase that is critical for splicing. Mol Cell Biol 24, 10101–10110. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoibWNiIjtzOjU6InJlc2lkIjtzOjExOiIyNC8yMy8xMDEwMSI7czo0OiJhdG9tIjtzOjM3OiIvYmlvcnhpdi9lYXJseS8yMDE4LzA2LzIyLzM1MzU5OS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 48. Singh, G., Kucukural, A., Cenik, C., Leszyk, J.D., Shaffer, S.A., Weng, Z., and Moore, M.J. (2012). The cellular EJC interactome reveals higher-order mRNP structure and an EJC-SR protein nexus. Cell 151, 750–764. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2012.10.007&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=23084401&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000310921200008&link_type=ISI) 49. Singh, G., Ricci, E.P., and Moore, M.J. (2014). RIPiT-Seq: a high-throughput approach for footprinting RNA:protein complexes. Methods 65, 320–332. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.ymeth.2013.09.013&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=24096052&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000332434900007&link_type=ISI) 50. Sugimoto, Y., Konig, J., Hussain, S., Zupan, B., Curk, T., Frye, M., and Ule, J. (2012). Analysis of CLIP and iCLIP methods for nucleotide-resolution studies of protein-RNA interactions. Genome biology 13, R67. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1186/gb-2012-13-8-r67&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=22863408&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 51. Szewczyk, M., Fedoryszak-Kuska, N., Tkaczuk, K., Dobrucki, J., Waligorska, A., and Stepien, P.P. (2017). Human SUV3 helicase regulates growth rate of the HeLa cells and can localize in the nucleoli. Acta biochimica Polonica 64, 177–181. 52. Taggart, A.J., Lin, C.L., Shrestha, B., Heintzelman, C., Kim, S., and Fairbrother, W.G. (2017). Large-scale analysis of branchpoint usage across species and cell lines. Genome Res 27, 639–649. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NjoiZ2Vub21lIjtzOjU6InJlc2lkIjtzOjg6IjI3LzQvNjM5IjtzOjQ6ImF0b20iO3M6Mzc6Ii9iaW9yeGl2L2Vhcmx5LzIwMTgvMDYvMjIvMzUzNTk5LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 53. Talbot, K., Miguel-Aliaga, I., Mohaghegh, P., Ponting, C.P., and Davies, K.E. (1998). Characterization of a gene encoding survival motor neuron (SMN)-related protein, a constituent of the spliceosome complex. Hum Mol Genet 7, 2149–2156. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1093/hmg/7.13.2149&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=9817934&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000077576700021&link_type=ISI) 54. Teigelkamp, S., Newman, A.J., and Beggs, J.D. (1995). Extensive interactions of PRP8 protein with the 5’ and 3’ splice sites during splicing suggest a role in stabilization of exon alignment by U5 snRNA. EMBO J 14, 2602–2612. [PubMed](http://biorxiv.org/lookup/external-ref?access_num=7781612&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 55. Ule, J., Stefani, G., Mele, A., Ruggiu, M., Wang, X., Taneri, B., Gaasterland, T., Blencowe, B.J., and Darnell, R.B. (2006). An RNA map predicting Nova-dependent splicing regulation. Nature 444, 580–586. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/nature05304&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=17065982&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000242377600040&link_type=ISI) 56. Van Nostrand, E.L., Freese, P., Pratt, G.A., Wang, X., Wei, X., Blue, S.M., Dominguez, D., Cody, N.A.L., Olson, S., Sundararaman, B., et al. (2017). A Large-Scale Binding and Functional Map of Human RNA Binding Proteins. bioRxiv. 57. Vogel, J., Hess, W.R., and Borner, T. (1997). Precise branch point mapping and quantification of splicing intermediates. Nucleic Acids Res 25, 2030–2031. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1093/nar/25.10.2030&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=9115373&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=A1997WZ93900022&link_type=ISI) 58. Wahl, M.C., Will, C.L., and Lührmann, R. (2009). The spliceosome: design principles of a dynamic RNP machine. Cell 136, 701–718. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2009.02.009&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=19239890&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000263688200020&link_type=ISI) 59. Wang, Z., Kayikci, M., Briese, M., Zarnack, K., Luscombe, N.M., Rot, G., Zupan, B., Curk, T., and Ule, J. (2010). iCLIP predicts the dual splicing effects of TIA-RNA interactions. PLoS Biol 8, e1000530. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1371/journal.pbio.1000530&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=21048981&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 60. Warkocki, Z., Odenwalder, P., Schmitzova, J., Platzmann, F., Stark, H., Urlaub, H., Ficner, R., Fabrizio, P., and Luhrmann, R. (2009). Reconstitution of both steps of Saccharomyces cerevisiae splicing with purified spliceosomal components. Nat Struct Mol Biol 16, 1237–1243. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/nsmb.1729&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=19935684&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 61. Warkocki, Z., Schneider, C., Mozaffari-Jovin, S., Schmitzova, J., Hobartner, C., Fabrizio, P., and Luhrmann, R. (2015). The G-patch protein Spp2 couples the spliceosome-stimulated ATPase activity of the DEAH-box protein Prp2 to catalytic activation of the spliceosome. Genes Dev 29, 94–107. [Abstract/FREE Full Text](http://biorxiv.org/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6ODoiZ2VuZXNkZXYiO3M6NToicmVzaWQiO3M6NzoiMjkvMS85NCI7czo0OiJhdG9tIjtzOjM3OiIvYmlvcnhpdi9lYXJseS8yMDE4LzA2LzIyLzM1MzU5OS5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 62. Wickramasinghe, V.O., Gonzalez-Porta, M., Perera, D., Bartolozzi, A.R., Sibley, C.R., Hallegger, M., Ule, J., Marioni, J.C., and Venkitaraman, A.R. (2015). Regulation of constitutive and alternative mRNA splicing across the human transcriptome by PRPF8 is determined by 5’ splice site strength. Genome Biol 16, 201. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1186/s13059-015-0749-3&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=26392272&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) 63. Witten, J.T., and Ule, J. (2011). Understanding splicing regulation through RNA splicing maps. Trends Genet. 64. Yoshida, K., Sanada, M., Shiraishi, Y., Nowak, D., Nagata, Y., Yamamoto, R., Sato, Y., Sato-Otsubo, A., Kon, A., Nagasaki, M., et al. (2011). Frequent pathway mutations of splicing machinery in myelodysplasia. Nature 478, 64–69. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/nature10496&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=21909114&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000295575400035&link_type=ISI) 65. Zarnack, K., Konig, J., Tajnik, M., Martincorena, I., Eustermann, S., Stevant, I., Reyes, A., Anders, S., Luscombe, N.M., and Ule, J. (2013). Direct Competition between hnRNP C and U2AF65 Protects the Transcriptome from the Exonization of Alu Elements. Cell 152, 453–466. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1016/j.cell.2012.12.023&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=23374342&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom) [Web of Science](http://biorxiv.org/lookup/external-ref?access_num=000314362800016&link_type=ISI) 66. Zhang, X., Yan, C., Zhan, X., Li, L., Lei, J., and Shi, Y. (2018). Structure of the human activated spliceosome in three conformational states. Cell research 28, 307–322. [CrossRef](http://biorxiv.org/lookup/external-ref?access_num=10.1038/cr.2018.14&link_type=DOI) [PubMed](http://biorxiv.org/lookup/external-ref?access_num=29360106&link_type=MED&atom=%2Fbiorxiv%2Fearly%2F2018%2F06%2F22%2F353599.atom)