A rapid, sensitive, scalable method for Precision Run-On sequencing (PRO-seq)

Tracking active transcription with the nuclear run-on (NRO) assays has been instrumental in uncovering mechanisms of gene regulation. The coupling of NROs with high-throughput sequencing has facilitated the discovery of previously unannotated or undetectable RNA classes genome-wide. Precision run-on sequencing (PRO-seq) is a run-on variant that maps polymerase active sites with nucleotide or near-nucleotide resolution. One main drawback to this and many other nascent RNA detection methods is the somewhat intimidating multi-day workflow associated with creating the libraries suitable for high-throughput sequencing. Here, we present an improved PRO-seq protocol where many of the enzymatic steps are carried out while the biotinylated NRO RNA remains bound to streptavidin-coated magnetic beads. These adaptations reduce time, sample loss and RNA degradation, and we demonstrate that the resulting libraries are of the same quality as libraries generated using the original published protocol. The assay is also more sensitive which permits reproducible, high-quality libraries from 104–105 cells instead of 106–107. Altogether, the improved protocol is more tractable allows for nascent RNA profiling from small samples, such as rare samples or FACS sorted cell populations.


24
Tracking active transcription with the nuclear run-on (NRO) assays has been instrumental in 25 uncovering mechanisms of gene regulation. The coupling of NROs with high-throughput 26 sequencing has facilitated the discovery of previously unannotated or undetectable RNA classes 27 genome-wide. Precision run-on sequencing (PRO-seq) is a run-on variant that maps polymerase 28 active sites with nucleotide or near-nucleotide resolution. One main drawback to this and many 29 other nascent RNA detection methods is the somewhat intimidating multi-day workflow 30 associated with creating the libraries suitable for high-throughput sequencing. Here, we present 31 an improved PRO-seq protocol where many of the enzymatic steps are carried out while the 32 biotinylated NRO RNA remains bound to streptavidin-coated magnetic beads. These 33 adaptations reduce time, sample loss and RNA degradation, and we demonstrate that the 34 resulting libraries are of the same quality as libraries generated using the original published 35 protocol. The assay is also more sensitive which permits reproducible, high-quality libraries from 36 10 4 -10 5 cells instead of 10 6 -10 7 . Altogether, the improved protocol is more tractable allows for 37 nascent RNA profiling from small samples, such as rare samples or FACS sorted cell 38 populations. 39

44
Next-generation sequencing technologies are now 45 routinely used to measure gene expression levels in a 46 highly sensitive and comprehensive fashion. RNA-seq 47 facilitate simultaneous detection, identification, and 48 annotation of many classes of cellular RNAs. Traditionally, 49 however, this technology primarily measures steady-state 50 RNA levels, which are a consequence of equilibrium 51 between RNA transcription, processing, and degradation. 52 As a result, many unstable RNAs, particularly eRNA and 53 some lncRNAs, are not easily detected with these 54 approaches. Alternatively, ChIP-seq allows for identification 55 and quantification of RNA polymerase II (Pol II) associated 56 DNA. This produces a genome-wide map of both 57 transcriptionally active and inactive Pol II without strand 58 specificity, thus the direction and transcriptional status of 59 the polymerase are ambiguous. Furthermore, the relatively 60 high background in ChIP-seq as compared to RNA-based 61 methods obfuscates comprehensive transcript and regulatory element detection. To address 62 these shortcomings, various methods of measuring transcription by genome-wide profiling of 63 nascent RNA have now been developed including GRO-seq (Core et al., 2008), PRO-seq (Kwak 64 et al., 2013), NET-seq (Churchman and Weissman, 2011), and TT-seq (Schwalb et al., 2016). 65 Characteristics of these assays are reviewed in (Wissink et al., 2019). 66 67 GRO-seq and PRO-seq are modern, genome-wide improvements of the nuclear run-on 68 assay, which has been in use for approximately 60 years (Weiss and Gladstone, 1959). Over 69 the years, run-on assays have been instrumental in the study or discovery of various forms of 70 gene regulation including, steady state transcription levels and mRNA turnover (Derman et al.,71 1981; Powell et al., 1984), promoter-proximal pausing (Gariglio et al., 1981;Rougvie and Lis, 72 1988), transcription rates (Bentley and Groudine, 1986;Hirayoshi and Lis, 1999; O'Brien and 73 Lis, 1993), and 3'-end processing and termination (Birse et al., 1997). The nuclear run-on 74 reaction works by adding labelled nucleotides to polymerases that are halted in the act of 75 transcription yet still transcriptionally competent. The polymerases incorporate the exogenous 76 NTPs and the labelled nascent transcripts can then be detected and quantified. In 2004, the 77 assay was adapted for macro-array by spotting probes of whole yeast genes on to nylon 78 membranes (García- Martínez et al., 2004). In 2008, the nuclear run-on assay was expanded to 79 cover the entire genome in global run-on and sequencing (GRO-seq) (Core et al., 2008). GRO-80  consuming, technically challenging, and requires significant amounts of starting material (0.5-2 116 x 10 7 cells). This is largely due to multiple streptavidin bead binding and subsequent elution 117 steps (Fig. 1A), which require technical finesse with phenol:chloroform extraction and ethanol 118 adapter in hydrolyzed total RNA vs. purified nascent RNA. 32 P labelled nascent RNA was ligated for 1 h using the ligation conditions in this protocol or using the standard PRO-seq ligation conditions in . (B) Efficiency of ligation to synthetic biotinylated RNA in 15% PEG8000 (C) Stringency of biotin-RNA affinity purification with MyOne C1 Streptavidin beads. Excess 32 P labelled non-biotinylated RNA was mixed with biotinylated RNA, and CPM of each fraction was assessed using liquid scintillation counting. (D) PRO-seq and qPRO-seq show similar levels of exonic reads relative to intronic reads genomewide.
precipitation of nucleic acids and present repeated opportunity for loss of material. This has 119 limited the adoption of PRO-seq by inexperienced laboratories and impeded studies of 120 transcription in experimental systems that utilize rare or precious biological samples. 121

122
To address these shortcomings, we optimized the PRO-seq protocol to simplify library 123 preparation and facilitate use of scarce input material in an improved protocol deemed qPRO-124 seq (quick Precision Run-On and sequencing; Fig. 1). The original protocol requires 4-5 working 125 days to complete, and included three bead binding steps, and five phenol:chlofororm extractions 126 and ethanol precipitations. By performing 3' adapter ligation to hydrolyzed total RNA instead of 127 affinity purified nascent RNA, we eliminated one bead-binding step. We have validated that this 128 ligation reaction is equally efficient to the standard PRO-seq ligation to purified nascent RNA 129 ( Fig. 2A). Downstream enzymatic reactions are then performed while nascent RNA is affixed to 130 streptavidin beads (Fig. 2B), which eliminates another bead-binding step. On-bead reactions are 131 performed in 2X volume to aid in handling, and simple bead washing steps replace numerous 132 phenol:chloroform extractions and ethanol precipitations. The resulting single bead-binding 133 protocol is much faster and easier than the original PRO-seq protocol (Fig. 1). It is feasible to 134 start from permeabilized cells and end with adapter ligated cDNA in a single day (~10 h; Fig.  135 1B). Furthermore, an option for column-based purification of RNA after the run-on reaction can 136 eliminate another organic extraction step. Reverse transcription can be performed on beads, 137 which completely eliminates organic extraction from the protocol, albeit with reduced efficiency.

139
In theory, reducing the number of affinity purifications and ligating adapters to bulk RNAs 140 could reduce the specificity of the assay for nascent RNA. However, we have found that MyOne 141 Streptavidin C1 beads, which have higher binding capacity per substrate area and are negatively 142 charged to repel non-specific nucleic acid binding, sufficiently enrich nascent RNA over other 143 cellular RNAs (Fig. 2C). Importantly, if nascent RNA was contaminated with mRNA, the number 144 of reads mapping within exons would increase relative to reads mapping within introns. However, 145 when we compared the length-normalized ratio of exonic reads to intronic reads, we observed 146 no detectable difference between the original protocol and the improved protocol presented here 147 (Fig. 2D).

149
Additional protocol changes have further simplified and improved the protocol. Careful 150 titration of adapters eliminates the need for PAGE purification (Fig. 3A-B), which is difficult and 151 time intensive and can introduce insert-size bias in the final libraries. Incorporation of a dual-UMI 152 strategy reduces concern about PCR-duplicate reads. Furthermore, this eliminates the need for 153 time-consuming test-amplification, except as an initial troubleshooting step when establishing 154 the assay in a new cell type or with a new amount of input material. 155 156 We have found that 10 6 cells are sufficient input material for this new protocol, and that it can 158 even be performed with as few as 0.05-0.25 x 10 6 cells (Fig 3C-F). Data quality typically 159 increases up to 10 6 cells but much smaller improvements are seen by further increasing cell 160 number. Polymerases are sampled from fewer positions overall in libraries made with 0.25 x 10 6 161 cells, which causes data to look "spiky" and inflates read counts at highly expressed genes when 162 normalizing per million mapped reads ( Fig. 3C-F). Cell numbers required for high quality library 163 generation will vary with the transcription activity of each cell type, with fast-dividing cultured 164 cells typically showing the greatest activity.

166
We performed the new qPRO-seq protocol alongside the original protocol in two biological 167 replicates of 10 6 K562 cells (Fig. 3). We observe that the C1 beads decrease the bias against 168 long RNAs seen with M280 beads, resulting in a relatively higher capture rate of gene body 169 reads which results in lower overall pause indices (pause index is pause region signal divided 170 by length-normalized gene body signal, so increased gene-body capture rate and unchanged 171 pause region capture rate decreases pause index; Fig. 3G). In aggregate profiles, we observe 172 no detectable difference in promoter or enhancer profiles across protocols ( The protocol presented in this manuscript is also available at 205 https://dx.doi.org/10.17504/protocols.io.57dg9i6 and will be updated as further improvements 206 are made. 207 208 Acknowledgements 209 We thank the Cornell BRC Genomics facility and Peter Schweitzer for assistance with Illumina 210 sequencing. We acknowledge many colleagues who have now used PRO-seq for various 211 applications that we did not cite in this manuscript due to space constraints. This work was 212 supported by NIH grants R01-GM025232 (to J.T.L.), R21-HG009021 and R35-GM128857 (to 213 L.J.C.), and R01-HG009309 (to C.G.D ~60% with the concentration in the 2XROMM as written, which is sufficient for 740 experiments using 10 6 cells or greater, but increasing the concentration improves 741 incorporation to ~77% (data not shown). 742 9. Use a centrifuge with a swinging bucket rotor for all centrifuge steps during cell 743 permeabilization. Using a fixed angle rotor will shear cells, releasing a smear of 744 white chromatin. 745 10. Centrifuge speed is cell size dependent. We typically centrifuge HeLa at 800 x g 746 and Drosophila at 1,000 x g. 747 11. When resuspending cells during permeabilization after centrifugation steps, first 748 gently resuspend the cell pellet with 1 mL solution with a wide-bore P1000 tip. 749 Then add the remaining volume (usually 9 mL) and mix by gentle inversion. 750 12. If your cell type is not permeabilized under these conditions, add Triton X-100 to 751 0.1-0.2%. 752 13. When processing multiple samples, if counting will cause the cells to sit on ice for 753 greater than 10 min, reserve 10 µL for counting, aliquot cells in 100 µL aliquots, 754 and snap freeze. Count the cells and then adjust the concentration with freeze 755 buffer after thawing and prior to the run-on. 756 14. of RNA that could occur in the pH 8.0 ThermoPol buffer. However, this is not a 848 major concern except for in the most sensitive of applications. 849 37. Reverse transcription can also be performed on-bead, but we find that this 850 significantly reduces library yield while increasing adapter dimer. For this reason, it 851 is not recommended except in cases where material is abundant (10 7 cells) and 852 speed is paramount. To do this, follow steps 1-2 in section 3.12, then follow 853 section 3.13, but resuspend the beads instead of the RNA pellet in RT 854 resuspension mix. After RT, elute cDNA by heating the bead mixture to 95°C, 855 quickly place tubes on a magnet stand, and remove and save supernatant.

856
Resuspend beads in 20 µL ddH2O and repeat the process for a final volume of 40 857 µL. Proceed with PreCR but use 20 µL less ddH2O (13.5 µL) in the PreCR mix and 858 use the entire 40 µL eluate instead of the 20 µL RT mix. 859 38. PreCR is optional if full scale amplification will be performed within 2 days. 44. Desired amplification characteristics include a sufficient amount of product (smear 889 starting ~150 bp), no evidence of overamplification, and ~50% primer exhaustion 890 . The adaptor dimer product is 132 bp, and the smear will start 891 15-20 bp above this band. RNA degradation will lead to shorter library products. 892 45. AMPure XP beads will work, as will any commercially available or homemade 893 SPRI bead cleanup reagent based on PEG precipitation. Be sure to allow beads to 894 reach room temperature, or excess primers will also precipitate. 895 46. Due to advances in streptavidin bead technology and titration of adapters 896 presented in this protocol, PAGE purification is rarely necessary. We prefer to 897 sequence libraries that are 0%-25% adapter dimer rather than risk size bias 898 associated with gel purification. Only perform PAGE purification if absolutely 899 necessary. If needed, multiple libraries can be pooled by molarity as determined by 900 bioanalyzer and extracted from the same gel lane to minimize size bias. 901 902