Abstract
RNAPII pausing immediately downstream of the transcription start site (TSS) is a critical rate limiting step at most metazoan genes that allows fine-tuning of gene expression in response to diverse signals1–5. During pause-release, RNA Polymerase II (RNAPII) encounters an H2A.Z.1 nucleosome6–8, yet how this variant contributes to transcription is poorly understood. Here, we use high resolution genomic approaches2,9 (NET-seq and ChIP-nexus) along with live cell super-resolution microscopy (tcPALM)10 to investigate the role of H2A.Z.1 on RNAPII dynamics in embryonic stem cells (ESCs). Using a rapid, inducible protein degron system11 combined with transcriptional initiation and elongation inhibitors, our quantitative analysis shows that H2A.Z.1 slows the release of RNAPII, impacting both RNAPII and NELF dynamics at a single molecule level. We also find that H2A.Z.1 loss has a dramatic impact on nascent transcription at stably paused, signal-dependent genes. Furthermore, we demonstrate that H2A.Z.1 inhibits re-assembly and re-initiation of the PIC to reinforce the paused state and acts as a strong additional pause signal at stably paused genes. Together, our study suggests that H2A.Z.1 fine-tunes gene expression by regulating RNAPII kinetics in mammalian cells.
Introduction
Eukaryotic transcription is a highly controlled process facilitated by a compendium of protein complexes that regulate RNA Polymerase II (RNAPII) recruitment as well as transcription initiation, elongation and termination ultimately culminating in genic output3–5. In metazoans, RNAPII promoter-proximal pausing is now widely recognized as a key rate-limiting step that occurs between +25 to +50 nucleotides downstream of the transcription start site (TSS) and prior to the first nucleosome1–3. Current models suggest that RNAPII pausing provides a critical window of opportunity to fine tune gene expression in response to signaling cues12. A diverse set of positive and negative factors regulating RNAPII pausing have been extensively studied both in vitro13–15 and in vivo3,16, yet how promoter nucleosomes affect pause-release remains poorly understood. At most genes in eukaryotes, the highly conserved histone H2A variant H2A.Z marks nucleosomes that flank the nucleosome depleted region (NDR) around the TSS7,17–19. H2A.Z is essential for early metazoan development20,21 and is necessary for proper induction of differentiation programs19,22–24. Conflicting reports suggest both positive and negative functions for H2A.Z in gene regulation. For example, work in Drosophila showed that H2A.Z (H2avD) lowers the energy barrier for RNAPII passage7, whereas recent biophysical studies using optical tweezers on in vitro reconstituted templates demonstrated that mammalian H2A.Z nucleosomes are more stable and pose a greater barrier to RNAPII compared to canonical histone H2A6. Notably, disruption of proper H2A.Z incorporation contributes to a growing list of human diseases including cancer23,25–27 underscoring the need for dissecting its mechanistic link to transcription in a cellular context.
Results
H2A.Z.1 slows the release of RNAPII towards progressive elongation
In vertebrates, H2A.Z is encoded by two independent genes H2afz (H2A.Z.1) and H2afv (H2A.Z.2 and its smaller spliceoform H2A.Z.2.2) that have distinct expression patterns and developmental roles23,24,28. Because H2A.Z.1 is strongly enriched at most promoters and is required for embryonic development as well as differentiation17,19,22,29, we focused on dissecting how this isoform regulates transcription in mouse embryonic stem cells (ESCs). We generated a homozygous ESC line containing a modified FKBP-V degron tag (dTAG) and 2x-HA epitope at the C-terminus of endogenous H2A.Z.1 (H2A.Z.1dTAG). This system allows for rapid and specific degradation of proteins by addition of the small molecule dTAG-13 (Fig. 1a and Extended Data Fig. 1a,b)11. Adding dTAG-13 to ESCs led to near complete loss of H2A.Z.1dTAG within 8 hours as shown by immunoblot (Fig. 1b and Extended Data Fig. 1c). This system circumvents prior limitations using siRNA knock-down including the extensive time required to assess the consequences of depletion and possible off target effects. ChIP-seq with an anti-HA antibody revealed the highest distribution of H2A.Z.1dTAG over promoters and some distal enhancers, correlating strongly with prior data both at a genome-wide (Spearman R = 0.89) and single gene level (Extended Data Fig. 1d-f)19. Moreover, H2A.Z.1dTAG ESCs were able to form EBs (data not shown) indicating that the dTAG cassette does not interfere with variant incorporation or function. As expected, treating ESCs with dTAG-13 led to rapid and near complete loss of H2A.Z.1 ChIP-seq peaks across the genome within hours (Fig. 1c and Extended Data Fig.1g) validating the robustness of our system.
H2A.Z.1 is enriched at the promoters of most genes including both active (H3K4me3 only) and silent, bivalent (H3K4me3 and H3K27me3) genes in ESCs17–19. We observed lower H2A.Z.1 levels at genes with the highest RNAPII density by ChIP-seq (Extended Data Fig. 2a), consistent with more rapid nucleosome turnover during active transcription. To test this idea further, ESCs were treated with the rapid elongation inhibitor DRB (5,6-dichloro-1-beta-D-ribofuranosylbenzimidazole)30, an adenosine analogue that halts mRNA synthesis, followed by RNAPII and H2A.Z.1 ChIP-seq. DRB treatment for 2 hrs led to a decrease in the elongating form of RNAPII S2ph and increase in both RNAPII and H2A.Z.1 density at the promoter-proximal regions of active, but not bivalent genes (Fig. 1d,e and Extended Data Fig. 2b-f). Upon DRB wash off, H2A.Z.1 levels decreased at active promoters and RNAPII showed higher levels in gene bodies (Fig. 1d,e and Extended Data Fig. 2b-d). These data predict that dTAG-13 treatment will lead to increased transcript levels at genes with the highest H2A.Z.1 enrichment.
Using RNA-seq, indeed, we found that the vast majority of differentially expressed genes showed increased expression upon rapid H2A.Z.1 loss (Extended Data Fig. 3a). We observed a striking inverse correlation between H2A.Z.1 loss at promoters and increased expression of the corresponding gene (Spearman R=-0.63) (Fig. 1f). Coupled with our analysis showing an inverse correlation between H2A.Z.1 and RNAPII at promoters, these data suggest that H2A.Z.1 regulates RNAPII dynamics. To test this idea, we next performed Native Elongation Transcript Sequencing (NET-seq) to map the strand-specific location of RNAPII at high resolution2 in ESCs. Analysis of biological duplicates showed that RNAPII density was highly reproducible (Pearson’s R = 0.99 for both control and dTAG data) (Extended Data Fig. 3b). Using available START-seq data in ESCs3, we compiled a list of 4,184 uniquely annotated transcription start sites (TSSs) corresponding to non-tandemly oriented protein-coding genes with detectable RNAPII levels (Extended Data Fig. 3c). Upon dTAG-13 treatment, RNAPII density significantly decreased at promoters and showed a concomitant increase in the gene body compared to controls (Fig. 1g and Extended Data Fig. 3d). We then calculated the RNAPII pausing index (PI) which is defined by the density of RNAPII over the promoter proximal region relative to the gene body31. A significant decrease (p = 1.6 x10−71) in the PI was observed upon H2A.Z.1 loss (Fig. 1h). ChIP-seq of Spt5, a subunit of the DSIF complex involved in promoter-proximal RNAPII pausing, also showed a global and gene-specific decrease upon dTAG-13 treatment (Fig. 1i,j). Together, these data suggest H2A.Z.1 acts as a barrier to transcription by hindering RNAPII pause-release.
H2A.Z.1 controls RNAPII and NELF dynamics at single molecule resolution
The above genomic approaches capture a snapshot of RNAPII at either a paused or elongating state, so we next wanted to quantify RNAPII dynamics at single molecule resolution in live cells. We engineered a photoconvertible Dendra2 tag on the largest subunit of endogenous RNAPII (RBP1) in H2A.Z.1dTAG ESCs. ChIP-seq using an anti-Dendra2 antibody confirmed highest RNAPII enrichment at the TSSs of all protein-coding genes, both at a genome-wide and single gene level, strongly correlating with our NET-seq data (Pearson R = 0.75) (Extended Data Fig. 4a-c). Using single molecule super-resolution microscopy (tcPALM)10,32 in live H2A.Z.1dTAG;RNAPIIDendra ESCs, we found that RNAPII exists both in stable and temporal clusters (∼100nm resolution) with an average cluster lifetime of 4.03 ± 0.13 sec (mean ± SEM of 360 clusters) for the latter clusters (Fig. 2a). Remarkably, H2A.Z.1 loss increased the average lifetime of temporal RNAPII clusters to 4.47 ± 0.14 sec (p=0.0066) (Fig. 2b). RNAPII cluster lifetime positively correlates with the rate of RNAPII re-initiation and ultimately mRNA output based on single molecule FISH32–34. In support of this correlation, treatment with the transcription initiation inhibitor triptolide (TRI)35 completely abolished RNAPII cluster dynamics in ESCs, whereas DRB treatment showed a less dramatic effect (Extended Data Fig. 4d).
The significant decrease in the RNAPII pause index and Spt5/DSIF promoter levels upon dTAG-13 treatment as well as single molecule RNAPII cluster dynamics point to a direct role of H2A.Z.1 in regulating pause-release. Thus, we next measured the dynamics of other key pausing factors. Negative Elongation Factor (NELF) is a well characterized pausing complex that interacts dynamically with RNAPII upon promoter-release4,16,36. Thus, we analyzed available NELF ChIP-seq data in ESCs31 and found a strong overlap with H2A.Z.1 and RNAPII at active gene promoters (Extended Data Fig. 4e). To test the prediction that H2A.Z.1 effects NELF dynamics, we engineered a Halo-tag at the C-terminus of endogenous NELF-B, a subunit of the NELF complex, in H2A.Z.1dTAG ESCs (Extended Data Fig. 4f). Live cell imaging of NELF-B-Halo showed bright punctate spots in the nucleus as previously described37 (Fig. 2c). To next measure NELF cluster dynamics relative to transcription, we treated ESCs for 45 min with either DRB or TRI to inhibit transcription elongation and initiation, respectively. NELF-B-Halo cluster size and intensity were unchanged by blocking initiation (Fig. 2d), consistent with the finding that NELF can be recruited to promoter-proximal regions through TFIID interactions38. In contrast, DRB treatment led to a significant decrease in NELF-Halo intensity (p<0.007) and dynamics by blocking elongation and subsequent re-loading or re-initiation of new RNAPII molecules39 (Fig. 1d,e and Fig. 2d) Thus, NELF clusters appear to mark regions of active transcriptional elongation in live cells.
These observations led us to test the dynamic relationship between H2A.Z.1 and NELF within these clusters. We performed fluorescence recovery after photobleaching (FRAP) in H2A.Z.1dTAG;NELF-BHalo ESCs. FRAP showed that 81% of NELF was exchanged within ∼10 seconds in control ESCs similar to the very rapid turnover observed for RNAPII cluster dynamics (Fig. 2e,f)10. Remarkably, NELF clusters displayed a significantly higher recovery (89%; p=0.001) following dTAG-13 treatment suggesting NELF is more rapidly exchanged at active regions upon H2A.Z.1 loss (Fig. 2f,g). A similar trend was observed when performing FRAP in the presence of TRI (Extended Data Fig. 4g,h), consistent with our imaging data showing that blocking transcription initiation for 45 min does not decrease NELF cluster size due to elongation of engaged RNAPII. Together, measuring RNAPII and NELF dynamics in live cells suggest that H2A.Z.1 impacts both promoter-proximal pause-release and subsequent re-loading of RNAPII.
H2A.Z.1 regulates promoter proximal RNAPII half-life
To test this idea that H2A.Z.1 prevents re-initiation presumably due to slowing the release of paused RNAPII at the gene level, we treated H2A.Z.1dTAG;RNAPIIDendra2 ESCs with TRI at different timepoints (10, 20, and 40 min) and performed ChIP-nexus9, a protocol that captures both initiating and stalled RNAPII at nucleotide resolution (Fig. 3a). As TRI treatment blocks new initiation35, paused RNAPII is eventually lost either by release of the elongation complex into the gene body or by premature termination of nascent transcripts40. As expected, TRI treatment caused a dramatic decrease of promoter-proximal RNAPII at active genes in both control and dTAG-13 treated ESCs (Extended Data Fig. 5a,b). ChIP-nexus also revealed a lower RNAPII pausing index upon H2A.Z.1 depletion compared to controls consistent with our NET-seq data (Fig. 1h) and with a more rapid RNAPII release into gene bodies (Extended Data Fig. 5c).
RNAPII half-life can be used as a measure of its turnover at gene promoters. Thus, we next fitted an exponential decay model using the time points of TRI treatment considering only those genes that maintain measurable RNAPII levels throughout the time course (see Methods). We calculated a median RNAPII half-life of 8.97 min in control ESCs (n = 2,971 genes), in good agreement with data in both Drosophila39,41 and mammalian cell lines40. In contrast, RNAPII half-life decreased to 7.70 min (p = 1×10−122) upon dTAG-13 treatment (Fig. 3b,c). We then performed k-means clustering and divided genes into three categories based on RNAPII half-life/turnover in control ESCs: Short; 1.5-7.9 min (n = 1,449), Medium; 8-13.5 min (n = 730), and Long; 13.6 – 23.5 min (n = 728) (Fig. 3d). Following dTAG-13 treatment, RNAPII half-life was most dramatically reduced at genes displaying shorter half-lives (Short p = 1×10−69, Medium p = 2.1×10−31) (Fig. 3d). Surprisingly, although RNAPII half-life of more stably paused genes did not change appreciably upon H2A.Z.1 loss (Long p = 0.4), these genes exhibited a dramatic decrease of promoter-proximal RNAPII followed by a concomitant increase in nascent transcripts (Fig. 3e-g). Upon DRB treatment, we observed that genes with a longer RNAPII half-life also displayed a dramatic shift of RNAPII into the gene body upon release from the elongation block (Extended Data Fig. 5d). Moreover, H2A.Z.1 slows pause-release and nascent transcription at stably paused genes, a class of genes that appear most sensitive to signalling cues3,39. Notably, these stably paused genes show lower expression in ESCs and have higher promoter GC content (Extended Data Fig. 5d-g), a sequence feature associated with stable RNAPII pausing in metazoans42,43. Thus, limiting pause release and subsequent RNAPII loading may safeguard proper activation of H2A.Z.1 genes.
H2A.Z.1 controls Pre-initiation complex (PIC) recruitment at promoters
Our observations suggest that stably paused genes experience more rapid reloading of RNAPII upon H2A.Z.1 depletion. Although our tcPALM data are consistent with an overall increase in initiating RNAPII upon dTAG-13 treatment (Fig. 2b), we wanted to measure RNAPII PIC recruitment and initiation at individual promoters. RNAPII ChIP-nexus in combination with TRI treatment showed a dramatic increase in initiating Serine 5 phosphorylated CTD levels (RNAPII S5ph) (Extended Data Fig. 6a) and shift of RNAPII occupancy upstream of the TSS (Fig. 4a) consistent with the position of the PIC39,44. Notably, after 40 min of TRI treatment, we observed similarly higher initiating RNAPII levels in dTAG-13 treated compared to control ESCs (Fig. 4b and Extended Data Fig. 6b,c).
To validate that the increase in initiating RNAPII upon H2A.Z.1 loss is due to new PIC recruitment, we next performed ChIP-nexus for the PIC subunits TFIIB and TBP to map their location at base pair resolution. Both TFIIB and TBP displayed a strong enrichment ∼20-30bp upstream of the TSS with high reproducibility between biological replicates (Fig. 4c and Extended Data Fig. 7a,b). In vitro, PIC dynamics is on the order of seconds14,45–47 and promoter proximal paused RNAPII is refractory to new transcription initiation39. To first measure new TFIIB and TBP enrichment, we treated cells with TRI for 40 min to enable complete release of paused RNAPII. ChIP-nexus revealed an increase of new PIC recruitment at RNAPII-dependent promoters either upon TRI and/or dTAG-13 treatment (Fig. 4d-f and Extended Data Fig. 7c-e) with the most dramatic increase observed at stably paused (Long) promoters (Extended Data Fig. 7f). Collectively, our data support a model whereby H2A.Z.1 enables multistep control of transcription by slowing RNAPII pause-release which subsequently inhibits PIC assembly and re-initiation.
RNAPII pausing is critical to allow genes to appropriately respond to developmental and environmental signals by controlling the timing, rate, and magnitude of the transcriptional response3,44,48. In addition to known pausing factors, our quantitative molecular analysis provides strong evidence that H2A.Z.1 acts as a barrier to transcription, affecting both RNAPII pause-release and re-engagement of new RNAPII at promoters in mammalian cells (Fig. 4g). Thus, H2A.Z.1 likely coordinates synchronous gene expression in response to signaling cues. Consistent with this idea, stably paused genes show more synchronous expression, lower cell to cell variability, and are typically signal-responsive genes49. Thus, H2A.Z.1 may serve as a GO or NO-GO decision point whereby RNAPII can either undergo early transcript termination or progressive elongation depending on cellular signals. Notably, in the absence of NELF, RNAPII experiences a second pause around the +1 nucleosome dyad16. Our data argue that site-specific H2A.Z incorporation surrounding the NDR is critical for this additional pause. Thus, our data also predict that H2A.Z.1 nucleosomes upstream of the TSS play a role in regulating PIC assembly and re-initiation. Together, our work uncovers a mechanistic link between H2A.Z.1 and transcription dynamics and provides an explanation for its promoter enrichment at RNAPII-regulated genes and requirement in metazoan development.
Accession Numbers
Raw and normalized sequencing data has been deposited under GEO accession number GSEXXXXXXX.
Author Contributions
CM and LAB designed the study. CM performed experiments and analyzed data. AA assisted with FRAP image acquisition and analysis. CL and IC assisted with tcPALM. CM and LAB wrote the manuscript with input from all authors.
Materials and Methods
ESC culture
V6.5 murine embryonic stem cells (mESC) were cultured in 2i + LIF conditions50 for all assays. Genome editing was performed in serum + LIF conditions. Cells were cultured at 37°C, 5% CO2, on 0.1% gelatin coated tissue culture plates. The media used for general culturing in serum + LIF conditions is as follows: DMEM-KO (Invitrogen 10829-018) supplemented with 15% fetal bovine serum (Hyclone characterized SH3007103), 1,000 U/ml LIF, 100 mM nonessential amino acids (Invitrogen 11140-050), 2 mM L-glutamine (Invitrogen 25030-081), 100 U/mL penicillin, 100 mg/mL streptomycin (Invitrogen 15140-122), and 8 ul/mL of 2-mercaptoethanol (Sigma M7522). The media used for 2i + LIF media conditions is as follows: 484 mL DMEM/F12 (GIBCO 11320), 2.5 mL N2 supplement (GIBCO 17502048), 5 mL B27 supplement (GIBCO 17504044), 0.5 mM L-glutamine (GIBCO 25030), 1X non-essential amino acids (GIBCO 11140), 100 U/mL Penicillin-Streptomycin (GIBCO 15140), 0.1 mM β-mercaptoethanol (Sigma M6250), 1 uM PD0325901 (Stemgent 04-0006), 3 uM CHIR99021 (Stemgent 04-0004), and 1000 U/mL recombinant LIF.
Immunoblots
To assay immunoprecipitation results by western blot, 500ng of each sample was run on a 4%– 20% Bis-Tris gel (Bio-rad 3450124) using SDS running buffer (Bio-rad 1610788) at 120V for 10 minutes, followed by 180V until dye front reached the end of the gel. Protein was then wet transferred to a nitrocellulose membrane using the Trans-blot turbo transfer system (Bio-rad). After transfer, the membrane was blocked with 5% BSA for 1h on a rotor.
Membrane was then incubated with 1:5000 Anti-HA (Cell Signaling 3724), 1:1000 Anti-Nelf-B (Cell Signaling 14894), 1:1000 Anti-RNAPII S5ph (Active Motif 39750), 1:1000 Anti-RNAPII S2ph (Active Motif 39564), 1:1000 Anti-GAPDH (Cell Signaling 8884) diluted in 5% BSA in TBST and incubated for 2h at room temperature. Following several washes with TBST, membranes were incubated either with 1:10,000 Anti-mouse HRP (Cell Signaling 7076) or with 1:10,000 Anti-rabbit HRP (R&D Systems FIN1818061) for 1h at room temperature with rotation. After extensive wash with TBST, membranes were developed with ECL substrate (Bio-rad 1705060) and imaged using Azure 600 or exposed using film.
Chromatin Immunoprecipitation (ChIP)
ChIP was performed using ∼10 million ESCs per assay as previously described51. Briefly, cells were cross-linked with 1% formaldehyde for 10 min followed by 5 min quenching with 125 mM glycine. After washing with PBS buffer, the cells were collected and lysed in cell lysis buffer (5 mM Tris, pH 8.0, 85 mM KCl, and 0.5% NP-40) with x1 Halt Protease Inhibitor cocktail (ThermoFisher 87786) and 1mM PMSF (Sigma 10837091001). Pellets were spun for 5 min at 6000 rpm at 4°C. Nuclei were lysed in nuclei lysis buffer (1% SDS, 10 mM EDTA, and 50 mM Tris–HCl) and samples were sonicated for 12 min on a Covaris Sonicator. The samples were centrifuged for 20 min at 13,000 rpm at 4°C and the supernatant was diluted in IP buffer (0.01% SDS, 1.1% Triton-X-100, 1.2 mM EDTA, 16.7 mM Tris–HCl, and 167 mM NaCl), and the appropriate antibody (10 μg) was added and incubated overnight at 4°C with rotation. Antibodies used in this study; Anti-HA (Cell Signaling 3724), Anti-Spt5 (Santa Cruz sc-133217), Anti-Dendra2 (OriGene TA180094), Anti-TFIIB (Cell Signaling 4169), Anti-TBP (Abcam, ab28175). Two biological replicates were prepared for each condition using independent cell cultures and chromatin precipitations. Following overnight incubation, 50 μl Protein G Dynabeads (Life Technologies 10009D) were added for 1 h at room temperature with rotation. Beads were washed once for 1 min with rotation with each of the following buffers: Low salt buffer (0.1% SDS, 1% Triton-X-100, 2mM EDTA, 20mM Tris pH 8.0, 150mM NaCl), High salt buffer (0.1% SDS, 1% Triton-X-100, 2mM EDTA, 20mM Tris pH 8.0, 500mM NaCl), LiCl buffer (0.25M LiCL, 1% NP-40, 1% dioxycholate, 10mM Tris pH 8.0, 1mM EDTA), and TE buffer (50mM Tris pH 8.0, 10mM EDTA). DNA was eluted off the beads by rotation at room temperature for 30 min in 200 μL elution buffer (1% SDS, 0.1M NaHCO3). Cross-links were reversed at 65°C for 4h. RNA was degraded by the addition of 3 μL of 33 mg/mL RNase A (Sigma R4642) and incubation at 37°C for 2 hours. Protein was degraded by the addition of 5 μL of 20 mg/mL proteinase K (Invitrogen 25530049) and incubation at 50°C for 1 hour. Phenol:chloroform:isoamyl alcohol extraction was performed followed by ethanol precipitation, the resulting DNA was resuspended in 20 μL H2O, and used for either qPCR or sequencing.
For ChIP-seq experiments, purified 10-20 ng of ChIP DNA was used to prepare Illumina multiplexed sequencing libraries. Libraries for Illumina sequencing were prepared following the NEBNext DNA Library Prep Master Mix kit (NEB E6040). Amplified libraries were size selected using a 2% agarose gel to extract fragments between 200 and 600 bp.
ChIP-nexus
For each ChIP we used ∼20 million crosslinked mouse ESCs spiked-in with 5% human U2OS cells expressing Dendra2-RNAPII to account for loss of RNAPII during triptolide treatment. For TBP and TFIIB ChIPs, mouse ESCs were spiked-in with 5% human iPSCs cells. ChIP was performed as described above. The ChIP-nexus libraries were constructed as previously described with minor modifications9. IPed-bead bound material was End-repaired, followed by dA-tailing and adaptor ligation. The ChIP-nexus adaptor mix contained four fixed barcodes (ACTG, CTGA, GACT, TGAC). Barcode was extended with Phi29 polymerase, followed by λ exonuclease digestion, ethanol precipitation and ssDNA circularization. A detailed ChIP-nexus protocol can be found online (ChIP-nexus protocol v.2019). At least two biological replicates were performed for each factor to obtain coverage of at least 80 million reads per condition. Single-end sequencing of 75 bp was performed on an Illumina NextSeq 500 instrument.
ChIP-seq/nexus analysis
For ChIP-seq, the samples were single-end deep-sequenced and reads were aligned to the mm10 genome using Bowtie2 (v 2.3.5)52. Peak calling was performed using PePr (v 1.1)53 with peaks displaying an FDR < 10−5 considered statistically significant. All aligned ChIP-seq BAM files were converted to bigwig (10 bp bin) and normalized to 1× sequencing depth using deepTools (v 3.0)54. Blacklisted mm10 coordinates were further removed from the analysis. Average binding profiles were visualized using R (v 3.6.1). Heatmaps were generated with deepTools.
For ChIP-nexus, fastq files were filtered for the presence of the four mixed barcodes and were deduplicated using Q-Nexus55. Bowtie2 (v 2.3.5) was used for alignment to either the mm10 or hg19 genome. Normalization factors were computed for reads mapping uniquely to the human genome using Deseq256. All aligned mouse ChIP-nexus BAM files were converted to bigwig (1 bp bin), separated by strand, and normalized to human spike-in controls using deepTools (v 3.0) with an “–offset 1” to record the position of the 5’ end of the sequencing read which corresponds to TF footprint.
To calculate the half-life of paused RNAPII for each gene, RNAPII density was calculated in a 250-bp window around the TSS. RNAPII time-course measurements were fitted into an exponential decay model using the “RNAdecay” 57 R package. We selected genes fulfilling the current criteria; 1) detectable RNAPII levels (RPM > 0.1), 2) highest RNAPII density at “No TRI” condition, and 3) low variance between replicates (sigma < 0.05). Genes fitting the above criteria (n = 2,971) were used to calculate RNAPII half-life. The full list of genes is displayed in Extended Data Table 1.
Genome Editing using CRISPR/Cas9
The CRISPR/Cas9 system was used to genetically engineer ESC lines. Target-specific oligonucleotides were cloned into the pSpCas9-GFP plasmid (Addgene PX458) using BbsI restriction digestion. The oligo sequences used for cloning are;
Plasmids containing H2afz-FKBP.knock-in.BFP and Nelfb-Halo DNA repair templates were synthesized with long homology arms (∼800 bp) by Genewiz FragmentGene and assembled using the NEBuilder HiFi DNA Assembly Cloning kit (NEB E5520S). The Dendra2-RBP1 plasmid and guide were used as previously described10.
For the generation of the endogenously tagged lines, 1 million mES cells were transfected with 1.25 μg Cas9 plasmid and 1.25 μg non-linearized repair template plasmid. Cells were sorted after 30 hours for the presence of Cas9-GFP. Cells were expanded for five days and then sorted again either for BFP (H2afz-FKBP) or GFP (RBP1-Dendra2). Cells that were transfected with the Nelfb-Halo repair template were incubated for 15 min with 500nM Halo TMR ligand (Promega G8251), washed 3 times with 2i media and further incubated for another 30 min before sorting for Texas Red. Sorted cells were plated at a low density (∼400 cells) on 10cm plates with irradiated MEFs and grown for approximately one week. Individual colonies were picked using a stereoscope into a 96-well plate. Cells were expanded and genotyped by PCR. Clones with a homozygous knock-in tag were further expanded and used for experiments.
Super-resolution imaging
Live cell PALM imaging was carried out as described before10. Briefly, cells were incubated in L-15 medium and were simultaneously illuminated with 1.3 W/cm2 near UV light (405nm) for photoconversion of Dendra2 and 3.2 kW/cm2 (561nm) for fluorescence detection with an exposure time of 50ms. We acquired images of Dendra2-RNAPII for 50s (1000 frames) for quantification of transient clusters. Super-resolution images were reconstructed using MTT 58 and qSR59. For imaging NELFb-Halo clusters, cells were incubated for 15 min with 500nM Halo TMR ligand (Promega G8251), washed with 2i media three times followed by 30 min incubation in 2i media without the ligand to remove unbound HaloTaq ligands before fluorescence imaging in L-15 medium. We acquired 100 frames (5s) with 561nm excitation.
Density based spatial clustering of applications with noise (DBSCAN) analysis
We used a clustering algorithm, density based spatial clustering of applications with noise (DBSCAN), to define the area of clustered regions in super-resolution data, as previously described10. Based on the single molecule localization in a super-resolution image, DBSCAN tests if a localization can be grouped with nearby localizations. Once a localization has a minimum number of nearby localizations (N) within a specific distance (R), it is grouped with other localizations also satisfying N and R criteria in their local neighbourhood. We defined the parameters N = 25-30 (points) and R = 90-95 (nm) for Dendra2-RNAPIII clustering. Varying the parameters for an image, we chose a set of parameters (N, R) for which the DBSCAN clustering result agrees with intensified regions in the super-resolution image generated using the qSR software module. For DBSCAN analysis, we used the ‘DBSCAN’ module embedded in the qSR software.
Fluorescence Recovery After Photobleaching (FRAP)
FRAP experiments were carried out as previously described10. Images were taken at a single confocal plane with an exposure time of 200ms. The bleach spot was taken at the center of each cluster and images were acquired at 1s intervals for 1min. Imaging was done on an Andor Revolution Spinning Disk Confocal microscope was used with the FRAPPA module. To quantify FRAP recovery, photobleaching was corrected by normalizing to non-bleached areas using the FIJI plugin FRAP profiler (http://worms.zoology.wisc.edu/research/4d/4d).
RNA-isolation and sequencing
RNA was isolated using the RNeasy Plus Mini Kit (QIAGEN 74136) according to manufacturer’s instructions. ERCC spike-in controls were added to the RNA samples prior to library preparation. Stranded Ribo-depleted selected libraries were prepared using the TruSeq Stranded mRNA Library Prep Kit (Illumina RS-122-2101) according to manufacturer’s standard protocol. Libraries were subject to 75 bp paired-end end sequencing on an Illumina NextSeq instrument.
RNA-seq analysis
Sequenced reads were aligned to the mm10 genome via STAR (v 2.7.2b)60. Gene counts were obtained from featureCounts of the Rsubread package (R/Bioconductor). Only genes with CPM
> 2 were included in subsequent analysis. Normalization factors from ERCC spike-in controls were calculated using edgeR61 and applied to the counts mapping to the mm10 genome. Differential expression was performed using the limma62 package. Significant genes with an absolute Fold change > 1.5 and adjusted P-value < 0.05 were considered as differentially expressed (Extended Data Table 2).
Cell fractionation, RNA preparation and sequencing library construction for NET-seq
The cell fractionation was performed as described previously63. 10 million ES cells are washed with 500 ml of pre-cooled 1x PBS, resuspended in 150 ml Cytoplasmic lysis buffer (0.15% (v/v) NP-40, 10 mM Tris-HCl (pH 7.0), 150 mM NaCl, 25 mM a-amanitin (MCE HY-19610), 50 U SUPERase.In (Life Technologies AM2694), x1 Halt Protease Inhibitor cocktail mix (ThermoFisher 87786) and incubated on ice for 5 min. The cell lysate is layered over 400 ml of Sucrose buffer (10 mM Tris-HCl (pH 7.0), 150 mM NaCl, 25% (w/v) sucrose, 25 mM a-amanitin, 50 U SUPERase.In, x1 Halt Protease Inhibitor cocktail mix) and centrifuged at 16,000 g for 10 min at 4C. The nuclei pellet is resuspended in 500 ml Nuclei wash buffer (0.1% (v/v) Triton X-100, 1 mM EDTA, in 1x PBS, 25 mM a-amanitin, 50 U SUPERase.In, 1x Protease inhibitor mix) and centrifuged at 1,150 g for 1 min at 4C. Washed nuclei are resuspended in 200 ml Glycerol buffer (20 mM Tris-HCl (pH 8.0), 75 mM NaCl, 0.5 mM EDTA, 50% (v/v) glycerol, 0.85 mM DTT, 25 mM a-amanitin, 50 U SUPERase.In, 1x Protease inhibitor mix). Next, 200 ml of Nuclei lysis buffer (1% (v/v) NP-40, 20 mM HEPES pH 7.5, 300 mM NaCl, 1M Urea, 0.2 mM EDTA, 1 mM DTT, 25 mM a-amanitin, 50 U SUPERase.In, 1x Protease inhibitor mix) are added, mixed by pulsed vortexing and incubated on ice for 2 min. The lysate is centrifuged at 18,500 g for 2 min at 4C. The chromatin pellet is resuspended in 50 ml Chromatin resuspension solution (25 mM a-amanitin, 50 Units SUPERase.In, 1x Protease inhibitor mix in 1x PBS) before RNA preparation. Biological replicates were obtained from two independent Control and dTAG-13 treated cells. Prior to library preparation, SIRV spike-in controls (1:10,000 dilution) were added to the extracted RNA to account for loss of RNA during library preparation. NET-seq library construction was conducted as originally described50.
More than 90% recovery of ligated RNA and cDNA was achieved from 15% TBE-Urea (Invitrogen EC6885BOX) and 10% TBE-Urea (Invitrogen EC6875BOX), respectively, by adding RNA recovery buffer (R1070-1-10; Zymo Research) to the excised gel slices and further incubating at 70°C (1,500 rpm) for 15 min. Gel slurry was added to a ZymoSpin IV Column (Zymo Research C1007-50) and cDNA was precipitated for subsequent library preparation steps. cDNA containing the 3’ end sequences of a subset of mature and heavily sequenced snRNAs, snoRNAs, and rRNAs were specifically depleted using biotinylated DNA oligos50. Oligo-depleted circularized cDNA was amplified by PCR and double-stranded DNA was run on a 3% low melt agarose gel. The final NET-seq library running at ∼150 bp was extracted and further purified using the ZymoClean Gel DNA recovery kit (Zymo Research D4007). Sample purity and concentration was assessed in a 2200 TapeStation and further deep sequenced in a HiSeq 2500 Illumina Platform.
Processing and alignment of NET-seq reads
All the NET-seq FASTQ files were processed using custom Python scripts (https://github.com/BradnerLab/netseq). The sequencing reads are aligned to the mouse reference genome (mm10) using the STAR (v 2.7.2b) aligner60. PCR duplicates and reads arising from RT bias were also removed. Reads mapping exactly to the last nucleotide of each intron and exon (splicing intermediates) were further removed from the analysis. The final NET-seq BAM files were converted to bigwig, separated by strand, and normalized to SIRV spike-in controls using deepTools (v 3.0) with an “–offset 1” to record the position of the 5’ end of the sequencing read which corresponds to the 3’ end of the nascent RNA. NET-seq tags sharing the same or opposite orientation with the TSS were assigned as “sense” and “antisense” tags, respectively. Promoter-proximal regions were carefully selected for analysis to ensure that there is minimal contamination from transcription arising from other transcription units. Genes overlapping within a region of 2 kb upstream of the TSS were removed from the analysis.
RNAPII pausing index calculation
The RNAP II pausing index is determined by dividing the coverage in the region −30 to +250 bp around transcription start sites by the coverage in the region +300 bp downstream of the TSS to the transcription end site. The analysis was performed for non-overlapping protein-coding genes displaying a detectable signal of RNAPII (RPM > 0.1) at the promoter proximal region.
Acknowledgements
We thank the Boyer lab, Seychelle Vos, Eliezer Calo and Craig Peterson for helpful discussions and insightful comments on the manuscript. This work was supported by NIGMS R01-GM134734 to I.C., NHLBI R01-HL140471 to L.A.B. and the Koch Institute Core Grant P30-CA14051 from the National Cancer Institute.