SUMMARY
Direct determination of RNA structures and interactions in living cells is critical for understanding their functions. Current crosslinking and proximity-ligation approaches are fundamentally limited due to inefficient RNA crosslinking, purification and high-level photochemical damages. Here we present PARIS2 (psoralen analysis of RNA interactions and structures, second generation), a re-invented method for capturing RNA duplexes in cells with three orders of magnitude improved efficiency. PARIS2 captures ribosome small subunit (SSU) binding sites on mRNAs, reporting translation status on a transcriptome wide scale, and captures spliceosomal snRNP binding sites on various RNA targets. We determine the RNA genome structure of enterovirus D68, a re-emerging viral pathogen associated with severe neurological symptoms, and discover alternative conformations in the internal ribosome entry site (IRES) that controls translation initiation. Together, these results reveal new aspects of RNA photochemistry and enzymology, and enable highly efficient interrogation of the RNA structurome and interactome in cells.
INTRODUCTION
RNA structures and interactions play important roles in many cellular processes, ranging from carrying genetic information, catalysis, regulation of gene expression, and beyond1. However, the vast majority of RNA molecules are too large and flexible for structure analysis using the physical methods such as X-ray crystallography, NMR and cryo-EM2. Base pair stacking is the dominant force in RNA structures and RNA-RNA interactions; therefore, direct determination of base pairs is a critical step towards decoding the structural basis of RNA-mediated regulation in cells.
Recently, we and others developed new approaches to determine RNA base pairs, based on the principle of crosslinking, proximity ligation and high throughput sequencing3–6. These methods, including PARIS, SPLASH, LIGR-seq and COMRADES, allowed direct analysis of RNA duplexes at the transcriptome level, achieving single molecule accuracy and near base pair resolution. Application of these methods has led to new insights into the mechanisms and functions of cellular and viral RNAs, such as modular architectures of long noncoding RNAs, and dynamic structures/interactions of RNA virus genomes3, 4.
Despite over 50 years of research on the nucleic acid crosslinking, our understanding of the physical, chemical, and enzymatic properties of “crosslink-ligation” methods remain limited. The solubility of the commonly used psoralen AMT (aminomethyl trioxalen) is low, limiting its crosslinking efficiency. Ultraviolet (UV) crosslinking and reversal induce severe damage to RNA7, impeding reverse transcription. We discovered that crosslinked RNA cannot be recovered efficiently from cells using the classical AGPC (acid guanidine thiocyanate phenol chloroform) aqueous-organic phase separation method (commercially known as TRIzol, QIAzol, etc.) or silica-based solid phase extraction methods8, 9. Given the low efficiency, several methods have been developed to enrich crosslinked fragments, including native-denatured two-dimension (ND2D) gel, biotin-tagging and RNase R treatment, however, these approaches are often expensive and inefficient2.
We have now systematically investigated the basic physics and chemistry of major steps in the crosslinkligation methods for RNA duplex discovery; and develop a new generation of the PARIS method (PARIS2). In particular, we identify amotosalen as an efficient and significantly improved crosslinker compared to the commonly used psoralen AMT, due to its higher solubility. We discover that crosslinking increases RNA hydrophobicity, and design a new method, TNA, to purify crosslinked RNA, enabling targeted analysis of RNAs using antisense enrichment. We develop a denatured-denatured 2D (DD2D) gel system for isolation of pure crosslinked RNA without the need for tagging the crosslinker. We introduce new chemical and enzymatic approaches to prevent and bypass photochemical damages to RNA. Together, these optimizations in PARIS2 resulted in >4000-fold increased efficiency, and importantly, the individual improvement will also find broad use in RNA research. Applying PARIS2, we discover that crosslinked RNA fragments can report translation status of mRNAs, and profile global snRNP binding sites. We also use PARIS2 to determine the genome architecture of enterovirus EV-D68, an important re-emerging pathogen associated with severe neurological symptoms, and discover novel structure conformations in enterovirus D68. The new PARIS2 method will enable more rapid and facile analysis of structural basis of RNA functions in various biological systems.
RESULTS
Overview of the PARIS2 strategy and major improvements
The crosslinking and proximity ligation-based principle for RNA secondary structure and interaction analysis relies on the successful completion of multiple reaction and extraction steps (Fig. 1a). The process starts with psoralen crosslinking in live cells, followed by RNA extraction and fragmentation, isolation of crosslinked from non-crosslinked, proximity ligation, crosslink reversal, adapter ligation, reverse transcription and finally cDNA amplification. In this study, we performed a systematic analysis of each step and make improvements based on the newly discovered physical, chemical and enzymatic properties of RNA reactions and extractions (summarized in Fig. 1b). The new improvements include (1) high solubility and high efficiency crosslinker amotosalen, (2) complete extraction of crosslinked RNA, (3) simplified RNA fragmentation using RNase III, (4) DD2D gel selection of crosslinked RNA, (5) optimized adapter ligation, and (6) prevention and bypass of photochemical RNA damage. Together these changes lead to >4000-fold improvement in the efficiency for PARIS2 (Supplementary Tables 1,2). Major optimizations are presented below, whereas additional details are in Supplementary Notes.
Highly soluble psoralen amotosalen increases RNA crosslinking efficiency
The most commonly used psoralen AMT is only soluble at 1mg/ml in aqueous solutions, limiting its crosslinking efficiency10 (Fig. 2a). Amotosalen, a previously reported psoralen derivative, is soluble at above 50mg/ml, and its activity in virus inactivation is similar to AMT at the same concentration11 (Fig. 2a). We designed a new 3-step method to synthesize amotosalen-HCl and found it soluble at 230mg per ml in water and >100mg per ml in phosphate buffered saline (PBS) (Supplementary Fig. 1). AMT and amotosalen crosslink DNA oligo duplexes in vitro with similar efficiency at the same concentration (Supplementary Fig. 2). At higher concentrations, amotosalen is more efficient.
We discovered that crosslinking repartitions large RNA from the aqueous phase to the interphase during standard AGPC (TRIzol) extraction (Fig. 2b). The migration of large RNAs from aqueous to the interphase serves as an indicator for the crosslinking efficiency. We crosslinked cells with AMT or amotosalen at various concentrations and extracted RNA from the aqueous phase in TRIzol-chloroform mixture. Both total RNA yield and percentage of 18S+28S rRNAs were reduced with higher concentrations of psoralens, suggesting higher crosslinking efficiency (Fig. 2c).
To accurately measure crosslinking efficiency, we recovered crosslinked intact RNA using the newly invented TNA method (Fig. 2d), fragmented RNA using an optimized RNase III protocol, and extracted crosslinked but not monoadduct fragments using the new DD2D gel system (see details later). Total crosslinked fragments (including both 1D-2D interface and 2D upper diagonal) increased from 0.82% to 4.65%, roughly 5.7-fold, after a 10-fold increase in psoralen concentration (AMT 0.5 vs. amotosalen 5). Even at the same concentration (0.5mg/ml), amotosalen is more efficient than AMT. The RNA duplexes captured by AMT and amotosalen are similar (Supplementary Fig. 2c-e). Therefore, we identified amotosalen as a more efficient nucleic acid crosslinker.
Phase partition and extraction of crosslinked RNA
In the classical AGPC method for RNA extraction, the mixture of guanidine thiocyanate (GuSCN), phenol and chloroform forms two phases8. RNA partitions to the aqueous phase at pH below 5, proteins partition to the inter and organic phase, while DNA partitions to the interphase (Fig. 3a-b, Supplementary Note 1). On the other hand, the standard PCI (phenol, chloroform and isoamyl alcohol) extraction of DNA employs higher pH (~8.0) to bring both DNA and RNA to the aqueous phase. While applying the AGPC method (commercial name TRIzol, etc.), we noticed that crosslinked cells cannot be completely dissolved, and RNA yield was greatly reduced9 (Supplementary Fig. 3a). Proteinase K (PK) treatment was necessary but insufficient for improving RNA yield (by only ~10%). We used PK and RNase digestion in lysate to recover crosslinked RNA9, however, recovery was still incomplete. Furthermore, fragmentation prior to purification makes it difficult to enrich specific RNAs using antisense oligos.
We suspected that crosslinked RNA may be more hydrophobic and partitioned to the interphase. To test this possibility, we crosslinked pure total RNA with AMT and then extracted RNA using direct ethanol precipitation (Fig. 3c). Alternatively, we used the AGPC method (TRIzol), where RNA from the aqueous, inter and organic phases was precipitated separately. Crosslinking induced a broad smear that spans beyond the largest 28S peak. While direct ethanol precipitation recovered all RNA, the aqueous phase in TRIzol-chloroform mixture contains only RNA in the 50-300nt range (e.g. tRNAs, snRNAs, snoRNAs), and sharp non-crosslinked 18S and 28S peaks (Fig. 3c-d). These results suggested that crosslinking increased RNA hydrophobicity, making it difficult to extract using the classical AGPC method. In most previous studies, the inefficient recovery of crosslinked RNA likely has resulted in significant bias because larger and heavier crosslinked RNA molecules are lost.
To recover total RNA after crosslinking, we developed a new method termed total nucleic acid extraction (TNA). Briefly, cells are first lysed in 6M GuSCN to completely inhibits nucleases. The lysate is diluted to reduce GuSCN concentration, buffered, treated with EDTA to chelate divalent cations, and digested with PK to remove proteins, which also inactivates nucleases. Total nucleic acids are then precipitated using phenol and alcohol. Phenol keeps residual proteins in solution so that only nucleic acids precipitate. DNA that makes up about 50% of all nucleic acids is removed using DNase, leaving intact RNA (Supplementary Fig. 3c). Carmustine and chlorambucil, two cancer chemotherapy drugs that crosslink nucleic acids, also promoted the partition of RNA to the interphase, and crosslinked RNA was successfully recovered using the TNA method (Supplementary Figs. 4-5). TNA also outperforms both AGPC and solid-phase methods by at least 6-fold (Supplementary Fig. 6, Supplementary Tables 1,2). These results suggest that crosslink-induced hydrophobicity is a general property of crosslinked RNA. Therefore, TNA method is generally applicable to crosslinking studies and enables targeted antisense enrichment from intact total RNA (see details later).
Efficient isolation of crosslinked RNA using a DD2D gel system
To obtain short crosslinked RNA fragments, we developed a simplified one-step RNase III protocol that takes advantage of the digestion kinetics (Supplementary Fig. 7, Supplementary Note 2). Given the low efficiency of psoralens, crosslinked fragments need to be enriched for sequencing. Biotin-conjugated psoralens have been used to enrich RNA after crosslinking, but these methods also enrich monoadducts, which are more abundant than crosslinks4, 6. Tag-based purification requires custom synthesis and cannot be quickly adapted for other crosslinkers. RNase R depletion of non-crosslinked RNA is also impeded by monoadducts5. We initially used a ND2D gel system to isolate crosslinked RNA without monoadduct contamination 9, 12, however, this method suffers from low resolution and low yield (Supplementary Fig. 8). The diagonal in the second dimension is broad, partially masking crosslinked fragments above the diagonal. We developed a new DD2D gel system takes advantage of the differential migration of crosslinked RNA vs. non-crosslinked at different gel concentrations during electrophoresis (Supplementary Fig. 8 and Supplementary Note 3). The DD2D method has higher resolution and consistency, recovers more crosslinked fragments (>1.5 fold vs. ND2D), does not rely on the base pairing of the crosslinked RNA for separation. RNA duplexes crosslinked with other compounds can be separated from noncrosslinked, suggesting that DD2D is a broadly applicable method (Supplementary Figs. 4-5).
Prevention of RNA against UVC induced photochemical damages
Photochemical crosslinking (psoralen + UVA, or 365nm) and reversal (UVC, 254nm) enable in vivo analysis of RNA duplexes, but also cause many types of damage. Together with the low efficiency proximity ligation, the damages block reverse transcription and reduce both the total cDNA yield and percentage of gapped reads (Fig. 4a). UVC irradiation induces pyrimidine dimers and other damages via the singlet excited state, even after very short exposure7, 13 (Fig. 4b). Earlier studies showed that UVC induced DNA damage can be prevented by singlet state quenchers, but such approaches did not work well for RNA14, 15. To prevent, repair or bypass UVC-induced RNA damages, we systematically screened a variety of conditions, including intercalating dyes and solvents that act as singlet quenchers (Fig. 4c, Supplementary Figs. 9-11 and Supplementary Note 4). Superscript IV (SSIV) reverse transcriptase outperforms other enzymes on UVC damaged RNA, increasing yield by 7-fold over SSIII (Supplementary Figs. 9). Acridine orange (AO) and ethidium bromide (EB) at high concentrations can protect both normal and psoralen crosslinked RNA from UVC irradiation. AO effectively protects non-crosslinked RNA even after 30min UVC irradiation (at 4mW per cm2), after which 30% RNA remain intact, vs. 0.5% in the absence of AO (Fig. 4c, upper panel). For crosslinked RNA, there is simultaneous UVC-induced reversal and damage, yet AO still protects RNA effectively (Fig. 4c, lower panel). Importantly, the singlet quenchers did not block crosslink reversal, making it possible to apply them in PARIS-like experiments (Supplementary Fig. 10i-l). Together these studies demonstrate for the first time that UVC induced RNA damage can be largely prevented by high concentrations of singlet quenchers. After proximity ligation and UVC reversal of crosslinks, the RNA samples are then ligated with adapters for reverse transcription and library preparation. We also optimized the adapter ligation step using synthetic oligos (Supplementary Fig. 12).
Bypass of oxidative damages in reverse transcription
In addition to crosslinking pyrimidines, photosensitized psoralens also induce oxidative damage to RNA, primarily affecting guanines through direct electron transfer and elicitation of oxygen16 (Supplementary Fig. 13). In order to minimize the adverse effect of these damages, we systematically screened conditions to prevent, repair or bypass them (Supplementary Figs. 13-15, and Supplementary Note 5). Certain oxidant scavengers, such as vitamin C, Tiron and MnTBAP17, 18, reduced PUVA-induced oxidation, but also blocked crosslinking, due to their common energetic precursors (Supplementary Fig. 14). RNA damage impedes reverse transcription by trapping the enzyme in an inactive state19, 20. We reasoned that conditions that promote enzyme conformation dynamics, or longer incubation time, may overcome such barriers. Indeed, several conditions, including SSIV, cofactor Mn2+ and longer incubation time dramatically increased cDNA yield both alone or in combinations, in both primer extension assays and qRT-PCRs (Fig. 4d-i, Supplementary Fig. 15). SSIV outperforms all other reverse transcriptases (Fig. 4f, h and Supplementary Fig. 15f). Mn2+ is better than Mg2+ for reverse transcription (Fig. 4g, i and Supplementary Fig. 15h-i). The longer reaction time is particularly effective in promoting the bypass of damages bases (Fig. 4i and Supplementary Fig. 15h-i). Together, these conditions for SSIV improved the bypass of PUVA-induced damages by 8-70 folds over SSIII (Supplementary Tables 1-2).
PARIS2 enables highly efficient and sensitive detection of RNA duplexes in cells
After optimizing all individual steps, we tested their performance in new PARIS2 workflow. Starting from the same number of cells, the 5mg/ml amotosalen crosslinking, TNA extraction and DD2D gel isolation improved the yield of crosslinked RNA fragments by ~60-fold over the standard AMT-TRIzol-ND2D protocol. Starting from the same amount of crosslinked RNA fragments (after DD2D gel step), the DNA library yield is improved ~76 fold (Fig. 1, Supplementary Tables 1-2). Together the improvements resulted in a total of >4000-fold increase in efficiency. We applied PARIS2 with oligo(dT) enrichment of cellular RNA from crosslinked HEK293 cells and mouse brain tissues, and with antisense enrichment of viral RNAs (Figs. 5, 6). For oligo(dT) enriched RNA, we were able to model structures of abundant mRNAs even with only ~1M gapped reads (Supplementary Fig. 16).
PARIS2 enables profiling of ribosome SSU binding across the transcriptome
During translation, mRNAs directly contact the 18S ribosomal RNA (rRNA) in the small subunit (SSU), and some of these contacts can be crosslinked by UV alone21 (Fig. 5a-b). We reasoned that capturing the mRNA-rRNA interactions may allow direct analysis of translation. To analyze interactions with multi-copy genes, such as those encoding rRNAs, spliceosomal snRNs etc., we designed reference genomes masking multicopy genes and adding back single copies (Supplementary Figs. 17-19). After mapping HEK293 mRNA PARIS2 data to the engineered references, we extracted chimeric reads connecting cytoplasmic rRNAs and mRNAs. 18S rRNA regions crosslinked to mRNAs are limited to helix 18 and 26 (h18 and h26), with a minor peak on h44 (Fig. 5c). Both h18 and 26 are in the mRNA channel (Fig. 5a,b and Supplementary Fig. 20a,b). Such specific rRNA binding was not observed for other abundant RNAs, like the mitochondrial rRNAs and snRNAs, suggesting that interaction was captured during translation. Then we analyzed 18S h18/h26 binding sites on mRNAs (Fig. 5d). The strongest binding is with the coding sequence (CDS), followed by the 5’UTR, while the 3’UTR has little binding. The highest peak in right next to the start codon. Interestingly, h26 binding peak precedes that of h18, consistent with their locations in the mRNA channel, with h18 near the entry and h26 near the exit. The distance between the two peaks is around 40-50nt, slightly longer than the ribosome footprint, likely due to the random fragmentation used in PARIS2. The mRNA-rRNA crosslinking could be a result of dynamic flipping of the h18 and h26 bases that transiently pair with mRNAs, which is also necessary for direct UVC crosslinking reported in earlier studies21. The binding in the 5’UTR but not 3’UTR may represent the scanning phase of translation initiation which has been previously captured in translation complex profiling22. This is different from the standard ribosome profiling where the ribosome binding to the 5’UTR is limited to uORFs23.
Similar patterns of rRNA-mRNA interactions were observed in individual mRNAs and in mouse brain oligo(dT) enriched RNAs, confirming the specificity of these interactions (Supplementary Fig. 20c-g). We also analyzed our previous total RNA PARIS data in HEK293T cells and mouse ES cells (Supplementary Fig. 20f-g). Similar specific interactions were observed, despite elevated background. Together, we demonstrate PARIS2 as a powerful alternative method that enables direct analysis of mRNA translation.
PARIS2 enables global profiling of snRNP targets
The spliceosomal snRNPs have multiple functions beyond splicing24–28. Accurate analysis of their binding sites across the transcriptome is necessary for mechanistic studies of these RNP complexes. Using the engineered genome references containing only single copies of snRNAs, we determined the interactions between snRNAs and other RNAs in both total and oligo(dT) enriched RNAs. Analysis of total RNA PARIS data revealed extensive interactions of snRNAs especially U1 and U2, with other RNAs (Fig. 5e-f, Supplementary Fig. 21). Meta analysis of snRNA binding sites on intron-containing RNAs revealed major peaks focused at the expected locations, including the 5’ splice site for U1 and the branch site for U2 (Fig. 5g). Interestingly, such interactions were also captured in the polyA-enriched samples, suggesting that at least some of the interactions persist after polyadenylation (Supplementary Fig. 21b, c). Many U1 binding sites reside in the exons, far away from the splice sites, consistent with our earlier studies9, 24 (Supplementary Fig. 21d-i). Surprisingly, we identified strong U1-XIST and U6-MALAT1 interactions, suggesting yet unknown functions of these snRNAs (Supplementary Fig. 21d, j-k). Together, these studies demonstrate the power in global analysis of snRNP binding sites.
PARIS2 determines the genome structure of the enterovirus EV-D68
The genomes of RNA viruses carry the genetic information, and at the same time fold into complex structures to regulate multiple steps of their infection life cycles. However, the direct analysis of viral genome structures and interactions in cells remains challenging. We applied PARIS2 to EV-D68, an RNA virus, whose recent global outbreaks have been associated with severe respiratory symptoms and acute flaccid paralysis, which resembles poliomyelitis29 (Fig. 6a). After in vivo amotosalen crosslinking of HeLa cells infected with the EV-D68 strain US/MO/14-18947 (US47), we enriched the ~7300nt viral RNA genome using biotinylated antisense probes for PARIS2 library construction (Fig. 6a). With ~100-fold enrichment of viral RNA, we were able to obtain full coverage of the genome using ~400,000 reads, where a quarter of them map to the virus (Supplementary Fig. 22a-e). The gapped reads revealed a complex global architecture, with extensive long-range structures (Supplementary Fig. 22f).
Using the PARIS2-derived duplex groups, we confirmed the previously predicted 5’UTR secondary structure, including the 5’CL (cloverleaf) and IRES, which play critical roles in replication and translation initiation, respectively30, 31 (Fig. 6b-c, Supplementary Fig. 22g-k). The IRES structure model consists of five domains designated II-VI. All domains are supported by multiple sequence alignments of ~500 complete EV-D genomes (Supplementary Fig. 23). Several motifs within the IRES, two GNRA tetraloops in domains IV and V and a pyrimidine-rich track (Yn) between domains V and VI, are essential for the recruitment of translation initiation factors32. These motifs are clearly identified on the PARIS2-derived structure model (Fig. 6c).
Interestingly, besides the proposed IRES structure, we also identified alternative structures where domain V and/or VI adopt significantly different conformations (Fig. 6b, c-e). These alternative conformations are supported by similar numbers of gapped reads compared to the known structure domains, suggesting that they are abundant in cells. All alternative structures are also supported by multiple sequence alignments among sequenced viral genomes in EV-D species (Supplementary Fig. 24), indicating that these dynamic structures may play important roles in EV-D68, probably in translation initiation. Using the PARIS2-derived structure models as guides, we analyzed structure conservation in 3 additional major enterovirus species, AC, where large numbers of whole genome sequences are available (Fig. 6f, Supplementary Fig. 25-28). Alternative conformations V-a1, V-a2 and VI-a showed comparable conservation to some of the previously proposed domains (II, III and V). Together, the combined PARIS2 analysis and phylogenetic analysis revealed a dynamic model of the IRES structure, setting the stage for further functional studies.
DISCUSSION
The systematically reinvented PARIS2 method is highly efficient and sensitive, overcoming many of the fundamental bottlenecks in current crosslink-ligation based methods for RNA structure/interaction analysis. In particular, we report the identification of the high solubility psoralen derivative amotosalen as a superior crosslinker, compared to the commonly used AMT. We discover abnormally higher hydrophobicity in crosslinked RNA that renders the classical AGPC RNA extraction method inefficient; and develop a new method capable of complete RNA recovery that is generally applicable to crosslinking studies. The full recovery of intact crosslinked RNA enables targeted analysis of structures and interactions, as demonstrated in two applications on cellular and viral RNAs. The newly developed DD2D gel method is robust and isolates crosslinked RNA fragments with high purity, outperforming alternative approaches. To our knowledge, the TNA and 2D gels are the only methods to completely and specifically recover total and crosslinked RNA, respectively. Our in-depth analysis of the photochemical damages in RNA during both the crosslinking and reversal steps have led to new understanding of these processes. Furthermore, we introduced new chemical and enzymatic approaches that significantly improved the prevention and bypass of these damages, solving long-standing problems in the photochemistry field.
In addition to dramatically improving the PARIS method, the newly developed photochemical and enzymatic approaches are generally applicable in molecular biology. For example, the TNA extraction and DD2D gel system are generally applicable to all types of crosslinkers. The prevention and bypass of photochemical damage will be useful in many RNA experiments, since UV irradiation is a commonly used technique. For example, UV crosslinking of RNA-protein interactions also result in damages that reduce cDNA yield and confound the subsequent analysis of crosslink sites33.
Despite these major improvements, there are still several steps that will benefit from further optimizations. For example, faster reacting crosslinkers will enable the analysis of more dynamic structures in vivo. The proximity ligation only produces ~10% gapped reads; more efficient ligation will greatly increase the percentage of useful reads and the sensitivity. Further improvement of the prevention, repair and/or bypass of photochemical damages may provide additional benefits, including even higher yield and higher percentages of gapped reads.
The surprising discovery of increased RNA hydrophobicity after crosslinking is reminiscent of crosslinked RNA-protein complexes that partition to the interphase34–37, however, it represents a different mechanism. Here the RNA structure itself, coupled with the low pH, seems to be the determinant of hydrophobicity, in contrast to RNA-protein crosslinks where the nonpolar amino acid residues control the hydrophobic behavior. The abnormal in vitro phase partition may be relevant to the role of RNA in phase separation in vivo, even though they occur at different physicochemical environments38, 39.
Using PARIS2 and an improved analysis pipeline, we found that the gapped reads report ribosome small subunit and spliceosomal snRNPs across the transcriptome. The bias towards uridines in psoralen crosslinking may confound analysis of binding sites, however, it is unlikely to be critical given the near uniform distribution of the uridines in most mRNAs. Alternatively, a uridine-abundance-based correction can be applied to obtain unbiased measurement of SSU/snRNP binding sites. The simultaneous measurement of mRNA secondary structure and translation status in one experiment will make it possible to directly analyze the impact of RNA structures on translation. The identification of snRNP binding sites and mRNA structures on nascent RNAs will also enable the analysis of structural basis of splicing regulation.
Based on the improved PARIS2, we determined the in vivo structure of the EV-D68 RNA genome, revealing a complex global architecture and dynamic conformations, especially in the IRES that is critical for translation initiation. IRES elements in picornaviruses are particularly fascinating given their rapid evolution and great diversity among the different species. Despite over three decades of research on IRES, our understanding of their in vivo dynamics remains limited. Others have found that certain host factors can induce minor conformations changes in IRES structures40. The dynamic conformations are limited to domains V and VI, which bind the major translation initiation factors, including eIF3, 4B, 4G and PTBP32. The critical location and evolutionary conservation suggest yet unknown functions of these conformations. We propose that the alternative conformations may represent different stages in the life cycle, such as translation, replication and packaging, or different stages in translation initiation. Furthermore, the transitions among the conformations may act structural switches among the stages. The high efficiency and low cost of the re-invented PARIS2 method will enable highly multiplexed analysis of viral RNA structures and interactions in various viral strains, physiological and pharmacological conditions, and stages of life cycles. Together, PARIS2 will enable RNA structurome and interactome analysis in increasingly more challenging biological systems and enable functional and mechanistic investigations of RNA-centric regulations.
Author contributions
Z.L conceived this project and designed the overall PARIS2 strategy. W.A.V. synthesized amotosalen. M.Z., C.Y. and Z.L. developed the method. M.L. R.V.D., W.H.L. M.L.C. and J.-F.C. participated in method optimizations and writing. K.L. and J.B. performed the EV-D68 studies. M.Z., K.L., J.B. and Z.L. performed the analysis. RVD, Z.L. wrote the manuscript with input from all authors.
Competing interests
Z.L., M.Z. and W.A.V. are named inventers on a patent application on the method reported in this paper.
Acknowledgements
This work was supported by NIH. R00HG009662, and startup fund from USC to Z.L. Computation for the work described in this paper was supported by the University of Southern California’s Center for High-Performance Computing (https://hpcc.usc.edu). We also acknowledge the USC Research Center for Liver Disease (P30DK48522) and Norris Comprehensive Cancer Center (P30CA014089) for their support of our research.