ABSTRACT
Glycans modify lipids and proteins to mediate inter- and intramolecular interactions across all domains of life. RNA, another multifaceted biopolymer, is not thought to be a major target of glycosylation. Here, we challenge this view with evidence that mammalian cells use RNA as a third scaffold for glycosylation in the secretory pathway. Using a battery of chemical and biochemical approaches, we find that a select group of small noncoding RNAs including Y RNAs are modified with complex, sialylated N-glycans (glycoRNAs). These glycoRNA are present in multiple cell types and mammalian species, both in cultured cells and in vivo. Finally, we find that RNA glycosylation depends on the canonical N-glycan biosynthetic machinery within the ER/Golgi luminal spaces. Collectively, these findings suggest the existence of a ubiquitous interface of RNA biology and glycobiology suggesting an expanded role for glycosylation beyond canonical lipid and protein scaffolds.
MAIN
Glycans have been shown to regulate a wide array of critical biological processes, ranging from cell-cell contacts to host-pathogen interactions, and even the organization of multicellular organisms(1). In a traditionally adjacent field of study, RNA represents another biopolymer that is central to all known life. While the building blocks of RNA are canonically limited to four bases, post-transcriptional modifications (PTMs) can dramatically elaborate the chemical diversity of RNA, with >100 identified PTMs(2–4). The cellular role for RNA is more complex than that of a simple messenger. For instance, RNAs function as scaffolds, molecular decoys, enzymes, and network regulators across the nucleus and cytosol(5–7). With the exception of a few monosaccharide-based tRNA modifications(8, 9), there has been no evidence of a direct interface between these two fields of biology.
We previously developed a strategy for unbiased discovery of protein-associated glycans based on metabolic labeling and bioorthogonal chemistry(10–12). We metabolically label cells or animals with precursor sugars functionalized with a clickable azide group. Once incorporated into cellular glycans, the azidosugars enable bioorthogonal reaction with a biotin probe for enrichment, identification, and visualization. In the course of performing such experiments using an azide-labeled precursor to sialic acid, peracetylated N-azidoacetylmannosamine (Ac4ManNAz), we made the surprising finding that azide reactivity was present on highly purified RNA preparations from labeled cells. Although there is currently no precedent for a connection between sialoglycans and RNA, either direct or indirect, the fact that RNA is so broadly post-transcriptionally modified in cells motivated us to pursue further investigation.
To explore the possible existence of RNA modified with sialylated glycans (hereafter glycoRNA), we labeled HeLa cells, a human immortalized cell line derived from cervical cancer, with 100 μM Ac4ManNAz for up to 48 hours and then used a rigorous protocol to chemically and enzymatically extract RNA in high purity (Fig. 1A). To visualize azide-modified components, the RNA sample was reacted with dibenzocyclooctyne-biotin (DBCO-biotin) via Cu-free click chemistry(13), separated by denaturing gel electrophoresis and analyzed by blotting (Fig. 1B). In an Ac4ManNAz- and time-dependent manner, we observed biotinylated species in the very high (>10 kilobases) molecular weight (MW) region. It has recently been reported that high doses of azidosugars can produce non-enzymatic protein labeling(14), however, in vitro incubation of total RNA with up to 20 mM Ac4ManNAz did not produce the previously observed biotinylated species on RNAs in the high MW region (fig. S1A). Minor background in vitro labeling was apparent on the 28S rRNA, which can also be seen more variably in some Ac4ManNAz-labeled cellular RNA experiments (e.g. Fig. 1B), but no such background labeling was observed in the putative glycoRNA species.
Further, treatment of RNA from Ac4ManNAz-labeled HeLa cells with DNase did not affect the glycoRNA signal while treatment with an RNase cocktail (A and T1) efficiently digested the total RNA and as well as the biotinylated glycoRNA (Fig. 1C). This effect required RNase enzymatic activity as pre-blocking of the RNases with an inhibitor, SUPERaseIn, completely rescued the biotinylated glycoRNA (Fig. 1C). GlycoRNA was also sensitive to exonucleases, such as RNaseR (3’-5’ exo) and Terminator nuclease (5’-3’ exo), in a manner proportional to the amount of total RNA each enzyme was able to degrade (fig. S1B). Thus, cells treated with Ac4ManNAz enzymatically incorporate the azide label into cellular RNA which migrates as high MW species.
Using the same metabolic labeling approach, we looked for the presence of glycoRNA in other cell types and in animals. Human embryonic stem cells (H9), a human myelogenous leukemia line (K562), a human lymphoblastoid cell line (GM12878), a mouse T-cell acute lymphoblastic leukemia cell line (MYC T-ALL 4188), and a hamster Chinese hamster ovary cell line (CHO) all showed evidence of the presence of glycoRNA (figs. S1C and S1D). In particular, H9 and 4188 cells in particular showed significantly more labeling with Ac4ManNAz per mass of total RNA than other cell types (figs. S1C and S1D). Next, we assessed if this labeling could occur in vivo. To this end, we performed intraperitoneal injections of Ac4ManNAz into mice for 2, 4, or 6 days(15). In the liver and spleen, which were the organs which yielded enough total RNA for analysis, we observed dose-dependent and RNase-sensitive Ac4ManNAz labeling of RNAs in the same MW region as glycoRNAs from cultured cells (Fig. 1D). These data suggest that glycoRNA is not an artifact of tissue culture and occurs broadly across multiple cell and tissue types, albeit at various abundances.
Across all cell types and organs tested, glycoRNA was found to migrate very slowly by denaturing agarose gel electrophoresis (Fig. 1). We hypothesized that, if glycoRNAs are indeed large RNAs, they would likely be polyadenylated (poly-A). However, we were consistently unable to purify glycoRNA from extracted RNA via poly-A enrichment (Fig. 2A). This was not due to cleavage of the glycoRNA during the poly-A enrichment procedure (fig. S2A). To address this, we used a fractionation strategy that leverages length-dependent RNA precipitation and binding to silica columns to separate out “large” (>200 nts) from “small” (<200 nts) transcripts. To our surprise, the glycoRNA fractionated exclusively with the small RNA population of the total RNA (Fig. 2B). To validate this observation with an independent fractionation strategy, we applied Ac4ManNAz-labeled RNA to a sucrose gradient and analyzed the distribution of total RNA via SybrGold staining and glycoRNA. The sucrose gradient efficiently separated the major visible RNAs such as small RNAs/tRNA, 18S rRNA, and 28S rRNA (Fig. 2C and fig. S2B). The glycoRNA signal was specifically present in the small RNA fractions, but still demonstrated extremely slow migration (high apparent MW) in the agarose gel (Fig. 2C).
To identify the glycosylated transcripts, we leveraged the sucrose gradients to isolate only the small RNA fractions from Ac4ManNAz-labeled H9 and HeLa cells. RNA sequencing libraries were generated from small RNAs (input) as well as glycoRNAs that were enriched after streptavidin pulldown. As expected, the distribution of RNA transcripts was biased for small non-polyadenylated RNAs (Fig. 2D, fig. S3A, and table S1). Quality controls revealed that biological replicates had high concordance, with glycoRNA-enriched and input samples clustering away from each other (fig. S3B and S3C). For each small RNA family, an enrichment was calculated to assess the relative distribution of reads in the glycoRNA-enriched vs input samples (Fig. 2D). Ribosomal RNA and tRNAs were depleted or poorly enriched, except for the 5.8S rRNA, in the glycoRNA pool (Fig. 2D). Messenger RNAs and other noncoding RNA transcripts were poorly enriched (ranging from 1.2–2x over input). However, small nuclear and small nucleolar RNAs (sn/snoRNAs) were the second most enriched class at 3.8-4x over input (Fig. 2D). Finally, the Y RNA family stood out as the single best represented set of small RNAs at approximately 22-fold enriched over input (Fig. 2D). Notably, despite the differences in cell type and culture conditions, HeLa and H9 glycoRNA enrichment results were highly similar (Fig. 2D).
Most snRNAs and snoRNAs were enriched to some degree, with U3, U6, and U8 ranking among the most highly enriched in both HeLa and H9 cells (fig. S3D). Among the Y RNAs, Y5 was the most well represented (fig. S3D). Finally, considering all the individual snoRNAs, we found a positive correlation between the enrichment of these transcripts across cell types (fig. S3E). Exceptions to this were the SNORD116 snoRNAs that are not expressed in HeLa but are expressed and enriched in H9 cells (table S2).
Next, we identified putative sites of glycoRNA modification by mapping of reverse transcriptase (RT) stops. RTs are characterized to frequently stop one nucleobase 3’ of PTMs in RNA transcripts((16, 17), Methods). We chose to focus on Y5 and U3 as they were both highly enriched from HeLa and H9 cells and comprised approximately 3% and 10%, respectively, of all reads in the glycoRNA-enriched libraries (table S1). We found that input RNA RT stops were broadly distributed and partially biased for the 5’ ends, which would represent cDNA synthesis templated by an intact and unmodified RNA (Fig. 2E, fig. S3F, S3G, and table S3). In contrast, glycoRNA RT stops were less biased for the full-length RNA. This pattern was particularly pronounced for the Y5 glycoRNA, which yielded two strong RT stops at single stranded guanosines 35 (G35) and 64 (G64) (Fig. 2E, 2F). Strikingly, the positions and relative intensities were conserved among glycoRNAs from HeLa and H9 cells (Fig. 2E and fig. S3F). The RT stop pattern for U3 glycoRNA was more complex, but many of the strongest RT stops occurred at guanosine residues (fig. S3G). To understand if guanosine modification is a general feature of glycoRNAs, we calculated the fractional distribution of nucleobases at RT stops from input and enriched libraries along snRNA and Y RNA transcripts. In both cases, guanosine was the only enriched nucleobase (fig. S3H).
The RNAs we found to be modified have many well-established and critical cellular roles. The Y RNA family stood out as the most strongly enriched family in our dataset and is highly conserved in vertebrates(18). Y RNAs are thought to contribute to cytosolic ribonucleoprotein (RNP) surveillance, particularly for the 5S rRNA(18, 19). Additionally, they are (among other glycoRNAs we identified) known to be antigens associated with autoimmune diseases, such as systemic lupus erythematosus and mixed connective tissue disease(20). Given these features, we sought to validate Y5 as a glycoRNA by gene knockout via CRISPR/Cas9. A 293T Y5 knockout cell line was generated using two single-guide RNAs (sgRNAs) that targeted the 5’ and 3’ regions of the Y5 genomic locus (fig. S4A). Single cell clones were isolated, and a KO was selected for characterization; PCR amplification of the Y5 locus yielded two amplicons corresponding to two different insertion/deletions (fig. S4B and S4C). The KO synthesized no observable Y5 transcript but had no gross growth defects (fig. S4D and S4E). Ac4ManNAz-labeling of the Y5 KO cells resulted in a significant (~30%) decrease in the amount of biotin signal compared to WT cells, without any apparent MW changes (Fig. 2G and 2H). The reduction of glycoRNA signal is consistent with the sequencing data, which identifies Y5 as a major, but not exclusive glycoRNA species.
Next we sought to define the glycan structures on glycoRNAs. The major pathway for Ac4ManNAz metabolism in human cells entails conversion to sialic acid, then to CMP-sialic acid, and finally addition to an underlying glycan(21). To exclude the possibility that Ac4ManNAz is shunted into unexpected metabolic pathways, we used 9-azido sialic acid (9Az-sialic acid), which is directly converted into CMP-sialic acid(22) as a metabolic label. Consistent with Ac4ManNAz labeling, 9Az-sialic acid produced a similar time-dependent labeling of slowly migrating cellular RNA (Fig. 3A). Additionally, treatment of Ac4ManNAz-labeled cellular RNA with Vibrio cholerae sialidase (VC-Sia) completely abolished the biotin signal without impacting the integrity of the RNA sample, while a heat-inactivated (HI) VC-Sia was unable to reduce the signal (Fig. 3B). We assessed the contribution of canonical sialic acid biosynthesis enzymes through the use of P-3FAX-Neu5Ac, a cell-permeable metabolic inhibitor of sialoside biosynthesis(23). Treatment of HeLa cells with P-3FAX-Neu5Ac resulted in a dose-dependent reduction in total glycoRNA signal and a concomitant shift towards higher apparent MW on the blot (Fig. 3C). This reduced mobility (appearing higher in the gel) of the glycoRNA likely results from less sialic acid, and thus less negative charge, per glycoRNA molecule.
To confirm that glycoRNAs are sialylated, we used an independent method not based on metabolic reporters. The fluorogenic 1,2-diamino-4,5-methylenedioxybenzene (DMB) probe is used to derivatize free sialic acids for detection and quantitation by HPLC-fluorescence(24). We subjected native, total RNA from HeLa, H9, and 4188 cells to the DMB labeling procedure (fig. S5A) and observed the presence of two forms of sialic acid commonly found in animals, Neu5Ac and Neu5Gc (Fig. 3D and fig. S5B). These peaks disappeared when the samples were pretreated with VC-Sia or RNase, reinforcing the notion that glycoRNA is modified with sialic acid containing glycans. Notably, we were unable to detect any sialic acid liberated from genomic DNA using the DMB assay (fig. S5C).
Quantitatively, we found that H9, HeLa, and 4188 cells have approximately 40, 20, and 20 picomoles (pmol) of total sialic acid per μg of total RNA, respectively (Fig. 3E). GlycoRNA from 4188 cells contained more Neu5Gc, whereas H9 cells contained mostly Neu5Ac, and HeLa cells had similar levels of Neu5Ac and Neu5Gc (Fig. 3E). Consistently, the DMB results are in line with the observed difference in Ac4ManNAz labeling intensity (fig. S1C and S1D). Human cells lack a functional CMAH gene which is responsible for converting Neu5Ac to Neu5Gc, while this pathway exists in mouse cells(25). Correspondingly, we found higher Neu5Gc levels in glycoRNA from mouse 4188 cells as compared to HeLa or H9 cells (Fig. 3E). The presence of Neu5Gc in HeLa glycoRNA likely comes from bovine serum in the growth media; H9 cells were grown in serum-free media.
There are two main classes of glycans on proteins, N- and O-glycans, and both can be sialylated. To determine whether glycoRNA structures were related to glycoprotein-associated glycan structures, we used a combination of genetic, pharmacological and enzymatic methods. The ldlD mutant CHO cell line lacks the ability to interconvert UDP-glucose(Glc)/galactose(Gal) and UDP-GlcNAc/GalNAc ((26) and fig. S6A). Thus, in minimal growth media lacking Gal and GalNAc, glycoproteins from ldlD CHO have stunted N- and O-glycans because the cells cannot produce UDP-Gal (required for N-glycan elongation) and UDP-GalNAc (required to initiate O-glycosylation). We observed very little glycoRNA labeling in Ac4ManNAz-treated ldlD CHO cells (Fig. 4A) grown in minimal media. However, supplementation of the media with galactose, but not GalNAc, restored glycoRNA labeling, and supplementation with both galactose and GalNAc further boosted labeling intensity (Fig. 4A). This result was reproduced using a human K562(27) cell line with a CRISPR-Cas9 targeted knockout of UDP-galactose-4-epimerase (GALE), whose activity is lost in the ldlD CHO cell line (fig. S6B). These results are similar to observations of glycoprotein labeling in these cell types(28), suggesting that glycoRNA glycans are structurally related to those found on proteins.
We next tested the effects of glycosylation inhibitors on glycoRNA biosynthesis. Oligosaccharyltransferase (OST) is the major enzyme complex that transfers a 14-sugar precursor lipid-linked oligosaccharide to nascent peptides during their translocation through the Sec/translocon ((29) and fig. S6C). We tested NGI-1, a specific and potent small molecule inhibitor of OST(30), which caused a dose-dependent loss of glycoRNA labeling (Fig. 4B), suggesting that OST is involved in biosynthesis of glycoRNA-associated glycans. We also perturbed downstream N-glycan processing steps with kifunensine and swainsonine, inhibitors of the N-glycan trimming enzymes α-mannosidase I and II, respectively ((31, 32) and fig. S6C), which resulted in a dose-dependent loss of azidosugar labeling (Fig. 4C and fig. S6D). This was accompanied by an increased in apparent MW of the glycoRNA at higher doses, akin to the results see with P-3FAX-Neu5Ac (Fig. 3C). We hypothesize that disruption of high-mannose glycan processing produces hyposialylated glycoRNAs with less net negative charge and, therefore, reduced mobility.
To further define the glycan structures on glycoRNA, we employed a panel of endoglycosidases. Purified RNA from Ac4ManNAz-labeled HeLa cells was first exposed to each enzyme and then clicked to biotin for visualization. Treatment of glycoRNA with PNGase F, which cleaves the asparagine side chain amide bond between proteins and N-glycans (33), strongly abrogated signal from Ac4ManNAz labeling. Endo F2 preferentially cleaves biantennary and high mannose structures, while Endo F3 preferentially cleaves fucosylated bi- and triantennary structures, both within the chitobiose core of the glycan(33). Treatment of glycoRNA with either Endo F2 or F3 resulted in a partial loss of Ac4ManNAz labeling. However, Endo Hf, which is more specific for high-mannose structures, did not affect Ac4ManNAz signal (Fig. 4D and fig. S6E). By contrast to these N-glycan digesting enzymes, O-glycosidase (targeting core 1 and core 3 O-glycans(34)) or mucinase (StcE (35)) treatment had no effect on Ac4ManNAz labeling intensity (Fig. 4D and fig. S6E, F). As in previous experiments, VC-Sia completely removed the Ac4ManNAz signal (Fig. 4D and fig. S6E). Together, these data suggest that glycoRNA possesses bi- and tri-antennary N-glycans with at least one terminal sialic acid residue.
Finally, we assessed the subcellular localization of glycoRNA. The biogenesis of sialylated glycans occurs across many subcellular compartments including the cytosol (processing of ManNAc to Neu5Ac), the nucleus (charging of Neu5Ac with CMP), and the secretory pathway (where sialyltransferases append sialic acid to the ends of glycans(36)). Interestingly, the localization of Y RNAs has been reported to be mainly cytoplasmic with a minor fraction in the nucleus (19). To determine specifically where glycoRNAs are distributed inside cells, we used two biochemical strategies: one which isolates pure nuclei away from membranous organelles mixed with the cytosol(37) and a second which separates the soluble cytosolic compartment away from membranous organelles (Methods). Nuclear RNA of Ac4ManNAz-labeled HeLa cells yielded no detectible azide-labeled species while the ER/membrane fraction quantitatively contained the glycoRNA (Fig. 4E, F). This suggests that the glycoRNAs reside within or are closely associated with membrane organelles. To address these possibilities, crude membranes were isolated from Ac4ManNAz-labeled 293T cells and subjected to VC-Sia digestion with or without pre-treatment with Triton X-100 to permeabilize membrane organelles (Fig. 4G). If glycoRNAs were fully contained within membranes, VC-Sia would only have access to these species after the addition of Triton X-100. We found that the majority of the glycoRNA signal was sensitive to VC-Sia without Triton X-100, while a small but consistent pool was accessible only after permeabilization (Fig. 4H). Thus, a proportion of glycoRNAs appears to reside within the luminal space of membranous organelles.
DISCUSSION
We have found that sialylated N-glycans produced by the canonical ER/Golgi-lumen biosynthetic machinery are attached to specific mammalian small RNAs. These RNAs are a select group of small noncoding RNAs which are consistently modified across several cell types and organisms. Mapping of RT stops suggests that the glycan modifications occur at discrete guanosine residues. Interestingly, the putative sites of glycosylation are predicted to lie within single-stranded loops or bulges (Fig. 2F). Overall, these findings point to a common strategy for RNA glycosylation in mammals (Fig. 4I).
The glycan-RNA linkage was not sensitive to stringent protocols to separate RNA from lipids and proteins including organic phase separation, proteinase K treatment and silica-based RNA purification. While the precise nature of the glycan-RNA linkage has not yet been determined, we hypothesize that direct glycosylation of native RNA bases is unlikely. The observed sensitivity to PNGase F, which cleaves the glycosidic linkage between Asn and the initiating GlcNAc of N-glycans, implies an amide bond-containing linker that native nucleobases lack. It is possible that a precursor guanosine modification is necessary to establish an asparagine-like functionality capable of modification by OST. Precisely defining the chemical and atomic features of this linkage will be critical for future studies.
Another mystery concerns the mechanism by which small RNA substrates might gain access to luminal compartments in the secretory pathway. There is precedent for transport machineries of intact RNA transcripts across membranes, such as the C. elegans transporter SID-1 that imports dsRNA for RNA interference(38). Interestingly, mammals have two SID-1 orthologs which are thought to transport RNAs across intracellular membranes(39, 40). It is possible that related trafficking systems exist on the ER membrane that would enable N-glycosylation of appropriately functionalized RNA.
Antibodies that cause various autoimmune diseases have been discovered against a striking number of the RNAs, either alone or in complex with proteins, that we identified as glycosylated in this study. A common mechanism by which such RNAs might provoke an immune reaction has been elusive. In light of our findings, the possible access of these RNAs to the secretory pathway and their modification with N-glycan structures may play a role.
The framework in which glycobiology is presently understood excludes RNA as a substrate for N-glycosylation. Our discovery of glycoRNA suggest this is an incomplete view and points to a new axis of RNA glycobiology, including unprecedented enzymology, trafficking, and cell biology.
Funding
This work was supported by grants from Damon Runyon Cancer Research Foundation DRG-2286-17 (R.A.F.), the Howard Hughes Medical Institute (C.R.B.), National Institutes of Health (NIH) R01 AI141970 (J.E.C.), NIH F31 Predoctoral Fellowship 1F30CA232541 (B.A.H.S), National Institute of General Medical Sciences F32 Postdoctoral Fellowship F32-GM126663-01 (S.A.M.), National Science Foundation Graduate Research Fellowship (NSF GRF) DGE-114747 (A.G.J.), and NSF-GRF/Stanford Graduate Fellowship/Stanford ChEM-H Chemistry/Biology Interface Predoctoral Training Program (K.P.).
Author Contributions
R.A.F. conceived the project. C.R.B. supervised the project. R.A.F., K.P., and C.R.B developed experimental plans. B.A.H.S., S.A.M., and R.A.F. performed and analyzed DMB experiments. A.G.J. and R.A.F. performed and analyzed sucrose gradients. B.M.G. performed mouse related experiments. K.M., R.A.F., and J.E.C. performed CRISRP/Cas9 experiments. R.A.F. and C.R.B. wrote the manuscript. All authors discussed results and revised the manuscript.
Competing interests
The authors declare no competing interests.
Data and materials availability
All sequencing data has been deposited on the Gene Expression Omnibus (GSE136967).
MATERIALS AND METHODS
Mammalian cell culture
All cells were grown at 37°C and 5% CO2. HeLa and HEK293T cells were cultured in DMEM media supplemented with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin (P/S). GM12878, K562, K562GALE-/-(27), and MYC T-ALL 4188 cells were cultured in RPMI-1640 media with glutamine supplemented with 10% FBS and 1% P/S. CHO and ldlD-CHO cells were cultured in 1:1 DMEM:F12 media with 3% FBS and 1% P/S. The H9 human embryonic stem cell line was cultured on Matrigel matrix (Corning) coated plates with mTeSR 1 (StemCell Technologies) media.
Metabolic chemical reporters and inhibitors
Stocks of azide-labeled sugars N-Acetyl-9-azido-9-deoxy-neuraminic acid (9Az sialic acid, Carbosynth) and N-azidoacetylmannosamine-tetraacylated (Ac4ManNAz, Click Chemistry Tools) were made to 500 mM in sterile dimethyl sulfoxide (DMSO). Stocks of unlabeled sugars N-Acetyl-D-galactosamine (GalNAc, Sigma) and D-(+)-Galactose (Gal, Sigma) were made to 500 mM and 50 mM, respectively, in sterile water. In cell experiments ManNAz was used at a final concentration of 100 μM. In vitro experiments with ManNAz used 0, 2, or 20 mM ManNAz (up to 200x the in-cell concentrations) for 2 hours at 37°C. The in-cell experiments with 9Az sialic acid used a 1.75 mM final concentration for between 6 and 48 hours. Gal and GalNAc were used as media supplements at 10 μM and 100 μM, respectively, and were added simultaneously with ManNAz for labeling.
Working stocks of glycan-biosynthesis inhibitors were all made in DMSO at the following concentrations and stored at −80°C: 10 mM NGI-1 (Sigma), 10 mM Kifunensine (Kif, Sigma), 10 mM Swainsonine (Swain, Sigma), 50 mM P-3FAX-Neu5Ac (Tocris). All compounds were used on cells for 24 hours and added simultaneously with ManNAz for labeling.
Metabolic reporters in mouse models
All experiments were performed according to guidelines established by the Stanford University Administrative Panel on Laboratory Animal Care. C57Bl/6 mice were crossed and bred in house. ManNAz was prepared by dissolving 100 mg ManNAz in 830 μL 70% DMSO in phosphate buffered saline (PBS), warming to 37°C for 5 minutes, and then sterile filtering using 0.22 μm Ultrafree MC Centrifugal Filter units (Fisher Scientific); this solution was stored at −20°C. Male C57Bl/6 mice (8-12 weeks old) were injected once-daily, intraperitoneally with 100 μL of ManNAz (dosed to 300 mg ManNAz/kg/day), while control mice received the vehicle alone. At 2, 4, and 6 days, mice were euthanized, and their livers and spleens were harvested. The organs were pressed through a nylon cell strainer and resuspended with PBS to create a single cell suspension. RNA was collected as described below.
RNA extraction and purification strategies
A specific series of steps were taken to ensure that RNA analyzed throughout this study was as pure as possible. First TRIzol reagent (Thermo Fischer Scientific) was used as a first step to lyse and denature cells or tissues. After homogenization in TRIzol by pipetting, samples were incubated at 37°C to further denature non-covalent interactions. Phase separation was initiated by adding 0.2x volumes of 100% chloroform, vortexing to mix, and finally spinning at 12,000x g for 15 minutes at 4°C. The aqueous phase was carefully removed, transferred to a fresh tube and mixed with 2x volumes of 100% ethanol (EtOH). This solution was purified over a Zymo RNA clean and concentrator column (Zymo Research) largely as per the manufacturer’s instructions. Modifications include: (1) each volume of the TRIzol-aqueous phase/EtOH mix was passed over the column twice to fully capture all RNAs and (2) to elute RNAs, two volumes of pure water were used. Next RNA was subjected to protein digestion by adding 1 μg of Proteinase K (PK, Thermo Fischer Scientific) to 25 μg of purified RNA and incubating it at 37°C for 45 minutes. After PK digestion, RNA was purified again with a Zymo RNA clean and concentrator using the protocol as prescribed by the manufacturer. All RNA samples generated in this study were purified at least by these two steps first, with subsequent enzymatic or RNA fractionations occurring in addition to these first two purifications. We found that Zymo-Spin IC and IIICG columns bind up to ~50 and 350 μg of total RNA, respectively; columns in a given experiment were selected based on the amount of RNA needed to be purified.
For differential-precipitation of small vs large RNAs, the Zymo RNA clean and concentrator protocol was used as described. Briefly, RNA in an aqueous solution was mixed with 1x volumes of 50% RNA Binding Buffer in 100% EtOH. This mix was applied to the Zymo silica column; the flow through contained small RNAs while the column retained large RNAs. The flow through was mixed with 1x volumes of 100% EtOH, bound to a new Zymo column and purified as prescribed by the manufacturer.
To enrich for poly-adenylated RNA species, RNA initially purified as above was used as the input for the Poly(A)Purist MAG Kit (Thermo Fisher Scientific). After following the manufacturer’s protocol to elute polyA-tailed transcripts, RNA was subsequently cleaned up via the Zymo RNA clean and concentrator as described above.
Enzymatic treatment of RNA samples and cells
Various endo- and exonucleases and glycosidases were used to digest RNA, DNA, or glycans. All digestions were performed on 20 μg of total RNA in a 20 μL at 37°C for 60 minutes. To digest RNA the following was used: 1 μL of RNase cocktail (0.5U/μL RNaseA and 20U/μL RNase T1, Thermo Fisher Scientific) with 20 mM Tris-HCl (pH 8.0), 100 mM KCl and 0.1 mM MgCl2 or 2 μL of RNaseR (20U/μL, Epicentre) with 20 mM Tris-HCl (pH 8.0), 100 mM KCl and 0.1 mM MgCl2 or 2 μL of Terminator nuclease (1U/μL, Epicentre) with 50 mM Tris-HCl (pH 8.0), 2 mM MgCl2, 100 mM NaCl. To block the RNase activity of the RNase Cocktail, 1μL of RNase Cocktail was premixed with 8 μL of SUPERaseIn (20U/μL, Thermo Fisher Scientific) for 15 minutes at 25°C before adding to the RNA solution. To digest DNA, 2 μL of TURBO DNase (2U/μL, Thermo Fisher Scientific) with 1x TURBO DNase buffer (composition not provided by manufacture). To digest glycans: 2 μL of α2-3,6,8 Neuraminidase (50U/μL, New England Biolabs, NEB) with GlycoBuffer 1 (NEB), or 2 μL of Endo-Hf (1,000U/μL, NEB) with GlycoBuffer 3 (NEB), or 2 μL of PNGase F (500U/μL, NEB) with GlycoBuffer 2 (NEB), or 2 μL of Endo-F2 (8U/μL, NEB) with GlycoBuffer 3 (NEB), or 2 μL of Endo-F3 (8U/μL, NEB) with GlycoBuffer 4 (NEB), or 2 μL of O-Glycosidase (40,000U/μL, NEB) with GlycoBuffer 2 (NEB), or 1 μL of StcE(35) at 0.5 μg/μL with or without 20 mM EDTA. For live cell treatments, VC-Sia was expressed and purified as previously described(41) and added to cells at 150 nM final concentration in complete growth media for 1 hour at 37°C.
Strain-promoted Alkyne-Azide Cycloaddition (SPAAC) conjugation to RNA
Strain-promoted Alkyne-Azide Cycloaddition (SPAAC) was used in all experiments to avoid copper in solution during the conjugate of biotin to the azido sugars (ManNAz and 9Az-Sia). All experiments used dibenzocyclooctyne-PEG4-biotin (DBCO-biotin, Sigma) as the alkyne half of the cycloaddition. To perform the SPAAC, RNA in pure water was mixed with 1x volumes of “dye-free” Gel Loading Buffer II (df-GLBII, 95% Formamide, 18mM EDTA, and 0.025% SDS) and 500 μM DBCO-biotin. Typically, these reactions were 10 μL df-GLBII, 9 μL RNA, 1 μL 10mM stock of the DBCO reagent. Samples were conjugated at 55°C for 10 minutes to denature the RNA and any other possible contaminants. Reactions were stopped by adding 80 μL water, then 2x volumes (200 μL) of RNA Binding Buffer (Zymo), vortexing, and finally adding 3x volumes (300 μL) of 100% EtOH and vortexing. This binding reaction was purified over the Zymo column as described above and analyzed by gel electrophoresis as described below.
RNA gel electrophoresis, blotting, and imaging
Blotting analysis of ManNAz-labeled RNA was performed conceptually similar to a Northern Blot with the following modifications. RNA purified, enriched, or enzymatically digested and conjugated to a DBCO-biotin reagent as a described above was lyophilized dry and subsequently resuspended in 15 μL df-GLBII with 1x SybrGold (Thermo Fisher Scientific). To denature, RNA was incubated at 55°C for 10 minutes and crashed on ice for 3 minutes. Samples were then loaded into a 1% agarose-formaldehyde denaturing gel (Northern Max Kit, Thermo Fisher Scientific) and electrophoresed at 110V for 45 minutes. Total RNA was then visualized in the gel using a UV gel imager. RNA transfer occurred as per the Northern Max protocol for 2 hours at 25°C, except 0.45 μm nitrocellulose membrane (NC, GE Life Sciences) was used. This is critical for downstream imaging as most positively charged nylon membranes have strong background in the infrared (IR) spectra. After transfer, RNA was crosslinked to the NC using UV-C light (0.18 J/cm2). NC membranes were then blocked with Odyssey Blocking Buffer, PBS (Li-Cor Biosciences) for 45 minutes at 25°C. Note that the blocking buffer made with TBS or PBS, both sold from Li-Cor Biosciences, work similarly for this step. After blocking, Streptavidin-IR800 (Li-Cor Biosciences) was diluted to 1:10,000 in Odyssey Blocking Buffer and stained the NC membrane for 30 minutes at 25°C. Excess streptavidin-IR800 was washed from the membranes by three, serial washes of 0.1% Tween-20 (Sigma) in 1x PBS for 5 minutes each at 25°C. NC membranes were briefly rinsed in 1x PBS to remove the Tween-20 before scanning on an Odyssey LiCor CLx scanner (Li-Cor Biosciences) with the software set to auto-detect the signal intensity for both the 700 and 800 channels. After scanning, images were quantified with the LiCor software (when appropriate) in the 800 channel and exported.
DMB assay for sialic acid detection
Unless otherwise noted, all chemicals were supplied by Sigma. Native sialic acids on RNA or DNA were derivatized with 4,5-methylenedioxy-1,2-phenylenediamine dihydrochloride (DMB) and detected via reverse phase high-performance liquid chromatography (HPLC) according to established methods(24). In brief, RNA samples were lyophilized, and 100 μg (or otherwise noted in specific figures) of each sample was dissolved in 2 M acetic acid. Sialic acids were hydrolyzed by incubation at 80°C for 2 hours, and then cooled to room temperature before the addition of DMB buffer (7 mM DMB, 0.75 M β-mercaptoethanol, 18 mM Na2SO4, 1.4 M acetic acid). Derivatization was performed at 50°C for 2 hours. Following the addition of 0.2 M NaOH, samples were filtered through 10 kDa MWCO filters (Millipore) by centrifugation and stored in the dark at −20°C until use. Separation was performed via reverse phase HPLC using a Poroshell 120 EC-C18 column (Agilent) with a gradient of acetonitrile in water: T(0 minutes) 2%; T(2 minutes) 2%; T(5 minutes) 5%; T(25 minutes) 10%; T(30 minutes) 50%; T(31 minutes) 100%; T(40 minutes) 100%; T(41 minutes) 2%; T(45 minutes) 2%. DMB-derivatized sialic acids were detected by excitation at 373 nm and monitoring emission at 448 nm. Sialic acids standards included N-acetylneuraminic acid (Neu5Ac; Jülich Fine Chemicals), N-glycolylneuraminic acid (Neu5Gc; Carbosynth), 3-deoxy-D-glycero-D-galacto-2-nonulosonic acid (KDN; Carbosynth), and the Glyko Sialic Acid Reference Panel (Prozyme).
Subcellular fractionation
Isolation of highly pure nuclei
Nuclei are intricately entwined with the ER, posing a challenge to biochemically separate nuclei cleanly from the ER without mixing. Gagnon et al.(37) describe a protocol which cleanly recovers mammalian nuclei after processing without significant residual ER membrane attached. We performed this protocol on adherent ManNAz-labeled HeLa cells without modification to the step-by-step instructions published. Due to the stringent isolation of the nuclei, some fraction of nuclei themselves lyse during the process, contaminating the non-nuclear fraction. Therefore, when examining the fractionation results of this protocol, we consider only the signal left in the nucleus. Signal in the supernatant will be partially mixed ER, Golgi, cytosol, some nuclei, as well as other cellular compartments. After fractionation as per the protocol, TRIzol was used to extract and process the RNA.
Isolation of cytosol and crude membrane fractions
The ProteoExtract® Native Membrane Protein Extraction Kit (EMD Millipore) was used as per the manufactures protocol on adherent ManNAz-labeled HeLa cells. This kit uses serial lysis steps: first to gently release soluble cytosol proteins and RNA and second to rupture membranous organelles such as the plasma membrane, Golgi, and ER. Because the lysis buffers are gentle, residual ER/Golgi are left on the nuclear fraction and thus analysis of samples generated from this kit was limited to the efficiently separated soluble cytosolic fractions compared to the membranous fractions. After fractionation as per the kit’s protocol, TRIzol was used to extract and process the RNA as described above.
Membrane protection assay
Large scale crude membranes were isolated using the Plasma Membrane Protein Extraction Kit (ab65400, Abcam) following the manufactures protocol, stopping after producing the membrane pellet but before separating plasma membrane from others. Typically, 10x 15cm plates of 80% confluent 293T cells used for each biological replicate. Membranes were resuspended in 800 μL KPBS (136 mM KCl, 10 mM KH2PO4, pH 7.25 was adjusted with KOH (42)), 125 mM sucrose, and 2 mM MgCl2, split into 4 reactions, and incubated at 37°C for 1 hour with or without 0.1% Triton X-100 or 150 nM VC-Sia (homemade as per above). RNA was extracted with TRIzol and processed as described above for DMB analysis of sialic acid levels.
Antibodies for western blotting
The following were used for blotting on nitrocellulose membranes at the indicated concentrations: 1:1000 GAPHD (A300-641A, Bethyl), 1:3000 β-tubulin (ab15568, Abcam), 1:5000 H3K4me3 (ab8580, Abcam), 1:1000 RPN1 (A305-026A, Bethyl), 1:1000 Sec63 (A305-084A, Bethyl). Appropriate secondary antibodies conjugated to LiCor IR dyes (Li-Cor Biosciences) and used at a final concentration of 0.1 ng/μL.
Sucrose gradient fractionation of RNA
RNA used as input for sucrose gradient fractionation was previously extracted, PK treated, and clicked to DBCO-biotin as described above. RNA was sedimented through 15-30% sucrose gradients following McConkey’s method(43). Typically, 250-500 μg total RNA was lyophilized and then dissolved in 500 μL buffer containing 50 mM NaCl and 100 mM sodium acetate (pH 5.5). Linear 15-30% sucrose gradients were prepared in 1×3.5 inch polypropylene tubes (Beckman) using a BioComp 107 Gradient Master. Dissolved RNA was layered on top of pre-chilled gradients, which were then centrifuged using a SW32 Ti rotor at 80,000x g (25,000 rpm) for 18 hours in a Beckman Coulter Optima L70-K Ultracentrifuge at 4°C. Gradients were fractionated using a Brandel gradient fractionation system, collecting 0.75 mL fractions. Fractionated RNA was subsequently extracted from the sucrose solution using TRIzol as described above and analyzed by agarose gel electrophoresis or deep sequencing.
Enrichment, deep sequencing, and analysis of ManNAz-labeled RNA
Two rounds of selection performed on RNA samples before sequence analysis to identify transcripts modified with ManNAz-containing glycans. Total RNA from ManNAz-labeled H9 or HeLa cells was extracted, purified, and conjugated to DBCO-biotin as described above. Biological duplicates, at the cell culture level (different passage number), were generated for the purposes of the sequencing experiments. The first enrichment was achieved by sucrose gradient fractionation; after centrifugation fractions containing small RNAs were pooled and TRIzol extracted. The second enrichment was achieved by selective affinity to streptavidin beads as previously published(44) with the following specific steps: 10 μL of MyOne C1 Streptavidin beads (Thermo Fisher Scientific), per reaction were blocked with 50 ng/μL glycogen (Thermo Fisher Scientific) in Biotin Wash Buffer (10 mM Tris HCl pH 7.5, 1 mM EDTA, 100 mM NaCl, 0.05% Tween-20) for 1 hour at 25°C. Biotinylated small RNAs from H9 and HeLa cells were thawed and 150 ng of each were saved for input library construction. Next, 25 μg of the biotinylated small RNAs were diluted in 750 μL Biotin Wash Buffer (final concentration of ~33 ng/μL) and mixed with the blocked MyOne C1 beads for 2 hours at 4°C. Beads were washed to remove non-bound RNAs: twice with 1 mL of ChIRP Wash Buffer (2x SSC, 0.5% SDS), twice with 1 mL of Biotin Wash Buffer, and twice with NT2 Buffer (50 mM Tris HCl, pH 7.5, 150 mM NaCl, 1 mM MgCl2, 0.005% NP-40), all at 25°C for 3 minutes each.
To construct deep sequencing libraries two approaches were taken using the same enzymes with different steps for the input(45) vs. bead-enriched(46) samples given that the latter were already conjugated to a bead-support.
Input libraries
The 150 ng of small RNAs isolated before MyOne C1 capture were lyophilized dry and then T4 PNK mix (2 μL 5x buffer (500 mM Tris HCl pH 6.8, 50 mM MgCl2, 50 mM DTT), 1 μL T4 PNK (NEB), 1 μL FastAP (Thermo Fischer Scientific), 0.5 μL SUPERaseIn, and 5.5 μL water) was added for 45 minutes at 37°C. Next, a pre-adenylated-3’linker was ligated by adding 3’Ligation Mix (1 μL of 3 μM L3-Bio_Linker(5), 1 μL RNA Ligase I (NEB), 1 μL 100 mM DTT, 1 μL 10x RNA Ligase Buffer (NEB) and 6 μL 50% PEG8000 (NEB)) to the T4 PNK reaction and incubating for 4 hours at 25°C. Unligated L3-Bio_Linker was digested by adding 2 μL of RecJ (NEB), 1.5 μL 5’ Deadenylase (NEB), 3 μL of 10x NEBuffer 1 (NEB) and incubating the reaction at 37°C for 60 minutes. Ligated RNA was purified with Zymo columns as described above and lyophilized dry. cDNA synthesis, enrichment of cDNA:RNA hybrids, cDNA elution, cDNA circularization, cDNA cleanup, first-step PCR, PAGE purification, and second-step PCR took place exactly as previously described(6).
Bead-enriched libraries
Washed MyOne C1 beads bounded to ManNAz-labeled small RNAs were processed as described before(45) with the following modifications. For the on-bead ligation step, a non-biotinylated 3’linker oligo was used (L3-Linker(5)) such that all RNAs captured on the beads would be included in the sequencing library. After completing the second-step PCR for both the input and bead-enriched samples, the dsDNA libraries were quantified on a High Sensitivity DNA Bioanalyzer chip (Agilent) and sequenced on a NextSeq 500 instrument (Illumina).
Data analysis
Sequencing data were processed largely as described previously with a pipeline designed to analyze infrared CLIP data(6). The specific version of the pipeline used in this work can be found here (https://github.com/ChangLab/FAST-iCLIP/tree/lite). Specifically, the raw reads were removed of PCR duplicates, adaptor sequences trimmed, mapped to a human genome reference (GCRh38) and custom sequence indexes of human repetitive RNAs (such as snRNAs and rRNAs), and finally reverse transcriptase (RT) stops were identified. We hypothesized that cDNA synthesis of unfragmented small RNAs used for the input samples would generate RT stops largely at the 5’ends of RNAs which would represent full length cDNA synthesis. In contrast, if the ManNAz-enrichment successfully isolated covalently modified RNAs, the RT may stop adjacent to the position where the RNA was modified, akin to the RT stopping adjacent to a UV-crosslinked nucleotide in irCLIP experiments(6). Enrichment of RNA families and specific RNA transcripts were performed by calculating relative proportions of unique RT stop counts in input and enriched libraries. Fractional nucleobase identity at RT stops in input and enriched libraries were used to calculate the enrichment of specific nucleobases in the snRNA and Y-RNA families.
CRISPR/Cas9 knockout of Y5 and characterization
CRISPR gRNA sequences were designed using the CHOPCHOP online webtool (http://chopchop.cbu.uib.no/index.php) (47). Guides that flank the Y5 locus were selected (table S4). Corresponding oligos were ordered from IDT. Oligos were cloned into the Zhang lab generated Cas9 expressing pX458 guide RNA plasmid (Addgene) as previously described(48) using Gibson assembly reaction (NEB). Two sgRNAs flanking the human Y5 locus encoded in the pX458 plasmids were co-transfected using Lipofectamine 3000 (Thermo Fischer Scientific) according to manufactures guidelines in a 6-well format. Transfected cells were single cell sorted based on GFP expression into 96-well plates using BD influx cell sorter (Stanford FACS facility). Clonal cell lines were allowed to expand, and genomic DNA was isolated for sequenced based genotyping of targeted allele. For this, a 300–500 base-pair region that encompassed the gRNA-targeted site was amplified and the PCR product was Sanger sequenced. Clones with editing events causing large deletions were selected for subsequent experiments and KO loss of expression was confirmed by Northern blotting (below). To evaluate doubling time, 293 WT and KO cells were cultured as described above, initially seeding 20,000 cells per 12-well plate in triplicate. At 24-hour intervals cells were trypsinized and counted using a Countess II FL Automated Cell Counter (Thermo Fischer Scientific) instrument as per the manufacturer’s instructions.
Small RNA Northern blotting
Detection of small RNAs was achieved by conventional Northern blotting and detection via radiolabeled locked-nucleic acids (LNAs). LNAs (Qiagen) complementary to the Y5 RNA or 5S rRNA (table S4) were ordered and 5’end labeled: 200 pmol LNA was added to 3 μL of T4 PNK (NEB), 7 μL 10x T4 PNK buffer, and 1 μL of ATP, [γ-32P]-3000 Ci/mmol 10 mCi/ml (γ-ATP, Perkin Elmer) in a 70 μL reaction. LNAs were incubated at 37°C for 3 hours after which free γ-ATP was purified away using Micro Bio-Spin 6 (Bio-Rad) columns as per the manufacture’s’ instructions. A 12% Urea-PAGE gel (National Diagnostics) was poured and pre-run at 10W for 15 minutes, after which 2 μg of total RNA from various cell types was separated by running the gel at 15W. After electrophoresis, RNA was transferred to HyBond N+ (GE Life Sciences) using a Semi-Dry transfer apparatus (Bio-Rad) with 0.5x Tris/Borate/EDTA (TBE, Thermo Fischer Scientific) buffer at a constant power of 18V for 90 minutes at 4°C. Next, RNA was crosslinked to the membrane, and pre-hybridized at 65°C for 60 minutes in 2 mL of PerfectHyb Plus (Sigma) buffer. Labeled LNA probes were then added to the PerfectHyb Plus buffer (typically 25% of the labeled LNA probe was used for any single membrane hybridization) and incubated at 65°C for 3-16 hours (no change in results with longer or shorter hybridizations). Membranes were rinsed 2x 2.5 mL of Low Stringency Northern Buffer (0.1% SDS, 2x SSC (Saline-sodium citrate)) and then washed at 37°C for 2x 5 minutes in 2.5 mL of High Stringency Northern Buffer (0.1% SDS, 0.5x SSC). Wash membranes were exposed to storage phosphor screens and finally imaged with a GE Typhoon 9410 scanner.
ACKNOWLEDGMENTS
We thank Phillip Sharp, Robert Spitale, Eliezer Calo, Steven Banik, and the Bertozzi Group members for critical discussion. We also thank Hannah Long, Ulla Gerling-Driessen, and Cheen Ang for help with human ES cell culture, Caitlyn Miller for synthesis of 9Az sialic acid, Dean Felsher for the MYC T-ALL 4188 cells, and Melissa Gray for expression and purification of VC-Sialidase. Cell sorting was performed on an instrument in the Shared FACS Facility obtained using NIH S10 Shared Instrument Grant S10RR025518-01.