Inhibition of CMP-sialic acid transport by endogenous 5-methyl CMP

Nucleotide-sugar transporters (NSTs) transport nucleotide-sugar conjugates into the Golgi lumen where they are then used in the synthesis of glycans. We previously reported crystal structures of a mammalian NST, the CMP-sialic acid transporter (CST) (Ahuja and Whorton 2019). These structures elucidated many aspects of substrate recognition, selectivity, and transport; however, one fundamental unaddressed question is how the transport activity of NSTs might be physiologically regulated as a means to produce the vast diversity of observed glycan structures. Here, we describe the discovery that an endogenous methylated form of cytidine monophosphate (m5CMP) binds and inhibits CST. The presence of m5CMP in cells results from the degradation of RNA that has had its cytosine bases post-transcriptionally methylated through epigenetic processes. Therefore, this work not only demonstrates that m5CMP represents a novel physiological regulator of CST, but it also establishes a link between epigenetic control of gene expression and regulation of glycosylation.


Introduction 14
Glycosylation is the most common form of protein and lipid modification (Dwek,Butters et al. 15 2002, Ohtsubo  what cellular processes may regulate NST activity. 30 It has been shown that free nucleotide monophosphates (NMPs) can inhibit uptake of nucleotide 31 sugars into the Golgi lumen, ostensibly by competing with the nucleotide sugar for binding to the 32 NST (Chiaramonte, Koviach et al. 2001). While it is known that concentrations of NMPs with 33 canonical bases (A, T, C, G, or U) can vary between cell type and fluctuate depending on the 34 metabolic needs of a cell (Traut 1994), there are currently no established links between these 35 fluctuations and regulation of glycosylation. However, it has recently been appreciated that 36 pools of cellular NMPs are comprised of more than just the canonical bases (Zeng,Qi et al. 37 mechanism of high-affinity interaction between m 5 CMP and CST. Considering that m 5 CMP 63 inhibits CMP-Sia transport and that m 5 CMP cellular concentrations are primarily related to post-64 transcriptional methylation of RNA, these results suggest a link between RNA epigenetics and 65 regulation of cellular glycosylation. 66 67 Results 68

Initial characterization of a molecule that co-purifies with CST 69
After determining the structures of CST in complex with its two primary physiological substrates 70 (CMP-Sia and CMP) (Ahuja and Whorton 2019), one of our next aims was to determine the 71 structure of CST in the absence of any ligand. Our hope was that such a structure would help 72 elucidate some of the conformational transitions that occur within CST upon ligand binding. Our 73 approach to determine the structure of a ligand-free CST was to simply crystallize purified CST 74 without the addition of any ligand. We were able to grow crystals of CST under these 75 conditions. They were very small but they still allowed us to collect a partially-complete dataset 76 with a resolution of 3.3 Å. Unexpectedly, the molecular replacement solution (data not shown) 77 indicated that either CMP or a CMP-like molecule was still bound in the substrate-binding cavity. 78 We hypothesized that one reason a CMP-like molecule would be present in these crystals is if it 79 was co-purified with CST. To test this hypothesis, we performed a phenol-chloroform extraction 80 on a sample of purified CST to precipitate the protein and liberate any bound molecule. The 81 aqueous fraction of this extract was run on a C18 HPLC column ( Figure 1A). We observed a 82 peak with a retention time of ~3.5 min, which was significantly different than the retention times 83 of CMP, UMP, or CMP-Siaa selection of candidate co-purifying molecules. 84 To further characterize this unknown peak, we next took a sample of the CST phenol-chloroform 85 extract and added 2.5 µM CMP. This sample was run on the C18 HPLC column while we 86 monitored the absorbance at both 274 nm and 254 nm ( Figure 1B). Like the CMP peak, the 87 peak at 3.5 min had a higher absorbance at 274 nm compared to 254 nm, indicating that the 88 molecule has an aromatic group. Calculation of the A274/A254 ratio across the two peaks 89 further shows that the peak at 3.5 min represents a different molecule than CMP since its 90 A274/A254 ratio is 1.60 ± 0.02 compared to 1.42 ± 0.02 for CMP. 91 Considering that the unknown molecule has an aromatic group and has a similar retention time 92 as CMP, we also wanted to see if it had a phosphate group. To do so, we treated the CST 93 phenol-chloroform extract with a non-selective nucleotide phosphatase, Antarctic phosphatase 94 (AnP). As seen in Figure 1C, AnP-treatment yielded a peak with a retention time of ~6.25 min. 95 This shifted retention time indicates that a molecule with a unique chemical composition was 96 formed, which is consistent with the generation of a new phosphate-lacking molecule. We also 97 found that we still observe the peak at ~3.5 min even after a sample of purified CST has been 98 dialyzed for 24 hr ( Figure 1D) before being subjected to phenol-chloroform extraction. This is 99 consistent with this unknown molecule having a high binding affinity for CST and explains how it 100 stays bound over the course of a two-day purification procedure. However, if we include AnP 101 during the dialysis, we no longer observe a peak at either 3.5 min or 6.25 min ( Figure 1D). This 102 indicates that the loss of the phosphate from the unknown molecule significantly reduces its 103 binding affinity to the point where it can be dialyzed away. This is similar to our previous 104 observation of how the removal of phosphate from CMP to form cytidine completely eliminates 105 its ability to bind CST (Ahuja and Whorton 2019). 106 107

Identification of the unknown molecule as 5-methyl CMP 108
To gain further insight into the identity of the molecule, we sent samples to the Northwest 109 Metabolomics Research Center at the University of Washington to be analyzed by HPLC 110 coupled to mass spectrometry (LC-MS), using electrospray ionization (ESI). A comparison of 111 total ion current chromatograms from phenol-chloroform extracts of a buffer-only sample versus 112 a sample containing purified CST protein did not reveal any peaks that were obviously-unique to 113 the protein-containing sample (Figure 2figure supplement 1). Several peaks that were unique 114 to the buffer-only control were observed; however, this may be partially due to the high 115 concentration of denatured protein in the CST sample affecting the partitioning of some buffer 116 components during the phenol-chloroform extraction. 117 An analysis of extracted ion chromatograms with a m/z range of 338.0701 ± 0.5 revealed a peak 118 that was found only in the protein-containing sample (Figure 2A). Mass spectra that correspond 119 to the retention time of this peak (10.7-10.8 min) are shown for the buffer-only sample ( Figure  120 2B) and the protein-containing sample ( Figure 2C). These clearly show that there is an ion with 121 an m/z of 338.0722 (positive ESI mode) that is unique to the protein-containing sample. Similar 122 analyses performed in negative ESI mode showed a unique ion with an m/z of 336.0631 ( Figure  123 2D). Although there was a unique peak in the buffer-only sample in the extracted ion 124 chromatogram (Figure 2A), the mass spectrum corresponding to the retention time for this peak 125 (9.6-9.8 min) showed that the contributing ion has an m/z of 338.0457 ( Figure 2E). This likely 126 represents one of several ions that are unique to the buffer-only sample, as described above. 127 To further characterize this ion that was unique to the protein-containing sample, it was 128 subjected to MS/MS fragmentation in positive ESI mode ( Figure 2F). The most abundant 129 product ion had an m/z of 126.0686, which supported an annotation of the precursor ion as 5-130 methyl-cytidine 5'monophosphate (m 5 CMP), according to the scheme shown in Figure 2G. 336.0602 in positive and negative ESI modes, respectively. This gives ppm errors of 7.6 and 133 8.6, respectively, with regards to the measured m/z's stated above, which is within the 5-10 ppm 134 mass accuracy of the MS instrument that was used (Bristow and Webb 2003). The annotation 135 of the precursor ion as m 5 CMP does not account for the other prominent ions in the MS/MS 136 spectrum ( Figure 2F); however, it is possible that they derive from the ion with an m/z of 137 338.9012 ( Figure 2C), which would have been included in the MS/MS fragmentation since the 138 precursor ion selection filter had an m/z isolation width of 1.3. 139 To reconcile this finding with our HPLC observations, we prepared a sample containing 5 µM 140 each of CMP and m 5 CMP. We ran this sample as well as a phenol-chloroform extract of CST 141 on a C18 HPLC column ( Figure 3A). We saw that m 5 CMP eluted with a nearly identical 142 retention time as that of the molecule that co-purifies with CST. We also characterized the 143 A274/A254 ratio for m 5 CMP and found it to be 1.60 ± 0.01 ( Figure 3B), which is essentially 144 identical to the A274/A254 ratio measured for the molecule that co-purifies with CST (Fig. 1B). 145 146 m 5 CMP binds CST with a higher affinity than CMP and inhibits CMP-Sia uptake 147 We next wanted to characterize the functional properties of m 5 CMP and how it compares to 148 CMP. The observation that m 5 CMP co-purifies with CST and remains bound even after 149 overnight dialysis suggests that m 5 CMP has a slow off-rate, which would be most consistent 150 with a sub-micromolar binding affinity. However, the assay that we have previously relied on to 151 measure equilibrium binding constants is a scintillation proximity assay that requires 2 µM 152 purified CST protein per assay point in order to achieve an adequate signal-to-noise ratio (Ahuja 153 and Whorton 2019). Therefore, we thought it would be difficult to use this assay to measure 154 m 5 CMP's binding affinity towards CST since it would be very challenging to account for the 155 significant ligand depletion that would occur. 156 So we instead developed an alternative assay that measures binding constants by evaluating 157 the effect that a series of ligand concentrations has on the thermal stability of CST. In this 158 assay, aliquots of 40 -80 nM GFP-tagged CST are either kept at 4°C or heated to 41°C in the 159 absence or presence of various concentrations of ligand. Some fraction of the CST protein will 160 denature in response to the heating; however, the addition of a ligand will stabilize the protein 161 and reduce the fraction of protein that denatures in a dose-dependent manner. The fraction of 162 protein that remains folded can be determined by running the samples on a size exclusion 163 column connected to a fluorescence detector and noting the peak height of the monodisperse 164 species that elutes at ~5.4 min, as shown in Figure 4A. 165 When we compare the peak heights of the CST sample heated to 41°C versus the sample kept 166 at 4°C, we can see that approximately 44% of the protein denatures. However, including 167 increasing amounts of m 5 CMP during the 41°C incubation leads to more and more protein being 168 protected from denaturation, to the point where saturating amounts of m 5 CMP are able to 169 prevent any denaturationas indicated by the peak height for the 100 µM sample being 170 identical to the sample that was kept at 4°C. By plotting peak heights against ligand 171 concentration, we can determine a K d of 1.0 ± 0.1 µM ( Figure 4B). A similar experiment 172 performed with a titration of CMP gives a K d of 16.1 ± 2.4 µM. 173 We then wanted to compare m 5 CMP and CMP in their ability to inhibit CMP-Sia uptake. Again, harvested by centrifugation before counting in a scintillation counter. As shown in Figure 4C, 180 the rate of uptake is linear for at least 5 min. To measure inhibition constants, we added various 181 concentrations of either m 5 CMP or CMP to the cells and incubated for 5 min. The amount of 182 transport activity remaining as a function of ligand concentration is plotted in Figure 4D. Fitting 183 the data with a simple dose-response model gives K i 's of 5.1 ± 1.2 µM and 1.0 ± 1.2 µM for 184 m 5 CMP and CMP, respectively. The apparent discrepancy between these K i values and the K d 185 binding constants will be discussed below. 186 187

Structure of CST-m 5 CMP complex reveals the mechanism of high-affinity binding 188
In order to understand the molecular details of how m 5 CMP binds CST with a higher affinity than 189 CMP, we determined the X-ray crystal structure of CST in complex with m 5 CMP. Crystals of 190 CST were grown using the lipidic cubic phase method, in the presence of 400 µM m 5 CMP. 191 Compared to crystals that we previously grew of CST in complex with CMP (Ahuja and Whorton 192 2019), crystals grown in the presence of m 5 CMP had the same morphology, belonged to the 193 same space group, and had nearly identical unit cell properties (Table 1). However, one key 194 difference is that the CST-m 5 CMP crystals diffracted X-rays to a much higher resolution of 1.8 Å 195 (compared to 2.6 Å for the CST-CMP crystals). This let us build a highly-detailed model which 196 contained 13 lipid molecules and 162 waters resulting in a R work and R free of 18.5% and 19.8%, 197 respectively (Figure 5figure supplement 1 and Table 1). 198 Overall, the CST-m 5 CMP structure is very similar to the CST-CMP structure, with a r.m.s.d. of The burial of m 5 CMP's C-5 methyl in this hydrophobic pocket is likely the primary contributor to 206 the 16-fold increase in m 5 CMP's binding affinity compared to CMP, on account of the 207 hydrophobic effect. It has been estimated that burying hydrophobic surfaces contributes 208 approximately 0.03 kcal/mol/Å 2 to the free energy of ligand binding (Hopkins , Chothia 1974). 209 Therefore, burying a methyl group, which has a surface area of 46 Å 2 , would contribute a total of 210 about 1.4 kcal/mol which is equivalent to a ~10-fold increase in affinity. This general effect has 211 been termed the magic methyl effect (Schonherr and Cernak 2013) and it is not uncommon to 212 see such large increases in binding affinity under the right circumstancese.g. where an added 213 methyl group is buried in a hydrophobic pocket (Leung, Leung et al. 2012). 214 The residues that line the methyl-interacting hydrophobic pocketprimarily Phe195, Thr260 215 (C γ2 atom), and Val264, but also Tyr98, Gly192, Met213, and Ser261 (only C α and C β atoms) 216 Here we describe the discovery that m 5 CMP binds CST and can inhibit CMP-Sia transport. 231 Using a thermal shift assay, we determined that m 5 CMP has a binding constant of 1 µM, which 232 implies a relatively fast off-rate and short half-life on the order of a second or less. However, in 233 order to co-purify a molecule over the course of a 2-day protein purification protocol, a relatively 234 slow off-rate with a half-live on the order of at least several hours would be required. This would 235 approximately equate to a low-nanomolar dissociation constant. The discrepancy between 236 these two dissociation constants can be reconciled by the fact that the binding experiment was 237 performed at the T m of CST (41°C) whereas the protein purification was primarily performed at 238 4°C. This suggests that there is a steep relationship between temperature and m 5 CMP binding 239 affinity. It is not straightforward to extrapolate a K d at intermediate temperatures; however, it 240 follows that the K d for m 5 CMP at lower temperatures, such as a physiological temperature of 241 37°C or room temperature where transport assays are performed, will be lower than what was 242 observed at 41°Cperhaps on the order of several hundred nanomolar. 243 Another seemingly paradoxical finding is that although m 5 CMP has a lower binding K d than 244 CMP, the apparent K i for m 5 CMP's inhibition of CMP-Sia transport is about 5-fold higher than 245 that of CMP (5.1 µM versus 1.0 µM). The transport assay was performed at room temperature, 246 so it is understandable that the K i for CMP is lower than the K d of 16 µM that we measured at 247 41°C using the thermal shift binding assay. While K i 's of transport inhibition and K d 's of binding 248 are not always necessarily identical, they are often quite similar (Van Winkle 1999). In fact, this 249 does actually seem to be the case for CMP inhibition of CST, considering that we previously 250 determined that the K d for CMP binding to CST was on the order of 1-6 µM at room temperature 251 using a scintillation proximity binding assay (Ahuja and Whorton 2019). Therefore, we expected 252 that the relationship between the K i 's for m 5 CMP and CMP would have mirrored what we 253 observed for their K d 's. In other words, we would have expected to see that the K i for m 5 CMP 254 inhibition of CMP-Sia transport to be significantly lower than CMP's K i . 255 The reason that we did not observe this is not entirely clear. It may be the case that m 5 CMP's 256 K i is indeed higher than CMP's K i despite having a lower equilibrium dissociation constant. 257 Given that CSTs transport mechanism likely involves several conformational states (Ahuja and 258 Whorton 2019), it is possible that in steady-state conditions m 5 CMP's extra methyl adversely 259 affects its interaction with certain conformational states or affects the rates of transitions 260 between states. There could also be a number of technical reasons that may underlie the 261 discrepancy between m 5 CMP's binding K d and transport K i . The transport assay relies on intact 262 cells, so it is possible that there are cellular processes that either preferentially degrade or 263 uptake m 5 CMP over CMP, thereby affecting its effective concentration. In addition, m 5 CMP's 264 additional methyl group imparts significant hydrophobicity making it at least 100-fold less soluble 265 in water. Therefore, given that we use a large number of cells per assay point (1 x 10 6 ), this 266 presents a significant amount of lipid bilayer where the m 5 CMP may partition which could lower 267 the effective concentration of soluble m 5 CMP. Finally, as mentioned above, our goal for this 268 assay was to reduce the total concentration of transporter in the assay; however, it may be the 269 case that the concentration of CST per assay point is still too high such that ligand depletion 270 affects our ability to accurately measure sub-µM K i 's. To address these technical concerns, 271 future work will need to characterize the actual concentration of soluble extracellular m 5 CMP as 272 well as define the concentration of CST per assay point. 273 There are two requirements for m 5 CMP to act as a physiological inhibitor of CMP-Sia transport: what was seen in the renal tissue samples, ranging from 56 -486. This indicates that not only is 295 m 5 CMP widely distributed in the body, but there are likely cell types where m 5 CMP is much 296 more abundant, compared to CMP levels, than what was seen in renal cells. In addition, the 297 m 5 CMP levels detected in urine is a conglomeration from all tissue types; therefore, since there 298 are tissues types with high CMP:m 5 CMP ratios (e.g. renal), it follows that there must be some 299 tissue types with CMP:m 5 CMP ratios even lower than the averages that were observed in the 300 urine samplesperhaps even approaching the levels mirroring the 16-fold affinity difference 301 that we measured between CMP and m 5 CMP. Using the same estimate for an average cellular 302 CMP concentration of 39 µM, this would be equivalent to m 5 CMP concentrations ranging from 303 80 -700 nM. Therefore, it is conceivable that there are cell types, perhaps ones that have high 304 RNA turnover and/or are predisposed to high levels of RNA cytosine methylation, where the 305 physiological levels of m 5 CMP would be adequate to regulate the transport activity of CST. 306 This analysis has focused on m 5 CMP since this was the molecule that we identified to co-purify 307 with CST; however, as mentioned above, m 5 dCMP is also present in cells. In the Zeng et al. it does not appear that the loss of the 2' hydroxyl significantly affects binding affinity since it has 315 been previously shown that CMP and dCMP have essentially identical K i 's for inhibition of CMP-316 Sia transport (Chiaramonte, Koviach et al. 2001). Therefore it seems that cellular pools of 317 m 5 dCMP may also be able to contribute to inhibition of CMP-Sia transport. 318 In conclusion, we have shown that m 5 CMP co-purifies with CST and most likely represents a 319 novel physiological regulator of CST transport activity. This work has focused on characterizing 320 the interaction between m 5 CMP and the mouse ortholog of CST. However, considering the very 321 high sequence identity between the mouse and human CST sequences ( Figure 5figure  322 supplement 3), especially in regards to residues that line the substrate-binding pocket, we 323 expect that human CST will have nearly identical properties as mouse CST. m 5 CMP binds CST 324 with a 16-fold higher equilibrium binding affinity than CMP, but m 5 CMP's K i for inhibition of 325 CMP-Sia transport is paradoxically approximately 5-fold higher than that of CMP. However, we 326 discuss how there may be several technical reasons for this discrepancy. If m 5 CMP's K i is also 327 roughly 16-fold lower than CMP's K i , mirroring what was observed for the binding K d 's, then 328 there are likely some cell types where the cellular m 5 CMP concentration is high enough to 329 approach m 5 CMP's K i . In these cases, fluctuations in m 5 CMP and m 5 dCMP concentrations that 330 are connected to rates of RNA/DNA cytosine methylation and decay would be able to impact the 331 uptake of CMP-Sia into the Golgi lumen and thereby affect glycosylation patterns. Ultimately, 332 experiments that can monitor glycosylation profiles in response to manipulation of rates of 333 cellular RNA/DNA methylation and/or decay will be crucial for establishing a definitive link Full-length, GFP-tagged mouse CST was expressed in P. pastoris as described above. Milled 395 cells were suspended in lysis buffer at a ratio of 125 mg cells to 1 ml buffer, then gently rotated 396 for 2 hours at 4°C (lysis buffer: 50 mM HEPES pH 7.5, 150 mM NaCl, 1% w/v DDM (Anatrace, 397 solgrade), 1 mM DTT, 1 mM EDTA, 0.01 mg/ml deoxyribonuclease I, 0.7 g/ml pepstatin, 1 398 g/ml leupeptin, 1 g/ml aprotinin, 1 mM benzamidine, 0.5 mM phenylmethylsulfonyl fluoride). 399 This DDM-solubilized lysate was then clarified by centrifugation at 21,000 x g for 20 minutes at 400 4°C. It was then diluted 32-fold in Buffer C (50 mM HEPES pH 7.5, 150 mM NaCl, 0.1% w/v 401 solgrade DDM, 1 mM EDTA, 1 mM DTT). This dilution factor was chosen to give a final assay 402 concentration of 40-80 nM GFP-tagged CST and was based on comparing the peak heights of 403 the 4°C control peak with samples of previously-run purified GFP-tagged CST of known 404 concentration (data not shown). 90 µl aliquots of the diluted lysate were then placed into 250 µl 405 PCR tubes (Fisher Scientific). 10 µl of 10X stocks of either CMP or m 5 CMP (made up in 25 mM 406 HEPES pH 7.5 and 150 mM NaCl) were then added to the diluted lysate samples. The samples 407 were gently mixed and incubated on ice for 30 minutes. Following this, they were then heated 408 to 41°C for 15 minutes using a thermocycler (control samples were kept on ice), transferred to 409 1.5 ml microcentrifuge tubes, and spun down at 87,000 g for 20 minutes at 4°C to pellet 410 precipitated protein and cellular debris.

Sf9 insect cells (Expression Systems) were grown in suspension in ESF 921 media (Expression 423
Systems) to a density of 1x10 6 cells per ml and then infected with baculovirus encoding the 424 same full-length mouse CST construct as described above at a ratio of 40 µL virus per ml of 425 media (this ratio was empirically determined from titer trials in order to optimize protein 426 expression). Cells were allowed to express the protein for 48 hours, counted, centrifuged at 427 1000 x g for 5 minutes at room temperature, and resuspended in fresh media to a density of 428 2x10 6 cells/ml. Aliquots of 0.5 ml (1x10 6 cells) were made in 1.5 ml microcentrifuge tubes.    uptake is plotted against inhibitor concentration in order to determine inhibition constants. The 680 plotted values represent the mean ± SEM, n=4. 681     In all panels, key residues are labeled and the atoms for the pyrimidine ring are numbered. 734