Structure and function of N-acetylglucosamine kinase illuminates the catalytic mechanism of ROK kinases

N-acetyl-D-glucosamine (GlcNAc) is a major component of bacterial cell walls. Many organisms recycle GlcNAc from the cell wall or metabolise environmental GlcNAc. The first step in GlcNAc metabolism is phosphorylation to GlcNAc-6-phosphate. In bacteria, the ROK family kinase NagK performs this activity. Although ROK kinases have been studied extensively, no ternary complex showing the two substrates has yet been observed. Here, we solved the structure of NagK from the human pathogen Plesiomonas shigelloides in complex with GlcNAc and the ATP analogue AMP-PNP. Surprisingly, PsNagK showed two conformational changes associated with the binding of each substrate. Consistent with this, the enzyme showed a sequential random enzyme mechanism. This indicates that the enzyme acts as a coordinated unit responding to each interaction. Molecular dynamics modelling of catalytic ion binding confirmed the location of the essential catalytic metal. Site-directed mutagenesis confirmed the catalytic base, and that the metal coordinating residue is essential. Together, this study provides the most comprehensive insight into the activity of a ROK kinase.


Introduction
N-acetylglucosamine (GlcNAc) is a critical monosaccharide for both prokaryotes and eukaryotes. Eukaryotes widely employ GlcNAc in the N-and O-linked glycans that decorate protein surfaces; in the glycosaminoglycans hyaluronan, heparin sulfate and keratan sulfate that form a major part of the connective tissues (1) (2); and in chitin (3). GlcNAc is also used as a reversible modification of proteins (4) that is conserved amongst metazoans, and to decorate some growth factors (5). This modification is particularly common on nuclear proteins, and generally acts to modulate signalling (often in competition with phosphorylation) and transcription in response to stress and nutrient conditions (6)(7)(8).
GlcNAc is essential to most prokaryotes, as the cell wall is formed from a polymer of GlcNAc and Nacetylmuramic acid cross-linked with peptides (9). Consequently, the key enzymes required for the biosynthesis of the nucleotide linked sugar UDP-GlcNAc are essential in all bacteria. Many bacteria also require GlcNAc to form their lipopolysaccharides (with GlcNAc forming the core of lipid A) (10) and capsular polysaccharides (CPS) (11). Many oligosaccharides are initiated by the addition of GlcNAc, its epimer N-acetylgalactosamine (GalNAc), or 6-deoxy versions of these (N-acetyl-Dquinovosamine and N-acetyl-D-fucosamine respectively) to a lipid carrier (10,12,13). The wzx flippase that transfers oligosaccharides from the cytoplasmic leaflet of the inner membrane into the periplasm (14,15) and the wzy O-antigen/CPS polymerase (16) have strong specificity for the membrane proximal sugar. Furthermore, most oligosaccharide transferases (17,18) are exquisitely specific for the N-acetyl group, making the N-acetylated sugars intimately linked to the surface biology of bacteria.
GlcNAc is generally synthesised by cells from glucose ( Figure 1) (19,20). However, many organisms also have pathways for recycling GlcNAc. This is of particular importance for many bacteria that remodel their cell wall, and for intracellular bacteria that have a reduced availability of metabolic precursors in their environmental niches. Loss of the recycling pathway enzymes reduces the capacity of bacteria to remodel their cell walls (21)(22)(23)(24). These pathways have been recognised in a wide range of human pathogens (e.g. Escherichia coli (22), Pseudomonas aeruginosa (25), Enterobacteriaceae, Staphylococcus aureus (26,27), Mycobacterium tuberculosis (28)). Many bacteria utilise chitin as a nutrition resource, using chitinases to recycle it to GlcNAc (29,30). They are likely to be of particular importance in pathogens derived from crustaceans and insects (e.g. Serratia (31) and Vibrio species (32)).
An essential step in GlcNAc metabolism is the phosphorylation of GlcNAc to GlcNAc-6-phosphate (GlcNAc-6P). Eukaryotes isomerise this to GlcNAc-1-phosphate (33,34) (Figure 1), as their preferred metabolic route to UDP-GlcNAc. In contrast, bacteria that recycle GlcNAc deacetylate GlcNAc-6P, linking recycled and environmental GlcNAc to their central metabolism (35). Phosphorylation of GlcNAc to GlcNAc-6P is performed by a specific kinase, N-acetylglucosamine kinase (NagK). Both mammalian (36) and bacterial NagK enzymes belong to the ROK kinase family of carbohydrate kinases (37). This family phosphorylates a broad range of sugars, with individual kinases showing tight specificity for their substrates (38)(39)(40)(41). ROK kinases have a two-domain fold, with the sugar binding between the two domains, causing a structural re-arrangement that forms the active site (42,43). Other characterised ROK kinases have shown a requirement for either manganese or magnesium for catalysis (40,44,45). Existing crystal structures suggest that ROK kinases use a similar mechanism to other classes of carbohydrate kinases (37) (Figure 1b). A conserved aspartic acid side chain deprotonates the 6'-hydroxyl of GlcNAc. This hydroxyl attacks the ATP -phosphate, passing through a pentacoordinate transition state that is stabilised by the catalytic metal. However, current structural information does not include a structure of an ATP analogue with an intact -phosphate. There is only one structure (from the human N-acetylmannosamine kinase NanK) that contains a catalytic metal: the metal binding site has not been confirmed by mutations or in bacterial enzymes (36,46).
Here, we report the activity, structure, and mechanism of NagK from Plesiomonas shigelloides. Surprisingly, the enzyme displays a random sequential mechanism, with both GlcNAc and ATP able to bind to the enzyme first. PsNagK showed activity with magnesium and manganese as divalent cofactors. The structure of PsNagK in complex with GlcNAc and the ATP analogue AMP-PNP demonstrates how the enzyme catalyses phosphorylation of GlcNAc. Molecular dynamics simulations allowed us to confirm the location of the catalytic cation binding site. Comparing the ternary complex to the product complex of NagK bound to GlcNAc-6P highlights a possible catalytic mechanism. This provides, for the first time, a comprehensive kinetic and structural characterisation of a ROK kinase.

NagK activity from divergent species
The enzymatic activity of NagK has previously been described for E. coli (47). We determined the activity for a wider range of enzymes, to highlight the diversity in activity from different species. We particularly focused on human pathogens with diverse NagK sequences. Recombinant NagK was readily purified for a range of human pathogens ( Figures S1 and S2). The enzymes showed a range of activities (Table 1), with NagK from Vibrio vulnificus showing the highest activity.

NagK uses a sequential mechanism
We selected NagK from P. shigelloides for a more detailed study of the NagK mechanism. The enzyme kinetics showed a sequential mechanism rather than a ping pong mechanism (Figure 2A-C; p=0.0045). The products GlcNAc-6P and ADP showed weak inhibition, with Morrison Ki values one to two orders of magnitude higher than the cognate substrate KM ( Figure S3). This prevented determination of whether an ordered or random sequential mechanism is used as alternative interpretations would be within error. We therefore examined whether the binding of either substrate affects binding of the other. Past studies of enzyme mechanisms have investigated substrate binding using methods such as differential scanning fluorimetry (48) or fluorescence anisotropy (49). We used differential scanning fluorimetry to determine the dissociation constants of GlcNAc and the non-hydrolysable ATP analogue AMP-PNP. Using the isothermal DSF approach (50), we determined that the KD for GlcNAc in the absence and presence of AMP-PNP were 230 ± 20 M and 270 ± 20 M respectively ( Figure 2D). We chose a temperature of 68 °C to measure at as this gave the optimal signal to determine KD. The KD for AMP-PNP in the absence and presence of GlcNAc were 2.2 ± 0.6 mM and 3.2 ± 0.5 mM respectively ( Figure 2E). Surprisingly, there is no significant increase in the affinity for either substrate in the presence of the other. This suggests that NagK uses a random sequential mechanism.

NagK prefers magnesium as the catalytic metal
Most carbohydrate kinases require a metal cofactor. The ROK kinases particularly have previously shown a strong requirement for metals. Consistent with this, PsNagK showed no activity in the absence of divalent cations ( Table 2). The enzyme showed a preference for manganese (K½ = 0.07 ± 0.01 mM) over magnesium (K½ = 0.32 ± 0.05 mM) at low concentrations, but at higher concentrations manganese was inhibitory (Ki = 11 ± 2 mM; maximum rate 55 ± 3 s -1 at 0.87 mM; Figure 3A). Magnesium shows no inhibition and a higher maximum rate (102 ± 3 s -1 ) and would be strongly preferred at physiological concentrations ( Figure 3B). No activity was observed with calcium.

The NagK active site is formed by enzyme closure around the GlcNAc and ATP substrates
Although a structure of V. vulnificus NagK has been solved (51), there is no structure of a ligand bound NagK. We therefore determined the structure of P. shigelloides NagK, as this crystallised readily with and without its substrates (Table S1). As expected, PsNagK forms a two-domain fold with a large domain (including the structural zinc characteristic of ROK kinases (37)) and a small domain ( Figure 4A). The enzyme closes around the GlcNAc substrate, with the small domain rotating by 23° (moving up to 15 Å) relative to the large domain ( Figure 4B). The GlcNAc is bound specifically by the side chains of residues S78, N104, D105, E154, H157, and D187 ( Figure 4C). 5A): the best previous ROK kinase ligand structures showed density only to the -phosphate (42,46). The small domain rotates a further 16° to engage the ATP ( Figure S4). ATP is held in place by the side chains of residues T10, D105, T132 and E196, with the phosphates being coordinated by the main chain of G9, T10, and G255 ( Figure 5B). Most of these side chains are well conserved amongst NagKs, consistent with a role in substrate binding ( Figure S5). We were unable to obtain a structure that contained the catalytic cation. However, our ternary complex with GlcNAc and AMP-PNP is structurally very similar to the previously solved NanK structure that included a catalytic magnesium ((46); Figure S6). The cation binding site is adjacent to a water molecule in our structure coordinated by D6, the main chain carbonyl of I7, and the -phosphate ( Figure 5C). To test the hypothesis that this is the metal binding site, we performed molecular dynamics simulations of the active site with divalent cations added in this location, and AMP-PNP replaced by ATP. Molecular dynamics of the solved structure over 5 ns showed no significant changes in the structure, aside from a minor rearrangement of the ATP phosphates ( Figure S7A). When magnesium, manganese or calcium was added to the protein structure, the cation and ATP phosphates re-arrange to form a binding site for the divalent cation. Counterintuitively, in the cases of magnesium and manganese, the rearrangement brings the cation close to the side chain of D105 and the GlcNAc O6 as well as the D6 side chain, I7 main chain carbonyl and the -phosphate ( Figure S7B, D). These cations show pentahedral coordination as one face is partially blocked by the side chain of I127. In contrast, the calcium ion forms a classical octahedral coordination with the side chains of D105 and D6 (both oxygens), I7 main chain, and two oxygens from the ATP -phosphate. In this case GlcNAc O6 is excluded from the coordination. This may reduce the acidity of the GlcNAc O6, consistent with calcium not supporting catalysis. The rapid, reproducible re-arrangement of the active site under molecule dynamics strongly supports the hypothesis that this is the cation binding site. The cation is then positioned to stabilise the pentacoordinate transition state. However, it is likely that a further re-arrangement of the enzyme active site is necessary for catalysis, as the ATP -phosphate remains too far away from GlcNAc to support a reaction.

Confirmation of proposed ligand interacting residues by site-directed mutagenesis
Site-directed mutagenesis of proposed ligand binding and catalytic residues support the role of these amino acids in PsNagK activity. Mutation to either D105N (catalytic base) or D6N/A (metal coordinating negatively charged group) results in a loss of activity below the limit of detection (at least 1000-fold; Table 3). Mutation of the phosphate coordinating T10V and T132V results in a loss of activity, without substantially affecting the KM for either substrate. Mutation of the main ATP binding side chain D187N results in an increase in KM, but little impact on rate. Mutation of some side chains that coordinate GlcNAc (N104D, E154Q or double mutant) results in substantial increases in KM for both substrates, and clearly reduced rates. Mutation of other conserved GlcNAc binding residues S78A and E196Q resulted in clear increases in rate without affecting KM. These two residues are not well conserved (Figure S5), and the residues mutated to are found in other orthologues. We did not mutate H157 as this residue also coordinates to the structural zinc atom, and mutation would likely significantly affect the protein structure.

Discussion
GlcNAc recycling from the cell wall is important for the biology of many human pathogens. These include some of the ESKAPE pathogens (52) of greatest concern for antimicrobial resistance (22,(25)(26)(27). To efficiently recycle cell wall GlcNAc, bacteria phosphorylate and then de-acetylate GlcNAc to form glucosamine-6-phosphate (35), an intermediate in the essential UDP-GlcNAc biosynthesis pathway (Figure 1; (19, 20)). Here, we have thoroughly characterised the first enzyme that performs the first of these steps, NagK. This enzyme belongs to the ROK kinase family of carbohydrate kinases (37). Key questions arising from previous studies of ROK kinases were the order of binding of substrates; confirming the location of the catalytic metal ion; and the location of the -phosphate.
In common with previous ROK kinases, we determined that NagK has an absolute requirement for divalent cations (40,45,53). Both magnesium and manganese, but not calcium, support NagK function. Physiologically, magnesium would likely be preferred as bacterial intracellular magnesium concentrations (~2 mM) exceed K½ (0.3 mM), whilst manganese concentrations (5-15 M) are below K½ (70 M) (54,55). Comparison of the crystal structure of NagK bound to GlcNAc and AMP-PNP to the human NanK structure (46) suggested that the metal ion should bind into a pocket adjacent to the -phosphate. This pocket would be coordinated by two oxygens from the -phosphate, the main chain carbonyl of I7, and the side chain of D6. An alignment of ROK kinases shows that D6 is strongly conserved as an acidic residue ( Figure S7). This has previously been proposed (albeit with limited evidence) as a metal ion binding residue (36). To support this proposal, we added a magnesium ion to this site in our structure and performed a molecular dynamics simulation. The maintenance of the ion in this location is strongly supported in the simulation, with both magnesium and manganese predicted to coordinate to both substrates. Furthermore, mutation of D6 to either asparagine or alanine completely abolishes the activity of the enzyme. Given that D6 is not close to either substrate in the crystal structure, this very strong phenotype strongly supports a role in binding to the catalytic metal ion. These observations strongly support this pocket as the metal binding site for a wide range of ROK kinases.
The effect of mutations in GlcNAc binding residues is in accordance with previous studies. A detailed phylogenetic study proposed that the 3'-OH is coordinated by asparagine (N104) and glutamic acid (E154) (39). Mutations in either of these residues significantly reduced the activity of NagK. In contrast, two side chains that contact GlcNAc in the crystal structures (S87 and E196) are not evolutionarily conserved ( Figure S5). Mutation of these side chains increases the catalytic efficiency of NagK in vitro.
Our structures provide for the first time a complex of a ROK kinase poised for activity. The structure shows the ATP -phosphate positioned above the 6' OH group of GlcNAc. The catalytic base, D105, is in position to de-protonate the 6' O and turn this into a strong nucleophile. The location of the phosphate group allows coordination of two oxygens with the catalytic metal ion. Other carbohydrate kinases generally follow a mechanism of a nucleophilic substitution with a pentahedral intermediate stabilised by a metal ion (37,(56)(57)(58). Based on ours and others' structures, it seems highly likely that ROK kinases follow a similar mechanism.
In conclusion, our study provides for the first time a detailed explanation for the catalytic power of ROK kinases. Our data show how this family of enzymes support the pentahedral intermediate required for phosphate transfer from ATP to GlcNAc. We demonstrate that a metal ion is required for NagK enzymes, and that the conserved ROK kinase metal coordinating acid is essential for enzyme activity. Our data confirm the critical side chains that support NagK substrate selectivity for GlcNAc. The availability of a detailed structure of the catalytic state of ROK kinases will enable the engineering of these enzymes to phosphorylate alternative substrates to support synthetic biology. This enzyme would also be an attractive target for the development of small molecule inhibitors to target bacteria that rely on cell wall remodelling as part of their pathogenic processes.
Expression and purification of NagK: NagK was expressed in 1 litre of high salt LB broth supplemented with 100 µg/mL ampicillin or 50 µg/mL kanamycin as appropriate. Each flask was inoculated with 10 mL of an overnight culture and grown at 37 °C with shaking at 200 rpm until OD600 reached 0.6. NagK expression was induced with 200 µM isopropyl thio-β-D-galactoside (IPTG), and cultures were grown at 20°C for 18h. Cells were harvested by centrifugation at 4500xg for 30 min at 4 °C. The pellet was resuspended in binding buffer (20 mM Tris-HCl, 500 mM NaCl, 10 mM imidazole, pH 8.0) and lysed by sonication (SONIC Vibra cell TM VCX130). The lysed sample was clarified by centrifugation (24 000xg for 30 min at 4 °C). The soluble fraction was purified using an ÄKTAxpress chromatography system (GE Healthcare). The sample was purified firstly using a 1 mL HisTrap crude column (GE Scientific). After loading sample, the column was washed with binding buffer, and the protein eluted into binding buffer with imidazole at 250 mM. The product was purified over a Superdex 200 16/60 size-exclusion column (GE Healthcare) and eluted isocratically into 10 mM HEPES, 500 mM NaCl, pH 7.5. The eluted protein was concentrated using a Vivaspin centrifugal concentrator (Generon) to 1 mg/ml and stored at -20°C with 20% (v/v) glycerol for enzymatic assays; or concentrated to 11.5 mg/ml and stored at -80°C in small aliquots without any glycerol for crystallization.
Kinetic analysis: NagK activity was assayed using the previously described coupling reaction with pyruvate kinase (PK) and lactate dehydrogenase (LD; (62)). For P. shigelloides, the His-tagged protein was used. Reactions contained 90-6000 ng/mL NagK, 40  Kinetic parameters (KM and kcat) for ATP and GlcNAc were determined by varying either ATP or GlcNAc concentrations between 2-0.02 mM and 2-0.03 mM respectively, keeping all other parameters constant. The data were fitted to the Michaelis-Menten equation in Prism 7.05 (GraphPad). To determine the substrate mechanism, the initial reaction rates were measured with a two-fold dilution of GlcNAc from 2 mM in eight steps, and with a two-fold dilution of ATP from 2180 µM in five steps. Two experimental replicates were taken for each data point. Data were fitted to the sequential bi-bi and ping-pong equations in Prism 9.01 (GraphPad) (62)(63)(64). To determine the effect of divalent cations, initial reaction rates were determined with the MgCl2 in the mixture above substituted with of 10 mM MgCl2, MnCl2, CaCl2, CuCl2, or CoCl2, and normalized to the rate with MgCl2. KM and Vmax were determined for MgCl2 or MnCl2 by varying the concentration between 0-10 mM, with GlcNAc and ATP at the determined KM. Three experimental replicates were performed for all reactions.
Differential scanning fluorimetry: The dissociation constants (KD) for NagK with its substrates was determined using differential scanning fluorimetry (50). Each sample contained 0.1 mg/mL NagK, 8X SYPRO Orange dye (Fisher Scientific #10338542), 10 mM Hepes pH 7.5, 100 mM KCl and varying concentrations of either GlcNAc, AMP-PNP or the combination of these in a total volume of 10 L. Data were collected on a Rotorgene Q (Qiagen) using the ROX channel to collect data. The melt curves showed a monotonic melt. Raw data were converted to a percentage unfolded using the fluorescence readings at the start and end of the melt to define 0 and 100% unfolded. 68 °C was selected as the temperature giving an optimal range of unfolding percentages. Data were fitted to equation 1 using Graphpad v. 9.0.1.
Where fu is the fraction unfolded, Top and Bottom are the maximum and minimum unfolded fractions, [S] is the varied substrate concentration, and EC50 is the substrate concentration that reduces the unfolded fraction by half. Equations fixing Bottom as zero and including a Hill slope were rejected as inferior to 1 for these data based on Akaike's information criterion.
The fitted EC50 values were converted to KD using equation 2 (50).
Where fu0 is the fraction unfolded at zero substrate concentration, and [P]T is the total protein concentration.

Crystallization:
For crystallization, the His-SUMO tagged P. shigelloides NagK was used. Crystals were grown using the microbatch method using an Oryx8 crystallization robot (Douglas Instruments). Initial crystals grew in well E6 of the Morpheus I screen (Molecular Dimensions), mixed 1:1 with 5 mg/ml NagK. Seed stocks were prepared from these crystals in 0.1 M MOPS pH 7.5, 30% (v/v) ethylene glycol, 10% (w/v) PEG 8000. The successful crystallisation conditions, soaking conditions, and cryoprotectants used are detailed in Table S2.
X-ray data collection and structure determination: Data were collected at Diamond Light Source (Didcot, UK) at 100 K using Pilatus 6M-F detectors and wavelengths of 0.92-0.98 Å. All data were processed using XDS (65). Further data processing and structural studies was carried out using CCP4 program package (66,67). The apo structure of NagK was solved by the molecular replacement (MR) using the MR pipeline MORDA (68) with the best solution found for the model (PDB ID: 4DB3). The model was refined using REFMAC5 (69) and PHENIX (70) and rebuilt using COOT (71). The refined apo NagK model was used as a MR search model in MOLREP (72) for the NagK-GlcNAc-ADP data, which crystallised in a different space group. The MR solution was refined using Buccaneer (73), following which further refinement was performed as above. The crystals of NagK-GlcNAc-AMP, NagK-GlcNAc, NagK-GlcNAc-6'-phosphate and NagK-GlcNAc-AMP-PNP were in the same space group as the NagK-GlcNAc-ADP complex, however, phased MR (74) was used to reposition the small domain in the NagK-GlcNAc structure. All structures were subjected to phased refinement in REFMAC5 (75) with input DM phases (76) from NCS averaging. The models were validated using MOLPROBITY (77) implemented in the CCP4i2 interface (78).

Molecular Dynamics:
Molecular dynamics was performed in YASARA v.20.12.24 (79). The structure of NagK complexed with GlcNAc and AMP-PNP was cleaned to remove water and PEG molecules. Molecular dynamics was run using the md_runfast macro for 5 ns using the AMBER15FB force field (80). Simulations including divalent cations were performed by replacing water molecule 97 with the relevant cation.   Table 3: Effect of mutants of ATP and GlcNAc binding residues. Site-directed mutants were prepared for P. shigelloides NagK at key side chains that coordinate ATP, GlcNAc or magnesium. kcat and KM app for both substrates were determined as for the wild-type enzyme. D6A, D6N and D105N mutants caused a loss of activity below the limit of detection of the assay (kcat < 0.01 s -1 ). The E154Q and N104D/E154Q mutants caused the apparent KM for GlcNAc to increase to above 50 mM (i.e., the plot of rate against substrate concentration was a straight line).