Delving into human α 1,4-galactosyltransferase acceptor specificity: the role of enzyme dimerization

Glycosyltransferases (GTs) exhibit precise donor and acceptor specificities, governed by intricate mechanisms,

In silico analysis of the human A4galt AlphaFold model revealed that the enzyme belongs to the GT-A fold-type GT and is a member of 32 families in the CAZy glycosyltransferase database (Carbohydrate Active Enzymes Database, CAZy, http://www.cazy.org/).For catalytic activity, the enzyme requires a divalent metal ion (typically Mn 2+ ) coordinated by two aspartic acid residues to form a DXD motif (D 192 TD in human A4galt) [Mikolajczyk K, et al. 2022;Mikolajczyk K, et al. 2021].Human A4galt occurs as a high-frequency enzyme, containing a glutamine (Q) at position 211 (p.Q211), and a rare enzyme variant (referred to as mutein) with a p.Q211E substitution (rs397514502), found in only two families worldwide [Suchanowska A, et al. 2012].This amino acid substitution affects the enzyme acceptor specificity towards GalNAc-capped oligosaccharides, enabling the synthesis of NOR1 and NOR2 antigens (terminally with a Galα1→4GalNAc disaccharide) [Kaczmarek R, et al. 2016;Suchanowska A, et al. 2012;Kaczmarek R, et al. 2014].
For many years, human A4galt has been considered as a glycosphingolipid (GSL)specific enzyme, responsible for transferring Gal residues into GSL acceptors, such as lactosylceramide, resulting in the formation of globotriaosylceramide (Gb3, Galα1→4Galβ1→4Glc-Cer).However, recent studies have shown that A4galt reveals a broader enzyme specificity toward glycoprotein (GP)-related acceptors, particularly Nglycans.It may produce a unique oligosaccharide motif, named P1 glycotope (consisting of a Galα1→4Galβ1→4GlcNAc carbohydrate sequence), which contains a Galα1→4Galβ disaccharide, identical to the terminal disaccharide of Gb3.Both products generated by A4galt, Gb3 on GSLs and the P1 glycotope on glycoproteins, may serve as receptors for Shiga toxins (Stxs) [Szymczak-Kulus K, et al. 2021;Bereznicka A, et al. 2021;Morimoto K, et al. 2020].Shiga toxin-producing Escherichia coli (STEC) strains, similar to Shigella dysenteriae serotype 1, are posing an increasing threat to the human population.STEC infections lead to hemorrhagic colitis, which often progresses to hemolytic-uremic syndrome (HUS), a severe complication that usually causes acute kidney failure [Cody EM and Dixon BP, 2019;Johannes L and Römer W, 2010;Bruyand M, et al. 2018;Michael M, et al. 2022].
Stxs are produced by STEC are of two types: Stx1, which is identical to the toxin secreted by S. dysenteriae serotype 1, and the more distinct Stx2.Structurally, Stxs belong to the AB 5 toxin family and comprise an A catalytic subunit responsible for cytotoxic effects and a pentameric B subunit that specifically binds to cell receptors (Gb3 and P1 glycotope) [Cooling L, 2015;Johannes L and Römer W, 2010;Kim JS, et al. 2020;Szymczak-Kulus K, et al. 2021].Stx1 exhibits a remarkable ability to utilize not only the GSL receptor but also the P1 glycotope on N-glycoproteins for cell entry and cytotoxicity, whereas Stx2 exclusively relies on GSL-related receptor [Szymczak-Kulus K, et al. 2021;Morimoto K, et al. 2020].
Herein, we propose that human A4galt forms complexes within the Golgi apparatus, thereby orchestrating sequential glycan biosynthesis pathways that utilize GSL-and GP-based acceptors [Kellokumpu S, et al. 2016].Preliminary data have shown that human A4galt forms homodimers [Kaczmarek, et al. 2012] and heterodimers with β1,4-galactosyltransferase 6 (B4galt6) [Takematsu H, et al. 2011].In this study, we used a relatively new NanoBiT (NanoLuc® Binary Technology) Protein: Protein Interaction System to assess the dimer formation by human A4galt in CHO-Lec2 and HEK293T cells.To complement this data, we investigated the molecular underpinnings of A4galt homo-and heterodimerization using AlphaFold enzyme models.

Analysis of human A4galt dimerization using NanoBiT technology
The specificity towards GP-and GSL-based acceptors of human A4galt may be mediated by its interaction with other GTs, which vary in acceptor specificities.As potential dimerization partners for human A4galt, we considered representatives of β1,4-galactosyltransferases (B4galt1-4 and B4galt5-6).B4galt1 is a GT located in the trans-Golgi that synthesizes βgalactosylated N-glycans (GP-based substrate for human A4galt), whereas B4galt2-4 also exhibits glycoprotein-utilizing activity, albeit to a lesser extent [Lee et al. 2001;Bydlinski et al. 2018].B4galt5 and B4galt6 catalyze the synthesis of lactosylceramide (LacCer), a major GSL-based substrate for human A4galt.In addition, both enzymes are located in the Golgi cisternae and the trans-Golgi network (TGN) [Rizzo et al. 2021].The co-occurrence of trans-Golgi and involvement in the synthesis of A4galt substrates (Gal-terminated N-glycans and LacCer) suggests the existence of heterodimers containing A4galt and B4galt1-4 and/or B4galt5-6.Their spatial proximity in glycosylation pathways increases the likelihood of interactions, potentially contributing to the formation of supramolecular complexes involving these GTs.To analyze heterodimer formation, we selected B4galt1, a major isoenzyme of GPs-specific β1,4-galactosyltransferases, and B4galt5 and B4galt6, two major GSLs-specific   isoenzymes.The presence of such heterocomplexes could elucidate the dual specificity of human A4galt towards Gal-terminated GSL-and GP-based acceptors.
The NanoBiT technology was used to assess the formation of homo-and heterodimers by high-frequency and mutein human A4galt.Although it is a relatively new method, NanoBiT has already been used to identify interaction partners for other GTs, such as human B4galt4 [Shauchuk et al. 2020] and B4galt1 [Wiertelak et al. 2020].In our study, we used two cell lines, CHO-Lec2 and HEK293T.Previously, CHO-Lec2 cells transfected with human A4GALT gene were used to examine the ability of human A4galt to recognize both GSL-and GP-based acceptors [Szymczak-Kulus K, et al. 2021].Since CHO-Lec2 cells do not contain the homolog of the A4GALT gene, which encodes A4galt, these cells are not primarily sensitive to Stxs.Therefore, we decided to validate the proposed experiments in HEK293T cells that express human A4galt and can naturally bind Stxs.
Using NanoBiT, we tested various combinations of proteins fused with the LgBiT and SmBiT subunits (located on either the N-or C-terminus of the analyzed protein) to determine the optimal orientation for the interactions (Table SI).We found that both forms of human A4galt (high-frequency and mutein) formed homodimers in CHO-Lec2 and HEK293T cells (Figures 1 and 2).Notably, the highest average luminescence value (compared to the respective controls) was observed for the mutein in CHO-Lec2 cells (Figure 2), in contrast to HEK293T cells (Figure 1), in which the average luminescence of A4galt homodimers was similar for both the high-frequency and mutein forms.We also found that N-terminally tagged constructs were preferred for heterodimer formation by human A4galt, while in the case of homodimers, the favored localization of the fused tag depended on the cell line used.In Additionally, we observed the formation of A4galt-B4galt1 and A4galt-B4galt5 heterodimers in both CHO-Lec2 and HEK293T cells, with no detectable luminescence for the A4galt-B4galt6 pairs (Figure 1 and 2).The average luminescence measured for A4galt-B4galt1 pairs of high-frequency and mutein enzymes in CHO-Lec2 cells was 90 and 360 times higher than that for controls, respectively (Figure 2).Similarly, A4galt-B4galt1 heterodimers analyzed in HEK293T cells exhibited robust luminescence signals, approximately 60 times higher for the high-frequency enzyme and approximately 30 times higher for the mutein, compared to the control samples (Figure 1).The greatest luminescence signals were detected for A4galt-B4galt5 pairs in CHO-Lec2 cells, with approximately 1000 times greater luminescence for the mutein compared to the 300 times higher signal for the high-frequency enzyme, in comparison to controls (Figure 2).These findings confirmed the ability of human A4galt to form homodimers, as well as heterodimers with B4galt1 and B4galt5, highlighting the varied GSL-and GP-based acceptor specificity.

Molecular basis of human A4galt dimerization
Our in vitro investigations highlighted that human A4galt can form homodimers as well as heterodimers with B4galt1 and B4galt5; however, the mechanism of these PPIs remains elusive.Given the unresolved spatial structure of human A4galt, we employed the AlphaFold-Multimer tool [Evans R, et al. 2021] to predict the structures of these complexes.Models for all monomers, including A4galt, B4galt1, and B4galt5, were generated using AlphaFold [Jumper J, et al. 2021] and are available in the AlphaFold database [Varadi M, et al. 2022].
The monomer models obtained from the AlphaFold database exhibited high pLDDT scores, indicating a high level of structural reliability, with the exception of the flexible N-terminal fragment.For our modeling endeavors, we considered predictions based on both the full enzyme sequences and those with the N-terminus removed.However, predictions of PPIs using full sequences did not yield reliable outcomes because the N-terminal protein fragment interfered with the formation of intermolecular interfaces in the dimers.Consequently, we focused solely on models with the N-terminus omitted, targeting regions with predictably stable structures for our analysis.Thus, we attempted to predict the structures of the protein complexes, with the A4galt-B4galt5 heterodimer model emerging as the most accurate prediction (Fig. 3A).This model exhibited the highest pLDDT score, with an average of 92.24.High pLDDT values, particularly those exceeding 90, indicated precise predictions.Furthermore, the high quality of the model was supported by the Predicted Aligned Error (PAE) scores, where the average PAE for residue pairs within a 6 Å contact distance was 3.76 Å (see Fig.In addition to the A4galt-B4galt5 model, we also generated models for the homodimer of A4galt-A4galt and a heterodimer of A4galt-B4galt1.However, these complexes exhibited lower pLDDT scores compared to the A4galt-B4galt5 model, particularly within regions where the two monomers interact.Moreover, the high PAE scores for contacting residue pairs in these models suggested diminished confidence in the predicted dimer interface and overall structural configuration (Fig 3B and C).
Standard AlphaFold-Multimer scores, such as pLLDT and PAE, provide initial insights into the quality of protein structure predictions, but they alone do not guarantee the biological relevance of the interactions.To address this, we have incorporated a machine learning method that enhances prediction accuracy by integrating these scores with a comprehensive set of omics and structural data.This integrated approach is encapsulated in the Structure Prediction and Omics-based Classifier (SPOC), specifically designed to evaluate the validity of predicted protein interactions more effectively [Schmid EW et al. 2024].A SPOC score of 0.3 or higher is considered indicative of a high likelihood that the predicted interaction is biologically meaningful and spurious.This threshold was chosen based on its high effectiveness in distinguishing true interactions from computational artifacts, making it a valuable tool for assessing the biological plausibility of our predicted dimer models.In our current study, all three predicted dimer models achieved SPOC scores of 0.3 or above (see Table SII), suggesting that these models are likely accurate and potentially significant representations of the protein interactions.
To further explore these complexes, we performed coevolution sequence analysis using EVcouplings [Hopf TA, et al. 2019] on the A4galt-B4galt5 heterodimer to identify correlated mutations.The analysis was challenging because of the low number of effective sequences, resulting in a ratio of effective sequences to protein length of 0.47 (the value below 1 suggests the limited reliability of this analysis).We identified only a few residue pairs outside the anticipated interface area, showing only a slight increase in the probability of correlated mutations.A similar coevolution analysis for the A4galt-B4galt1 heterodimer revealed an even smaller number of sequences in the multiple sequence alignment (MSA), highlighting the difficulties in obtaining sufficient confidence in structural predictions for this complex via AlphaFold (Fig. 3B).AlphaFold typically relies on a robust MSA for precision, and the limited number of sequences in MSA poses challenges for accurate predictions.
In addition, we explored the potential active sites of the monomers using the COACH meta-server.We included B4galt1, for which the active site is known [Harrus D, et al. 2018], to test the reliability of the results.This method proved to be reliable, as it provided consistent predictions across various methods for binding pockets and known substrates associated with these proteins.Details regarding the potential interaction sites are provided in the Supplementary Data (Fig. SV-SVII).
Analysis of the mutual arrangement of these active sites, supported by the known structure of the B4galt1 homodimer (PDB ID: 6FWU), suggested that the active sites in the heterodimer A4galt-B4galt1 are proximal enough to facilitate substrate exchange.The AlphaFold-predicted structure showed that most of the interface between the two monomers was composed of residues that formed their respective active centers (Fig. 4A).A similar observation was made for the A4galt-B4galt5 heterodimer, however only the active site residues of A4galt contributed to the interface (Fig. 4B).The proximity between the catalytic centers of A4galt and B4galt1 indicated the potential for substrate transfer.

Discussion
In this study, we aimed to elucidate the ability of high-frequency and mutein human A4galt to engage in homo-and heterodimerization.Our findings indicate the propensity of human A4galt for homodimerization, but the functional implications of this phenomenon for enzyme functions require further investigation.Homodimerization is a well-documented phenomenon in various GTs with distinct acceptor specificities, such as glycoprotein-specific α1,6fucosyltransferase 8 (Fut8) [Ihara H, et al. 2007] and glycolipid-specific β1,4-Nacetylgalactosaminyltransferase 1 (B4galnt1) [Giraudo CG, et al. 2001].In addition to homodimerization, we showed that both high-frequency and mutein forms of human A4galt can form heterodimers with B4galt1 and B4galt5, exhibiting distinct acceptor specificity towards GSLs and GPs.Notably, previous research conducted by Takematsu et al. (2011) suggested a potential interaction between human A4galt and B4galt6, which was supported by the shared presence of the GOLPH3 recognition polypeptide motif in these enzymes.This underpins the possibility of heterodimerization within the trans-Golgi apparatus [Takematsu et al., 2011;Rizzo R, et al. 2021;Lujan P and Campelo F, 2021].
By utilizing AlphaFold models for A4galt, B4galt1, and B4galt5, we attempted to elucidate the molecular intricacies underlying their interactions.Analysis of the A4galt-B4galt1 heterodimer revealed spatial proximity of the active sites of both enzymes, potentially facilitating mutual substrate exchange.This observation resembles the mechanism demonstrated in human α2,8-sialyltransferase 3, which forms a homodimer close to the Golgi apparatus membrane [Volkers G, et al. 2015].For the A4galt-B4galt5 heterodimer, we proposed an electrostatic bond-mediated enzyme interaction, similar to the mechanism previously described for homo-and heterodimer formation by β1,4-galactosyltransferase 1 and α2,6-sialyltransferase 1 [Khoder- Agha F, et al. 2019].Both the A4galt-B4galt1 and A4galt-B4galt5 heterocomplexes encourage the transfer of products from B4galt1 and B4galt5 (Gal-terminated N-glycoproteins and glycosphingolipids, respectively) to A4galt.
Subsequently, A4galt synthesizes Gal-capped oligosaccharides using two different types of acceptor molecules (GSLs and GPs).
Understanding the impact of GT dimerization on acceptor specificity is essential to unravel the molecular basis of the glycosylation processes in cells.The mechanistic details underlying the changes in acceptor specificity related to heterodimer formation are crucial for comprehensively deciphering the regulatory role of these GT complexes in shaping the cell glycome.Future studies should elucidate the cause of the preferential interaction between A4galt and B4galt5, but not with the B4galt6 isoenzyme.Confirming the formation of hydrogen bonds between specific amino acids in human A4galt and B4galt5 is also essential; however this requires unveiling the spatial structures of these GTs.Furthermore, the B4galt5 and B4galt6 isoenzymes due to the same substrate specificity can exhibit inhibitory effects on each other, which may affect dimerization with other proteins, including A4galt.The mechanistic details underlying the changes in acceptor specificity related to heterodimer formation are crucial for comprehensively deciphering the regulatory role of these GT complexes in shaping the cell glycome.

Cell cultures
CHO-Lec2 and HEK293T cells were obtained from the American Type Culture Collection (Rockville, MD, USA).The cells were grown and maintained in a humidified incubator with 5% CO2 at 37°C in DMEM/F12 medium (Thermo Fisher Scientific, Inc., Waltham, MA, USA) supplemented with 10% fetal bovine serum (Gibco, Inc., Waltham, MA, USA) and Pen-Strep (Gibco, Inc., Waltham, MA, USA).The culture medium was changed every second or third day, and after reaching 85-90% confluence, the cells were subcultured by treatment with trypsin (0.25% trypsin, 137 mM NaCl, 4.3 mM NaHCO3, 5.4 mM KCl, 5.6 mM glucose, 0.014 mM phenol red, 0.7 mM EDTA), harvested, centrifuged at 800 × g for 5 min, resuspended in fresh medium and seeded to new tissue culture plates.

Construction of expression vectors
The ORF sequences of human B4GALT1 (GenBank accession number NM_001378495.1),B4GALT5 (GenBank accession number NM_004776.4),and B4GALT6 (GenBank accession number NM_004775.5)were customized by Thermo Fisher Scientific and cloned into pET-TOPO plasmids.The template for human A4GALT (GenBank accession number NG_007495.2) amplification was the pCAG-A4GALT expression plasmids, containing the A4GALT gene encoding full-length human A4galt (high-frequency and mutein with p.Q211E substitution).The expression constructs for the split luciferase complementation assay were prepared according to the manufacturer's protocol (NanoLuc® Binary Technology NanoBiT™, Promega, Madison, WI, USA).Briefly, the genes encoding the protein candidates for interaction with human A4galt were cloned into pBiT1.1 and pBiT2.1 vectors to produce fusion constructs with LgBiT and SmBiT subunits, respectively.The luciferase LgBiT and SmBiT subunits were localized at the N-or C-terminus of the analyzed protein to establish the optimal orientation for PPIs.PCR amplification, DNA sequencing, and preparation of plasmid constructs were carried out as previously described [Mikolajczyk K, et al. 2021].Cloning of pBiT vectors was performed using XhoI and EcoRI restriction sites added to the forward and reverse primers.The sequences of all primers used in this study are listed in Table SI.

Split luciferase complementation assay
CHO-Lec2 and HEK293T cells (2×10 4 ) were seeded in a complete growth medium onto a 96well plate with white polystyrene wells and a flat transparent bottom (Corning Inc., New York, USA).20-24 h after plating, the cells were transfected with appropriate combinations of plasmids (25 ng/well of each plasmid, 50 ng/well in total) using linear PEI 25 kDa (1 mg/mL, Polysciences Inc.) transfection reagent.2-3 h before the measurement, the conditioned medium was replaced with the serum-free Opti-MEM medium (Life Technologies, CA, USA).Immediately before measurement, Nano-Glo® Live Cell Substrate (Promega) was added to all wells, according to the manufacturer's instructions.Cell-derived luminescence was recorded using a Clariostar Luminescence Microplate Reader (BMG LABTECH).Each tested combination was accompanied by the respective negative control comprising the analyzed GT fused with the large NanoLuc subunit combined with the HaloTag (a recombinant protein not synthesized by mammalian cells, thus not interacting with proteins of mammalian origin) that was fused with the small NanoLuc subunit (Promega).

Data analysis
For statistical analysis, one-way ANOVA with Bonferroni post-hoc test was used.All analyses were performed with GraphPad Prism (GraphPad Software, CA, USA).Statistical significance was assigned to p-value < 0.05.Only results exceeding the respective negative controls by at least 10-fold were considered indicative of an interaction, as suggested by the manufacturer.

Dimer structure prediction
Structure prediction was carried out using AlphaFold-Mulitmer [Jumper J, et al. 2021;Evans R, et al. 2021] implemented in ColabFold [Mirdita M, et al. 2022].Multiple parameters were tested, that is, template modes (and using different templates), dropout, or the number of recycles, which produced an array of structure predictions.The final dimer models for structural analysis were selected using two criteria: (1) the highest average pLDDT score and (2) the lowest average PAE score for the residue pairs that were in contact (< 6 Å apart).

Dimer structure analysis
Structure Prediction and Omics-based Classifier (SPOC) is a machine learning tool that integrates structural predictions with omics data to assess the reliability of predicted proteinprotein interactions [Schmid EW et al. 2024].It is valuable for distinguishing true interactions from false positives, offering a robust metric that enhances confidence in computational biology studies.This tool takes 3 user-uploaded AlphaFold multimer (AF-M) predictions and calculates compact SPOC scores and other confidence metrics (avg_models, ipTM, pDOCKQ).For each dimer, the three highest scored models of the five models generated by AlphaFold-Multimer were used for this analysis.

Coevolution analysis
The coevolution analysis was performed using EVcouplings [Hopf TA, et al. 2019].The method generates multiple sequence alignments (MSA) for each of the two sequences using Jackhammer (with a Bitscore cutoff of 0.3) and then combines them into a single combined MSA.The measure of whether MSA is sufficiently deep is the number of effective sequences divided by the length of the protein, which should be above 1.

Active site prediction
The potential active sites were predicted using the COACH meta-server [Yang J, et al. 2013].
It is a consensus method that uses multiple different methods to predict potential active sites.
Only methods that correctly predicted substrates known for these proteins were taken into consideration.The residues were considered as a part of the active site if the majority of methods predicted them to form the active center.
HEK293T cells, a high-frequency enzyme preference tag present at the C-terminus for homodimerization (Figure1A, B), while the mutein enzyme favors N-terminally tagged constructs (Figure1C, D).The high-frequency human A4galt in CHO-Lec2 cells formed homodimers using both N-and C-tagged constructs (Figure2A, B).The roles of A4galt N-and C-termini in PPIs in different cellular types remain unclear and require further investigation.
SI).Low PAE values, notably below 5 Å for specific residue pairs, are indicative of a reliable prediction of their spatial relationship, emphasizing the importance of PAE as a key metric for evaluating the structural integrity and precision of protein complexes.Insights into predicted interactions between individual chains were provided by the Mapiya web-server [Badaczewska-Dawid AE, et al. 2022], which are elaborated in the Supplementary Data (Fig. SII-SIV).

Fig. 1 .
Fig. 1.NanoBiT assay was performed for high-frequency human A4galt (A) and its mutein form (C) with the potential heterologous protein partners (B4galt1, B4galt5, and B4galt6 isoenzyme) using HEK293T cells.The negative controls comprise HaloTag fused with the small NanoLuc subunit.The fold changes of luminescence were calculated by dividing the average luminescence measured for the tested combinations (Sample RLU) was divided by the average luminescence obtained for the corresponding negative controls (Control RLU) to calculate the fold changes of the interaction pairs for high-frequency human A4galt (B) and its

Fig. 2 .
Fig. 2. NanoBiT assay performed for high-frequency human A4galt (A) and its mutein form (C) with the potential heterologous protein partners (B4galt1, B4galt5, and B4galt6 isoenzyme) using CHO-Lec2 cells.The negative controls comprise HaloTag fused with the small NanoLuc subunit.The fold changes of luminescence were calculated by dividing the average luminescence measured for the tested combinations (Sample RLU) was divided by the average luminescence obtained for the corresponding negative controls (Control RLU) to