Identification of the Clostridial cellulose synthase and characterization of the cognate glycosyl hydrolase, CcsZ

Biofilms are community structures of bacteria enmeshed in a self-produced matrix of exopolysaccharides. The biofilm matrix serves numerous roles, including resilience and persistence, making biofilms a subject of research interest among persistent clinical pathogens of global health importance. Our current understanding of the underlying biochemical pathways responsible for biosynthesis of these exopolysaccharides is largely limited to Gram-negative bacteria. Clostridia are a class of Gram-positive, anaerobic and spore-forming bacteria and include the important human pathogens Clostridium perfringens, Clostridium botulinum and Clostridioides difficile, among numerous others. Several species of Clostridia have been reported to produce a biofilm matrix that contains an acetylated glucan linked to a series of hypothetical genes. Here, we propose a model for the function of these hypothetical genes, which, using homology modelling, we show plausibly encode a synthase complex responsible for polymerization, modification and export of an O-acetylated cellulose exopolysaccharide. Specifically, the cellulose synthase is homologous to that of the known exopolysaccharide synthases in Gram-negative bacteria. The remaining proteins represent a mosaic of evolutionary lineages that differ from the described Gram-negative cellulose exopolysaccharide synthases, but their predicted functions satisfy all criteria required for a functional cellulose synthase operon. Accordingly, we named these hypothetical genes ccsZABHI, for the Clostridial cellulose synthase (Ccs), in keeping with naming conventions for exopolysaccharide synthase subunits and to distinguish it from the Gram-negative Bcs locus with which it shares only a single one-to-one ortholog. To test our model and assess the identity of the exopolysaccharide, we subcloned the putative glycoside hydrolase encoded by ccsZ and solved the X-ray crystal structure of both apo- and product-bound CcsZ, which belongs to glycoside hydrolase family 5 (GH-5). Although not homologous to the Gram-negative cellulose synthase, which instead encodes the structurally distinct BcsZ belonging to GH-8, we show CcsZ displays specificity for cellulosic materials. This specificity of the synthase-associated glycosyl hydrolase validates our proposal that these hypothetical genes are responsible for biosynthesis of a cellulose exopolysaccharide. The data we present here allowed us to propose a model for Clostridial cellulose synthesis and serves as an entry point to an understanding of cellulose biofilm formation among class Clostridia.

made of cellulose, although the molecular mechanisms governing their production has not before been

37
Biofilms are communities of microorganisms that reside in an extracellular matrix. This extracellular matrix 38 is produced by the community itself and is composed largely of secreted exopolysaccharides. Biofilms are 39 among the most successful and widely distributed forms of life on Earth, enabling bacteria to adhere to, 40 colonize, and persist on a wide variety of surfaces or interfaces. The production of the biofilm extracellular 41 matrix represents a significant resource cost to the organisms producing it, rationalized by the many At present, an understanding of bacterial cellulose biosynthesis has been largely limited to the Gram-negative PNAG synthase, encoded by the pgaABCD locus [29]. For example, the lack of both a an E. coli host as a His-tagged fusion protein and purified it to apparent homogeneity. Structure available structures of the Gram-negative cellulose synthase component BcsZ from E. coli and P. putida, 111 both GH-8 enzymes with an (α/α)6 barrel fold. Structural comparison and analysis showed CcsZ has an 112 unusual architecture for substrate accommodation among GH-5, which was further supported by its

151
Healthcare). Protein eluted from Ni-NTA resin was dialyzed against anion buffer A (50 mM sodium 152 phosphate pH 7.5) for 12 -16 h and passed over the column three times. CcsZ-His6 was eluted from the 153 column using a gradient of 0-100 % anion buffer B (50 mM sodium phosphate pH 7.5, 1 M NaCl). The 154 purity of protein obtained from both nickel affinity chromatography and anion exchange chromatography 155 was routinely assessed using SDS-PAGE. Where needed, purified protein was concentrated in a 156 centrifugal filter unit (Pall Corporation) with a nominal MWCO of 10,000 Da. of 5 mM following crystal growth and to cryoprotectants (crystallization buffer supplemented to 30% (v/v) PEG 8000) just prior to harvesting and vitrification in liquid nitrogen. Data were collected on beamline 08images of 0.2° ∆j oscillations were collected using incident radiation with a wavelength of 1.0 Å for the 168 product-bound crystal. Collected data was processed with XDS [37]. A P 21 space group was determined 169 with a single copy of CcsZ in the asymmetric unit using POINTLESS, then scaled using SCALA and data 170 reduction performed using CTRUNCATE [38]. The processed data was solved using the molecular 171 replacement technique with the Phaser tool in PHENIX [39] using TmCel5A (PDB ID 3AMD) as a search 172 model. Structural refinement was performed using iterative rounds of automated refinement using the was performed by LigandFit in PHENIX, followed by calculation of an omit map by Polder in PHENIX.

176
Ligand real-space refinement was performed in Coot using the Polder map, and validation was performed 177 using MolProbity in PHENIX. Structure interface analysis was performed using the PDBePISA server Glycoside hydrolase activity assays. Activity was initially assessed qualitatively by spotting proteins on 180 an agar overlay containing dissolved carbohydrates. Agar plates were prepared by heating a solution of 181 1.5% (w/v) agar and 0.8% (w/v) carbohydrate to 100°C for 10 min. Carbohydrates tested with the agar 182 overlay include CMC, xylan, and hydroxyethylcellulose (Sigma). Spotted on the agar plates were 0.5 mg

188
The activity of CcsZ was quantitatively monitored using the dinitrosalicylic acid (DNS) method.
substrates were tested at 8.3 g/L. CcsZ (20 μM) was mixed with substrate in assay buffer (

229
We subsequently examined the loci adjacent to CcsA in each organism listed above to identify if 230 other cellulose synthase subunits were present in a similar orientation to the bcsABZC operon found in 231 Gram-negative bacteria (Fig. 1). Although the orientation of specific genes varied between the genomes 232 we surveyed, in all cases we also identified a putative glycoside hydrolase, annotated as an endo-233 glucanase precursor protein, which we presumed to be the functional equivalent of BcsZ and which we 234 denoted CcsZ (Fig 1; yellow). In addition, we identified a highly conserved protein of unknown function, 235 which we denote herein as CcsB ( Fig. 1; gray). Although it is tempting to speculate that these CcsB 236 proteins may serve the functional equivalent to BcsB, a BlastP search of these sequences returned only 237 other proteins of unknown function found in members of class Clostridia and provided no evidence of 238 homology to BcsB. Homology modelling predicted these CcsB sequences to be related to exo-β-239 agarases, although we interpreted this with skepticism due to very limited coverage and identity to the 240 CcsB sequence against the agarase domain fold (i.e., < 40% coverage with < 20% identity in all cases).

241
To understand the localization of CcsB, we analyzed the sequence with the TMHMM bioinformatics tool 242 that predicted the sequence contains two transmembrane helices of 19 and 22 residues in length, with 243 these helices very near to each terminus and connected by an approximately 300 amino acid extracellular domain. As expected, no protein was identified with predicted homology to BcsC, or that contained a predicted TPR or β-barrel domain that would be required for export across an outer membrane; a function 246 mandated only in Gram-negative bacteria ( Fig. 1; green).

247
We also located two conserved loci that were always found adjacent to ccsABZ, which we named

256
fluorescens SBW25 is proposed to be carried out by the wssABCDEFGHIJ operon [11]. In this system, 257 wssBCDE are proposed functional equivalents to bcsABZC, with wssAJ predicted to serve in cellular 258 localization of the synthase complex, and wssFGHI predicted to serve in cellulose O-acetylation [11] in an 259 analogous fashion to the alginate acetyltransferases algXFIJ [42]. The ccsH and ccsI genes are

266
The role of such MBOAT and O-acetyltransferase proteins in similar pathways for the O-267 acetylation of secondary cell wall polysaccharides (SCWPs) has been demonstrated previously [43]. In   responsible for the production of the PNAG polymer [45]. Interestingly, IcaA contains a canonical GT-2 291 domain, while IcaD is a smaller protein of unknown function, predicted to be a membrane protein with two 292 TM helices, yet its expression was still necessary for maximal IcaA activity and correct PNAG synthesis 293 [45]. Although CcsB and IcaD share a poor sequence alignment (> 16%), low sequence identity (31%), 294 and no predicted homology, both of these proteins appear conserved in their respective gene clusters 295 and contain two predicted terminal TM helices linked by a single extracellular domain, although CcsB is 296 much larger than IcaD (i.e. 358 residues versus 101) [29,45]. Thus, our model would predict that ccsAB is 297 necessary and sufficient for cellulose polymerization at the cytoplasmic membrane as reported for icaAD.     respectively. Rfree is the sum extended over a subset of reflections excluded from all stages of the refinement. ‡ As calculated using MolProbity [60].
In agreement with typical GH-5 enzymes, CcsZ folds into an overall structure adopting a distorted 335 TIM barrel fold, in which an (α/β)8 barrel is formed at the core of the fold by eight parallel β-strands (Fig. 3,

336
A and B). This β-barrel motif is flanked by a series of eight partially distorted α-helices packed against the 337 core β-strands, which are connected by extended loops along the C-terminal face of the β-barrel motif.

338
The extended C-terminal loops shape a deep cleft that is typical of GH-5, where both the active site and 339 the cleft for substrate accommodation are located. Interestingly, we observed this groove to be strongly 340 negatively charged in CcsZ, which is not a typical feature of GH-5 (Fig. 3C). The GH-5 consensus

356
difficile were also classified in GH-5_25 [49]. It was not surprising that based upon the molecular 357 phylogenetics of GH-5_25, the TmCel5A enzyme we used as the search model for molecular GH-5_25 is reported as a polyspecific subfamily of GH-5 that possesses multiple activities [49].

373
TmCel5A was also reported to be highly thermostable [53], a feature rationalized by a larger 374 fraction of buried atoms, a smaller accessible surface area, and the presence of shorter unstructured 375 loops as compared to the structure of a mesophilic GH-5 cellulase from Clostridium cellulolyticum 376 (CcCel5A; PDB id 1EDG) [50]. However, for the purpose of comparison, it is worth noting that CcCel5A 377 does not belong to GH-5_25 but instead to GH-5_4, which includes endo-β- (1,4)  glucan structures. We also did not anticipate CcsZ to possess features associated with extended thermal 390 stability given that C. difficile is a mesophilic bacterium and that the biofilm phenotype has been reported 391 at temperatures of 25-37°C [8,32]. Accordingly, we set out to biochemically characterize CcsZ to further 392 explore these properties.

393
The product-bound structure of CcsZ. Next, in order to experimentally resolve the subsite architecture 394 and mechanism of substrate accommodation by CcsZ, we attempted to solve the structure of CcsZ in 395 complex with cello-oligosaccharides. We were able grow crystals in a distinct but chemically similar 396 condition to our apo-CcsZ crystals and successfully introduced cellotriose following crystal growth but 397 prior to crystal harvesting. These crystals diffracted X-rays to 1.65 Å resolution and also grew in the space 398 group P 21 containing a single polypeptide in the asymmetric unit (Table 1). We solved the structure of 399 our ligand complex with molecular replacement using the structure of the apo-form as the search model.

400
The ligand-bound structure was in complete agreement with the apo-form, and although the experimental  Table 1). The final model of the CcsZ-ligand complex also covered almost 405 the entirety of the protein, from residues 39-340, and included the missing S91 from the apo-form.

406
Following refinement of the complex structure, we observed a Fourier electron density peak,

428
Thus, structural plasticity of the substrate-binding groove and induced-fit distortion of longer saccharide 429 substrates is certainly possible in these enzymes and is also conceivable in CcsZ, given the structure we 430 observed.

431
CcsZ is an endo-β-glucanase. To test if CcsZ was in fact capable of hydrolytic activity on cellulose, we 432 assessed its activity on the soluble substrate analogue carboyxmethylcellulose (CMC) first using the 433 CMC-agar overlay method reported by Mazur and Zimmer for BcsZ [27]. Following staining with Congo

434
Red, a zone of clearing was observed suggesting CcsZ was capable of cleaving the glucosidic bonds of 435 CMC after 1 h incubation (Fig 6A). We also performed this same agar overlay experiment using equal 436 concentrations (0.8% w/v each) of CMC, hydroxyethyl cellulose (HEC) and beechwood xylan with a 24 h 437 incubation. As expected, we observed that CcsZ was active on CMC and HEC but was only capable of 438 partially cleaving xylan (Fig. S1).

439
Subsequently, we sought to measure CcsZ activity on CMC quantitatively using the 440 dinitrosalicylic acid (DNS) reducing sugar assay, where enzyme activity was calculated from our raw data 441 using a standard curve of glucose prepared using the DNS method. We observed an increase in reducing 442 sugar concentration following incubation with CcsZ in a concentration-dependent manner, further indicating CcsZ is capable of using CMC as a substrate (Fig. 6B). We also tested CcsZ with CMC under a range of pH buffer conditions to measure pH stability. We found CcsZ to have a pH optimum of buffer conditions and still demonstrated detectable hydrolase activity on CMC across all pH values we 448 tested (pH 4-10). CcsZ activity was reduced two-fold between pH 4.5 and 7.5, with a particular loss in 449 activity under alkaline conditions, showing a six-fold reduction in activity at pH 9.5.

450
To assess substrate specificity, we also tested the common GH-5 polysaccharide substrates 451 arabinoxylan, xylan, lichenin and β-glucan using the same DNS reducing sugar assay. We did not 452 observe a significant increase in reducing sugar concentration following prolonged CcsZ incubation (i.e.  Table 2). Our enzyme-free control contained only the m/z species cellobiose disaccharide (G2) and cellotriose trisaccharide (G3) species. In addition, we also observed unreacted and intact starting material (G5) among the CcsZ products, as well as m/z species that 473 corresponded to the glucose monosaccharide (G1) and cellotetraose oligosaccharide (G4). Although not 474 truly quantitative, we noticed the relative abundance of these m/z species was markedly lower than that of 475 the di-and trisaccharide species, and that their intensity on the liquid chromatograph matched this low 476 ionization potential under constant conditions. This data suggested the preferred regioselectivity of CcsZ 477 as an endo-glucanase and also accounted for the presence of small quantities of glucose, likely an 478 enzymatic side product, which we were able to structurally resolve bound to CcsZ.        The pH profile of CcsZ displays a clear preference for acidic conditions with a pH optimum of 4.5. Activity was calculated using the DNS method with solubilized CMC substrate. Buffer solutions used (50 mM) are listed above data points. (B) Substrate utilization profile of CcsZ using common GH-5 substrates. CcsZ exhibited five-fold greater activity on mixed-linkage β-glucan as compared to CMC when assayed using the DNS method. CcsZ activity on arabinoxylan, lichenin, and xylan was not significantly different from an enzyme-free control under our assay conditions. Cleavage of cellopentaose (G5) by CcsZ was observed to occur in an endo-acting fashion, resulting in products G2 and G3. Additionally, enzymatic products G1 and G4 were observed in lower relative abundance, corresponding to exo-acting hydrolytic activity at terminal saccharides.