Hidden potential of the supporting scaffold as a structural module for plant cystatin design

Protein engineering approaches have been proposed to improve the inhibitory properties of plant cystatins towards herbivorous pest digestive Cys proteases, typically involving sequence alterations in the inhibitory loops and/or N-terminal trunk of the protein interacting with specific amino acid residues of the target protease. In this study, we assessed whether the loops-supporting frame, or scaffold, would represent a valuable structural module for cystatin function improvement. Twenty hybrid cystatins were designed in silico, consisting of the N-terminal trunk and two inhibitory loops of a given donor cystatin grafted onto the scaffold of an alternative, recipient cystatin. Synthetic genes for the hybrids were expressed in E. coli, and the resulting proteins assessed for their potency to inhibit model Cys protease papain and the digestive Cys proteases of Colorado potato beetle (Leptinotarsa decemlineata) used as an insect pest model. In line with the occurrence of positively selected amino acids presumably influencing inhibitory activity in the scaffold region of plant cystatins, grafting the N-terminal trunk and inhibitory loops of a given cystatin onto the scaffold of an alternative cystatin generally had an effect on the inhibitory potency of these function-related elements against Cys proteases. For instance, hybrid cystatins including the three structural elements of model tomato cystatin SlCYS8 grafted on the scaffold of cystatins from other plant families showed Ki values altered by up to 3-fold for papain, and inhibitory efficiencies increased by up to 8-fold against L. decemlineata cathepsin L-like proteases, compared to wild-type SlCYS8 bearing the original scaffold. Our data point to a significant influence of the cystatin scaffold on the inhibitory activity of the N-terminal trunk and protease inhibitory loops. They also suggest the potential of this structural element as a module for plant cystatin design to generate functional variability against Cys proteases, including the digestive proteases of herbivorous pests.


| INTRODUCTION
Numerous studies have discussed the protective effects of plant cystatins against herbivorous arthropods. 1,2 These proteins act as pseudosubstrate inhibitors to obstruct the active site cleft of Cys proteases and block their catalytic activity on protein substrates. 3 Upon feeding, cystatins inhibit digestive proteases in the arthropod's digestive tract, to cause amino acid shortage leading to growth delays, developmental defects, and eventual death of the herbivore. 4 From a physiological standpoint, the plant protective effect of a given cystatin is determined by its relative abundance compared to target Cys proteases in the arthropod midgut, by its basic inhibitory potency against these enzymes, and by eventual compensatory responses triggered in the herbivore following plant tissue intake. 5 Herbivorous arthropods have developed effective strategies to cope with dietary protease inhibitors, notably involving the production of protease isoforms weakly sensitive to inhibition and the overexpression of protease-sensitive proteases to compensate for the loss of protein digestive functions. 6 A key challenge to improve the protective effects of plant cystatins is to identify, or to develop, cystatin variants that show improved potency and/or a broadened inhibitory range against the target arthropod digestive proteases. 2,7 Different strategies have been proposed for the molecular improvement of cystatins towards Cys proteases, usually involving primary sequence mutations in function-related regions of the protein. The inhibitory function of plant cystatins relies on two hairpin loops that establish physical interactions with specific amino acid residues in the active site cleft of the target enzyme. 8 The N-terminal region, or Nterminal trunk, which interacts with surface residues on the target enzyme, also has a strong influence on the inhibitory potency and specificity of the cystatin towards different protease isoforms, owing to a steric impact on the spatial orientation of the inhibitory loops. 9,10 Accordingly, protein engineering schemes involving site-directed mutagenesis or random mutations of functionally relevant amino acids in the N-terminal trunk or inhibitory loops of plant cystatins allowed for the design of improved cystatin variants potentially useful in plant protection against herbivorous pests or pathogens. 2,11 Likewise, a newly described strategy for protein design that involves the substitution of function-related structural elements by the corresponding elements of an alternative cystatin allowed for the design of cystatin variants with highly improved inhibitory activity against digestive Cys proteases of the herbivorous insect model Colorado potato beetle, Leptinotarsa decemlineata. 12 Our goal in this study was to assess whether the non-functional elements of a cystatin, referred to collectively as the 'scaffold', would represent a relevant structural module for plant cystatin improvement. Cystatin scaffolds have been used to different ends in recent years, notably to develop accessory proteins useful in stabilizing and purifying recombinant proteins in heterologous protein expression systems, 13 to design Affimer binding proteins for a variety of imaging, diagnostic and therapeutic purposes, 14,15 or as an inspiration for de novo protein design involving customized protein folds. 16 Here we used twelve cystatin scaffolds as supporting templates for the N-terminal trunk and inhibitory loops of alternative cystatins, to assess their usefulness in generating functional diversity among a finite collection of hybrid cystatin variants. The a-helix and inter-loop region of plant cystatins include positively selected amino acids presumably indicative of an impact on the protease inhibitory range or potency of the cystatin. 8,17 The central fold of cystatins, although not directly involved in the inhibitory process, might represent a valuable target for plant cystatin engineering given its possible impact on the spatial orientation and stability of the protein's functional elements.

| A generic scheme for cystatin scaffold substitution
A scaffold substitution strategy was designed (1) to assess whether grafting the N-terminal trunk and two inhibitory loops of a given cystatin on the scaffold of a second cystatin could be a viable option for the design of stable and active hybrid inhibitors, and (2) to determine whether natural cystatin scaffolds among the plant cystatin protein family could represent a pool of structural modules useful to generate cystatin variants with improved inhibitory potency against insect Cys proteases. A first set of hybrids were designed with the functional elements of tomato multicystatin domain SlCYS8, chosen based on the well described suitability of this inhibitor for protein engineering work 12,13,18,19 and its moderate activity against arthropod Cys proteases, including L. decemlineata digestive Cys proteases, compared to other plant cystatins. 12 A second set of hybrids were designed with the functional elements of Physcomitrella patens C-tailed cystatin domain PpCYS and potato multicystatin domain StCYS5, two cystatins highly active against herbivorous arthropod Cys proteases, 12 to assess whether the inhibitory efficiency of naturally potent cystatins could be explained, at least in part, by their supporting scaffold. A third set of hybrids were designed with the N-terminal and inhibitory loops of cucumber cystatin CsCYS, a weak inhibitor of arthropod Cys proteases, 12 to determine the relative influence of these functional elements and their supporting scaffold on the resulting potency of the cystatin.
Cystatin hybrids were designed in silico by 'grafting' the N-terminal trunk and two inhibitory loops of 'donor' cystatins onto the supporting scaffold of alternative, 'recipient' cystatins deemed representative of the plant cystatin family ( Table 1). The three functional elements were defined based on their position relative to conserved amino acid motifs essential for protease inhibitory activity, in such a way as to include amino acids assumed to physically interact with amino acid residues in the active site or at the surface of the target enzyme (Figure 1). 9 The scaffold corresponded to the polypeptide sequences remaining, including the a-helix and 1 st elbow between the N-terminal trunk and first inhibitory loop, the 2 nd elbow between the two inhibitory loops, and the amino acid string downstream of the second inhibitory loop at the C-terminal end. More specifically, the N-terminal trunks were defined based on the Gly-Gly (GG) motif conserved in the N-terminal trunk of plant cystatins, the first inhibitory loops based on the conserved pentapeptide motif Gln-X-Val-X-Gly (QxVxG) known to penetrate the active site of the target protease, and the second inhibitory loops based on the conserved Trp (W) residue also entering the active site cleft of the protease. 8 Twenty hybrids were designed overall, with the N-terminal trunk and inhibitory loops of four donors (SlCYS8, PpCYS, StCYS5, CsCYS) and the supporting scaffolds of 12 recipients including the loop donors ( Table 2 and Supplementary Table 1). Synthetic DNA's were generated for the 20 hybrids and used as templates for heterologous expression in E.
coli and affinity purification using the GST gene fusion system. 20 Eighteen hybrids, out of 20, were purified in a form suitable for protease inhibitory assays with papain and L. decemlineata Cys proteases (Table 2 and Figure 2). The other two hybrids -with the functional elements of potato StCYS5-could not be properly expressed under our experimental conditions, likely due to unsuccessful folding and/or poor stability of the resulting protein products expressed in a foreign cellular environment.

| The inhibitory potency of SlCYS8 functional elements is scaffold-dependent
Protease inhibitory assays were conducted to measure the impact of scaffold substitutions on the anti-papain activity of SlCYS8 N-terminal trunk and inhibitory loops ( HaCYS3, that showed relative inhibitory rates increased by >4 to 10-fold compared to wildtype SlCYS8 (P < 0.05; post-ANOVA Tukey's test). The less potent hybrids were those with the scaffolds of PpCYS, SlCYS9, ZmCYSB and StCYS5, which had little effect on the inhibitory potency of SlCYS8 functional elements. Unlike for papain, no positive relation could be established between the basic efficiency of the original cystatins against L. decemlineata cathepsin L-like enzymes and the inhibitory potency of their resulting SlCYS8 hybrids against the same enzymes. For instance, the less potent hybrids included the scaffolds of cystatins, such as PpCYS, SlCYS9 or StCYS5, exhibiting strong inhibitory activity against the insect proteases. 12 In sharp contrast, the most potent inhibitors, with the scaffolds of CsCYS, HaCYS3, JcCYS or lotus cystatin LjCYSB, were derived from cystatins exhibiting low to very weak activity against these enzymes. 12 Complementary assays were performed to assess the inhibitory potency of SlCYS8 hybrids against L. decemlineata cathepsin B-like enzymes (Figure 4, right panel). In accordance with previous studies reporting the poor susceptibility of these enzymes to plant cystatin inhibition, 10,20 most scaffold substitutions had no effect on the effectiveness of SlCYS8 functional elements to inhibit their activity. Unexpectedly, the scaffolds of PpCYS, a strong inhibitor of herbivorous arthropod Cys proteases, 12 and CsCYS, a weak inhibitor of these enzymes, had a positive impact on the SlCYS8 elements against these 'naturally insensitive' proteases, as inferred from inhibitory rates increased by 2-to 5-fold compared to wild-type SlCYS8. These observations, along with those above about cathepsin L-like enzymes, suggested a role for the supporting scaffold on the inhibitory potency of plant cystatin functional elements against Cys proteases. In practice, they suggested the potential of plant cystatin scaffolds as structural modules to generate functional diversity among small collections of recombinant cystatin variants, in complement to current protein engineering approaches relying on primary sequence modifications in the N-terminal trunk and two inhibitory loops.

| The scaffold is a strong determinant of plant cystatins inhibitory potency
Cystatin hybrids designed with the N-terminal trunk and inhibitory loops of PpCYS, StCYS5 and CsCYS were used to assess the relative contributions of their scaffold to the inhibition of L. decemlineata proteases (Figure 5). In brief, cystatin hybrids with the functional elements of PpCYS or StCYS5 grafted onto alternative scaffolds showed decreased potency against the insect cathepsin L-like enzymes compared to their wild-type counterpart ( Distribution maps were inferred from the L. decemlineata cathepsin L inhibition data to compare the potential of our new scaffold substitution strategy in generating functional variability with our recently described strategy based on the substitution of N-terminal trunks or inhibitory loops (Figure 6). An overall coefficient of variation (CV) of 62% was calculated for inhibitory data generated with the scaffold hybrids, similar to a CV value of 61% calculated for the cathepsin L inhibition data recently produced with SlCYS8 structural element (SE) hybrids. 12 Much interestingly, the average cathepsin L inhibitory rate for the SlCYS8 scaffold hybrids was almost three times the inhibitory rate measured for wild-type SlCYS8 (see Figure   4), similar to the improved inhibitory rates observed for the best variants of a previously described collection of SlCYS8 single mutants bearing alternative residues at positively selected sites. 20 Together, these observations confirmed the potential of our new scaffold substitution approach to implement functional diversity among a small set of recombinant cystatin variants considered for plant protection, in complement to the usual mutational approaches involving structural changes in the functional elements.

| CONCLUSION
Protein engineering approaches have been proposed to improve the inhibitory properties of plant cystatins against herbivorous arthropod digestive Cys proteases, typically involving rationally inferred or randomly generated mutations at functionally relevant amino acid sites of the protein.

| Recombinant cystatins
All cystatins were produced in E. coli, strain BL21 as described previously, 20 using the GST gene fusion system for heterologous expression and affinity purification (GE Healthcare). DNA templates for the cystatins were synthesized as g-blocks (Integrated DNA Technologies) including GoldenGate BsaI cloning sites on both sides of the cystatin coding region. DNA coding sequences for the original cystatins corresponded to those reported in GenBank (as listed in Table 1). DNA sequences for the scaffold hybrids were designed as described on

| Test proteases
Papain (E.C.3.4.22.2, from papaya latex) was from Millipore Sigma. Insect proteases were extracted from the midgut of L. decemlineata fourth instars collected on potato plants, as described previously. 20 Papain and soluble proteins in the arthropod crude extracts were assayed according to Bradford, 31 with bovine serum albumin as a protein standard.

| Arthropod protease assays
Cys cathepsin activities in L. decemlineata protein extracts were assayed in 0.1 M NaH2PO4/0.1 M Na2HPO4 phosphate buffer, pH 6.5, using the synthetic peptide substrates Z-Phe-Arg-MCA for cathepsin L-like and Z-Arg-Arg-MCA for cathepsin B-like activities. 12 Hydrolysis was allowed to proceed in reducing conditions (10 mM L-cysteine) for 10 min at 25°C, with ~5 ng of arthropod protein per µl in the reaction mixture and the peptide substrate in large excess. Cystatins diluted in a minimal volume of reaction buffer were added to the reaction mixture for the inhibitory assays. Proteolytic activity was monitored using a Synergy      Details on the functional element donors and scaffold recipients are given in Table 1, Table 2 and  CsCYS N-ter + loops CsCYS N-ter + loops