Abstract
TGF-β is a secreted signaling protein involved in many physiological processes: organ development, production and maintenance of the extracellular matrix, as well as regulation of the adaptive immune system. As a cytokine, TGF-β stimulates the differentiation of CD4+ T-cells into regulatory T-cells (Tregs) that act to promote peripheral immune tolerance. The murine parasite Heligmosomoides polygyrus takes advantage of this pathway to induce inducing Foxp3+ Tregs in a similar manner using a TGF-β mimic (TGM), comprised of five tandem complement control protein (CCP) domains, designated D1-D5. Despite having no structural homology to TGF-β or to TGF-β family proteins, TGM binds directly to the TGF-β type I and type II receptors, TβRI and TβRII. To further investigate, NMR titration, and SPR and ITC binding experiments were performed, showing that TGM-D2, with the aid of D1, binds TβRI and TGM-D3 binds TβRII. Competition ITC experiments showed that TGM-D3 competes with TGF-β for binding to TβRII, consistent with TGM-D3-induced NMR chemical shift perturbations of TβRII which aligned with the solvent inaccessible areas of TβRII upon binding TGF-β. Thus, TGM-D3 binds to the same edged β-strand of TβRII that is used to bind TGF-β. Competition ITC experiments demonstrated that TGM-D1D2 and TGF-β3:TβRII compete for binding to TβRI, while TGM-D2-induced NMR chemical shift perturbation of TβRI showed that TGM-D2 binds to the same pre-helix extension of TβRI as does the TGF-β/TβRII binary complex. The solution structure of TGM-D3 revealed that while it has the overall structure of a CCP domain, TGM-D3 has an insertion in the hypervariable loop uncommon to CCP domains. These findings suggest that parasitic TGM, despite its lack of structural similarity to TGF-β, evolved to take advantage of the binding regions of the mammalian TGF-β type I and type II receptors. The structure of this TGM domain, along with the predicted structure of other H. polygyrus secreted proteins reported in the literature, suggest that TGM is part of a larger family of evolutionarily-adapted immunomodulatory CCP-containing proteins.
Introduction
Helminth parasites are extraordinarily prevalent in all host organisms and remain major human health burdens in tropical regions of the world1,2; their longevity reflects an evolutionarily-refined ability to evade the immune system through multiple molecular strategies which are only now becoming understood3-5. A number of helminth infections are associated with activation of the regulatory T-cell (Treg) compartment which dampens inflammation and restricts anti-parasite immunity, either through expansion of the host’s pre-existing Tregs or through inducing de novo differentiation of peripheral T-cells into the Treg subset6. Notably, infection with the mouse model helminth Heligmosomoides polygyrus expands Treg activity, and worm clearance can be induced by antibody-mediated depletion of Tregs7. Tregs express a defining transcription factor, Foxp3, which can be induced in naive T-cells by the pleiotropic cytokine TGF-β8-10. Moreover, H. polygyrus secretory-excretory products (HES) can induce Tregs by signaling through the TGF-β receptors, TβRI and TβRII11. Fractionation and analysis by a TGF-β response bioassay isolated the protein in HES responsible for stimulating the TGF-β signaling pathway and inducing Tregs, as a five-domain ca. 420 amino acid protein designated TGF-β mimic (TGM), which is fully active as a full-length protein requiring no post-translational processing, and bearing no sequence similarity to any member of the TGF-β family12.
Canonical TGF-β proteins control a multitude of pathways in cellular differentiation13-15 and immune homeostasis9,13,16, and are tightly regulated by pro-domain processing to yield ∼110-amino acid cystine-knotted monomers tethered together into active 25kDa homodimers by a single inter-chain disulfide bond. They signal by assembling a heterotetrameric complex with two pairs of two serine/threonine kinase type I and type II receptors, TβRI and TβRII respectively17-19. TGF-β-dependent differentiation of naïve CD4+ cells into CD4+ CD25+ Foxp3+ Tregs, is essential for peripheral immune tolerance8,9 as mice lacking TGF-β1 exhibit perinatal mortality and develop multi-organ inflammatory disease and die after maternal TGF-β is depleted upon weaning13. Dysregulation of the TGF-β signaling pathway has been implicated in the pathogenesis of several human diseases, ranging from inflammatory bowel disease20, renal and cardiac fibrosis21,22, and many soft tissue cancers21,23,24. In the lattermost setting, TGF-β dysregulation prevents effective checkpoint immunotherapy25,26 and thus offers a therapeutic target in its own right27.
In contrast to the single-domain structure of mature TGF-β, the parasitic-encoded TGM molecule is composed of 5 modular domain, all with distinct similarity to the complement control protein (CCP) famliy; CCP domains are ca. 60 amino acids in length and are comprised of multiple short β-strands tethered together by two highly conserved disulfide bonds (CysI-CysIII, CysII-CysIV)28. CCP domains are not usually found alone, but are instead are usually found as arrays, in some cases with as many as 30 repeated modules28. These domains are present in numerous proteins, but they are most prevalent in the family of proteins that act to regulate complement (RCA), which among others include decay accelerating factor (DAF), Factor H (FH), and Complement C3b/C4b Receptor 1 (CR1)28. In H. polygyrus, more than 30 CCP-containing proteins have been detected29,30, including HpARI (H. polygyrus Alarmin Release Inhibitor) and HpBARI (H. polygyrus Binds Alaarmin Receptor and Inhibits) which suppress IL-33 signaling by binding IL-33 and its receptor ST-231-33; these cytokines prime innate and adaptiive type 2 immue responses and thus suppression via HpARI and HpBARI acts as an alternatae form of immunosuppression. Similar to TGM, Hp-ARI and Hp-BARI contain multiple CCP domainsn (3 and 2, respectively) and have significant insertions not present in canonical CCP domains12,31,32.
Previous studies have shown that of the five domains of TGM, only the first three, D1-D3, are required for TGF-β signaling activity29. Previous studies also showed that TGM, in contrast to TGF-β, binds TβRI with high affinity (KD 52.1nM) and TβRII with lower affinity (KD 0.55 μM) and that it does so without the pronounced inter-dependence of TβRI and TβRII binding characteristic of the TGF-βs17,34. Here we biophysically characterized the individual domains of TGM to show that binding of TGM to human TβRI and TβRII is modular in nature, with D1-D2 and D3 binding to TβRI and TβRII with 1:1 stoichiometry, respectfully. We characterized the binding sites that TβRI and TβRII use to bind TGM using ITC, SPR, and NMR finding that they utilize structural motifs similar to those used to bind TGF-β. Finally, we determined the solution structure of TGM-D3 which showed that TGM-D3 takes on the overall fold of a CCP domain with two key differences: 1) loops replacing two β-strands, and 2) an atypical insertion at the hypervariable loop (HVL). Using NMR chemical shift perturbations, the binding site of TβRII on TGM-D3 was mapped, with large targeted shifts across the four β-strands, but particularly focused on the C-terminal β-strands. The atypical insertion at the HVL dramatically increases its length, which allows it to block both faces of the N-terminal region of TGM-D3 from binding TβRII. Thus, TβRII is able to insert the edged β-strand it uses to bind TGF-β alongside the face of TGM-D3, interacting most with the C-terminal β-strands of TGM-D3. This highlights that the parasite has evolved to take advantage of host TGF-β receptors, and that more broadly H. polygyrus has adapted its own CCP domain-containing proteins for the purpose of protein mimicry and host immunomodulation.
Results
Isolation and folding of TGM-D1, -D2, and -D3
Previously, in vitro TGF-β bioassays demonstrated that only TGM domains 1-3 were required for full TGF-β Smad reporter activity and induction of CD4+ CD25+ Foxp3+ Tregs29. Truncation of domains 4 and 5 (TGM-D1-3) showed minimal reduction in TGF-β activity, while loss of domain 1 (TGM-D2-5), domains 3-5 (TGM-D1-2), or domains 2-5 (TGM-D1) completely abolished TGF-β activity29. TGM was furthermore shown to require both TβRI and TβRII to elicit TGF-β signaling, as TGM activity was inhibited by Sβ43154235, a TβRI kinase inhibitor, and by ITD-1 which stimulates ubiquitin-dependent degradation of TβRII 12.
The engagement of the TGF-β receptors by TGM was thus investigated by assessing the receptor binding properties of TGM domains 1, 2, and 3, each as an isolated protein. To enable isotopic labeling, the individual domains were produced in E. coli. TGM-D1, -D2, and -D3 each contain four cysteines and are expected to form two disulfides by homology to CCP domains. Though steps were taken to produce each domain as a soluble protein by fusion to thioredoxin and expression at reduced temperature (18 °C), this was not successful. Thus, the fusion proteins were expressed at 37 °C in the form of insoluble inclusion bodies, and refolded in the presence of a glutathione redox couple.
The isolated proteins were validated by mass spectrometry, which showed that their masses matched, to within 0.5 Da or less of the mass calculated assuming the formation of two disulfide bonds (i.e. loss of two protons for each disulfide bond). To further validate the refolded proteins, they were labeled with 15N and analyzed by recording 1H-15N HSQC spectra in phosphate buffer at pH 6.0 at 37 °C. The CCP family of proteins, to which TGM belongs, are characterized by four or more β-strands36,37, and thus in addition to having the expected number of peaks indicative of a homogenous pairing of the cysteines, the labeled proteins would also be expected to have well-dispersed spectra, with minimal clustering in the random coil region (between 7.8 and 8.3 ppm in the 1H dimension). This was observed for TGM-D3, which had very close to the expected number of backbone amide resonances (80 observed, 81 expected) and excellent signal dispersion (Fig. S1A).
TGM-D2, although also exhibiting excellent signal dispersion, had more backbone amide resonances than expected (106 observed, 76 expected), suggesting sample heterogeneity (Fig. S1B). This could be due to heterogenous pairing of the disulfides, or may result from conformational dynamics. To distinguish between these, HSQC ZZ-exchange spectra were recorded with mixing times ranging between 0 – 250 ms 38. These experiments identified at least 12 pairs of peaks undergoing exchange on this timescale, indicating that the protein is likely homogenous with respect to the pairing of its cysteines, but undergoes a slow conformational transition that leads to two conformations in solution. The process responsible for the doubling was not investigated, but would be consistent with proline cis:trans isomerization owing to the slow (ca. 100 ms) timescale by which the exchange cross peaks build up and the fact that TGM-D2 has four additional proline residues relative to TGM-D1 (Fig. S1C, Table S1).
TGM-D1, in contrast to TGM-D2 and TGM-D3, had poor signal dispersion, with most peaks clustered in the random coil region of the spectrum (Fig. S2A). This narrow dispersion might reflect the presence of non-native protein due to mispairing of cysteines, or it might be due to other reasons, such as aggregation. To investigate the latter, increasing concentration of CHAPS was added to the NMR buffer and the protein concentration was decreased. This led to the appearance of a large number of peaks outside of the random coil region (Fig. S2B-D). The spectrum with 20 μM TGF-D1 and 10 mM CHAPS in the buffer had roughly the expected number of peaks (85), but as well a few intense peaks in the random coil region of the spectrum (Fig. S2D). Thus, TGM-D1 appears to be natively folded, but is not entirely disaggregated under these conditions.
NMR-detected binding of TGM-D1, -D2, and -D3 by TβRI and TβRII
To assess binding of TβRII by TGM-D1, TGM-D2, and TGM-D3, unlabeled TβRII was titrated into samples of 15N-labeled TGM-D1, TGM-D2, or TGM-D3. The addition of TβRII resulted in significant perturbations in more than half of the backbone amide signals of TGM-D3 (Fig. 1A), but little to no perturbations in the signals of either TGM-D1 or TGM-D2 (Fig. S3A-B). Through the course of the titration of TGM-D3 with TβRII, peaks corresponding to both the bound and unbound forms of TGM-D3 were observed at intermediate titration points (1:0.35 and 1:0.70), indicative of slow-exchange binding (Fig. 1C). This suggested that TGM-D3 binds to TβRII with relatively high affinity, but weakly or not at all to TGM-D1 or TGM-D2. The presence of 10 mM CHAPS in the TGM-D1 sample may, however, impede binding and thus a role of TGM-D1 in binding TβRII cannot at this point be excluded. The NMR titrations of TGM-D3 into TβRII further suggest that TGM-D3 binds TβRII with a stoichiometry of 1:1 (Fig. 1C).
The addition of TβRI resulted in significant perturbations in more than half the backbone amide signals of TGM-D2 (Fig. 1B), but little to no perturbations in the signals of either TGM-D1 or TGM-D3 (Fig. S3C-D). However, similar to TGM-D1:TβRII, the presence of 10 mM CHAPS in the TGM-D1 sample may impede binding and thus a role of TGM-D1 in binding TβRI cannot be excluded. Through titrations in which increasing amounts of TβRI were added to TGM-D2, peaks corresponding to both the bound and unbound forms were observed under sub-stoichiometric conditions (Fig. 1D), indicative of relatively high affinity binding. The stoichiometry of binding was confirmed by titrating the amount of TβRI needed to fully convert the unbound form of 15N TGM-D2 to the bound form using NMR (Fig. S4A-D), with samples from each titration analyzed by size-exclusion chromatography (SEC) (Fig. S4E-I) Furthermore, a sample of TβRI with excess TGM-D2 was analyzed by SEC as monitored by multiangle light scattering (SEC-MALS) (Fig. S4J). The peak corresponding to the complex had a molecular mass of 17.1 ± 1.5 kDa, while the later eluting peak, corresponding to TGM-D2, had a mass of 10.1 ± 0.4 kDa. These experimentally measured masses are close to those expected for a 1:1 TβRI:TGM-D2 complex (18.8 kDa) or TGM-D2 alone (9.3 kDa), thus TGM-D2 binds TβRI with 1:1 stoichiometry. The titration of 15N TGM-D2 with TβRI leads to resolution of the conformational doubling in the spectrum of TGM-D2 that was previously noted (Fig. S5A-B). Thus, TGM-D2 appears to be primarily responsible for binding TβRI, and binding stabilizes TGM-D2 in one of its two native conformations.
To confirm primary binding of TGM-D2 to TβRI and binding of TGM-D3 to TβRII, and to assess the potential role for TGM-D1, the converse NMR titration experiments were performed, with samples of 15N-labeled TβRI and TβRII being titrated with unlabeled TGM-D1, -D2, or -D3, all in buffers lacking CHAPS. The titration of 15N TβRII with unlabeled TGM-D3 resulted in significant perturbations in its backbone amide signals (Fig. S6A), but titration with TGM-D1 or TGM-D2 did not (Fig. S7A,B). The titration of 15N TβRII with TGM-D3 further revealed the simultaneous appearance of peaks corresponding to the unbound and bound at sub-stoichiometric ratios, confirming that TGM-D3 is the main binding partner for TβRII and that it binds with high affinity (Fig S6C). The absence of any shifts upon titration of 15N TβRII with TGM-D1 suggests the lack of binding previously observed was not due to interference by CHAPS.
The titration of 15N TβRI with unlabeled TGM-D2 supports the prior results with a significant perturbation in over half of the backbone amide signals (Fig. S6B). There are peaks corresponding to both the unbound and bound forms at intermediate titration points, confirming that binding is high-affinity (Fig. S6D). The addition of unlabeled TGM-D3 into 15N TβRI resulted in little to no perturbation of the backbone amide signals, confirming that TGM-D3 does not bind TβRI (Fig. S8A). Titration of 15N TβRI with unlabeled TGM-D1 in the absence of CHAPS resulted in the weakening of many, and full disappearance, of nearly a third of the TβRI backbone signals, along with weak perturbations of other residues (Fig. S8B). The disappearance of some of the TβRI signals and small shifts of other residues is likely because it is binding to TGM-D1 and being incorporated into a TGM-D1 aggregate. Thus, although the nature of the binding precludes detailed analysis of the interaction, TGM-D1 does appear to bind TβRI in the absence of CHAPS and TGM-D1 may contribute to its binding by TGM.
ITC and SPR quantification of T/JRII binding
To determine whether TGM-D3 is the only domain responsible for binding TβRII, or whether TGM-D1 or -D2 might also play a role, ITC binding experiments were performed in which either full-length TGM (TGM-FL) or individual domains of TGM, were titrated into TβRII in the calorimetry cell. The strong exothermic response for TGM-FL and TGM-D3 with TβRII indicate binding (Fig. 2C,D). The fitting of the integrated heat to a standard binding isotherm for both binding reactions yielded KDs and enthalpies of 0.55 μM (0.26 – 1.08, 68.3% CI))/ -7.14 kcal mol-1 (−7.80 – −6.55, 68.3% CI) and 1.15 μM (0.87 – 1.52, 68.3% CI)/ −10.65 kcal mol-1 (−11.22 – −10.14, 68.3% CI) for TGM-FL and TGM-D3, respectively (Fig. 2E-F, Table 1). Titration of TβRII with TGM-D1 or TGM-D2, in contrast, led to weak endothermic and exothermic responses, which were observed in buffer only titrations, indicating that neither TGM-D1 nor TGM-D2 binds TβRII (Fig. 2A-B, right inset). These observations, along with the similarity of the binding affinities of TGM-FL and TGM-D3 to TβRII, indicate that TGM-D3 is responsible for nearly all or all of the binding capacity of TGM-FL for TβRII.
The results above were further validated by SPR measurements in which TGM-FL, or individual domains of TGM, were injected over biotinylated avitagged TβRII captured on a streptavidin-coated sensor chip. These injections yielded robust concentration-dependent responses when TGM-FL or TGM-D3 were injected over the immobilized TβRII, but not when TGM-D1, TGM-D2, or a construct that included both TGM domains 1 and 2, TGM-D1D2, were injected (Fig. 3). The KD values derived by fitting the TGM-FL and TGM-D3 sensorgrams to a simple (1:1) kinetic model were comparable, 0.61 ± 0.01 μM and 0.91 ± 0.02 μM, respectively, though as with the ITC data, TGM-FL demonstrated slightly greater affinity than TGM-D3 for TβRII (Table S3, Fig. 3D-E). The SPR results are therefore consistent with the overall conclusion derived from both the NMR and ITC experiments, that TGM-D3 was responsible for all or almost all of the binding capacity of TGM-FL for TβRII.
ITC and SPR quantification of T/JRI binding
Similarly, to determine whether TGM-D2 is the only domain responsible for binding TβRI, or whether TGM-D1 or -D3 might also play a role, ITC binding experiments were performed in which either TGM-FL, or individual domains of TGM were titrated into TβRI in the calorimetry cell. The strong exothermic response for TGM-FL and TGM-D2 with TβRI indicate binding (Fig. 4C, E, F,H). The fitting of the integrated heat to a standard binding isotherm for both binding reactions yielded KDs and enthalpies of 52.1nM (29.3 – 90.2, 68.3% CI)) and −16.69 kcal mol-1 (−18.26 – −15.35, 68.3% CI) and 1.47μM (0.45 – 4.58, 68.3% CI) and −17.66 kcal mol-1 (−26.95 – −13.18, 68.3% CI) for TGM-FL and TGM-D2, respectively (Fig. 4F,H) Table 2). Titration of TβRI with TGM-D1 or TGM-D3, in contrast, led to a weak endothermic (TGM-D1) or no response (TGM-D3) similar to that of an injection into buffer, and no response, indicative of no binding or weak binding. The finding that TGM-D2 bound with an affinity about thirty-fold weaker than TGM-FL complements the previous NMR titration data suggesting that TGM-D1 might also contribute to binding TβRI. To test this, a construct that included both TGM-D1 and TGM-D2 (TGM-D1D2), was produced in bacteria, refolded, and purified to homogeneity. This construct was titrated into TβRI and yielded a robust concentration-dependent response (Fig. 4D, G). The KD and enthalpy was 25.3nM (10.7 – 48.3, 68.3% CI) and −18.70 kcal mol-1 (−19.59 – −17.85, 68.3% CI) respectively, which was comparable to that of TGM-FL. These observations indicate that both domains TGM-D1 and TGM-D2 are required to recapitulate the binding capacity of TGM-FL for TβRI.
SPR experiments were also performed with TβRI, TGM-FL, TGM-D1D2, or individual domains of TGM, injected over biotinylated avi-tagged TβRI captured on a streptavidin-coated sensor chip. These injections yielded robust concentration-dependent responses when TGM-FL, TGM-D2, or TGM-D1D2 were injected over the immobilized TβRI, but not when TGMs-D1 or TGM-D3 were injected (Fig. 5). The KD values derived by fitting the TGM-FL and TGM-D1D2 sensorgrams to a simple (1:1) kinetic model were comparable, 13.1 ± 0.4 nM and 24.1 ± 0.1 μM, respectively, though TGM-FL demonstrated slightly greater affinity than TGM-D1D2 for TβRI (Table S4, Fig. 5D-E). The KD derived from kinetic analysis of the TGM-D2 sensorgram was and 309 ± 4 nM (Fig. 5B, Table S4), indicating that the affinity of TGM-D2 for TβRI is approximately 25 times weaker than the TGM-FL, consistent with the previous ITC results. The SPR results are therefore consistent with the overall conclusion derived from both the NMR and ITC experiments that both TGM-D1 and -D2 are required for the full binding capacity of TGM for TβRI.
T/JRII utilizes a similar set of residues to bind TGM-D3 and TGF-β
To determine if TβRII might bind to TGM-D3 with some of the same residues that it uses to bind TGF-β3, ITC competition binding experiments were performed. The TGF-β3/TGM-D3 competition experiments with TβRII were performed using an engineered TGF-β monomer, known as mmTGF-β2-7M2R, rather than TGF-β3. mmTGF-β2-7M2R has an intact finger region and binds TβRII with the same affinity as TGF-β1 and TGF-β339, but unlike TGF-β1 or TGF-β3, is highly soluble at neutral pH. The ITC competition measurements were performed by titrating mmTGF-β2-7M2R into the sample cell loaded with TβRII in either the absence or presence of increasing concentrations of TGM-D3 (Fig. 6A-C). The addition of TGM-D3 both increased the extent of curvature in the binding isotherms and reduced the overall enthalpy, consistent with the behavior expected for competitive binding. To quantify this, the integrated heat from the three experiments, together with fitted KD and enthalpy for the TGM-D3:TβRII interaction, were globally fit to a simple competitive binding model to derive the binding constant for the high affinity mmTGF-β2-7M2R:TβRII interaction in the absence of competitor (Fig. 6D-F, Table 3). The KD for the high affinity mmTGF-β2-7M2R:TβRII interaction was found to be 35.20 nM (17.16, 64.42 – 68.3% CI), in accord with previous SPR measurements for the TβRII:TGF-β interaction with immobilized TGF-β1 or TGF-β339. The clear evidence of competition demonstrated in this experiment shows that TβRII uses some or all of the same residues to bind TGF-β and TGM-D3.
To investigate this further, the backbone of 15N, 13C TβRII was fully assigned as bound to unlabeled TGM-D3. The assigned chemical shifts for the bound form were then compared to those previously reported for the unbound form40 (Fig. S9). The largest shifts, as assessed from a composite of the backbone N, Ca, CO, and sidechain Cβ chemical shift perturbations (CSPs), fell within a narrow region from residue 52-54 (Fig. 6G). This pattern of CSPs was compared to the change in solvent accessible surface area (SAS) of TβRII upon binding TGF-β. The regions of TβRII most hidden by solvent upon binding TGF-β fell within a similar area from residue 50-56 (Fig. 6H). This commonality between the SAS of TβRII bound by TGF-β and the CSPs of TβRII bound by TGM-D3 confirms that TβRII uses the same primary motif, an edge β-strand that binds deeply in the cleft between the fingers 1-2 and 3-440,41, to bind both ligands. TGM-D3 leads to only minor shift perturbations outside of the edge β-strand noted above, whereas TGF-β3 decreases solvent accessibility in regions that flank this strand, suggesting that TGF-β3 has more extensive contacts with TβRII than TGM-D3, an observation consistent with TβRII’s considerably higher affinity for TGF-β1/-β3 compared to TGM-D3 (KDs ca. 30-50 nM and 500-1000 nM, respectively).
T/JRI utilizes a similar set of residues to bind TGM-D2 and TGF-β:TβRII
To determine if TβRI might bind to TGM-D1D2 with some of the same residues that it uses to bind the TGF-β3/TβRII complex, ITC competition binding experiments were performed by titrating TGM-D1D2 into the sample cell loaded with TβRI in either the absence or presence of saturating concentrations of TGF-β3/TβRII. The control titration of TGF-β:TβRII into TβRI alone demonstrated a fitted KD of 61.7nM (35.7 – 97.2nM, 68.3% CI) (Fig. 7A-B), which is similar to that for TGM-D1D2 binding TβRI alone (KD: 25.3nM (10.7 – 48.3, 68.3% CI)).
However, unlike the TGM-D1D2:TβRI interaction which had a large enthalpy, −18.70 kcal mol-1, the TGF-β:TβRII:TβRI interaction had a much smaller enthalpy −4.21 (−4.49 – −3.95) kcal mol-1, even at increased temperature (Table 4), indicating that the binding reaction is more entropically driven. Owing to similar KDs but a significantly greater enthalpy for the TGM-D1D2:TβRI interaction, the competition experiment was performed by saturating TβRI in the cell with TGF-β:TβRII binary complex and then titrating TGM-D1D2 using the syringe (Fig. 7C-D). The resultant titration yielded minimal heat relative to the TGM-D1D2:TβRI interaction (Fig. 4D, G) and could not be quantitively fit, indicating that the TGF-β:TβRII binds to some or all of thesame residues of TβRI at TGM-D1D2.
To investigate this further, the backbone of 15N, 13C TβRI was fully assigned as bound to unlabeled TGM-D2 (Fig. S10). TGM-D1D2 was not used due to its higher molecular weight and reduced solubility in TβRI-compatible NMR buffers. The assigned chemical shifts for the bound form were then compared to those previously reported for the unbound form17. The largest shifts, as assessed from a composite of the backbone N, Ca, CO, and sidechain Cβ CSPs, fell within four distinct regions: 1) residues 13-16, 2) residues 53-65, 3) residues 73-77, and 4) residues 82-86 (Fig. 7E). This pattern of shifts was compared to the change in SAS of TβRI upon binding the TGF-β/TβRII complex. The regions of TβRI protected from solvent upon binding TGF-β and TβRII fell within similar areas from residue 54-60 and 75-78 (Fig. 7F). The commonality between the change in SAS of TβRI upon binding TGF-β:TβRII and the CSPs of TβRI upon binding TGM-D2 confirms that TβRI uses the same primary motifs, the pre-helix extension (residues 53-65) that bridges β-strands 4 and 5 and the C-terminal end of β-strand 5 (residues 73-77), to bind both ligands. TβRI has two regions of TGM-D3-inncuded CSPs outside these regions, residues 13-16 and 82-85; these residues are positioned adjacent to the C-terminal end of β-strand 5, suggesting that TGM-D2 has a larger, more distributed contact surface with TβRI as compared to TGF-β:TβRII.
TGM-D3 structure and dynamics
The structure of TGM-D3 was determined based on 1H-1H NOE distance restraints, 1H-15N, 13Ca-1Ha, and 13CO-15N RDCs, and 3JHN-Ha, 3JHa-Hβ, 3JHN-Hβ J-couplings and essentially complete chemical shift assignments for both the backbone and sidechains (Table 5). The backbone root-mean-square deviation (RMSD) for the ten lowest-energy structures relative to the lowest energy structure was 0.41 Å when aligned according to the regions of regular secondary structure, or 2.31 Å when aligned over the length of the ordered core (Fig. 8A, Table 5). TGM-D3 is comprised of four beta strands (Val15-Gly21, Thr45-Cys51, Glu62-Lys69, Ser76-Tyr80) arranged into a highly twisted antiparallel β-sheet with a β1:β2:β3:β4 topology (Fig. 8A). There is also a 310helix (Gln56-Ala58) connecting β2 and β3 in some, but not all of the lowest-energy structures (Fig. 8A). The structures are consistent with that derived from an analysis of secondary shifts42, with four high probability extended regions predicted between residues 12-19, 44-50, 62-69, and 76-80, and a low probability helical region from residues 54-56 (Fig. S11C). The secondary shifts also predict, with lower probability, extended regions between residues 5-7 and 29-34. The former corresponds to the N-terminal region, while the latter corresponds to the middle section of the 23-residue hypervariable (HVL) loop that connects β1 and β2 (Fig. 8A). This section of the HVL from extends perpendicularly across the C-terminal end of β1 and is mostly converged among the ten lowest energy structures, with an average pairwise RMSD of 0.85 Å. The segments from residues 5-7 and 29-34, although highly extended, do not form hydrogen bonds that define a β-strand and thus are not classified as such in the calculated structures.
The Cys6-Cys67 disulfide pins the N-terminus to one end of the concave surface of the β-sheet, while the C-terminus is pinned to the other end of the sheet by the Cys51-Cys87 disulfide (Fig. 8B). This creates a large broad cavity that is bordered on one edge by the extended N-terminal segment from Cys6-Gly13 and on the other by β4 and the extended segment that follows, which spans from Ser76-Cys87 (Fig. 8B). The core of the protein is formed by hydrophobic residues in the cavity, and includes Leu9 and Ile14 from the extended N-terminal segment, Val15 and Tyr17 from β1, Ala47 and the hydrophobic portion of the sidechain of Arg49 from β2, Val64 and Ala65 from β3, and Trp78, Tyr80, and Tyr81 from β4 (Fig. 8B). Though the majority of these residues are completely buried, including Leu9, Ala47, Val64, Ala65, and Trp78, there are a few that are partlly exposed, particularly Ile14 and Val15, both of which are found near the end of the cavity closest to the Cys51-Cys87 disulfide bond where it is at its widest.
The backbone 15N T2 relaxation times, which are sensitive to both fast (ns-ps) timescale motions that result from low amplitude fluctuations of the backbone, but also to larger amplitude rearrangements that occur on slower (μs – ms) timescales, are significantly increased in the N-terminal tail and modestly increased near the C-terminal end of the HVL and in the shorter loops connecting β2-β3 and β3-β4 (Fig. 8C). The increases in 15N T2 indicate increased flexibility in these regions, especially the N-terminal tail which does not converge in the final ensemble of structures. The other loop regions converge reasonably well, consistent with their more modest increases in 15N T2 (Fig. 8A), although one exception is the HVL, which adopts two conformations, in which the C-terminal portion of the HVL either ascends or descends as it contacts the extended N-terminus (Fig. 8A, green and pink respectively).
TGM-D3 compared to other CCP domains
CCP domain structures with the closest fold to TGM-D3, as identified by a DALI43,44 search of the protein data bank, have close correspondence in the four β-strands that form the core of the CCP fold, but have two additional β-strands, one in the loop connecting β2 and β3, designated β’, and another at the C-terminus, designated β’’ (Fig. 9A). The β’ and β’’ strands are present in all of the top-scoring CCP domains and pair with one another (Fig. 9B). This serves to draw the C-terminal segment toward the loop connecting β2-β3 and essentially eliminates the large broad cavity that is formed between the extended N-terminus and β4 (Fig. 9C). Thus, unlike TGM-D3, the hydrophobic core of other CCP domains, which also includes a conserved tryptophan, is entirely buried and there are no partly exposed hydrophobic residues Additionally, none of these CCP domains, nor any other known CCP domain, contains the long HVL extension similar to the one in TGM-D3, which as noted is largely ordered and wraps laterally around the CCP domain on the convex surface of the sheet. Instead, the HVL in other CCP domains simply extends into solvent to connect β1 and β2 and has no contact with the convex surface of the sheet.
TGM-D3 engages its binding partner T/JRII in a manner distinct to that of the canonical CCP domain
To determine the TGM-D3:TβRII binding interface, the backbone of 15N, 13C TGM-D3 was fully assigned as bound to unlabeled TβRII (Fig. S11A-B). The two regions that were most strongly perturbed include residues 62-71 and 77-85, which correspond to most of β3 and β4, as well as a few residues that extend beyond the end of β4 (Fig. 10A). The regions that are perturbed to a lesser extent include residues 42-47 and 21-28, which correspond to the N-terminal end of β2 and the N-terminal end of the HVL as it emerges out of β1 and makes a sharp turn before extending laterally across the C-terminal end of β1. The residues that are most strongly perturbed, including Phe63, Ile66, Tyr80, Tyr81, and Ile84, lie within the large cavity on the concave face of TGM-D3 (Fig. 8B, 10A). The largest shift perturbations in TβRII, as previously noted, are from residue 52-54 (Fig. 6G), which corresponds to an edge β-strand. Through previous crystal structures of TβRII bound to TGF-β17,45,46, TβRII has been shown to insert this edge β-strand into the hydrophobic pocket formed between the fingers 1-2 and 3-4 of TGF-β. The region of TGM-D3 that is perturbed upon binding TβRII is similarly hydrophobic (Fig. 8B, 10A), and thus could conceivably accommodate the edge β-strand of TβRII.
The binding surface that TGM-D3 uses to bind TβRII is distinct relative to that used by other CCP domains to bind their partners. Though there is no single interface that defines how CCP domains bind their partners, due to both their ubiquity and the range of partners they bind, an analysis of those CCP domains most closely related to TGM nonetheless shows that they generally use their convex surface, specifically the extended N-terminal segment, β1, β2, and the β2-β3 loop to bind their ligands (Fig 10β). Thus, TGM-D3 has evidently acquired two large structural modifications that engender it with an alternate manner of binding its partner TβRII: (1) the long, laterally protruding, and structurally-ordered HVL of TGM-D3 likely acts to block binding to the convex face of TGM-D3, thus preventing interactions with other partners through the interface that is commonly employed, and (2) the large broad cavity between the extended N-terminus and β4, which is formed due to the elimination of the β’ and β’’ strands, and replacement of the former with a 310 helix, accommodates the edge b-strand of TβRII in a manner that mimics binding between the fingers of the TGF-β in the TGF-β:TβRII complex.
Discussion
The genome of the mouse helminth H. polygyrus encodes a highly expanded family of CCP-containing proteins, several of which have been identified in its secretome to regulate host immune response. This distinguishes H. polygyrus uniquely from closely related helminths, such as the sheep hookworm Teladorsagia circumcincta and the human hookworm Necator americanus which do not have an expanded CCP family and attenuate host immune response through other mechaniisms47-50. TGM and its five adult (TGM-2 through TGM-6) and four larval (TGM-7 through TGM-10) homologues are amongst the proteins in this expanded family, and as noted, at least four of these, TGM, TGM2, 3, and 4, regulate immunosuppressive signaling through the Treg pathway. Though potency of signaling through TGM is similar to that of TGF-β12, protein-protein binding kinetics and amplitude of signaling in reporter cell lines is distinct, as is the overall gene expression profile, with increased Treg potency, and decreased fibrotic gene response12. HpARI and HpβARI, which impact innate immune responses, are also among the CCP-containing proteins encoded by H. polygyrus, although the sequences of the HpARI and HpβARI CCPs are highly divergent from those of TGM and its homologs.
The results presented demonstrate that TGM binds the TGF-β receptors, TβRI and TβRII, in a modular manner, with TGM-D2 and TGM-D3 the main partners, respectively. The binding of TβRI, is however significantly potentiated by TGM-D1. The underlying basis for this potentiation is likely binding of TβRI through a composite TGM-D1:TGM-D2 interface as the NMR titration data presented in Fig. S8B shows that TGM-D1 directly, albeit weakly, binds TβRI. This manner of binding, however, is unsurprising given that it is common for CCP-containing proteins to bind partners through arrays of CCPs, with avidity playing an important role, and with the CCP domains generally connected by short linkers, as is the case for TGM domains 1 and 2.
The overall manner by which TGM binds TβRI and TβRII and assembles them into a signaling heterodimer is distinct in comparison to the TGF-β homodimers, which assemble a (TβRI:TβRII)2 heterotetramer, first by binding TβRII with moderate to high affinity (KD ca. 50 nM), and in turn recruiting and binding TβRI through a composite TGF-β:TβRII interface with high affinity (KD ca. 30 nM). Though further studies are required, the differences in stoichiometry of the TGM vs. TGF-β signaling complex, the differences in the manner of assembly, as well as potential differences in the overall architecture of the signaling complex, specifically how the type I and type II kinases are arrayed relative to one another in the cytoplasm, likely explains the differences in the amplitude and kinetics of signaling and overall shift of the gene expression profile away from extracellular matrix accumulation and towards immunosuppression.
The ITC competition binding and NMR assignments of the free and bound forms of TβRI and TβRII clearly demonstrate that TGM-D1D2 and TGM-D3 truly mimic the mammalian cytokine by engaging the same primary motifs of the receptors, the -PRDRP-pre-helix extension in TβRI and the β4 edge β-strand in TβRII. The structure of TGM-D3, together with identification of the binding site for TβRII based on NMR assignments of the free and TβRII-bound forms, provides a remarkable demonstration of how TGM-D3 has adapted, relative to other CCP domains, to uniquely and specifically bind TβRII through the cavity formed on its concave surface.
Though all of the domains of TGM are predicted to have the overall fold of CCP domain, only TGM-D3 binds to TβRII. Sequence comparisons of TGM-D3 with the other domains of TGM (Fig. S12A), demonstrates that all of the domains contain the HVL insertion and two disulfide bonds. TGM-D3 is however unique in that the β3-β4 loop is 5-6 residues shorter in the other domains compared to domain 3, thus this loop is likely a tight β-turn rather than a more extended turn as in domain 3. This may potentially alter the overall shape and dimensions of the cavity on the concave surface to accommodate other receptors of the TGF-β family, all of which, including TβRI, have been shown to engage their cognate ligands through structural motifs distinct from the β4 edge strand of TβRII.
Of the TGM family members in the adult parasitic secretome, TGM, TGM-2,-3,-4, and −6 have each been shown to bind TβRII29 (TGM-4 and TGM-6 data unpublished), consistent with sequence alignments which show high conservation amongst the our β-strands, the loop connecting β2 and β3, and the extended HVL, with minor amino acid differences that likely do not correlate to larger structural differences (Fig. S12B). Thus, it is likely that domain 3 of each of these homologs have a large cavity bordered by the extended N-terminal segment and β4 that binds TβRII, while binding of other potential partners through the convex face is prevented by the extended HVL.
This specific CCP domain modification likely does not extend to other CCP proteins in the H. polygyrus excretory-secretory products (HES). HpARI, and HpβARI, which are involved in binding Il-33 receptor ST-2, are predicted to be somewhat structurally dissimilar to TGM-D3. HpARI has three CCP domains, and HpβARI has two CCP31-33, and analysis of the secondary structure of these using the PHYRE2 server51, reveals significant differences: HpARI-CCP1 is predicted to have 5 β-strands, CCP2 is predicted to have 8 β-strands, together with two predicted helices, and CCP3 is predicted to have 4 β-strand regions and a large helical region (Fig. S13A). HpβARI-CCP1 and CCP2 are both predicted to have six β-strands, which likely means that these two proteins have the β’ and β’’-strands that TGM-D3 lacks, and likely lack a cavity on the concave surface (Fig. S13B). Thus, the unique modifications of TGM-D3, and likely other TGM domains as well, are not likely shared by other by other CCP-containing in the H. polygyrus secretome, indicating that these modifications have arisen to inure these domains with the remarkable ability to specifically bind the type I and type II receptors of the TGF-β family.
These findings highlight the unique nature of H. polygyrus-mediated immunomodulation. Unlike TGF-β, which binds TβRII and then recruits TβRI to the TGF-β:TβRII binding interface, TGM binds TβRI and TβRII in a modular manner, with TGM-D3 alone binding TβRII and TGM-D1D2 binding TβRI, with higher affinity binding to TβRI than to TβRII. Though the parasite product responsible is structurally dissimilar to TGF-β, TGM has convergently evolved the CCP domain protein family to take advantage of the same type I and type II receptor binding sites as human TGF-β. This structural knowledge can be used both to develop TGM and individual domains of TGM as therapeutic TGF-β mimics and to further explore the role of CCP domain-heavy proteins in parasitic host immunomodulation.
Materials and Methods
Expression and purification of TGM domains
DNA fragments corresponding to individual domains of H. polygyrus TGM, TGM-D1, TGM-D2, TGM-D3, and TGM-D1D2, were inserted between a KpnI and HindIII sites in modified form of pET32a (EMD-Millipore, Danvers, MA) to include a KpnI site immediately following the coding sequence for the thrombin recognition sequence. The resulting constructs, which included a thioredoxin-hexahistidine tag-thrombin cleavage site-TGM domain coding cassette (Table S1), were overexpressed in βL21(DE3) cells (EMD-Millipore, Danvers, MA) cultured at 37°C. Unlabeled samples for binding studies were produced on rich medium (LB), while 15N and 15N,13C samples for NMR studies were produced using minimal medium (M9) containing 0.1% 15NH4Cl (Cambridge Isotope Laboratories, Tewksbury, MA) or 0.1% 15NH4Cl and 0.3% U-13C-D-glucose (Cambridge Isotope Laboratories, Tewksbury, MA). Carbenicillin was also included in the media at 50 μg mL-1 to select for cells bearing the expression plasmid. Protein expression was induced by adding 0.8 mM IPTG when the light scattering at 600 nm reached 0.75.
Cell pellets from 3 L of culture were resuspended in 100 mL lysis buffer (50 mM Na2HPO4, 100 mM NaCl, 5 mM imidazole, 10 μM leupeptin, 10 μM pepstatin, 1 mM benzamide, pH 8.0) and sonicated. Following centrifugation (20 min, 15000g), the pellet was washed with 50 mL water, resuspended in 50 mM Na2HPO4, 100 mM NaCl, 5 mM imidazole, 10
μM leupeptin, 10 μM pepstatin, 1 mM benzamide, 8 M urea, pH 8.0, and stirred overnight at 25 °C. The remaining insoluble material was removed by centrifugation and the supernatant was loaded onto a 50 mL metal affinity column (Ni++ loaded chelating sepharose, GE Lifesciences, Piscataway, NJ) pre-equilibrated with 125 mL of resuspension buffer. The column was washed with 100 mL resuspension buffer and the bound protein was eluted by applying a linear gradient of resuspension buffer containing 0.5 M imidazole.
Protein from the eluted peak was treated with reduced glutathione (GSH), such that the final concentration of GSH once the protein was diluted into folding buffer was 2 mM. After a 30 min incubation at 25 °C, the protein was slowly diluted into folding buffer (0.1 M Tris, 1 mM EDTA, 0.5 mM oxidized glutathione (GSSG), pH 8.0) to a final concentration of 0.1 mg mL-1 and stirred for 12 - 16 h at 4° C. The folding mixture was concentrated and dialyzed into 25 mM Tris, pH 8.7. Solid thrombin was added to a final concentration of 4 U per milligram of TGM domain and incubated overnight at 25 °C. Cleavage was stopped by the addition of 10 μM leupeptin, 10 μM pepstatin, and 100 μM PMSF, and after re-adjusting the pH to 8.7, the cleavage mixture was passed over a Ni++ chelating sepharose column equilibrated with water. Column flow-through, and a subsequent water wash, which contained primarily the TGM domain, was collected. For the TGM-D1 and TGM-D1D2 domains, the flow-through was bound to a Source Q column (GE Lifesciences, Piscataway, NJ) equilibrated in 25 mM CHES, pH 9.0 and eluted with a 0-0.5M NaCl gradient. For the TGM-D2 and TGM-D3 domains, the flow-through was adjusted to pH 5.0 by the addition of acetic acid, bound to a Source S column (GE Lifesciences, Piscataway, NJ) equilibrated in 5 mM sodium acetate, 2M Urea, pH 5.0, and eluted with a 0-0.5M NaCl gradient. Masses of the TGM domains were measured by liquid chromatography electrospray ionization time-of-flight mass spectrometry (LC-ESI-TOF-MS, Bruker Micro TOF, Billerica, MA). TGM-FL was expressed in expi293 cells (Promega, USA) and purified by metal affinity chromatography as previously described12.
Expression and purification of TGF-/J receptor and growth factor constructs
The TGF-B2 mini monomer (mm-TGF-B2-7M2R), TβRII ectodomain, and TβRI ectodomain, were expressed in E. coli at 37°C in the form of insoluble inclusion bodies, refolded, and purified as previously described39,52,53. Details of the constructs used are provided in Table S1. Masses were verified by LC-ESI-TOF-MS.
NMR data collection and signal assignments
Samples of TGM-D1, -D2, -D3, and corresponding complexes with TBRI and TBRII, for NMR were prepared at a concentration of 0.03 to 0.2 mM in 25 mM Na2HPO4, 50 mM NaCl, pH 6.0 and transferred to 5 mm susceptibility-matched microtubes (Shigemi, Allison Park, PA) for data collection. NMR data were collected at 30 °C using a Bruker 600, 700, or 800 MHz spectrometer equipped with a 5 mm 1H (13C,15N) z-gradient “TCI” cryogenically cooled probe (Bruker Biospin, Billerica, MA). Routine 1H-15N HSQC spectra were acquired with a sequence with sensitivity enhancement54, water flipback pulses55, and WATERGATE water suppression pulses56. Backbone resonances were assigned by recording and analyzing 1H-15N HSQC and HNCACB, CBCA(CO)NH, HNCA, HN(CO)CA, HNCO, and HN(CA)CO triple resonance data sets. Proton and sidechain resonances were assigned by recording and analyzing 1H-13C CT-HSQC, CC(CO)NH, HBHACONH, HCCH-TOCSY, H(CC, CO)NH, HNHA, and HNHB data sets. NMR data were processed using nmrPipe57 and analyzed using a combination of NMRFAM-SPARKY and the Bayesian-based PINE software packages58,59. T2 Relaxation experiments were performed with 15N TGM-D3 using ZZ-exchange experiments60.
NMR structure determination of TGM-D3
The structure of TGM-D3 was initially calculated using the assigned chemical shifts and measured 1H-15N residual dipolar couplings (RDCs) as input. Residual dipolar couplings for the backbone amides were measured using an IPAP-HSQC sequence61 and an oriented sample containing 10 mg mL-1 Pf1 phage. Refined structure of TGM-D3 was determined using the program NIH-XPLOR62, with manually peak picked 3D 15N-edited and 13C-edited NOESY data, backbone 1H-15N 1Ha-13Ca RDCs, and 13Ca 13C RDCs, TALOS derived phi and psi restraints, hydrogen bonding restraints, 3JHN-Ha, 3JHa-Hβ, and 3JHN-Hβ J-couplings. Calculations were performed using NIH-XPLOR for minimizing the energies, each run generating 100 structures.
SPR measurements
SPR datasets were generated using a BIAcore X100 instrument (GE Lifesciences, Piscataway, NJ) with biotinylated avi-tagged TBRI or biotinylated avi-tagged TBRII captured onto neutravidin-coated CM-5 sensor chips (GE Lifesciences, Piscataway, NJ) at a density of 50 – 150RU. Neutravidin coated sensor chips for capture of biotinylated avi-tag receptors were made by activating the surface of a CM-5 chip with EDC and NHS, followed by injection of neutravidin (Pierce, Rockford, IL) diluted into sodium acetate at pH 4.5 until the surface density reached 6000 – 15000 RU. Kinetic binding assays were performed by single injections of the analytes in 25 mM HEPES, pH 7.4, 50 mM NaCl, 0.005% surfactant P20 (Pierce, Rockford, IL) at 100 uL min-1. Regeneration of the surface was achieved by a 30 sec injection of 1 – 4 M guanidine hydrochloride. Baseline correction was performed by subtracting the response both from the reference surface with no immobilized ligand and 5 – 10 blank buffer injections. Kinetic analyses were performed by fitting the results to a simple 1:1 model using the program Scrubber (Biologic Software, Canberra, Australia).
ITC measurements
ITC datasets were generated using a Microcal PEAQ-ITC instrument (Malvern Instruments, Westborough, MA). All experiments with TBRII were performed in ITC buffer, consisting of 25 mM sodium phosphate, 50 mM NaCl, pH 6.0 at 35 °C. Experiments with TBRI were performed in ITC buffer, consisting of 25 mM HEPES, 50mM NaCl, 0.05% NaN3, pH 7.5 at 25°C, with exception of the TBRI/TGF-B:TBRII titration which was performed at 30°C. A listing of the proteins included in the syringe and sample cell is provided in Tables 1-4.
Proteins included in the syringe and sample cell were dialyzed against ITC buffer and concentrated as necessary prior to being loaded into either the syringe or sample cell. For the TBRII experiments, fifteen 2.5 µL injections were performed with an injection duration of 5 s, a spacing of 150 s, and a reference power of 10. For the TBRI experiments, with exception of the TBRI/TGF-B:TBRII titration, nineteen 2.0 µL injections were performed with an injection duration of 4 s, a spacing of 150 s, and a reference power of 10. The TBRI/TGF-B:TBRII titration was performed with thirteen 3.0 µL injections were performed with an injection duration of 5 s, a spacing of 150 s, and a reference power of 10. Integration and data fitting were performed using the programs Nitpic63, Sedphat64,65, and GUSSI66. No more than 2 outlier data points were removed from any one ITC data set for analysis.
Acknowledgements
The authors would like to thank Mike Delk for assistance with the NMR instrumentation. This research was supported by the NIH (GM58670) and the U.S. Department of Defense (DoD W81XWH-17-1-0429). Molecular graphics and analyses were performed with UCSF Chimera, which is developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco and supported by NIGMS P41-GM103311.