An integrative approach unveils a distal encounter site for rPTPε and phospho-Src complex formation

Protein tyrosine phosphatase: phospho-protein complex structure determination, which requires to understand how specificity is achieved at the protein level remains a significant challenge for protein crystallography and cryoEM due to the transient nature of binding interactions. Using rPTPεD1 and phospho-SrcKD as a model system, we established an integrative workflow involving protein crystallography, SAXS and pTyr-tailored MD simulations to reveal the complex formed between rPTPεD1 and phospho-SrcKD, revealing transient protein–protein interactions distal to the active site. To support our finding, we determined the associate rate between rPTPεD1 and phospho-SrcKD and showed that a single mutation on rPTPεD1 disrupts this transient interaction, resulting in the reduction of association rate and activity. Our simulations suggest that rPTPεD1 employs a binding mechanism involving conformational change prior to the engagement of cSrcKD. This integrative approach is applicable to other PTP: phospho-protein complex determination and is a general approach for elucidating transient protein surface interactions.


Abstract 21
Protein tyrosine phosphatase: phospho-protein complex structure determination, which 22 requires to understand how specificity is achieved at the protein level remains a 23 significant challenge for protein crystallography and cryoEM due to the transient nature 24 of binding interactions. Using rPTPεD1 and phospho-SrcKD as a model system, we 25 established an integrative workflow involving protein crystallography, SAXS and 26 pTyr-tailored MD simulations to reveal the complex formed between rPTPεD1 and 27 phospho-SrcKD, revealing transient protein-protein interactions distal to the active site. 28 To support our finding, we determined the associate rate between rPTPεD1 and 29 phospho-SrcKD and showed that a single mutation on rPTPεD1 disrupts this transient 30 interaction, resulting in the reduction of association rate and activity. Our simulations 31 suggest that rPTPεD1 employs a binding mechanism involving conformational change 32 prior to the engagement of cSrcKD. This integrative approach is applicable to other 33

Introduction 36
Protein-tyrosine phosphorylation is a reversible post-translational modification that 37 regulates cellular signaling in eukaryotes. Protein-tyrosine phosphorylation levels in 38 the cell are balanced by counteracting activities between protein-tyrosine kinases 39 (PTKs) and protein-tyrosine phosphatases (PTPs) 1 . Aberrations in the regulation of 40 protein-tyrosine phosphorylation are often associated with disease states such as 41 arthritis, diabetes and cancer 1-6 . Crystallographic and peptide-binding studies of various 42 PTPs such as PTP1B, SHP-1, SHP-2, rPTPε and rPTPα have revealed detailed 43 mechanisms of substrate specificity/recognition at the active site 7-12 . The hallmark of 44 previous structural studies is that the cysteine-dependent active site typically features a 45 small, deep pocket to accommodate the phosphorylated tyrosine (pTyr) side chain and 46 a relatively flat outer surface for the adjacent residues 13 . The interactions between the 47 pTyr side chain and the active-site pocket provide most of the binding energy and drive 48 the binding event. However, previous studies of the rPTPα phosphatase domain 49 (rPTPαD1) and pTyr peptides with sequences derived from its physiological substrate, 50 Src, displayed an unlikely weak affinity, with Michaelis constants (KM) in the 51 millimolar range-much higher than the physiological concentration 10 . Although the D2 52 domain of rPTPα and SH2 domain of Src also play crucial roles in rPTPα: Src complex 53 formation 9,14,15 , studies of ERK kinase and metalloproteinase have shown that 54 additional protein-protein interaction (also known as encounter interface or exosites) 55 far from the active site can facilitate substrate recognition 16 . Currently, there is only 56 one PTP: phospho-protein complex structure in protein data bank (PBD), but it 57 represents a noncatalytic mode of interactions and cannot reveal additional protein-58 protein interactons 17 . Therefore, the corresponding encounter interface in PTPs 59 remained largely unexplored as no functional PTP: phospho-protein complex structure 60 has yet been determined. 61 Herein, we report the first rPTPεD1: phospho-SrcKD complex structures by integrating 62 experimental and computational approaches that is applicable to other PTP complexes. 63 In brief, the experimental SAXS data guides rigid-body docking to form the initial 64 complex, which provides a defined spatial orientation between rPTPεD1 and phospho-65 SrcKD. This approach effectively reduces the computational time and resource required 66 by multiscale MD simulations in searching of protein-protein binding ensemble 67 structures 18-21 . The following pTyr-tailored MD simulation optimized the spatial 68 arrangement of the two protein molecules and the encounter interface. The key residues 69 and trajectory snapshots of protein complex formation are further revealed by steered 70

MD simulations and umbrella sampling. 71
Our complex structure revealed an encounter interface, which greatly enhance the 72 formation of a catalytically competent complex. A single site was replaced on the 73 encounter interface, designed to partially disrupt charge-charge interactions, resulting 74 in a seven-fold reduction of the association rate kon, and a 30% reduction of PTPε 75 phosphatase activity towards phospho-SrcKD but not towards pNPP, a pTyr substrate 76 analog. Our structural analyses further suggest that a conformational selection 77 mechanism plays an initial role in molecular recognition between rPTPεD1 and SrcKD. 78 79

Production of stable rPTPεD1: phospho-SrcKD complex 81
In the present study, we focused on the interaction between the D1 domain of rPTPε 82 (rPTPεD1) which possesses phosphatase activity and its target, the kinase domain of 83 Src (SrcKD) where the C-terminal pTyr527 is dephosphorylated. A known catalytically 84 inactive and substrate-trapping mutation of rPTPεD1-C335S was used to obtain a stable 85 rPTPεD1: phospho-SrcKD complex. CSK, a known kinase of the Src family, was used 86 to phosphorylate SrcKD in vitro 22 . The SrcKD double variant, K295M/Y416F 87 substituted in ATP binding and activation residues, respectively, was produced to 88 prevent additional auto-phosphorylation on Src 23,24 . The CSK treated SrcKD was 89 pooled with rPTPεD1 for complex formation and was further purified by size-exclusion 90 chromatography (SEC). The purified rPTPεD1: phospho-SrcKD complex was co-91 eluted at a volume that was distinct from that of the uncomplexed rPTPεD1 and SrcKD 92 ( Fig. 1a and 1b). The ability to be co-eluted in the SEC suggests that the phospho-93 SrcKD forms a stable heterodimeric complex. Our analytical ultracentrifugation (AUC) 94 result shows that both uncomplexed rPTPεD1 and SrcKD show a single peak with an 95 S20 value of ~ 3 whereas rPTPεD1: phospho-SrcKD complex shows an additional peak 96 with S20 value > 4, indicating stable heterodimeric complex formation and consistent 97 with SEC experiment (Fig. 1c). 98

Distinct binding behavior of rPTPεD1 towards SrcKD and peptide 99
Previous findings revealed that rPTPα has a substantially weaker binding affinity (in 100 the low mM range) toward the pTyr Src peptide 10 . As rPTPε is a homolog of rPTPα, 101 SEC showing that rPTPεD1 does not co-elute with pTyr Src peptide is similarly a sign 102 of weak or transient binding between rPTPεD1and Src peptide (Fig. 1d). As we 103 observed stable rPTPεD1: phospho-SrcKD complex formation (Fig 1a), we 104 hypothesized a non-peptide mediated binding regime and the existence of additional 105 encounter interfaces (exosite) between rPTPεD1and SrcKD. 106

Docking model by SAXS and MD simulation 107
The combination of multiscale MD simulations with solution SAXS is advantageous 108 as MD simulations allow conformational arrangement while SAXS experiments 109 provide information about overall shape which can effectively reduce the time-110 consuming simulation process in searching of protein-protein binding ensembles. 111 SAXS (with a q value ranging from 0.009 to 0.2 Å -1 ) was used to determine a molecular 112 envelope for the rPTPεD1: phospho-SrcKD complex, indicating an elongated particle 113 in solution with a radius of gyration (R g) of 29.4 Å and a maximum intramolecular 114 distance, Dmax, of 91.1 Å (Fig. 1e, 1f, 2a and 2b and Table. S1). The calculated low-115 resolution envelope had adequate space to fit the complex molecules (Fig. 1f). In 116 addition, the rigid-body docking complex was generated by CORAL using crystal 117 structures of rPTPεD1, (PDB ID: 2JJD) and SrcKD (PDB ID: 2SRC). One of the best 118 fit CORAL docking models with χ 2 value of 4.1 showed a docking complex with tail-119 to-tail relative orientation (Fig. 2a). In this complex, there were no lysine or arginine 120 residues found proximal to the encounter (intermolecular) interface. Further 121 examination found that the complex cannot be cross-linked by amine-to-amine 122 crosslinkers, such as glutaraldehyde and bis-sulfosuccinimidyl suberate (BS3), 123 supporting this docking model. 124 To create a pTyr bound docking model, we manually moved the flexible pTyr region 125 (Asp518-Gln534) toward rPTPεD1 based on geometry restraints and positioned 126 pTyr527 into the active-site of rPTPεD1 based on the pTyr-peptide bound PTP1B 127 crystal structure (PDB ID: 1G1H) 25 . The ability of the C-terminal pTyr527 to reach the 128 rPTPεD1 active site by moving only the flexible C-terminus implies that our tail-to-tail 129 docking model is in a functionally competent state, allowing rPTPεD1 to 130 dephosphorylate pTyr527 of SrcKD. The missing N-terminal residues (four residues in 131 rPTPεD1 and 27 residues in SrcKD) were added to the CORAL docking model by 132 RosettaCommons. By keeping pTyr527 bound in the active site and the N-terminus 133 flexible, the docking model was optimized by BIBLOMD with an improved χ 2 value 134 of 2.5 ( Fig. 2d and S1). However, close inspection of the BIBLOMD model revealed 135 that Glu486 and Glu489 of SrcKD were surrounded by a negatively charged surface on 136 rPTPεD1, indicating an unfavorable repulsive contact in the encounter interface (Fig.  137 S2). Application of MD relaxation allowed these unfavorable repulsive contacts to be 138 resolved into favorable attractive interactions in the encounter interface (Fig. 2d, S1 139 and S2). Compared to the crystal structure of uncomplexed rPTPεD1, the most apparent side chain Arg220 present in the encounter interface rotates to the more solvent exposed 145 side, providing an attractive favorable contact in the interface (Fig S1). In the case of 146 SrcKD, a minor rearrangement of the backbone of one helix (residue 469-477) is 147 observed. The major change is that the loop including Glu486 flips a distance of 4.5 Å 148 toward the encounter interface, contributing to a favorable attraction in the encounter 149 complex interface. Overall, the MD relaxation complex forms additional rPTPεD1-150 R220: SrcKD-E486 and rPTPεD1-K237: SrcKD-D518 interactions with χ 2 values 151 improved from 2.5 to 1.6, indicating a better fit to the experimental SAXS data (Fig.  152 2c, and Table S2). 153

Mapping the interactions during complex formation with a free-energy approach 154
Typically, searching protein dissociation or association pathway requires long-155 timescale MD simulations combined with an additional modeling approach 26 , however, 156 it is not easy to reach its convergence criterion. In contrast, our SAXS-based complex 157 structure can quickly provids a reasonable initial complex model for further MD 158

optimization. 159
Initially, MD simulation with umbrella sampling failed to assess the pathway trajectory 160 owing to the strong attractive interactions between pTyr527 of SrcKD and the rPTPεD1 161 active site. Consequently, the complex remained in the bound form and resulted in 162 significant rotation of the protein molecules (Fig. S3). Hence, the unphosphorylated 163 form of SrcKD was purposely used for the following MD simulation. 164 The simulated reaction coordinate was selected based on the center-of-mass (COM) 165 between rPTPεD1 and SrcKD. In the dissociation process, the encounter interface starts 166 to disrupt at a COM distance of 49 Å and vanish at 55 Å (Fig. 3a). It suggests that the 167 interface interaction of the complex started dissociating at COM distance of 49 Å and 168 completely dissociated at 55 Å. To understand and evaluate the contribution of key 169 residues, we decomposed the free energy of two charge-charge residue pairs, R220: 170 E486 and K237: D518. The energy decomposition results suggested both residue pairs 171 play roles in binding, which is consistent with the optimized MD model (Fig. 3b). In 172 addition, our results illustrate that the rPTPεD1-R220: SrcKD-E486 pair show a larger 173 difference between the bound state and the unbound state whereas the rPTPεD1-K237: 174 SrcKD-D518 pair displays a minor change, indicating that the R220:E486 interaction 175 plays a crucial role in complex formation. 176 Next, to demonstrate the corresponding intermediate structural changes from the 177 unbound to bound state, MD simulation in the association direction was performed. The 178 starting model was derived from the previous MD pathway trajectory at the COM 179 distance of 56.5 Å. To simulate the process of complex formation, proteins were slowly 180 moved toward each other in the bulk solvent environment. In the unbound state, the 181 proteins were very dynamic without any close contact (Fig. 3c). At the intermediate 182 state, the complex gains electrostatic attraction between rPTPεD1-R220 and SrcKD-183 E486 (Fig. 3d). At the next stage of complex formation, an additional interaction is 184 formed between rPTPεD1-K237 and SrcKD-D518 (Fig. 3e). SrcKD-D518 is only eight 185 amino acid residues away from pTyr527 in the primary structure, so this intermolecular 186 arrangement brings SrcKD-pTyr527 close to the rPTPεD1 active site (Fig. 3f). The 187 identified interactions along the association pathway of rPTPεD1: SrcKD complex is 188 highly consistent with the dissociation process results that R220:E486 and K237:D518 189 pairs play roles in the complex formation 190

In vitro validation of the role of rPTPε-D1-R220 in complex formation 191
To further validate the role of rPTPεD1-R220: SrcKD-E486 interaction, a repulsive 192 indicates that the binding event is limited by conformational rearrangement 28 . 203 Finally, we compared the phosphatase activity of the rPTPεD1 wild-type and R220E 204 variant using pNPP and phospho-SrcKD as substrates. As expected, the pNPP assay 205 results showed that both rPTPεD1 wild-type and R220E mutant possess similar 206 phosphatase activity (Fig. 4e), suggesting the R220E mutation does not affect the 207 catalytic site. A previous study of PTP1B activity toward phospho-peptide showed 208 disruption of charge-charge interaction has little effect on kcat 27 . However, our 209 phosphatase activity toward phospho-SrcKD revealed a ∼30% activity reduction as 210 measured by kapp for the rPTPεD1-R220E variant (Fig. 4f). Furthermore, sequence 211 alignment shows that R220 is highly conserved in rPTPεD1 as well as rPTPαD1 (Fig  212   5a and 5 b). The rPTPαD1-R317E mutation (corresponding to R220E in rPTPε) also 213 exhibited a ~30% decrease in activity compared to wild type rPTPαD1 ( Fig. 4g and 4h), Our structure reveals a key charge-charge interaction between rPTPεD1-R220 and 254 phospho-SrcKD-E486 far from the active site for complex formation (Fig. 3f). 255 Systematic analyses of 131 protein-protein hetero-complexes in the PDB also shows 256 that transient charge-charge interaction is predominant in signaling complexes, which 257 is consistent with our finding 29 . The electrostatic interactions remain effectively with a 258 distance of 10-20 Å 30 . We postulate a long-range electrostatic interaction between R220 259 and E486 brings rPTPεD1 and SrcKD into proximity at the beginning of complex 260 formation. Once the R220:E486 encounter interface is established, the second 261 interaction between K237 and D518 is formed. The conformation adopted by D518 in 262 the K237: D518 interaction orients the dynamic C-terminal pTyr527 (connected 263 through the main chain) into the rPTPεD1 active site for dephosphorylation. This 264 proposed association pathway accompanies a 7.5-fold wider charge-charge interface to 265 increase the probability of rPTPεD1: phospho-SrcKD complex formation compared to 266 the interface between rPTPεD1 and pTyr (Fig. 5e) Similarly, a comparison of rPTPεD1/rPTPεD1-R220E towards cSrc was performed 320 using a Superdex 75 HR 10/300 column.  (Table S1)  The force weight sets on the x, y and z-component with the restrain force equal to 5 385 kcal mol -1 Å -2 . For the dissociation process, the initial model was taken from the MD 386 optimized structure. Simulations were carried out for every 36 windows of 1 ns run, via 387 a strain velocity during sampling relaxation (pulling force: 5 kcal mol -1 Å -2 , velocity: 388 0.0005 Å ps -1 ), corresponding to a total simulation time of 0.36 μs. For the association 389 process, the started model was taken from the trajectory of the dissociation process, 390 with a COM distance ~57 Å. Simulations were carried out for every 50 windows of 1 391 ns run, via a slower strain velocity during sampling relaxation than that of the 392 dissociation process (pulling force: 5 kcal mol -1 Å -2 , velocity: 0.00025 Å ps -1 ), 393 corresponding to a total simulation time of 0.5 μs. 394

Pathway (FEP) 396
In umbrella sampling (US-PMF) 43-45 , harmonic restraint is placed at successive points 397 along with the reaction coordinate with restraining potential form V(t) = k(xt-xo) 2  The rPTPεD1/rPTPαD1 phosphatase assay using pNPP as the substrate was performed 432 as previously described 48,49 . In brief, the purified rPTPεD1, rPTPαD1, rPTPεD1-R220E, 433 or rPTPαD-R317E proteins were added in a solution containing 20 mM Tris-HCl (pH 434 7.5), 50 mM NaCl, and 20 mM pNPP. The reactions were incubated for 30 min, the 435 level of dephosphorylation was measured at 405 nm using a UV spectrometer. All 436 measurements were performed in triplicate. 437