MLL1 minimal catalytic complex is a dynamic conformational ensemble susceptible to pharmacological allosteric disruption

Histone H3K4 methylation is an epigenetic mark associated with actively transcribed genes. This modification is catalyzed by the mixed lineage leukaemia (MLL) family of histone methyltransferases including MLL1, MLL2, MLL3, MLL4, SET1A and SET1B. Catalytic activity of MLL proteins is dependent on interactions with additional conserved proteins but the structural basis for subunit assembly and the mechanism of regulation is not well understood. We used a hybrid methods approach to study the assembly and biochemical function of the minimally active MLL1 complex (MLL1, WDR5 and RbBP5). A combination of small angle X-ray scattering (SAXS), cross-linking mass spectrometry (XL-MS), NMR spectroscopy, and computational modeling were used to generate a dynamic ensemble model in which subunits are assembled via multiple weak interaction sites. We identified a new interaction site between the MLL1 SET domain and the WD40 repeat domain of RbBP5, and demonstrate the susceptibility of the catalytic function of the complex to disruption of individual interaction sites.


INTRODUCTION
Post-translational modifications on histone tails are key epigenetic signals for regulation of chromatin structure and gene expression. Histone H3 lysine 4 (H3K4) methylation is the epigenetic mark exclusively associated with transcriptionally active chromatin (1,2). This modification is mostly catalyzed by the MLL/SET1 family of histone methyltransferases (3, 4), through their evolutionarily conserved SET domain (5, 6). The founding member of this family of H3K4 methyltransferases is the yeast SET1 protein (7, 8). In mammals, methylation of H3K4 is carried out by a family of six proteins: MLL (mixed lineage leukemia protein)1 to MLL4, SET1A and SET1B (9-15). The MLL proteins play crucial roles in embryonic development and hematopoiesis through transcriptional regulation of the clustered homeobox (Hox) genes and other genes important for developmental regulation (10, [16][17][18][19]. Deletion of MLL1 and MLL2 can lead to severe defects in embryonic development in mice (18,20). The MLL1 gene is frequently rearranged in human acute leukemia in both adults and children (21)(22)(23). MLL3 and MLL4 have also been linked to other human malignancies. Recently studies have identified inactivating mutations in MLL3 and MLL4 in different types of human tumors (24)(25)(26)(27), as well as in Kabuki syndrome (28).
The catalytic activity of MLL family members are dependent to varying degrees on the presence of additional conserved protein subunits, RbBP5, WDR5 and ASH2L, and a minimal core enzyme can be reconstituted with the conserved C-terminal SET domain of MLLs and at least two of the other subunits. Interestingly, in studies of these reconstituted core enzymes, MLL1 appears to be unique among the family members in its requirements for, and interactions with other subunits. For example, compared to other family members, the catalytic activity of MLL1 is the most dependent on WDR5 (29-31). Similarly, MLL1 binds with the least affinity to the RbBP5-Ash2L heterodimer, and its catalytic activity is only weakly stimulated by RbBP5-ASH2L compared to WDR5 (32).
Recent crystallographic studies of MLL3 support a model in which the RbBP5-ASH2L heterodimer stabilizes the catalytically active conformations of MLL2,3,4 through interactions with conserved surfaces on their SET domain (32). However, it was suggested that two key variant residues on this surface of MLL1 dramatically weakened the interaction between MLL1 and RbBP5-ASH2L relative to that of other MLL members, thereby increasing the dependence of MLL1 on the WDR5 subunit. The unique dependence of MLL1 activity on WDR5 may be of therapeutic relevance, as we and others have shown that pharmacological targeting of the MLL interaction site on WDR5 can functionally antagonize MLL1 in cancers that are dependent on MLL1 activity (33-35). While there are several structures of WDR5 bound to MLL and RbBP5 peptides (31, 36-39), as well as a crystal structure of the apo-SET domain (40) of MLL1 and a 24 Å resolution cryo-EM model of the homologous yeast COMPASS complex (41), an atomic level picture of a functional MLL1 catalytic core complex is still lacking. Here, we report a hybrid methods study of MLL1 and its catalytic core components in solution. Using small angle X-ray scattering (SAXS), cross-linking mass spectrometry (XL-MS), NMR spectroscopy, and computational modeling we derived a dynamic ensemble model for MLL1/WDR5/RbBP5 and identify a new interaction site between MLL1-SET domain and the N-terminal WD40 repeat domain of RbBP5. Our data support the notion that the functional MLL1 enzyme comprises a collection of weak but specific interactions, and that the disruption of individual interactions can have significant destabilizing effects on the entire complex. These results highlight the dynamic nature of an important protein complex and the strategy of targeting a weak but druggable protein-protein interaction site to antagonize the function of a larger macromolecular assembly that is dependent on a collection of weak interactions.

Protein preparation
The individual components of the MLL1 complex used in this study were expressed in E.
coli and purified using an N-terminal GST-tag (for MLL1) or His-tag (for WDR5 and RbBP5).
The dimeric and trimeric complexes of MLL1 used for SAXS and cross-linking studies were expressed in Sf9 cells. The dimeric complex of WDR5-MLL1-WIN and WDR5-RbBP5 were purified using TALON affinity resin (Clontech) followed by gel filtration chromatography.
The purified dimeric complexes were incubated on ice for 2 hours together to reconstitute the trimeric complex which was purified and recovered by gel filtration chromatography.
Detailed procedures are described in the Supplementary Data section.

SAXS data collection, analysis and modelling
SAXS measurements were carried out at the beamline 12-ID-C of the Advanced Photon Source, Argonne National Laboratory. The energy of the X-ray beam was 18 Kev (wavelength λ=0.6888 Å), and two setups (small-and wide-angle X-ray scattering, SAXS and WAXS) were used in which the sample to charge-coupled device (CCD) detector (MAR research, Hamburg) distance were adjusted to achieve scattering q values of 0.006 < q < 2.3Å -1 , where q = (4π/λ)sinθ, and 2θ is the scattering angle. Data were analyzed using the program PRIMUS (ATSAS package, EMBL (42)). Detailed descriptions of SAXS data collection and analysis, and modelling protocols, are provided in the Supplementary Data.

Chemical cross-linking mass spectrometry
The reconstituted trimer complex of WDR5, RbBP5 and MLL1-SET was cross-linked at a concentration between 12 and 16 M, with 1 mM of isotopically coded disuccinimidyl suberate (DSS-d0,DSS-d12) as described previously (43

GST Pull-down experiments
Recombinant purified MLL1-GST proteins were incubated with RbBP5 constructs in an assay buffer containing 20mM TRIS pH 7.7, 150 mM NaCl, 10μM ZnCl2, 5mM βmercaptoethanol, 5 mM DTT, 1 mM PMSF in a 1:2 molar ratio at 4 0 C for 1h. Proteins were then incubated with 100μl of glutathione-Sepharose beads (GE Healthcare) for an additional 1h. The mixture was transferred to a micro-column and was extensively washed with assay buffer. Bound proteins were eluted with 30mM reduced glutathione and detected by SDS-PAGE and Coomassie staining.
All reactions were incubated for 90 minutes at room temperature and the SPA method was used to determine activities. Experiments were performed in triplicate . To test the effect of   OICR-9429 on the MLL1 complex, increasing concentrations of the compound was   incubated with 200 nM MLL1-WDR5 complex for 20 min before adding 400 nM RbBP5. The activity of the complex was measured as above.

SAXS data reveal solution ensembles for WDR5, RbBP5 and MLL-SET
To model catalytically active MLL1 complexes, we first collected reference solution data for the individual subunits including the SET domain of MLL1, the WD40 repeat region of WDR5 (ΔN-WDR5), the N-terminus of RbBP5 (RbBP5-NTD) and full-length RbBP5, followed by characterization of dimeric and trimeric complexes. Fig 1A shows the protein constructs used in this study. Normalized Kratky plots of ΔN-WDR5 and RbBP5-NTD exhibit a typical bell-shape expected for a globular protein and are nearly superimposable in the q range 0<qRg<3 (Fig 1B). Also, the experimental values of Rg predicted for ΔN-WDR5 and RbBP5-NTD are in agreement with the theoretical values expected for globular proteins ( Table 1 and Fig S2). The normalized Kratky plot of MLL1-SET also exhibits a bell-shape, but its maximum is shifted with respect to the globular protein position, with poor convergence at high q-values, indicating that MLL1-SET is flexible. The observed flexibility of the MLL1-SET could be attributed to known inherent dynamics of the SET domain in the absence of cofactor (32), and to the disordered N-terminal tail. The calculated solution ensembles for each protein taking into account known or predicted disordered regions (see SI for details) establish good correspondence between our SAXS measurements and the crystal structures of WDR5, the SET domain of MLL1, and our homology model of RbBP5-NTD predicted using ROSETTA (48) (Fig S1).
One of the main challenges in modeling the MLL1 complex is the lack of structural information on RbBP5. To better understand its structural arrangement, we collected [ 1 H- 15 N]-TROSY NMR spectra of a full-length construct, as well as constructs corresponding to its C-terminal and N-terminal regions (Fig 2A). The data confirm that RbBP5-NTD is a globular, folded domain, consistent with our SAXS analysis and WD40 homology model.
The C-terminal region of RbBP5 (RbBP5-CTD) is substantially disordered as evidenced by the lack of spectral dispersion (Fig 2A). Both the gel filtration profile (Fig S3B) and the radius of gyration estimated from SAXS data ( Table 1 and Fig S2) indicate a high degree of disorder in the RbBP5 full-length protein. This is further supported by sequence based secondary structure prediction and order parameters, which predict a rigid globular Nterminus and a flexible coil-like C-terminus (Fig S3). Interestingly, the [ 1 H-15 N]-TROSY spectrum of full-length RbBP5 is not the superposition of the individual NTD and CTD spectra, and reflects features of both folded and unfolded regions with some apparent conformational broadening, possibly reflecting weak intramolecular interactions (Fig S3A).
The NMR data is consistent with the general shape of the normalized Rg-based Kratky plot and the pair distance distribution function P(r) for full-length RbBP5 (Fig 1B, C). In particular, the P(r) function of RbBP5 has an asymmetric shape with a long smooth tail at large r values, and the position of its maximum is shifted only slightly (~4 Å) with respect to that of RbBP5-NTD. The latter features indicate that full-length RbBP5 has no additional globular content compared to RbBP5-NTD.
Based on the above data we used the Sparse Ensemble Selection (SES) approach (49) to calculate a solution ensemble of RbBP5 that would satisfy the SAXS data. An initial ensemble consisting of 20,000 models of RbBP5 with random conformations of its flexible regions (residues 1-23 and 326-538) did not fit the SAXS data well (the goodness-of-fit We next generated a solution ensemble that better fits the SAXS data by calculating an optimal weight for each model in the initial ensemble using a multi-orthogonal matching pursuit algorithm (49) (see SI for details). The resulting optimal ensemble fits the SAXS data very well with 0.38 (Figs 2C, S1). The most populated models in the optimal ensemble are shown in Figs 2B, S1F, G. The optimal ensemble displays a much narrower distribution of radius of gyration values than the initial random ensemble, with a major peak at Rg=37 Å (Fig 2C). This indicates that RbBP5 adopts a more compact conformation than would be predicted for a fully random CTD, consistent with our NMR data for the fulllength protein.

Binary subcomplexes have dynamic non-random solution conformations mediated by WD40 repeat domains
Our SAXS data for the binary complexes of WDR5/MLL1-WIN and WDR5/RbBP5 both suggest the presence of significant disorder, especially for WDR5/MLL1-WIN (Fig 3A,   S3A). The P(r) functions of WDR5/MLL1-WIN and WDR5/RbBP5 are typical for proteins containing globular domains tethered by long disordered regions (Fig 3A). The position of the P(r) major peak for the aforementioned complexes is close to the positions of the major peaks of P(r) of their individual components (Fig 1C), indicating that in both complexes the globular domains are not in close contact and may not adopt a unique arrangement in solution. WDR5 is known to interact with RbBP5 and MLL1 through small peptide segments designated as the WDR5 binding motif (WBM) (38) and WDR5 interacting (WIN) (36) motif, respectively (Fig 1A). Both interactions with WDR5 have reported dissociation constants on the order of 1-2 μM (30, 36, 38, 39). To calculate solution ensembles of the WDR5 binary complexes, we first used NMR to verify that WDR5's mode of interaction with these two motifs, as observed in the crystal structures, is maintained in solution. We expressed a triply-labeled ( 15 N/ 13 C/ 2 H) ΔN-WDR5 construct, and were able to assign 254 amides (Fig S6). We then used chemical shift perturbation (CSP) analysis in [ 1 H-15 N]-TROSY titration experiments, to localize the WRD5 binding site for peptides corresponding to the two motifs. As seen in Fig 3B, there is excellent agreement between the WDR5 CSP profiles and the WDR5/WIN (PDB:4ESG) and WDR5/RbBP5 (PDB:2XL2) crystal structures.
Next, using these two structures to fix each WDR5-peptide interface, we modeled the ensemble of solution conformations for the binary complexes of WDR5 with MLL1-WIN and full-length RbBP5 using the SES method. The arrangement of the globular domains in the most populated models of the optimal ensembles for both WDR5/MLL1-WIN and WDR5/RbBP5 complexes does not support the existence of additional interactions of WDR5 with MLL1-WIN or with RbBP5 other than those described above (Fig S4).
There is currently no atomic resolution structural data for the interaction of MLL1 with RbBP5. Recently, the activation segment (AS) of RbBP5 was shown to bind to the SET domain of other MLL family members, but only very weakly to MLL1 (32). In order to determine whether there is a direct interaction between RbBP5 and MLL1, we performed GST pull-down studies of full-length RbBP5, RbBP5-NTD and RbBP5 (320-410) with GST-MLL1-SET and GST-MLL1-WIN (Fig 3C-E). Both MLL1 constructs interacted with RbBP5 constructs containing the N-terminal WD40 repeat but did not interact with the C-terminal residues (aa 320-410) of RbBP5 containing the AS region. These results agree with the lack of conservation of the RbBP5 AS-binding surface on MLL1 (32), and suggest that MLL1-SET may interact with the WD40 domain of RbBP5.

SAXS and cross-linking data suggest a dynamic triangulated ensemble for WDR5/RbBP5/MLL1-WIN
Our SAXS data for the catalytically active WDR5/RbBP5/MLL1-WIN complex showed a substantial amount of flexibility. The shape of the experimental Kratky plots of the complex is typical of partially disordered proteins (Figs 4A, S5A). In particular, the Rg-based Kratky plot is a bell-shaped curve with a maximum at (2.26, 1.27) shifted to higher values of the coordinates with respect to its position expected for a globular protein. Also, the presence of a high degree of flexibility is evidenced by the poor convergence of the Kratky plots at high q values. The low maximum value of 0.48 in the Vc-based Kratky plot (Fig S5A), as well as the asymmetric shape of the P(r) function (Fig S5B), suggests an elongated shape of the complex. This is in agreement with the averaged ab initio SAXS-derived molecular envelope, which showed an extended shape with approximate dimensions of 22010570 Å (Fig 4D).
We note that pair-distance distribution functions of proteins containing several globular domains connected by long disordered regions are characterized by peaks at low r-values, corresponding to the intra-domain distances. Therefore, if the three globular domains of WDR5, MLL1-SET and RbBP5-NTD are not interacting directly with each other within the trimeric MLL1 complex, we would expect the P(r) function of the complex to have peaks at 26-32 Å, reflecting the inter-atomic distances prevailing within these domains (Figs 1C, S5B). However, the experimental P(r) function of the trimeric MLL1 complex has its maximum at a much larger distance of ~ 47 Å (Fig S5B), suggesting the existence of possible inter-domain contacts in the complex.
To aid our modeling of the solution conformations of the trimeric complex we performed cross-linking mass spectrometry studies. We observed many intramolecular cross-links within each of the three proteins. These were highly consistent with the available WDR5 and MLL1-SET crystal structures, and importantly, in agreement with our RbBP5-NTD homology model, establish that these structural models are reliable representations of the domains within the trimeric complex in solution. We also observed a number of intermolecular cross-links, with the largest number being between MLL1 and RbBP5, suggesting association of these two subunits in solution. Fig 4B shows the sequence mapping of both intra-and inter-molecular DSS cross-links observed for the trimeric complex. For the purposes of modeling we used only intermolecular cross-links between lysine residues within the globular subunits (Table S1) domains and their WDR5 interacting sequences (WBS and WIN, respectively) (Fig 4E, S5C).
There are only two inter-domain cross-links that involve WDR5, and they can only be simultaneously satisfied in the more compact subpopulation corresponding to Rg ~ 37 Å.
The ensemble distribution of Cα-Cα distances corresponding to these cross-links showed a large fraction of ensemble members for which the cross-links cannot be simultaneously formed (Fig 4F).

RbBP5-NTD has a unique interaction mode with MLL1
A recent crystallographic study revealed an important role for the AS+ABM region of RbBP5 in binding to the SET domain of MLL family proteins, thereby stimulating the latter's catalytic activity (32). This work showed that the catalytic activity of MLL2, 3, 4 and SET1A/B was highly dependent on the RbBP5AS+ABM/ASH2LSPRY dimer, but not WDR5. In contrast, the catalytic activity of MLL1 SET domain was only weakly stimulated by the RbBP5AS+ABM/ASH2LSPRY dimer and instead, its optimal activity was more dependent on WDR5. Our solution model suggests an explanation for these observations.
Highly populated models of the trimeric complex in the optimal ensemble feature a direct interaction between the WD40 domain of RbBP5 and a short peptide sequence of MLL1 located between the WIN motif and the SET domain (Fig 4E). We refer to this RbBP5 binding sequence as the RBS region of MLL1 (Fig 1A). The RBS binding surface of RbBP5-NTD consists of a number of hydrophobic residues (V249, I283, L286, V287, and I289), and residues Q273 and P253 (Fig S5D).  6A). Pharmacological disruption of the WDR5-MLL interaction compromised the assembly of the trimeric complex (Fig 6A,B). OICR-9429 also inhibited the catalytic activity of the recombinant trimeric complex (Fig 6C). These results are consistent with our previous   Table 1 and Figs S1, S2.        Peaks are labeled with resonance assignments -254 backbone amides were assigned in the construct. The fractions containing the trimeric complex were collected and used for SAXS data collection and 2 cross-linking experiments. 3

SAXS data collection and analysis 4
SAXS measurements were carried out at the beamline 12-ID-B of the Advanced Photon Source, 5 Argonne National Laboratory. The energy of the X-ray beam was 14 Kev (wavelength λ=0. 8856 Å), and 6 two setups (small-and wide-angle X-ray scattering, SAXS and WAXS) were used simultaneously in which 7 the sample to Pilatus 2M detector distance were adjusted to achieve scattering q values of 0.006 < q < 8 2.6Å -1 , where q = (4π/λ)sinθ, and 2θ is the scattering angle. Thirty two-dimensional images were 9 recorded for each buffer or sample solutions using a flow cell, with the accumulated exposure time of 10 0.8-2 seconds to reduce radiation damage and obtain good statistics. No radiation damage was 11 observed as confirmed by the absence of systematic signal changes in sequentially collected X-ray 12 scattering images. The 2D images were corrected and reduced to 1D scattering profiles using the 13 Matlab software package at the beamlines. The 1D SAXS profiles were grouped by sample and 14 averaged. The scattering profile of the protein was calculated by subtracting the background buffer 15 contribution from the sample-buffer profile using the program PRIMUS (ATSAS package, EMBL) (1). 16 Concentration series measurements for each sample were carried out to remove the scattering 17 contribution due to inter-particle interactions and to extrapolate the data to infinite dilution. The

Structural characterization using SAXS data 1
The SAXS data indicate that the trimeric complex and its sub-complexes, as well as individual 2 molecules MLL1-SET and RbBP5, are flexible molecular systems in solution. Thus we take an ensemble 3 approach for structural characterization of these systems by utilizing SES protocol (13). The strategy on 4 which SES method is based consists of two main steps: 1) generate the initial ensemble of 5 conformations in order to approximate the conformational space available for a system in solution; 2) 6 find optimal weight for each conformation k from the initial ensemble that minimizes discrepancy 7 8 were is the experimental scattering intensity, N q is number of experimental points, is 9 the experimental error, and is scattering intensity predicted for kth conformation, and of N ens 10 is number of conformations in the initial ensemble. 11 Multi-orthogonal matching pursuit (13) is used to find possible ensembles on step 2, and optimal 12 ensemble size was select using l-curve. The optimal weights were then obtained by averaging over 13 top solutions with similar . 14

Generation of the structural ensembles 15
MLL1-SET. The high degree of flexibility observed for MLL1-SET sample originate from inherent 16 flexibility of SET domain and 28 aa long disordered N-terminal tail. We used all-atom molecular 17 dynamics simulations to generate initial ensemble of conformers. We used all atom MD simulations to 18 generate a trajectory started from the known crystal structure of the MLL1 SET domain with the 19 cofactor product AdoHcy (14) (PDB id: 2W5Y). After minimization and equilibration a productive run 20 was continued for 70 ns. Theoretical scattering profiles in the q range 0 < q < 0.3 for 7,000 frames 21 taken from the trajectory were calculated using CRYSOL.
trajectory of 20ns was generated and theoretical scattering profiles in the q range 0 < q < 0.3 for 1 2000 frames taken from the trajectory were calculated using FoXs (8). The domain keeps its structure 2 along the trajectory within 3.5 Å of backbone r.m.s.d. to the initial homology model. The calculated 3 scattering curves were averaged over the entire ensemble of structures using the optimal weights for 4 each ensemble member obtained with SES method, and this average profile was compared with the 5 experimental scattering data. 6 RbBP5. The initial ensemble for SES analysis of full length RbBP5 was generated using RANCH (16)  dummy residues. The experimental inter-domain cross-links data were taken onto account in the 1 CORAL calculations by introducing six Cα -Cα distance restraints (see Table S1) with upper bound of 30 Å. 2 CORAL tries to build a single conformation of the complex that fits SAXS data under the imposed 3 constraints. Performing multiple CORAL runs we generated a number of different conformations of the 4 complex that fit SAXS data with ~ 0.9. Although the obtained conformations have different inter-5 domain arrangements the relative position of the SET domain and WD40 domain of RbBP5 is well 6 defined and suggests the interaction of these domains in the complex. (ii) On the second step we 7 "refined" the best CORAL models by carrying out all-atom molecular dynamic simulations. The initial 8 conformation for MD simulations was constructed from CORAL model by building an all-atom 9 reconstruction model using PULCHRA (21). A 20 ns MD trajectory was generated at T = 300 K. (iii) On 10 the third step, we used coarse-grained MD simulations to generate a pool of possible conformations of 11 trimetric complex that are consistent with known inter-molecular binary interactions and cross-links. 12 The structures from the step 2 were used to derive native contact map of quasi-rigid regions of the 13 complex, which determines the nonbonded part of the Go-like potential. The quasi-rigid regions include 14 residues WDR5   were saved and used as initial ensemble for fitting to SAXS data by SES method. Theoretical scattering 18 profiles for each conformation in the ensemble were calculated in the q range 0 < q < 0.23 Å -1 using FoXS. 19

All-atom molecular dynamics simulations 20
A modified Generalized Born implicit solvent model (22) was exploited in the MD simulations in 21 order to accelerate sampling of the conformational space for each of the systems. All simulations used 22 an integration step of 2 fs with fixed bonds between hydrogen atoms and heavy atoms. Temperature 23 was controlled by carrying out Langevine dynamics with damping coefficient set to 2 . The cut-off 24 for non-bonded Lennard-Jones and electrostatic interactions were set to 18 Å. Ionic strength was set to 25 0.15M. All simulations were performed using NAMD 2.9 code (23) with the AMBER Parm99SB parameter 26 set (24). For residues that coordinate Zn ions a Zinc AMBER Force Field (25) was used. 27

Coarse-grained molecular dynamics simulations 28
We used a coarse-grained model of RbBP5/MLL1-WIN/WDR5 protein complex in order to 29 enhance the sampling efficiency in the conformational space of the complex. In this model, amino acid 30 χ SAXS ps −1 residues in the proteins are represented as single beads located at their Cα positions and interacting via 1 appropriate bonding, bending, torsion-angle, and non-bonding potential. A Gō-like model of Clementi 2 and Onuchic (26)was employed to maintain the structured, globular domains as quasi-rigid in the 3 simulation. For flexible regions, we adopt simple model in which adjacent amino acids beads are joined 4 together into a polymer chain by means of virtual bond and angle interactions with a quadratic potential. 5 ; 6 with the constants K b = 50 kcal/mol and = 1.75 kcal/mol and the equilibrium values = 3 .8 Å and 7 = 112° for bonds and angles, respectively. The excluded volume between nonbonded beads was 8 treated with pure repulsive potential 9 10 were is the inter-bead distance, = 4 Å, and = 2.0 kcal/mol. 11 The interaction between quasi-rigid domains is modeled with the residue-specific pair interaction 12 potentials that combine short-range interactions with the long-range electrostatics as it described (27, 13 28). The short-range interaction is given by a Lennard-Jones 12-10-6 -type potential and simple Debye-14 Hückel-type potential is used for the electrostatics interaction (27). In this study we used the dielectric 15 constant of 80 and the Debye screening length of 10 Å, which corresponds to a salt concentration of 16 about 100 mM. 17 To account for the experimentally observed cross-links we introduced used in the force field additional 18 distance restraints term given by a potential 19 ; 20 Here sum is over all cross-links, is the number of cross-links, is Cα-Cα distance for residues 21 involved in k-th cross-link, = 32 Å is upper bound, = 10 kcal/mol is force constant, is 22 Kronecker delta, and is random digital number selected from the interval [1, ]. is a 23 number that is randomly changed every = 10 ns during the MD simulation. 24