Induced fit with replica exchange improves protein complex structure prediction

Despite the progress in prediction of protein complexes over the last decade, recent blind protein complex structure prediction challenges revealed limited success rates (less than 20% models with DockQ score > 0.4) on targets that exhibit significant conformational change upon binding. To overcome limitations in capturing backbone motions, we developed a new, aggressive sampling method that incorporates temperature replica exchange Monte Carlo (T-REMC) and conformational sampling techniques within docking protocols in Rosetta. Our method, ReplicaDock 2.0, mimics induced-fit mechanism of protein binding to sample backbone motions across putative interface residues on-the-fly, thereby recapitulating binding-partner induced conformational changes. Furthermore, ReplicaDock 2.0 clocks in at 150-500 CPU hours per target (protein-size dependent); a runtime that is significantly faster than Molecular Dynamics based approaches. For a benchmark set of 88 proteins with moderate to high flexibility (unbound-to-bound iRMSD over 1.2 Å), ReplicaDock 2.0 successfully docks 61% of moderately flexible complexes and 35% of highly flexible complexes. Additionally, we demonstrate that by biasing backbone sampling particularly towards residues comprising flexible loops or hinge domains, highly flexible targets can be predicted to under 2 Å accuracy. This indicates that additional gains are possible when mobile protein segments are known.


Introduction
, and (2) induced-fit (IF) (12-14). In CS, unbound protein monomers exist in an ensemble of diverse 23 conformations, and the monomer conformations corresponding to the thermodynamically stable minima 24 are selected upon binding (13). This mechanism motivated our prior method, RosettaDock 4.0 (11), a  simulations follow the IF-approach for all atoms, however, they are bound by time and length scales (18,19). 37 Thus, expensive molecular dynamics (MD) simulations are accelerated with alternative sampling techniques 38 such as steered MD (20), replica-exchange (21), or metadynamics (22) to refine rigid-body poses of docked 39 proteins or dock small, rigid proteins (23, 24). 40 relevant degree of freedom of the system (24,26,27). To date, however, none of these methods incorporate 46 larger conformational rearrangements between protein partners upon docking. Moreover, most of the 47 modeling examples have been limited to rigid-proteins with little flexibility (RMSD BU < 1.2 Å). 48 Here, we couple the sampling prowess of replica exchange algorithms with the induced-fit binding mecha-49 nism to develop a new, aggressive, flexible backbone protein docking method. Our method, ReplicaDock 50 2.0, builds on Zhang et al.'s prior work on replica-exchange MC-based rigid-docking (ReplicaDock (21, 26)) 51 and adds backbone motions to tackle moderate and highly flexible targets. We test our method on a diverse 52 set of protein targets from the Dockground benchmark (28) that spans rigid, moderately flexible, and highly 53 flexible targets. Despite the power of REMC, it is still unfeasible to explore all backbone conformational 54 degrees of freedom, therefore we test the efficacy of choosing different flexible subsets. Finally, we examine 55 whether biasing the sampling choices can generate sub-angstrom quality predictions.

57
Protein-protein docking studies with T-REMC by Zhang et al. (21,26) demonstrated significant improvement 58 in sampling docking orientations, albeit with two important limitations. (1) No backbone degrees of freedom 59 were sampled, restricting the search to rigid-body moves and thus precluding success on medium and 60 highly-flexible docking targets. (2) The low-resolution energy function was inaccurate (21), so the improved 61 sampling often led to incorrect complex structures. In this work, we address these limitations and improve 62 protein-protein docking for previously intractable flexible targets.

63
ReplicaDock 2.0 protocol selectively samples backbone degrees of freedom while docking. 64 To address the backbone sampling limitation, we created ReplicaDock 2.0, an induced-fit (IF) inspired, 65 T-REMC plus minimization algorithm that samples backbone conformations on-the-fly while docking. 66 ReplicaDock 2.0 consists of two stages, low-resolution sampling and high-resolution refinement (Fig. 1). 67 To capture backbone degrees of freedom, the low-resolution stage performs replica-exchange and samples 68 both backbone conformations and rigid-body orientations. For each docking pose sampled, backbone moves 69 are sampled via Rosetta Backrub (29) over the interface residues. Our hypothesis is that by narrowing our 70 search to the putative interface, the protocol will capture realistic conformational changes while maintaining 71 feasible compute times. The low-resolution IF-based method samples the six rigid-body degrees of freedom 72 along with the 3 N backbone degrees of freedom (φ, ψ, ω) for N interface residues. By extending this sampling 73 procedure over three replicas with inverse temperatures, β, of 1.5 −1 kcal −1 · mol, 3 −1 kcal −1 · mol and 5 −1 74 kcal −1 · mol, a range of backbone conformations sampled. We chose the number of replicas and replica-75 temperatures such that the energy distribution at any replica overlaps sufficiently with adjacent replicas, 76 allowing efficient exchanges (Supplementary Fig. S1). After every 1, 000 MC trials of rigid-body and 77 backbone motions, an MC-swap is attempted between neighboring replicas as per the Metropolis criterion 78 (30) (Methods). Higher temperature replicas accept backbone moves that would be otherwise rejected at 79 lower temperatures. To expand the diversity of sampled structures, up to 8 independent trajectories are 80 initiated from the starting docking pose. After generation of candidate docking poses in the low-resolution 81 stage, the high-resolution stage performs an all-atom refinement of side-chain orientations without any 82 rigid-body motion, thereby resolving any side-chain clashes to form a compact, low-energy, high-resolution 83 interface. After evaluating the refined structures' all-atom scores, the lowest scoring structure is the complex 84 prediction. Starting from an initial docking pose i.e. a structural model with randomly oriented protein partners, the protocol perturbs the protein partners and slides them into contact. This creates an initial docking pose for the low-resolution stage. Here, the pose object is copied to three parallel replicas per trajectory, and each replica performs rigid body moves (rotation-translation) and backbone moves for each MC trial, followed by exchange between replicas after every 1,000 th trial. Each exchange obeys the Metropolis acceptance criterion and if accepted, the low resolution structure is output. Each trajectory completes 2.5×10 5 MC trial steps, and produces ∼5,000 candidate structures. Lastly, all produced structures undergo an all-atom refinement comprising of side-chain packing, small rigid-body motions, and energy minimization to output final docked structural models. ReplicaDock 2.0 uses a residue-transform based scorefunction. 86 As ReplicaDock 2.0 performs backbone sampling and generates docking poses during the low-resolution 87 stage, it is crucial to have a score function that favors native-like interfaces. Thus, our next step in improving 88 docking performance was to tackle the limitation of the inaccurate low-resolution centroid score function 89 as observed by Zhang et al. (21). In their recent CS-based approach, RosettaDock 4.0, Marze et al. (11) 90 created the Motif Dock Score (MDS), a pre-tabulated score based on the residue-pair transforms approach 91 (31) where energy of the interacting residues is defined by the 6-dimensional translation and rotation 92 coordinates specifying their relative backbone locations. This simple scorefunction accurately estimated 93 the well-tested all-atom score-function with a faster compute time (31). MDS is restricted to inter-chain 94 energies, which worked well for pre-generated monomer ensembles with fixed backbones in RosettaDock 95 4.0. For IF-based ReplicaDock 2.0, however, intra-chain energies must be included as the backbone moves, 96 especially clashes. Therefore, we incorporated knowledge-based backbone torsion statistics terms and Van 97 der Waals interaction terms (32) to create the Motif Updated Dock Score (MUDS). We optimized the 98 relative weights of the MUDS energy terms based on the number of CAPRI acceptable quality structures in 99 the top-scoring 10% of sampled structures (enrichment) (Supplementary Fig. S6). 100 With this updated scoring and sampling schemes, we tested the performance of our ReplicaDock 2.0 101 protocol for global and local docking tasks.

102
Rigid global docking with ReplicaDock2.0 can identify local binding patches. 103 Docking challenges can be categorized as either global (without any prior knowledge of binding interface) 104 or local (using knowledge of putative binding patches). Conventionally, predictors search with a rigid 105 protein backbone to identify putative binding interfaces (e.g., with ClusPro (33) or ZDOCK (34)), and 106 then each binding interface is refined, often with backbone conformational change. This strategy breaks 107 down docking hierarchically. Global docking has been performed with a T-REMC approach (ReplicaDock 108 (21)), but low-scoring structures were often far from the experimental structure owing to the inaccurate 109 centroid score function. With our updated score function MUDS, we hypothesized that its discriminative 110 power would enable a rigid-body global docking simulation to better identify native-like interfaces. To 111 test this hypothesis, we ran ReplicaDock 2.0 without backbone conformational sampling (only rigid-body 112 rotational and translational moves) on 10 protein targets starting from random orientations of the protein 113 partners. To illustrate the rigid global docking performance, Fig. 2A plots the low-resolution score (MUDS) 114 versus the RMSD from the native structure for all generated candidate structures for two representative, 115 medium-flexibility protein targets (2CFH, trafficking protein particle complex subunits, 1.55 Å RMSD BU 116 (35) and 1XQS, HspB1 core domain complexed with Hsp70 ATPase domain, 1.77 Å RMSD BU ) (36). As a 117 reference, we relaxed the experimental bound structure with relatively small rigid-body moves (rotations 118 and translations of 0.5 • and 0.1 Å, respectively) to generate near-native structures (blue in Fig. 2A). 119 ReplicaDock 2.0 generates low-scoring near-native orientations (under 5 Å RMSD) for 2CFH, however, 120 for 1XQS, sampling is limited to RMSD values above 6 Å, with the lowest scoring structures about 20 Å 121 away from the experimental structure. On 10 protein targets (Supplementary Fig. S2), ReplicaDock 2.0 122 produced models within 5 Å of the native-bound structure for 8 of 10 targets. For comparison, ClusPro (37) 123 successfully predicts 6 of 10 targets. Thus, ReplicaDock 2.0 can perform exhaustive global sampling on the 124 protein energy landscape with better near-native discrimination. One limitation is that global docking with 125 ReplicaDock 2.0 requires 600-800 CPU hours, compared with 35 CPU hours (as reported by Varela et al.  Flexible local docking with ReplicaDock2.0 samples deeper energy funnels. 132 When given a putative, broadly-defined binding patch, local docking approaches strive to obtain the 133 biological complex structure by capturing conformational changes in protein partners. ReplicaDock 2.0 134 explores conformational changes by restrictively sampling backbone moves at putative interfaces. To 135 evaluate the extent of flexibility that can be incorporated while docking for optimum performance, we tested 136 different selections of residues for backbone sampling (Supplementary Fig. S4). First, we performed 137 backbone moves conservatively over only the set of residues with atoms lying within 5.5 Å of the binding 138 partner (Set 1: 5.5 Å interface patch). Then, we expanded the selection to residues with atoms lying within 139 8 Å of the binding partner (Set 2: 8 Å interface patch). As loops are the most flexible secondary structural 140 element in a protein structure, we incorporated residues belonging to all the loop regions from the unbound 141 protein monomers, and added them to prior residue sets to obtain Set 3 (5.5 Å interface patch + loop 142 residues) and Set 4 (8 Å interface patch + loops residues) respectively. For local docking on 12 test targets, 143 we generated ∼5,000 structures and sub-sampled sets of 1,000 structures to calculate the expected number of 144 near-native structures (defined as CAPRI acceptable quality or better) in the 5 top-scoring structures ( N5 ). 145 Higher N5 indicates that in blind predictions, top-scoring structures are more likely to be correct. of the four flexibility scopes. Extending the backbone moves to 8 Å interface patch increased N5 across all 148 targets, and offered enough flexibility to capture the binding-induced conformational changes. Incorporating 149 loops reduced performance for medium-flexible and rigid targets (average performance for medium-flexible 150 targets dropped from N5 =5 in Case 2 to N5 =2.5 in Case 4), possibly due to over-sampling of backbone 151 moves in relatively rigid regions of the protein structure. Adding flexibility to all loops, the scorefunction 152 misdirects sampling in non-native, spurious minimas, resulting in alternate binding modes with large 153 buried surface area or distorted protein tertiary structures (as shown by the false positive minimas in 154 Supplementary Fig. S5). 155 With the interface patch chosen as the mobile residue-set, we next evaluated the local docking performance 156 of ReplicaDock 2.0 against RosettaDock 4.0. This also served as a head-to-head comparison between two 157 kinetic mechanisms of binding i.e. IF versus CS. As an example, Fig. 2C shows the generated candidate 158 structures for two representative protein targets 2CFH and 1XQS with the two docking methods. The 159 low-resolution score (MUDS) versus C α RMSD plots for the targets 2CFH and 1XQS show that ReplicaDock 160 2.0 samples structures that score lower than RosettaDock 4.0. Further, in contrast with RosettaDock 4.0 161 funnels, ReplicaDock 2.0 produces deeper funnels, suggesting that as induced-fit enables the protocol to 162 capture better backbone conformations, replica exchange improves the docking orientations of the encounter 163 complexes generated, thereby allowing us to reach lower, native-like energies (bound-derived funnel in blue). 164 Induced-fit recapitulates native contacts but fails to push backbone sampling towards bound confor-165 mations. 166 With candidate structures producing deeper energy minima in low-resolution, we then evaluated the 167 performance of ReplicaDock 2.0 in capturing native-like protein complexes after all-atom refinement. We 168 refined the candidate structures generated in low-resolution stage with the Rosetta all-atom ref2015 energy 169 function (32). Fig. 3A and D show the results for the same two protein targets (2CFH and 1XQS) by 170 comparing the interface energies (equivalent to thermodynamic binding energies) versus the interface-RMSD. 171 For both protein targets, ReplicaDock 2.0 retains the better-scoring structures from the low-resolution 172 stage (Fig. 2C). Relative to RosettaDock 4.0, ReplicaDock 2.0 structures have better all-atom scores and 173 an improved CAPRI quality as evident by the greater number of medium-quality decoys. Despite this 174 improvement, there remains a gap in interface-RMSD between the lowest scoring docked structures and the 175 refined native structures (blue in Fig. 3A and D). we calculated the monomer component backbone RMSDs from the bound backbone conformations (Fig. 177  3B and E). Although ReplicaDock 2.0 generates a much more diverse set of backbone conformations than 178 RosettaDock 4.0, the best RMSDs attained by both the methods are comparable. Note that RosettaDock 179 4.0 uses pre-generated ensembles resulting in all candidate docking structures being limited in the backbone 180 conformation space (all RMSDs within a rectangular region), whereas ReplicaDock 2.0 generates more 181 diversity. These docking metrics for the entire benchmark set are illustrated in Supplementary Fig. 182 S13-S15. 183 Further, we calculated the native-like interactions made by the interface residues with the fraction of 184 native residue-residue contacts, f nat (Fig. 3C and F). With the induced-fit strategy, ReplicaDock 2.0 185 increases the f nat over RosettaDock 4.0 by ∼0.2. By sampling protein conformations in the vicinity of its 186 binding partner, ReplicaDock2.0 is able to orient more interface residues to a native-like state, thereby 187 recapitulating a larger fraction of bound contacts (Supplementary Fig. S13-S15, on left).

195
To compare the performance, we measured N5 after the low-resolution and high-resolution stage for 196 the full benchmark set of 88 targets. We define a structure as near-native if the C α RMSD ≤ 5 Å for the 197 low-resolution stage, and if the CAPRI rank is acceptable or better for the high resolution stage.  To better illustrate the trend, we plotted the probability density of N5 across all targets (Fig. 4B). 205 ReplicaDock 2.0 shifts the curve towards higher N5 , particularly for moderately-flexible targets (Fig. 4C). 206 For 37 out of 44 moderately-flexible targets, ReplicaDock 2.0 performance is either equivalent or better 207 than RosettaDock 4.0. However, for highly flexible targets, the improvement is modest; docking proteins 208 with higher conformational changes (RMSD BU > 2.2 Å) is still a challenge. On an absolute basis with N5 209 ≥ 3 as a success criteria, ReplicaDock 2.0 correctly predicts near-native docked structures in 80% of rigid, 210 61% of medium-flexible and 35% of highly-flexible docking targets.

Sampling of known mobile residues captures near-bound conformations of highly flexible protein tar-212
gets. 213 While ReplicaDock 2.0 generates better quality structures, it fails to reach sub-angstrom interface accuracy 214 for many flexible targets, as shown in Fig. 3A and C for 2CFH and 1XQS. Upon inspection of the bound 215 and unbound structures of medium and difficult targets, we observed that the backbone conformational 216 changes were diverse, ranging from motion of loops and changes in the secondary structure to hinge-like 217 motion between intra-protein domains. The residue sets for backbone sampling in ReplicaDock 2.0 were not 218 broad enough to capture these conformational changes. To push towards these larger backbone motions, we 219 wondered whether ReplicaDock 2.0 might attain native-like backbones if it used the information of the 220 residue set that results in the conformational change. To test this claim, we identified the mobile residues 221 on the unbound protein partners of Ras:RALGDS domain complex (1LFD, 1.79 Å RMSD BU (44)) that 222 showed more than 0.5 Å RMSD when superimposed over the bound structure. Next, instead of automating 223 the selection of interface residues on-the-fly in the baseline protocol, we fed the ReplicaDock 2.0 protocol 224 the identity of these mobile regions. In this version, we restricted the replica-exchange backbone sampling 225 strategy towards pre-selected mobile residues, thereby implementing a directed induced-fit mechanism for 226 protein docking. Comparing with the Ras' unbound structure (grey) superimposed over the bound (green), the docked structure loop (blue) has moved closer to the bound state (green) for the two cases respectively. With directed sampling, it is able to capture the backbone structure to sub-angstrom accuracy.
To investigate whether directed induced-fit improves the docking performance, we evaluated the interface 228 scores, native-like contacts and near-bound backbone conformations. Fig. 5 compares the directed IF 229 approach (bottom) with the vanilla version (top), which performs unbiased backbone sampling over putative 230 interface residues. The results from Fig. 5A and E suggest that with directed IF, the protocol is now able 231 to generate sub-angstrom structures with high-quality CAPRI ranks. In addition, Fig. 5B and F show 232 that it also increases the fraction of native-like contacts at the interface from an f nat score of roughly 0.6 to 233 Harmalkar et al.
December 8, 2021 | 11 0.8. The most significant difference is illustrated by the backbone RMSDs of the ligand and receptor chains 234 relative to the bound structure. With directed sampling, the backbone RMSD values do not go higher than 235 ∼0.5 Å away from the bound, starting from the unbound (Fig. 5G), whereas the unbiased case samples 236 extensive conformation space away from both bound and unbound (Fig. 5C). Finally, to give a structural 237 perspective, Fig. 5D and H   with lower precision of predicted LDDT from AlphaFold (39,40) correspond to the flexible regions. We 274 anticipate that by improving our ability to predict intrinsic flexibility of residues, T-REMC docking with 275 ReplicaDock 2.0 has potential to make even larger strides in flexible-backbone protein docking.

276
Binding-induced conformational changes, and backbone flexibility at large, has long confounded protein-277 docking algorithms (47). By improving our understanding of protein interactions and the molecular 278 recognition process, we could determine structures that are yet to be experimentally validated e.g., SERCA-279 PLB transmembrane complex critical for cardiac function (48), and explore potential association pathways, 280 such as the translocation of protein antibiotics (e.g., colicins) through cellular nutrient transporters (2). 281 Insights into protein docking and binding interfaces have enabled successful computational designs such as 282 symmetrical oligomers for self-assembling nanocages (49, 50) and orthogonal designs of cytokine-receptor 283 complexes (51). Capturing larger conformational changes will eventually impact our ability to design 284 proteins with complex functions. Looking ahead, we anticipate that capturing the dynamic behaviour of 285 proteins in docking will guide molecular engineering and de novo interface modelling to develop functional 286 protein interfaces for biology, medicine and engineering.  for the residues in between, thereby providing a restrictive IF-like motion. 325 We scale temperature across three replicas with inverse temperatures set to 1.5 -1 kcal −1 .mol, 3 −1 326 kcal −1 .mol and 5 −1 kcal −1 .mol, respectively. Replica exchange swaps are attempted every 1,000 MC steps 327 and candidate structures are stored after every successful swap. An exchange attempt is successful if the 328 Metropolis criterion is obeyed as stated below: Here, i and j are the replica-levels across which the swap is performed, E is the MUDS energy, k B is the 331 Boltzmann's constant, T is temperature and P i is the probability set in Metropolis criterion that needs to 332 be obeyed for acceptance (generally set to 0.5). Thus, ReplicaDock 2.0 simulations scale the temperatures 333 to modulate the acceptance of backbone and docking moves, so motions that are penalized heavily at 334 lower replicas can be accepted at higher replicas, thereby allowing more diversity in capturing backbone 335 conformations as well as docking orientations. The generated candidate structures are further passed to the 336 high-resolution stage for all-atom refinement. For each local docking simulation, we initiate 8 trajectories, 337 each trajectory spanning over 3 replicas, run for 2.5 x 10 5 MC steps generating ∼5,000 candidate structures. 338 For global docking, we run 10 6 -10 8 MC steps, generating roughly 24,000 candidate structures. 339 Benchmarking, evaluation and success metrics. 340 Four interface residue selection tests were performed on 12 unbound targets from the DockGround Benchmark 341 Set (28) to optimize the flexibility scope over interface residues, number of trajectories and MC trials. 342 ReplicaDock 2.0 docking runs were performed on the entire Dockground benchmark set of 44 medium and 343 34 difficult targets. We added 10 rigid targets for a final set with 88 targets. As defined in CAPRI (52), 344 we calculated the interface RMSD (I-rms), ligand RMSD (L-rms), all-atom RMSD(RMSD), C α RMSD 345 and fraction of native-like contacts (f nat ) against the bound complex. Further, the results of the docking 346 simulations were evaluated with the expected N5 metric. N5 denotes the number of near-native decoys 347 in the five top-scoring structures. A structure is deemed as near-native if the C α RMSD ≤ 5 Å for the 348 low-resolution stage, and if CAPRI rank ≥ 1 for the high resolution stage (6). First, we bootstrapped 349 1,000 structures, i.e. randomly selected 1,000 structures with replacement from the generated candidate 350 structures. Then, by evaluating whether the five top-scoring structures were near-native, we determined 351 the N5 value. This procedure was repeated 1,000 times for robustness to obtain the expected value ( N5 ). 352 Successful docking for a target is defined as N5 ≥ 3.