Abstract
Although we understand many aspects of how small proteins (number of residues less than about hundred) fold, it is a major challenge to understand how large proteins self-assemble. To partially overcome this challenge, we performed simulations using the Self-Organized Polymer model with Side Chains (SOP-SC) in guanidinium chloride (GdmCl), using the Molecular Transfer Model (MTM), to describe the folding of the 110-residue PDZ3 domain. The simulations reproduce the folding thermodynamics accurately including the melting temperature (Tm), the stability of the folded state with respect to the unfolded state. We show that the calculated dependence of ln kobs (kobs is the relaxation rate) has the characteristic Chevron shape. The slopes of the Chevron plots are in good agreement with experiments. We show that PDZ3 folds by four major pathways populating two metastable intermediates, in accord with the kinetic partitioning mechanism. The structure of one of the intermediates, populated after polypeptide chain collapse, is structurally similar to an equilibrium intermediate. Surprisingly, the connectivities between the intermediates and hence, the fluxes through the pathways depend on the concentration of GdmCl. The results are used to predict possible outcomes for unfolding of PDZ domain subject to mechanical forces. Our study demonstrates that, irrespective of the size or topology, simulations based on MTM and SOP-SC offer a framework for describing the folding of proteins, mimicking precisely the conditions used in experiments.
Introduction
The most common way of initiating folding (unfolding) of proteins in ensemble and single molecule experiments is by decreasing (increasing) the concentration of denaturants. Thus, direct comparison with experiments is only possible if simulations are done using models that take the effects of denaturants into account.1 Although atomic detailed simulations hold the promise of quantitative description of denaturant-induced folding or unfolding,2-6 currently the only available method for obtaining the thermodynamics and folding kinetics of proteins, even for proteins as large as GFP,7 is the Molecular Transfer Model (MTM) in combination with coarse-grained SOP-SC representation of polypeptide chain.8,9 The theoretical basis for the success of the MTM has been explained elsewhere.10 Applications of MTM to probe folding of a variety of proteins have yielded quantitative agreement with experiments 7,8,11-13 which attests to the efficacy of the MTM.
One of the early applications of MTM was the demonstration that the Chevron plot of the 56-residue srcSH3 domain could be reproduced nearly quantitatively.9 However, the extension of these calculations to proteins with more than hundred residues has been difficult even using the simplified coarse-grained SOP-SC models. One of the goals of this study is to overcome this challenge. The second problem that we address is to establish the folding mechanism as a function of denaturant concentration for a large single domain protein. If the folding mechanism involve parallel pathways, as theoretical and computational studies have firmly established,14-17 then are the fluxes through the pathways modulated by changing the external conditions such as denaturant concentration or mechanical forces? Understanding the origin of parallel folding and unfolding pathways, and how they are altered by environmental changes, is important in establishing the generality of the protein folding mechanisms. Here, we investigate the denaturant-dependent folding and unfolding of PDZ3, a protein with110 residues.
PDZ domains, found in many cell junction-associated proteins, are a large family of globular proteins that mediate protein-protein interactions and play an important role in molecular recogniton.18-21 The folding of PDZ3 domain, a member of this family, has been studied both by experiments and computations. 22-26 The constructs used in these experiments differ. For example, Bai and coworkers22 used the construct with two additional β-strands at the C terminal, which are not found in the native PDZ3 domain. Two recent experiments27,28 have shown that much of the folding properties, such as the existence of an intermediate or the nature of the transition states, are not greatly affected in the presence or absence of non-native structural elements. In GmdCl-induced equilibrium unfolding experiments of PDZ3 or its variants, the folding transition appears to be highly cooperative, seemingly displaying a simple two-state behavior. In the Chevron plot of PDZ3 domain in GdmCl solution at pH= 6.3, both the folding and unfolding arms are linear functions of GdmCl concentrations [C], indicating no detectable intermediate states at the ensemble level. However, native-state hydrogen exchange experiments reveal hidden intermediate states under native conditions.22 Interestingly, the addition of potassium formate at pH= 2.85 induces a rollover in the unfolding arm in the Chevron plot, suggestive of an intermediate.24,25 This finding is reminiscent of the salt-induced detour found in the folding of the protein S6. 29 In the presence of potassium phosphate at pH= 7.5, there are two thermal unfolding transitions in DSC experiments, further demonstrating the existence of intermediate states, at least in the presence of salt.26 Based on these experiments, we surmise that the PDZ3 domain folding must occur by multiple pathways, even if they are hard to detect in generic ensemble experiments. If there are multiple pathways, is it possible that the fluxes through these pathways could change by altering the concentration of denaturant in this large single domain as reported for a two-domain protein30? Here, we answer this question in the affirmative using MTM simulations in the folding of PDZ3 domain as a function of GdmCl concentration, after establishing that the SOP-SC simulations capture the key findings in ensemble experiments.
We combine simulations of coarse-grained off-lattice SOP-SC model31-33 and the molecular transfer model8-10 to decipher the folding mechanism of PDZ3 domain. The calculated fractions of molecules in the native basin of attraction(NBA), fNBA, and the unfolded basin of attraction(UBA), fUBA as a function of the denaturant concentration [C] are in excellent agreement with experiments. In addition, we find that a small fraction, fıBA, of an intermediate, not resolved in ensemble experiments, is populated in equilibrium The Tanford β parameters for the two transition state ensembles obtained in simulations are in quantitative agreement with those inferred from experiments.24,25 Chevron plot calculated for the first time for a protein with over 100 amino acids, allows us to extract the locations of the transition state ensembles (TSEs) in terms of the Tanford β parameters. The calculated free energy profiles suggest that the at low (high) [GdmCl] a less (more) structured TSE is rate limiting. Folding trajectories both in aqueous and in denaturant solutions demonstrate directly the existence of the thermodynamic intermediate states as well as kinetic intermediate states. Our simulations vividly illustrate four parallel folding pathways in molecular detail. We show that the fluxes between the assembly pathways can be modulated by varying the denaturant concentration. Many of our predictions are amenable to experimental test.
Methods
SOP-SC model
Our simulations were carried out using the SOP-SC (Self-Organized Polymer-Side Chain) model for the protein.9,10 Each residue is represented by two interaction centers, with one centered at the Cα position, and the other located at the center of mass of the side chain. The energy function of a conformation in the SOP-SC representation of the polypeptide chain is,
The detailed functional form and the values of the parameters are described elsewhere.9
Molecular Transfer Model
In the MTM the effective free energy function for a protein in aqueous denaturant solution is where ΔG({ri}, [C]) is the free energy of transferring a given protein conformation from water to aqueous denaturant solution, the sum is over all the interaction centers (i), δg(i, [C]) is the transfer free energy of interaction center i, αi is the solvent accessible surface area (SASA), and αGly-i-Gly is the SASA of the ith interaction center in the tripeptide Gly – i – Gly.
We used the procedure described in detail previously9,10 to calculate the thermodynamic properties of proteins in the presence of denaturants.
Langevin Dynamics
We assume that the dynamics of the protein is governed by the Langevin equation, where m is the mass of a bead, ζ is the friction coefficient, Fc = –∂EP({ri})/∂ri is the conformational force calculated using Eq. (1), Γ is the random force with a white noise spectrum. To enhance sampling, we used Replica-Exchange Molecular Dynamics (REMD)34-36 to carry out thermodynamics sampling at low friction coefficient ζL = 0.05m/τL.37 Here, τL is the unit of time (see below) for use in the computation of thermodynamic quantities, and m is the average mass of the beads. It is obvious that the precise values of these two quantities do not play a role in the determination of the equilibrium properties of interest. In the underdamped limit, we employ the Verlet leap-frog algorithm to integrate the equations of motion.
Brownian Dynamics
To obtain a realistic description of the kinetics of folding or unfolding, we set ζH = 50m/τL, which approximately corresponds to the value of the friction coefficient in water. 38 At the high ζ value where the inertial forces are negligible, we use the Brownian dynamics algorithm39 to integrate equations of motion using
It is important to note for obtaining kinetic properties the systematic force, Fc in Eq. 4 is, where GP ({ri}) is given in Eq. (2).
Time scales
In the high ζ limit the unit of time is . Following Veitshans,38 we chose el = 1 kcal/mol, average mass m =1.8 × 10−22 g, a = 4 Å, which makes τL = 2 ps. For ζH = 50 m/τL, we obtain τH = 164 ps. These estimates are used to obtain estimates of the folding times from our Brownian dynamics simulations.
In Langevin dynamics simulations, the integration time step, h = 0.05τL, whereas in the Brownian dynamics simulations, h = 0.1τH (Eq. (4)).
Results
The structure of the N = 110 residue PDZ3 domain from PSD-95 is shown in Fig-1A (PDB ID 1BFE). In contrast to the typical structure of a PDZ domain, it has one additional helix (α3 and a two-stranded β sheet formed between two short strands, β7 and β8 at the C terminal. They are colored in red in Fig-1A.
Melting temperatures
The melting temperature, identified with the peak in the heat capacity Cv(T) at [C] = 0 (black line in Fig-1B) is Tm = 323K, which is in reasonable agreement with the experimentally measured Tm = 344K.26 Similarly, by associating the melting temperatures with the peaks in the heat capacity at different values of [C] (Fig-1 B) we determined the dependence of Tm[C] on [C]. It is clear from Fig-1 C that Tm[C] is linearly dependent on [C] (see figure caption for the parameters of the fit).
Boundaries between the distinct thermodynamic states
To define the Native Basin of Attraction (NBA) and the basins corresponding to the unfolding (UBA) and intermediate states (IBA), we obtained the free energy profile G(χ) as a function of the structural overlap function, χ, which serves as an order parameter. The structural overlap function where,
In Eq. (6), Θ(x) is the Heavyside function. If , there is a contact. Nk is the number of contacts in the kth conformation and NT is the total number in the folded state. The profile G(χ) at Tm[0] (Fig-2A) shows that the conformations can be classified into three groups separated by the black vertical lines; and are the cutoff values separating the NBA, UBA, and IBA. If , the corresponding conformation belongs to the NBA and if , the conformation belongs to UBA, and all the other conformations are grouped into IBA.
GdmCl dependence of thermodynamic stability and m-value
In order to compare with experiments, we simulated the effects of GdmCl using the Molecular Transfer Model (MTM).8 Following our previous studies,8-10 we choose a simulation temperature, Ts, at which the calculated free energy of stability of the native state (N) with respect to the unfolded state (U), ΔGNU(Ts) (GN(Ts) – GU(Ts)) and the measured free energy at TE (=298K) ΔGNU(Te) coincide. For PDZ3, ΔGNU(TE = 298K) = –7.4kcal/mol at [C] = 0,22 which results in Ts = 306K, which is close to TE. It is worth emphasizing that besides the choice of Ts no other parameter is adjusted to fit any experimental data.
With Ts = 306K fixed, we calculated the dependence of the fraction of molecules in the NBA, fNBA([C],Ts), in the UBA, fUBA([C],Ts), and in the IBA, fIBA([C],Ts), on [C] (Fig-2B). The midpoint concentration, Cm, obtained using fUBA([Cm], Ts) = 0.5 is [C]=3.05M, which is very close to 2.90M, measured in experiments.22 For comparison, the experimentally monitored maximum wavelength of the fluorescence at different concentrations of GdmCl is also shown (blue, left scale in Fig.2B). Although it is not a direct measure of fUBA, it is correlated to fUBA, which in turn is in good agreement with the result based on simulations. The ability to reproduce reasonably accurately experimental measurements further establishes the efficacy of the MTM and SOP-SC simulations in capturing the folding thermodynamics of single domain proteins in general, and PDZ3 in particular.
The small value of fIBA([C],Ts), compared to fNBA([C],Ts) and fUBA([C],Ts), explains why the intermediate state is hard to detect in the equilibrium denaturation experiments. 22 Our finding that fIBA([C],Ts) is small is consistent with the observed protein-concentration dependent thermal unfolding in DSC experiments, where at low PDZ3 concentration only one transition is observed as the associated intermediate is not substantially populated. 26
The native state stability with respect to U, ΔGNU([C])(= GN([C]) – GU([C])), is calculated using . The linear fit, ΔGNU([C]) = ΔGNU(0) + m[C], yields ΔGNU([0]) = –6.18kcal/mol and m = 2.03kcal/mol · M (Fig. 2C), which is in reasonable agreement with experimental estimate m = 2.50kcal/mol · M.22 In light of a recent experiment showing that the truncation of the α3 helix only modestly destabilizes the native state,27 we surmise that the addition of extra two β strands at the C terminal probably does not significantly affect the stability of PDZ3, and thus the value of ΔGNU([0]).
Free energy profiles as function of the order parameter χ
To illustrate how GdmCl changes the free energy landscape, we plotted the free energy profiles as functions of χ at different [C] at Tm[C] in Fig-3A and at a fixed temperature, Ts = 306K in Fig-3B. We use χ in Eq. (6), the microscopic order parameter of the protein, to distinguish between the native, the unfolded and high-energy intermediate states.40 Fig-3B shows that, at low [GdmCl], the first transition state ensemble, TSE1, is rate limiting for folding. However, at high [GmdCl] the second transition state ensemble, TSE2, is rate limiting. These findings agree qualitatively with the energy diagram for the folding reactions proposed elsewhere. 24 The movement of the transition states with changing concentration is in accord with the Hammond postulate.
Structures of the transition state ensembles (TSEs)
The free energy profile as a function of χ in Fig. 2A, suggests that there are two barriers. The ensembles of conformations at their locations, grouped as TSE1 and TSE2, are shown by the shaded areas. The global characteristic of the TSE in ensemble experiments is usually described using the Tanford parameter, β. From the observed chevron plot, 24 or (0.56, 0.90) for TSE1, TSE2 respectively.25 It is generally assumed, that β is related to the buried solvent accessible surface area (SASA) in the TSE. For the TSE obtained in our simulations, we calculated the distribution P(ΔR) (Fig-4A), where ΔR = (ΔU – ΔTSE)/(ΔU – ΔN) with ΔU, ΔTSE, ΔN are the SASA in the DSE ([C] = 8.0M), TSE, and the NBA ([C] = 0.0M), respectively. We found that the average for TSE1 and for TSE2, which are in qualititative agreement with the experimentally measured values. The small deviations between simulations and experiments may be due to the following reasons. (1) The PDZ3 domain in the simulations has one additional helix and one β sheet, which is absent in the construct used in the experiments. These extra structural elements are highly flexible even under native conditions (upper left in Fig-4B), which lowers < ΔR >. (2) When fitting the chevron plot to obtain β in experiments, both and (0.3, 0.8) give reasonable fits, indicating a range of β can describe the experimental data.24 Given these observations, we surmise that the Tanford β calculated using the simulation data is in reasonable agreement with experimental values.
Fig-4B and Fig-4C show the contact maps obtained from the TSE1 and TSE2, respectively. It is clear that, relative to the native state (upper left) the TSE1 structures, populated at low [C], is moderately structured. In contrast, the structures are ordered to a greater extent in the TSE2. The contacts between β1 – β6 have very low probabilities of formation indicated by the major blue region in Fig-4B, and have moderate formation probabilities indicated by the major green region in Fig-4C. One representative structure for TSE1 is shown on the left of Fig-4A, where we can see that β2, β3, β4, β5, β6, α1, α2 and α3 are packed loosely with β1, β7, β8 forming no contact with the core of the protein. Note that β7, β8 and α3 are the extra regions in our simulations compared to the typical structure of PDZ domain. A representative structure for TSE2 is shown on the right of Fig-4A, where the core of the native topology is well established except for β1, β7 and β8. We should point out a minor discrepancy between our results and a previous study. 25 We find that β1 is unstructured in the TSE2 but is found to be structured by Jemth.25
Folding kinetics and the Chevron Plot
We calculated the [C]-dependent folding (unfolding) rates from folding (unfolding) trajectories, which were generated from Brownian dynamics using the effective energy function, GP({ri}) (see Methods). From sixty (one hundred for [C] = 0) folding trajectories, the fraction of unfolded molecules at time t, is computed using , where Pfp(s) is the distribution of first passage times. We fit Pu(t) ~ e−tkfunder folding conditions ([C] < Cm), from which kf[C] can be extracted. Similarly, a single exponential fit for unfolding conditions ([C] > Cm) yields ku[C]. At high (low) [C], we can approximate kobs = kf([C]) + ku([C]) as ku([C]) (kf([C])). We globally fit the relaxation rate, kobs using lnkobs=ln[kf([0])e−mf/RT + ku([0])e−mu[C]/RT], where mf (mu) is the slope of the folding (unfolding) arm with lnkf=lnkf(0) – mf[C]/RT and lnku=lnku(0) + mu[C]/RT.
A plot of ln kobs as a function of [C] over a wide concentration range (0M ≤ [C] ≤ 8.0M) shows a classic Chevron shape (Fig-5) observed in several experiments for a number of proteins. In the range [C] ≤ 1.5M, ku ≪ kf, so that kobs ~ kf and similarly for [C] above 4.5M, kobs ~ ku. In the transition region (2.0M ≤ [C] ≤ 4.0M), the folding and unfolding rates are too small to be reliably calculated even using the SOP-SC simulations. Because the size of the PDZ3 domain is relatively large (110 amino acids) we could not generate folding and unfolding rates reliably around the midpoint even using the SOP-SC model. Comparison of the simulation and experimental results shows that the slopes (from the folding and unfolding arms) of the simulated chevron plot are qualitatively similar to the experimental values.
From the slope of the folding arm (simulation results in Fig-5), we obtain mf = 0.91kcal/mol· M and mu = 0.63kcal/mol · M from the unfolding arm. The corresponding experimental values are and .22 The agreement between experiments and simulations for the slope of the folding arm is reasonable and the agreement for the unfolding arm slope is fair. Since the the fraction of molecules in IBA is negligibly small, both thermodynamics and kinetics simulations can be approximately described by a two-sate model and hence we expect m ≈ mf + mu. From the simulated Chevron plot, we obtain m ≈ 1.54kcal/mol · M, which differs by ~ 24% from m = 2.03kcal/mol · M obtained from equilibrium ΔGNU[C] calculations (Fig. 2C). In contrast, the relation m = mf + mu = 2.46kcal/mol · M, which is close to m = 2.50kcal/mol · M found in equilibrium titration experiments. We conclude that our simulations capture only the qualitative features of the denaturant-dependent folding kinetics of PDZ3 domain.
Although MTM simulations reproduce the Chevron shape well, the dependence of lnkobs on [C] does not agree quantitatively with experiments. For instance, kf [0] from simulations is 278 s−1, which is only ≈ 1.6 times larger than the extrapolated value 170 s−1 for [C] = 0 from experiment. However, the unfolding rate at [C] = 0, ku[0] from simulations is 0.91 s−1 while ku [0] from experiment is 0.0022 s−1. The simulations overestimate the unfolding rate by about 414 fold compared to experiments, even though the values of the slopes of the unfolding arms from simulations and experiments are reasonably close. It is not easy to theoretically establish the reasons for the large difference between the predicted and experimental values of the unfolding rate, especially considering that the folding rate is accurate. We expect, on general grounds, that both the folding and unfolding rates should differ from measurements because we use coarse-grained models. In several previous studies9,10,12 we had argued that the difference between simulations and experiments could be on one or two orders of magnitude. The effective diffusion in our model is greater than would be the case had the solvent been modeled explicitly, which alas is impossible to do using current atomic detailed simulations. The larger predicted value of ku[0] compared to experiments suggests that the unfolding energy landscape is rugged, which is not accurately captured by the simulations. Assuming that the actual diffusion upon unfolding and ε is the scale of roughness) the discrepancy between the predicted and experimental ku[0] implies that the ε ≈ 2.5kBT. The absence of non-native interactions, which apparently is important for unfolding of PDZ3, could explain the overestimation of ku[0]. Clearly, the extent of deviation is likely to depend on the protein and the sequence.
Fluxes through parallel pathways depend on the denaturant concentration
By analyzing the folding trajectories by using χ as the progress variable for the folding reaction, we find that PDZ3 folds along four distinct pathways. One representative trajectory for each pathway is shown in Fig-6, where χ is displayed as a function of t. In each pathway folding occurs in stages. In addition to the conformations in the UBA and the NBA, we identified two intermediate states (KIN1 and KIN2), whose lifetimes vary greatly depending on the pathway. Arrows of each color in Fig-7 represent one folding pathway and the thickness of the arrows represents the probability of the pathway. At [C] = 0 (Fig-7A), the dominant pathway P1 is D → KIN1 → KIN2 → N (black arrows), through which ~ 52% of the flux to the native state is channeled. In this pathway, β sheets between strands 1, 6, 4 form transiently in KIN1 state, followed by the consolidation of core β sheets between strands 2, 3, 4 and α2 in the KIN2 state. The less probable alternative pathway P2 is D → KIN2 → N (red arrows), representing ~ 38% of the trajectories, where folding occurs only through the KIN2 state. Similarly, in the third probable pathway P3, D → KIN1 → N (green arrows), through which about ~ 10% of the flux to the native state flows, folding occurs only through KIN1 state.
The flux through P1, identified at [C] = 0 remains dominant at [C] = 0.5M (Fig-7B, ~ 52%) and [C] = 1.0M (Fig-7C, ~ 61%). The P2 and P3 pathways have lesser probabilities ~ 36%, ~ 6% for [C]=0.5M and ~ 28%, ~ 9% for [C] = 1.0M, respectively. A direct pathway P4, D → N(blue arrows) is observed with small probabilities (~ 6% for [C]=0.5M, ~ 2% for [C] = 1.0M). Thus, the PDZ3 domain folds through heterogeneous pathways. Most importantly, the populations of the folding pathways are sensitive to denaturant concentrations. Denaturant-modulated parallel pathways were also observed for adenylate kinase.30 Interestingly, such parallel folding has been observed in the folding of both small proteins41 and larger proteins. 30,42,43
Discussion
Post-Collapse kinetic intermediate is structurally similar to an equilibrium intermediate
To illustrate the relationship between the thermodynamically observed intermediate state (IEQ) and kinetically observed intermediate states (KIN1, KIN2), we calculated the average fraction of native contacts at every residue fQs. The correlations between the fQs for the three states are shown in Fig-8A and Fig-8B. The correlation between IEQ and KIN1 is very low (correlation coefficient, R=0.3), indicating that at the early stages of folding a variety of compact but structurally diverse states are explored. The observation that in the initial stages of organization a heterogeneous mixture of states with small thermodynamic states are sampled is consistent with early atomic detailed simulations on cytochrome c. 44
In contrast, the correlation between the calculated fQ between IEQ and KIN2 is high (Fig-8B). with the correlation coefficient, R=0.98. A linear fit of the line in Fig-8B gives y = A+Bx with A = 0.008±0.007, B = 1.175±0.021. Since the intercept, A, is close to zero, and the slope, B, is close to unity, we conclude that KIN2 and IEQ are the same intermediate species. Fig-8C shows, for KIN1, IEQ and KIN2, the average fraction of native contacts at every residue fQs by color, vividly demonstrating the similarity between KIN2 and IEQ. In addition, KIN2 and IEQ are very similar to the native PDZ3 structure except that the β1 strand and α3 helix are not structured if the β7 and β8 strands are not included. These results show that in the later stages of folding the equilibrium and kinetic intermediates coincide, which is expected to hold for most, if not all, foldable proteins. In the later stages of folding, which corresponds to the stage after chain compaction, native-like interactions dominate, which was first shown using lattice models45,46 for which precise computations can be performed. Because native-like structures, which have considerable Boltzmann weight, dominate it follows that the structural features of KIN2 and IEQ should coincide.
Despite the simplicity of theory-based approach used here it is worth emphasizing again the validity of the SOP model from the following perspectives. First, the SOP model was not parameterized but is transferable, as we have demonstrated in a number of applications (see for example9,12,43). Second, the SOP model predictions for the thermodynamics are in excellent agreement with experiments not only for this protein but also for about ten proteins for which detailed comparisons have been made. Third, previous simulations using lattice models45,46 show that after the polypeptide chain collapses the transition to the native state is dominated by native-like interactions, which is consistent with our finding that the structural features of KIN2 and IEQ coincide and thus justifies the SOP model. For these reasons we believe that the major prediction that the fluxes through the pathways can be altered by changing the denaturant concentration is valid, and certainly amenable to experimental test as was done for adenylate kinase. 30
Denaturants alter the connectivity between the metastable intermediates
The most interesting finding that denaturants alter the connectivity between the intermediates, and hence the fluxes through the distinct pathways, was already demonstrated in a most beautiful single-molecule fluorescence energy transfer (smFRET) experiment.30 Using adenylate kinase (ADK), a 214-residue two (or three) domain protein, and by collecting a large number of trajectories Haran and coworkers showed that during the folding process six metastable intermediates are populated. Most importantly, they showed using Hidden Markov Model analysis of the smFRET trajectories that the pathways traversed by ADK depend on the concentration of GdmCl. Both sequential (state i is connected to its neighbor i± 1) as well as non-sequential transitions, leading to parallel folding routes, are found during the folding process. The findings for ADK in experiments are qualitatively reflected in the simulated folding pathways of the smaller single domain PDZ domain. In particular, at the three concentrations of GdmCl we find both sequential and non-sequential connectivities. Just like the ADK study, we also find that a minor population of unfolded states directly reach the NBA, as predicted by the kinetic partitioning mechanism (KPM).47 Surprisingly, our simulations for PDZ show that KPM occurs only at 0.5M and 1.0M GdmCl but not in the absence of the denaturant. Overall, our simulations provide support to the discovery by Haran and coworkers 30 that fluxes through parallel folding pathways could be altered by changing the denaturant concentration.
It is interesting to compare this key finding30 to the reports of parallel unfolding routes in I27 induced by denaturants48 and SH3 domain by mechanical force (F).49 In both these studies the [C] or F dependence of ln ku (ku is the unfolding rate) exhibited upward curvature. Recently, we showed using theory that as long as the perturbation of the protein is linear in the external field (GdmCl or F) then upward curvature in the [ln ku, [C] or F] plot implies parallel unfolding pathways.50 In light of these observations and the present study, it would be most instructive to use F to probe the folding and unfolding of the PDZ domain. Such experiments would clarify if denaturant-induced intermediates coincide with those found under tension, which should be the case if the perturbation is linear in [C] and F.
Single molecule pulling experiments would provide insights into the nature of the TSEs and the modulation of fluxes through distinct pathways. There are two possible scenarios for forced-unfolding of PDZ3 in the presence of denaturants. The calculated free energy profiles in Fig. 3 suggest two possible outcomes for the single molecule pulling experiments. In the first case, we expect that the free energy profile could be described by an effective one-dimensional reaction coordinate with an outer barrier dominating at low forces and an inner barrier becoming important at high forces. 51,52 This scenario could hold good for force-induced unfolding of PDZ3, which would be consistent with the energy landscape inferred from ensemble experiments. 27 In this case there would be a change in the transition state position from a large (small) value at (low) (high) force. The more interesting scenario is that the location of the transition state in terms of the molecular extension, conjugate to the applied force, is an increasing function of force just as found for the unfolding of the SH3 domain. 49 In this case we would predict based on the free energy profiles in Fig. 3 that TSE1 would be dominant at low F and TSE2 at high F. Distinguishing between the two scenarios awaits single molecule pulling experiments.
Conclusions
Using PDZ3 as another case study we have showcased the power of the SOP-MTM simulations in capturing accurately the thermodynamics of folding in the presence of denaturants. Although the unfolding rate in the absence of denaturants deviates substantially from experiments, the predicted folding rate in water is in excellent agreement with experiments. The major finding is that PDZ3 folds by parallel pathways with the crucial prediction that fluxes through the major pathways depend on the denaturant concentration. Single-molecule fluorescence energy transfer experiments could be used to validate our predictions. The two scenarios for parallel folding pathways, which lead to different predictions for the variation in the position of the transition states with changes in the mechanical force, can be distinguished using single molecule pulling experiments. Finally, the present work shows that the most practical, reasonably accurate, and currently the only way of taking the effects of denaturants into account is by using the SOP-MTM simulations. The transferability of this method has been established through numerous applications.
Acknowledgement
ZL acknowledges financial support from the National Natural Science Foundation of China (11104015,11675017,11735005) and the Fundamental Research Funds for the Central Universities (2012LYB08). DT is grateful to the National Science Foundation (CHE 16-36424), the National Institutes of Health (R01 GM089685), and the Collie-Welch Regents chair (F0019) for supporting this work.
Footnotes
↵* E-mail: zxliu{at}bnu.edu.cn