Abstract
In this study, we provide a time-dependent (td-) mechanical model, taking advantage of molecular dynamics (MD) simulations, quasiharmonic analysis of MD trajectories and td-linear response theories (td-LRT) to describe vibrational energy redistribution within the protein matrix. The theoretical description explains the observed biphasic responses of specific residues in myoglobin to CO-photolysis and photoexcitation on heme. The fast responses are found triggered by impulsive forces and propagated mainly by principal modes <40 cm−1. The predicted fast responses for individual atoms are then used to study signal propagation within protein matrix and signals are found to propagate ∼ 8 times faster across helices (4076 m/s) than within the helices, suggesting the importance of tertiary packing in proteins’ sensitivity to external perturbations. We further develop a method to integrate multiple intramolecular signal pathways and discover frequent “communicators”. These communicators are found evolutionarily conserved including those distant from the heme.
Introduction
Understanding the allosteric control of molecular functions through investigations of energy flows within biomolecules has been of continuously arising scientific and medicinal interest in the past 20 years, encouraged by ever increasing computing power and implementations of appropriate physical theories. Exemplary systems include the relaxation dynamics of photolyzed heme-proteins1-8 such as the photodissociation of monoxide (CO) originally captured in carbonmonoxy myoglobin (MbCO). The former has been studied by ultrafast spectroscopy that empowers scientist to characterize relaxation dynamics on the time scale of femto- to picoseconds. The energy flows of multiple stages relaxations started from the photoexcited electronic state of heme by ultrafast lasers, and then the energy flows through the vibrational excitation of heme to the proteins environment, and finally dissipate to the solvent.1 Using the time-resolved near-IR absorbance spectra, it has been reported that the photoexcited electronic state of heme relaxes to its ground state with a time constant of 3.4 ± 0.4 ps and the ensuing thermal relaxation characterized by the temperature cooling exponentially with a time constant of 6.2 ± 0.5 ps.1 In the crystalline environment, such a relaxation was reported to approach equilibrium at a faster pace through damped oscillations with a 3.6 ps time period9. Further, the excited heme communicates with distant sites through vibrational modes within a few picoseconds, where the delocalized “doming modes” of heme are identified around 40 cm−1 using the nuclear resonance vibrational spectroscopy.7 Two in-plane heme modes v4 and v7 are referred to couple with motions along the doming coordinate (40-50 cm−1) and with spatially extended modes (centered at 25 cm−1) by using high frequency laser pulses.8
Recently, the ultraviolet resonance Raman (UVRR) has been used to investigate the vibrational redistribution in protein matrix with the aid of its enhanced intensity of the Raman scattering.2, 3, 10, 11 Mizutani’s group use time-resolved UVRR (UV-TR3) to measure time constants of relaxation dynamics for several tryptophan mutants in two different experiments of the photodissociation of MbCO.2, 3 In 2007, the CO photodissociation coupled with a conformational changes of Mb was studied. The band intensity change as a function of time owing to the hydrophobicity change upon the structural change in E helix as well as its interaction with A helix in which Trp7 and Trp14 are situated.3 In 2014, the same group irradiated and excited the heme, and vibrational relaxation of Mb was monitored by UVRR under the condition that conformational change is suppressed where Fe is in the 3+ state.2 In these studies,2, 3 the biphasic decay of relaxation motions were observed and time constants of fast and slow responses of several residues have also been identified, which provides us an excellent model system to trace the energy flows and investigate the underlying mechanism of vibrational energy transfer within Mb.
On the theoretical side, many theories and algorithms have been developed to describe the vibrational energy relaxation and mechanical signal propagation.12-16 The lifetime of CO vibrations estimated by the Landau–Teller formula is found to agree well with the time-resolved mid-IR absorbance experiments.4 To further understand the inherent inhomogeneity in the spatially dependent relaxation rate of the solvated protein, Langevin model has been used to estimate inherent friction in protein motion.17-19 Besides, the propagation of heat flow, kinetic energy or simply ‘signals’ through the vibrational energy redistribution in proteins has also been investigated with mode diffusion (or mode-coupling) induced by anharmonicity,20-23 MD simulations24-26 and linear response theory (LRT).19, 27-29
In our previous work, we succeeded in using the normal mode based linear response theories (NMA-tdLRT)19 to describe the relaxation dynamics of ligand photodissociation of MbCO in the UV-TR3 experiment.3 A general assumption of the normal mode based theory is that the conformational space of protein motions is around the global minima of potential energy surface (PES). However, the sampling space of protein in a more realistic system such the PC modes of the essential dynamics or the principal component analysis (PCA) for molecular dynamics snapshots could evolve over a rougher and wider range PES.30, 31 Although this entails the concern of external perturbation that is no longer small challenging the validity of the tdLRT, we have earlier proved that, for systems evolving on a harmonic energy surface, assumption of small perturbation is no longer needed31a (see also SI).
In this study, we formulate the principal component analysis based linear response theory (PCA-tdLRT) to investigate the signal propagation in myoglobin. We clearly prove that td-LRT holds true for systems that evolve following harmonic potentials without the assumption of small perturbations (see SI) and thereby induced conformational changes. We examine the PCA-tdLRT by comparing the estimated time constants of several residues for short (corresponding to the “fast” response in the UVRR experiment) and long time (corresponding to the “slow” response in the UVRR experiment) relaxation with two UV-TR3 experiments.2, 3 We also show that the large difference in energy-transfer speed for inter- and intra-helical signals, suggesting the importance of protein tertiary packing in mediating vibrational energy. We also explore the mode dependence of the signal propagation to understand the range of modes heavily involved in transmitting fast response signals. Finally, we introduce a communication matrix to record the counts of mechanical signals launched from sites nearest to the FE atom propagating through donor-acceptor pairs; the communication score (CS) of each residue is assigned, which quantifies how frequently mechanical signals goes through this residue shall perturbations are introduced in specific sites, such as substrate binding sites. The sites with high CS are termed “intramolecular communication centers (CCs)”, which are found evolutionarily conserved. Also, these CCs were previously found able to allosterically regulate the enzyme activity53.
Methods
MD protocols and Principal Component Analysis (PCA)
Unligated (1A6N) and CO-ligated (1A6G) myoglobin (Mb) structures are used in the MD simulations using the CHARMM36 forcefield implemented in NAMD package.32 Proteins are immersed in a box of size 78Å×71Å×71Å containing 33276 TIP3P water molecules and neutralizing ions containing one sodium and nine chloride ions. Structures are energy-minimized, heated and equilibrated at 300K and 1 bar, maintained by Langevin thermostat and barostat. Particle Mesh Ewald and SHAKE are applied in the simulations. The production run at 300K and 1 bar takes 80 ns with snapshots being saved for every 100 fs. We take the non-hydrogen atoms of the last 50 ns, totaling 500,000 snapshots (frames), for the subsequent principal component analysis (PCA33). The structural change under perturbations (see below) is taken from the difference between averaged ligated and unligated frames. The time-dependent and independent covariance matrices are calculated by PCA-derived quantities using the averaged unligated structure, to which the “0” of “< >0” in the eqs. (1) and (2) refer.
We performed (atom-)mass-weighted superimposition for each snapshot34 of the unligated myoglobin onto the averaged structure in an iterative fashion.33,35 According to our earlier protocols,33 the mass-weighted protein atom coordinates in an ensemble of superimposed frames were used to build the residue-residue covariance matrix for positional deviations, to which PCA was applied. PCA provided a set of principal component modes (PC modes) and their corresponding mass-weighted variance σ2 (the eigenvalues). According to equipartition theorem of harmonic oscillators,33 at a given temperature T, the energy of a harmonic PC mode is σ2 × ω2 = kBT, where the effective frequency, ω, of the PC mode can be defined as .
PCA-based time-dependent linear response theory
In the studies of photodissociation of carbonmonoxy myoglobin (MbCO),2,3 a biphasic relaxation - a fast response following by a long-time relaxation – for UV-TR3 - detectable residues was observed. To model the slow and fast (biphasic) relaxation dynamics corresponding to the residue responses with and without conformational changes,2,3 we developed two PCA-based time-dependent linear response theories (tdLRT)19-the constant force tdLRT (CF-tdLRT) and the impulse force tdLRT (IF-tdLRT), respectively. The CF-tdLRT reads as where kB and T are the Boltzmann constant and temperature, respectively. The time dependent covariance matrix, , can be expressed as the sum of solvent-damped harmonic oscillators treated by the Langevin equation (see supporting information, SI). When time goes to infinite, vanishes and eq. (1) retreats back to the time independent form, .29 The constant forces which drive Mb to evolve from unbound to bound conformation, upon CO binding, can be derived from the time independent LRT as the form in kcal/mol/Å, where in this study is the structural difference between bound and unbound Mb structures that are the averaged MD snapshots. Here the initial structures for MD simulations were taken from the x-ray-resolved unligated (PDB:1A6N) and ligated Mb (PDB:1A6G) structures.19
To model the relaxation dynamics for UV-TR3 experiment without conformational changes, we use the IF-tdLRT19 in the form where is an impulse force applied on the atom j. In this study, we model the laser pump pulse which deposits excess energy on heme by the force pointing from CO (if bound) to the FE atom plus a set of forces that model the “heme breathing” (v7) mode.8 In both tdLRTs, we express the time-dependent covariance matrix with the PC modes subjecting to the Langevin damping36 where the solvent friction, β, of 27 cm−1 is taken from our earlier calculation19 based on Haywards’ approach.18 Under this friction constant, PC modes of frequency higher than 13.5 cm−1 are underdamped, which agrees with the observation of orientation-sensitive terahertz near-field microscopy that vibrational modes of frequency larger than 10 cm−1 are overdamped in a chicken egg white lysosome of 129 residues.37
The mean characteristic time of a residue,
Using the IF-tdLRT in the Eq. 2, after an impulse force is applied on a Cα atom, atoms evolve as a function of time. After perturbations are introduced to the system at time 0, the atom i achieves its maximal displacement from the initial position at time ti. The time when is at the maximum of all time is defined as the characteristic time of atom i, . Our previous study19 showed that characteristic time is a function of the directions of applied impulse forces. When a single point force is applied on one atom in the system, the does not change with the absolute magnitude of the force and remains unchanged if the force points to the opposite direction. To average out the effects of applied force directions, we define a set 21 impulse forces, with unit magnitude and pointing toward evenly distributed 21 directions in a hemisphere at spherical angles of Ω, acting on a heavy atom of an int erested residue. Consequently, a set of characteristic times could be derived. We then averaged over 21 directions as well as all heavy atoms i of a residue R; the averaged characteristic time of a residue R is expressed as where i is the index of heavy atoms. To estimate the signal propagation speed, can be plotted as a function of the distance, , where d is the distance between the perturbed Cα atom and the Cα atom that senses the coming signal. By taking the linear regression for a set of as a function of d, we can obtain the propagation speed as the reciprocal of the slope (see Figure 3).
Communication matrix/map (CM) and communication score (CS)
On the aid of the IF-tdLRT providing the characteristic time to characterize the dynamical signal propagation, we further develop a method that trace the signal propagation pathways and identify which residues are essential to mediate the signals when considering all the pathways. Here, we introduce “the communication map (CM)” to record the signal propagations between residues, from which we derive “the communication centers (CCs)” - the residues that are frequently used to communicate signals in most of the pathways.
can provide causality of signal propagation. We could trace the signal transmission pathways consisting of the donor-acceptor pairs illustrated in Fig. 1. There are two communication criteria to determining the atom communication between a donor atom, ad, and an acceptor atom, ar. First, given an impulse force exerted on an atom in the system, the atoms with characteristic times that differ by Δt, are considered “communicating” such that , where the fixed Δt is chosen to be 0.2 ps in this study. In order to maintain the causality of signal propagation, the candidates that satisfy the first criterion are required to meet the second criterion such that the angle between the vector pointing from the perturbed atom to ar and that pointing from the perturbed atom to ad should be less than 90 degrees. It is possible that a donor atom connects multiple acceptor atoms, or multiple donors connect to a single acceptor atom. Consequently, in order to quantitatively recognize how frequently a residue participates in the signal propagation, we de?ne an all-atom communication matrix F(ad, ar) which account for the number of the communication that the donor, ad, connects to the acceptor, ar. As the form where j runs over all the selected Cα atoms being perturbed, δ(x) is a Kronecker delta function of x. Ωj are forces orientation angles as defined in the previous section. The lower part of Figure 1 illustrates how communication matrix F is constructed by summing all the pairwise communications when signals are initiated from different sites in the system, one at a time. With the all-atom F matrix, we further de?ne the residue level communication matrix (CM) by summing the counts belong to each residue pair,, where Rd and Rr are donor and acceptor residues, respectively; Nd and Nr are the number of atoms in donor and acceptor residues, respectively.
To model the signals propagation process, we perturbed the residues within 6 Å from the ferrous ion (FE): F43, H64, V68, L89, H93, H97 and I99 represented in yellow spheres in the middle panel of Figure 5a. Consequently, 21 evenly distributed impulse forces are applied to each perturbation site and all the resulting signal communications were summed in a CM, C(Rd, Rr). In the CM, the diagonal elements are of intra-residue communication having high C(Rd, Rr) scores, which is intuitive but less meaningful in terms of the allostery. For the off-diagonal elements satisfying |Rd − Rr| > 2, they provide information on the signal propagation between long range contacts (say, within or between secondary structures). Several hot spots that have high CM scores indicate their strategic locations to frequently communicate mechanical signals in multiple pathways. Then, a unique “communication score (CS)” can be assigned to each residue, which is defined as the highest CM score between this residue and any other non-neighboring residues in the protein – in other words, residue i’s CS is the highest score in either the i-th row or the i-th column (corresponding to donors or acceptors, respectively) of CM.
Results
Modeling the biphasic relaxation of residues of time-resolved ultrafast spectroscopy
It has been reported that the time constant measured by the resonance Raman spectra of tryptophan residues are sensitive to its hydrophobic environment change.3,38,39 For the four measured residues in the UV-TR3 experiments: W14 on helix A,2,3 Y146 on helix H,3 V68W on helix E,2 and I28W on helix B,2 their hydrophobic environment changes are monitored by the distance fluctuations between they and their closest hydrophobic residues (herein, distances for Trp14-Leu69, Ile28-I107, Val68-I107 and Tyr146-Ile99 pairs). The residues’ spatial distributions relative to heme can be found in Figure 2. To quantify the dynamics of corresponding residues, we define the reaction coordinate, , where and is the time-evolved distance between the side chain centers of residues a and b starting from when the force perturbations are introduced.
Considering conformational changes initiated by photodissociation of MbCO,3 we use the CF-tdLRT particularly shown suitable for describing the conformational change.19 The time correlation functions (interchangeably used as time-dependent covariance) in eqn (1) are composed of 350 PC modes up to ∼40 cm−1 (see Figure 4). As described in the Methods, the constant forces that result from CO photodissociation can be derived from the known conformational changes and the inversion of the covariance matrix through time-independent linear response theory.19 With the derived forces and time-dependent covariance in eqn (1), the time-dependent conformational changes of Mb can be tracked. The normalized response curves Ψab(t) for the two pairs, W14-L69 and Y146-I99, can be drawn (Figure 2b). Fitting the data corresponding to the structural-change-associated slow relaxation (after the peak for W14 and after the minimum for Y146) to the exponential function A exp(−t/τ) or B − A exp(−t/τ) reveals that the time constants τ obtained for the two pairs are respectively 39.0 and 3.5 ps, which are well compared with the time constants 49.0 and 7.4 ps observed in time-resolved UVRR experiments.3 It is worth noting that there is a fast rise or drop in Ψab(t) appearing in the first a few picoseconds before being followed by a slower relaxation (Figure 2c). Revealed below, these fast responses can be described by td-LRT using impulse forces.
The relaxation time constants of another two tryptophan mutants V68 and I28 have been reported by the UV-TR3 experiment without significant conformational changes in Mb,2 where the tryptophan band of anti-stokes intensities give the time constants for V68W and I28W, 3.0 ± 0.4 and 4.0 ± 0.6 ps, respectively.2 We use the IF-tdLRT in eqn (2) to model the relaxation dynamics and the impulse force used to mimic the laser pump pulse is a force pointing from CO (if bound) to the FE atom perpendicular to heme plus the forces that model the “heme breathing” (v7) mode.8 Figure 2d shows the reaction coordinates, , for residues V68, I28 and W14 labeled by green triangles, red squares and black circles, respectively. We define characteristic time tC as the time when reaches its maximal amplitude. The tC for is 1.6 ps, shorter than of 2.0 ps. The estimated fast responses are slightly faster but still close to the UV-TR3-observed time constants, 3.0 ± 0.4 and 4.0 ± 0.6 ps, respectively, for the two residues.2 W14 has a tC of 3.4 ps according to our IF-tdLRT while the signal is too weak to observe in the UV-TR3 experiment.2
Inter-helix signals propagate faster than intra-helix ones
The characteristic time tC that indicates the arrival time of mechanical signals to individual atoms or residues enables us to examine how fast such signals propagate through secondary and tertiary protein structures. We quantify the signal propagation between (inter-) or within (intra-) α-helices, corresponding to perturbed sites and sensing sites located at different or the same helices, respectively.
It can be seen in Figure 3a that the intra-helical (see symbol definition in Figure3) increase with dP linearly with a correlation coefficient of 0.71, where the linear regression gives a speed of 529 m/s, close to Leitner’s earlier estimate, 10 Å/ps (=1000 m/s). [Leitner’s 2001 PRL] In comparison to the intra-helix signals, the for signal propagation across different helices is much shorter than its intra-helical counterpart. The linear regression gives a speed of 4076 m/s, falling in between that in water (1482 m/s)40 and in steel (5930 m/s),41 for the inter-helical communication with a correlation coefficient of 0.16. The low linearity suggests a more topology-dependent nature of the inter-helical signal propagation than that for the intra-helical case. These results imply that the packing of secondary structures can significantly accelerate the signal propagation speed, exemplified by the rigid molecule, myoglobin.
The signal propagation mediated by selected modes
As a property derived from time-correlation functions used in IF-tdLRT, the characteristic time should be a function of constituent PC modes. To understand the mode dependence of characteristic time of residues of interest and therefore the inferred propagation, we calculate the from the time dependent covariance matrix that comprises all the PC modes that are slower than a “cutoff frequency”, ωmax that goes from 12 to 1100 cm−1. Figure 4 shows the of the four residues examined in UV-TR3 experiments2, 3 as a function of ωmax when the perturbation is introduced at the FE atom of the heme from 21 evenly distributed forces. When taking all the modes ≤1100 cm−1, we obtain for I28, V68 and Y146 as 3.0 ± 0.4, 2.4 ± 0.2 and 2.4 ± ps, respectively, which are compatible with the UV-TR3 results 4.0 ± 0.6,2 3.0 ± 2 and 2.0 ± 0.8 ps,3 respectively. However, we should note again that the experimental data for I28 and V68 were obtained from photoexcited Mb involving no conformational changes, while that for Y146 was measured from CO-photolyzed Mb with accompanied conformational changes, which involves multiple force perturbations on and near the heme (instead of only on the heme). It can be seen in Fig4 (a) that does not vary much until ωmax continues to drop below ∼40 cm−1 where monotonic increases of the start to become apparent in all the four residues. On the other hand, PC modes larger than 100 cm−1 are barely influential to . This could be because these modes are spatially localized and do not propagate the signals.
The intramolecular communication centers are evolutionarily conserved
To investigate whether the residues frequently mediating the vibrational signals have any functional or evolutional importance, we define a set of residues as communication centers (CCs) (See Methods) which frequently communicate signals in multiple pathways. The CCs are functions of perturbing sites. Perturbing heme-binding residues located within 6 Å from the ferrous atom, F43, H64, V68, L89, H93, H97 and I99 (yellow balls in Fig. 5b), results in a communication map (CM) (Fig. 5a; see Methods). The CCs are the 20 residues having the highest communication score (CS; see Methods), which are rank-ordered as V68, A134, A84, A130, G124, T39, G25, A71, A94, A90, A127, S58, H24, A143, G5, Y146, D141, K102, L115 and V114 (Figure 4b). On the other hand, we calculate the evolutionarily conserved residues in Mb based on multiple sequence alignment results in the ConSurf database42, 43 where residues are grouped into nine conservation levels by their “ConSurf scores” from 1 (most diverse) to 9 (most conserved) color-coded from blue to red in Figure 5b, respectively.
We then exam the distribution of the ConSurf scores for all the residues as well as for the CCs. It is found in the normalized histograms of CS (Figure 5c) that the average ConSurf score of CCs is 6.4, larger than the average ConSurf score for all the residues, 5.3. Also, there are 60% CCs of the ConSurf score value greater than or equal to 7 while only 38% of the residues in Mb meet the same criterion. Similarly, we also found that the CCs are generally more conserved than average residues in dihydrofolate reductase (DHFR) that catalyzes the reduction of dihydrofolate (DHF) in the presence of the cofactor NADPH into terahydrofolate (THF) and NADP+. 44, 45 Here, the CCs for DHFR are D122, V13, G121, A29, A26, T123, G15, Y111, I155, P31, L4, L110, I14, V40, T35, G95, P39, I94, I61, M92 and E17, rank-ordered by their CS scores. The average ConSurf score for CCs is 7.3, contrasted with 5.5 for all the residues in DHFR. Interestingly, we notice that the D122 and G121 are the CCs of the highest CS and these two residues are known allosteric sites (>16Å from the active site), whose mutations impair the hydride transfer rate significantly.46-48 These results suggest that CCs, bearing mechanical/communication importance, are evolutionarily conserved. CCs in another enzyme, DHFR, was also earlier found to be evolutionarily conserved53.
Communication centers are not co-localized with functional mechanical hinges and folding cores with high local packing density
We further ask whether these conserved communication centers are a natural consequence of their important structural and mechanical roles in serving as folding cores49 or being at the mechanical hinges.49, 50 Among the 20 CCs in Mb, we found that there are seven folding cores (35% of CCs) including G25, A90, A94, V114, A127, A130 and Y146, which are identified by the fastest GNM mode peaks,49 and the residue A71 is identified as a mechanical hinge by the slowest GNM mode (see Figure 6).50, 51 Besides these, all the other 12 CCs, including the top three communicators V68, A134, A84 (highlighted in Figure 6), are neither folding cores nor mechanical hinges, suggesting that these CCs are not properties readily derivable from proteins’ structural topology and low-frequency mechanics. Within the 12 CCs, T39, S58, V68, A134 and D141, having a ConSurf score ≥ 8, are highly conserved while S58 (ConSurf score = 9), A134 and D141 are not anywhere close to heme.
DISCUSSIONS
We have developed two td-LRTs, CF-tdLRT and IF-tdLRT, to model the long-time (involving conformational changes) and short-time relaxation (without conformational changes), respectively. Although the initial stage of the electronic excitation within the heme group involves quantum effects5, the consequent relaxation process in protein environment seems qualitatively captured by our classical approach that describes the time characteristics of the energy transfer measured by the UVRR experiment. In this work, we also show that the PC modes can serve as a well-defined basis set to mediate the vibrational energy using the PCA-based tdLRTs. Three substantial issues are deeply investigated in this study: 1. How do we model the quantum external perturbation use a classical approach? 2. What kinds (frequency range of modes) of relaxation motions of the measured residues are captured by the time-dependent UVRR spectra? 3. Are there essential residues taking charge of the energy flow pathway?
In this study, the external perturbations are modeled by two kinds of classical forces: constant force induced by conformational change and point impulse force at FE atom along the direction perpendicular to heme or along evenly distributed ones. First, according to Reference,5 the laser pulse excite the electronic state of FE atoms and then the excess heat relax first to the vibrational state of heme. This process is effectively modeled by our IF-tdLRT using a point impulse force at FE atom along the direction perpendicular to heme. A reasonable speculation is that a good “effective classical force” can excite vibrational excitations of heme similar to that excited by the laser pulse (Figure 1c and Figure 3). Second, the time constant of the measured residues could be attributed to the change of hydrophobic environment of that ones.3 In this work, we represent the hydrophobic environment of the measured residue using the displacement between the center of mass of the measured residues and its closest hydrophobic residue. From the Figure1a and 1b, these approaches are shown to estimate the time constant appropriately. Although, in another aspect, the corresponding band in UVRR spectrum for a measured residue could also be dominated by specific vibrational mode,2 another possibility is to monitor the measured residues itself. In this aspect, we also measure the mean characteristic time of a residues, , shown in Figure 3. For the full mode case, of I28, V68 and Y146 are 3.0 ± 0.4, 2.4 ± 0.2 and 2.4 ± 0.3 ps, respectively; it is compatible to the fast response results in UVRR experiment of 4.7 ± 1.3, 3.0 ± 0.7 and 2.0 ± 0.8 ps, respectively. Furthermore, we address the third issue by building the time-dependent covariance matrix in eq.1 and eq.2 using selected modes. It has been shown in Figure 3 that the deletion of fast PC modes above 200 cm−1 only affect the time constants slightly, while the deletion of PC modes above around 40 cm−1 cause the time constants fluctuate significantly. The results suggest that the medium frequency modes between 40 and 200 cm−1 play an important role in propagating signals. However it does not rule out the possible mechanism mediating the vibrational energy flow such like mode resonance or modes couplings.
Supporting Materials
Supporting Methods
A general time-independent linear response theory
Given the Hamiltonian of unperturbed system, H0, governs the dynamics of equilibrium system. The perturbed system is subject to force applied on atom j, and the Hamiltonian is , where is the deviation from the mean of atom j due to the external force . The time progression of the positional changes of atom i under external forces is of the form19 where kB and T are the Boltzmann constant and temperature, respectively, is the velocity of atom j at the moment when external forces, , are applied. is the velocity-position time-correlation function sampled in the absence of perturbations noted by subscript 0, which can be expressed in the normal-mode space, where modes are treated as independent 1-D harmonic oscillators under solvent damping using the Langevin equation.36 The detail derivation is referred to our previous work.19
The constant force time-dependent linear response theory (CF-tdLRT)
Substitute the force term in eq. (S1) with a time-invariant constant force , one can re-formulate eq. (S1) as Let where β is the solvent friction and ωm is the frequency of the PC mode m; for overdamped modes, 2ωm < β, one can derive where αjm and αim are j-th and i-th component of the m-th PC mode, is the variance,; while an underdamped PC modes (2ω > β), one can have ( here)? When time goes to infinite, the term vanishes and eq. (S2) returns to the time independent form.29
The impulse force time-dependent linear response theory (IF-tdLRT)
Let the force in eq. (S1) be a delta function (the “impulse force”) , the format for the IF-tdLRT is when 2ωm < β When 2ωm < β (let ; note that sinh θ = −i sin iθ),
LRT for systems evolving on a “harmonic energy surface” can be directly derived without the need to assume small perturbations
The validity of the aforementioned time-independent linear response theory (ti-LRT) equality29 holds up on the fact that has to be enough small as compared to . In general, this is valid if structural changes are small such as in the myoglobin case when the NMA treatment of the unperturbed covariance matrix is considered suitable19. However, when the induced structural changes are large, may not hold. However, we show below how the theory still holds when the protein conformational changes are not negligibly small. It serves the foundation for us to apply the theory to MD-derived trajectory subject to quasi-harmonic analyses.
Unperturbed Hamiltonian Ξ0, though not necessarily in the local minima, can be approximated harmonically from a classic-forcefield-defined or elastic-potential-defined energy minima such that Let ; Ξ0’ is the energy minimum about which the harmonic approximation of the potential takes place. Hijγδ are the components of Hessian (force constant matrix) H; γ, δ denote the Cartesian x, y and z. The constant term vanish as being factored out from both the numerator and the denominator. Assuming fj is readily available (possibly from the standard version of ti-LRT), to avoid the restriction that has to be small at the standard LRT case, we would like to directly integrate the above equation (now rewritten in its matrix-vectorial form) such that where conformational changes ΔR and forces f are 3N-d column vectors; For the definite integral in the numerator and denominator, the equality and can be used respectively.
Using Eq. S10, Using Eq. S11, (S9) is of the ratio (S12) to (S13), which is , where A can be viewed as ,as , B as and eBx as exp(+βfT ΔR). According to the NMA theories49, 50, (where elements in C are Cijγδ = <Δriγ Δrjδ >) if the protein’s Hamiltonian is at the energy minimum and its adjacent potential surface is harmonically approximated.
Hence, which arrives the same formula as shown before.19, 27-29
Therefore, when is not necessarily small (e.g. the ligand (or protein)-induced conformational changes that can be seen in long MD simulation or coarse-grained models), the same equation can be used if its potential is harmonically approximated.