Abstract
Free energy of transferring amino acid side–chains from aqueous environment into lipid bilayers, known as transfer free energy (TFE), provides important information on the thermodynamic stability of membrane proteins. In this study, we derived a TFE profile named General Transfer Free Energy Profile (GeTFEP) based on computation of the TFEs of 58 β–barrel membrane proteins (βMPs). The GeTFEP agrees well with experimentally measured and computationally derived TFEs. Analysis based on the GeTFEP shows that residues in different regions of the TM segments of βMPs have different roles during the membrane insertion process. Results further reveal the importance of the sequence pattern of transmembrane strands in stabilizing βMPs in the membrane environment. In addition, we show that GeTFEP can be used to predict the positioning and the orientation of βMPs in the membrane. We also show that GeTFEP can be used to identify structurally or functionally important amino acid residue sites of βMPs. Furthermore, the TM segments of α–helical membrane proteins can be accurately predicted with GeTFEP, suggesting that the GeTFEP captures fundamental thermodynamic properties of amino acid residues inside membrane, and is of general applicability in studying membrane protein.
1 Introduction
Membrane proteins play important roles in cellular metabolism, signaling regulation, and intercellular interactions.1 Knowledge of the thermodynamic stability of membrane proteins is essential for understanding their folding behavior and their structure–function relationship.2–5 A widely used measure to estimate the stabilities of membrane proteins is the transfer free energies (TFEs), which quantify the free energies of transferring amino acid residues from aqueous environment into lipid bilayers.6–11
Often called hydrophobicity scales, transfer free energies have been measured experimentally based on several model systems. The Wimley–White whole residue scale (WW–scale) measures TFEs of residue partitioning between water and octanol using a set of peptides as the host of amino acids.8 The biological scale (Bio–scale) of Hessa et al. measures the free energies required to transfer residues in polypeptides into the ER membrane through the translocon machinery.9 The Moon–Fleming whole protein scale (MF–scale) measures TFEs of residues from water to the membrane core in the context of a whole β–barrel membrane protein (βMP).10 These experimentally obtained hydrophobicity scales have leaded to improved understanding of the structures and functions of membrane proteins12 and have been used in prediction of transmembrane (TM) segments of membrane proteins.13
However, experimental measurement of TFEs is technically challenging, cumbersome, and costly.14,15 Complementing experimentally measured transfer free energies, several hydrophobicity scales have been derived computationally, which can aid in our understanding of the governing principles of membrane protein folding.2,4,16 The EZα and EZβ empirical potentials are knowledge–based hydrophobicity scales. They have been successfully applied in predicting the positioning of membrane proteins in the lipid bilayer, in discriminating side– chain decoys, and in identifying protein–lipid interfaces.17,18 However, these scales obtained from statistical analysis do not consider the physical interactions either between residues from neighboring helices/strands or within the same helix/strand, which are known to be important for membrane protein folding.19,20 There have also been studies based on molecular dynamics (MD) simulations to calculate TFEs,21–23 although the choice of the reference state before membrane insertion remains a challenging task.22
Another method of deriving TFEs computationally was developed for βMPs recently.11 This method is based on a statical mechanical model in a discretized conformational space. It incorporates both intra– and inter–strand interactions in the TM segments of the proteins. It can be used to calculate TFEs of any lipid–facing residue in the TM segment of a βMP, as long as the number of TM strands of the protein is no more than 12. The computed TFE scale (OmpLA scale) is in excellent agreement with the MF–scale with a Pearson correlation coefficient r = 0.90. This scale has been applied successfully to explain how the functional fold and topology of the βMP are determined by the asymmetry of both the Gram–negative bacterial outer membrane and the TM residues.11 A further algorithmic extension of this method has greatly reduced the computational cost, enabling the calculation of TFEs on all βMP known so far, regardless their sizes, with little loss of the accuracy.24
In this study, we use the new algorithm24 to compute the depth–dependent TFE profile of each βMP in a non–redundant set of 58 βMPs. After examining their overall patterns, we found that there exists a general TFE profile applicable to all βMPs, which we call the General Transfer Free Energy Profile (GeTFEP). The GeTFEP agrees well with previously measured and computed TFEs. Analysis based on GeTFEP shows that residues in different regions of the TM segment have different roles during the membrane insertion process. Our results further reveal the importance of the sequence pattern of TM segments in stabilizing βMPs in the membrane environment. In addition, we also show that GeTFEP can be used to predict positioning and orientation of βMPs when embedded in the membrane, with overall results in good agreement with experimental data. Furthermore, we show that the GeTFEP can be used to locate structurally or functionally important sites of βMPs. In addition, TM segments of α–helical membrane proteins can also be accurately predicted using the GeTFEP, suggesting that the GeTFEP captures fundamental thermodynamic properties of amino acid residues inside membrane, and has general applicability in studying membrane protein.
2 Results
GeTFEP: General Transfer Free Energy Profile
Computation of TFE profiles of βMPs
Using the methods described in Ref [24], we calculate the depth–dependent TFE profiles for each βMP in a non–redundant set of 58 βMPs. The proteins in this set have ≤ 30% pairwise sequence similarity. Briefly, for each βMP, we substituted each lipid–facing residue in the TM region to the other 19 amino acids. We calculated the TFEs of each amino acid substitution using Ala as the reference. The TFE profile of the protein was then obtained by taking average of the TFE values of the same amino acid type at the same depth position in the membrane. As an example, Fig. 1 shows the computed TFE profile of the protein LptD, the largest βMP with known structure (PDB ID: 4q35).
Derivation of GeTFEP
Although the 58 βMPs are in different oligomerization states, have different sizes (strand numbers) of TM segments, and come from different organisms, their TFE profiles are remarkably similar. Results of clustering their profiles show that the 58 βMPs can be grouped into only one group (with 56 βMPs) and two outliers: α– and γ–hemolysins (PDB ID: 7ahl and 3b07). Details of the clustering method can be found in the supporting information.
Unlike the other βMPs, the TM regions of both α– and γ–hemolysins are formed by repeated β–hairpin (Fig. S2B), which make their TFE profiles highly sensitive to the composition of the β–hairpin and the local interactions of residues within the hairpin (Fig. S2 C and D). Accordingly, we further investigate whether α– and γ–hemolysins have truly different thermodynamic properties than βMPs, or their outlier status is due to the special architecture of repeated β–hairpins.
We first computed the TFE profiles of artificially generated hemolysin–like βMPs constructed by repeating each β–hairpin in our βMP set. Altogether, we computed TFE profiles for 778 artificial hemolysin–like βMPs. We then sampled from these profiles with replacement, and computed the distribution of the distance from each sampled profile to the average profile of all sampled artificial βMPs. The distances from the TFE profiles of both α– and γ– hemolysins to the average profile are at the 80th percentile in the distance distribution (Fig. 2B), indicating that α– and γ–hemolysins are not fundamentally different in their thermodynamic properties from other βMPs. Therefore, we conclude that a general transfer free energy profile exists and is applicable to all βMPs, including α– and γ–hemolysins. We derive the General Transfer Free Energy Profile (GeTFEP) by averaging the TFEs of a specific amino acid at the same lipid bilayer depth position for all 58 βMPs (Fig. 2C).
Comparison with other hydrophobicity scales
We then examine how GeTFEP compares with other hydrophobicity scales. Since most experimentally measured scales are not depth–dependent, we first compare the scale of the TFEs at the hydrocarbon core position of depth 0 in the GeTFEP with other hydrophobicity scales. We refer this hydrophobicity scale as the mid–GeTFEP scale. The mid–GeTFEP scale correlates well with the experimentally measured hydrophobicity scales, having Pearson correlation coefficients r = 0.83 with the WW–scale, and r = 0.92 with the Bio–scale. It also correlates well with the computational βMP OmpLA scale,11,24 with r = 0.90 (Fig. S3). When compared with the experimentally measured MF–scale of the βMP OmpLA mid– GeTFEP has a correlation of r = 0.87. One noticeable difference between mid–GeTFEP and the MF–scale is that the TFE value of His is less unfavorable in mid–GeTFEP (Fig. 3A).
This is expected since the MF–scale was measured in acidic condition at pH=3.8, where His was fully protonated.10 The different value in mid–GeTFEP likely reflects the property of His in physiological conditions of the outer membrane.
Another notable difference is Pro. It is found that Pro is unfavorable in the membrane environment according to the mid–GeTFEP scale, while it is found to be favorable according to the MF–scale (Fig. 3A). Pro tends to disrupt the structures of both α–helix and β–sheet, and is thermodynamically unfavorable in the non–polar core of the membrane.25 The value of Pro in the GetFEP–mid scale reflects the general situation.
We then examined the depth–dependency of the GeTFEP of Arg and Leu, whose experimental results are available.10 Their TFEs at different depth positions of the membrane are in good agreement with the experimentally measured values, with r = 0.87 for Arg and r = 0.75 for Leu (Fig. 3B), suggesting the GeTFEP captures the depth–dependency of TFEs of amino acids.
Insertion of βMP into membrane
βMP insertion as a thermodynamically driven spontaneous process
Upon synthesis in the cytoplasm, βMPs need to be transported across the periplasm and then folded into the outer membrane. As there is no energy source such as ATP in the periplasm, it was suggested that the free energies of βMP folding provide an adequate source to ensure successful periplasm translocation.26 A computational study showed that the TFE of lipid–facing residues of the hydrophobic core regions are indeed the main driving force for membrane insertion.11 Analysis also showed that lipid–facing residues in the TM regions of of βMPs have clear patterns of amino acid composition.27 However, it is still unclear whether the insertion of βMPs into the membrane is primarily due to the extensive property of the hydrophobicity of lipid–facing residues, or the specific pattern of amino acid composition also plays important roles.
To investigate this question, we employed a simplified βMP insertion model based on the concerted folding mechanism proposed in Ref [28]. We ignore the effects of non–TM loops and discretizes the insertion process into 17 steps (Fig. 4A). We take the position recorded in the widely–used Orientations of Proteins in Membranes (OPM) database29 as the fully inserted position of each βMP. This position is denoted as the reference positioon 0, and the other positions are indexed accordingly from -8 to +8. βMPs start the insertion process at position -8 from periplasmic side and become fully inserted into the membrane at position 0. From position 0 to +8, βMPs would translocate across the membrane. We assume that the stability of the TM region of a βMP can be approximated by summarizing TFEs of all lipid– facing residues in the membrane region. The stability of the βMP at each position was then calculated using the GeTFEP following this additive model. As an example, Fig. 4B shows stability of the protein OmpA (PDB ID: 1bxw) at different insertion positions. Overall, results of all βMPs show a funnel–like pattern of insertion energy (Fig. 4C). Most βMPs (52 of 58) have minimum free energy when they are fully inserted into membranes (position 0, Tables S1 and S2). The funnel–like pattern indicates that the insertion of βMPs into outer membranes is indeed a spontaneous process. βMPs become energetically trapped after being fully inserted.
Importance of patterns of TM lipid–facing residues in membrane insertion
We then examine if the funnel–like insertion energy pattern arises from the extensive property of the TFEs of the hydrophobic residues alone. We considered only the 52 βMPs whose minimum free energies are at the fully inserted position. We first shuffled the sequences of the β–strands within the TM segment of each βMP. While the side–chain direction as well as the interstrand hydrogen bond pairing at each residue position in β–strands are maintained, all TM residues are permuted. Each βMP is shuffled 2,000 times. We found that it is highly unfavorable to insert the shuffled βMPs into the membrane. This is expected, since the shuffling changes hydrophobicity of TM segments βMPs. Before the shuffling, the ionizable/polar residues were enriched among lumen–facing residues of βMPs, while lipid– facing residues were mostly apolar. After the shuffling, they were much evenly distributed.
We then investigate how insertion energy is affected if only the lipid–facing residues are shuffled. While the insertion of the shuffled βMPs remains energetically favorable (see Fig. S4 for an example), shuffled βMPs are less stable compared to the original βMPs at the fully inserted position for 50 out of 52 βMPs: The insertion energy for the shuffled βMPs is on average 6.36 kcal/mol higher (Table S1). In addition, the fully inserted position (position 0) is no longer the most stable position for 17.4% of the shuffled βMPs (Table S1). These results indicate that the locational patterns of lipid–facing residues30 in the TM region are optimized for βMPs to gain stability in the membrane environment.
Roles of residues in different TM regions during membrane insertion
The TM segment of a βMP can be divided into three regions, namely, the periplasmic headgroup region, the hydrophobic core region, and the extracellular headgroup region.27 We investigate how these regions contribute to the insertion energy of the βMP. We found that residues in the same regions across all 52 βMPs shared similar patterns in their insertion free energy profile (Fig. 4C), indicating that they play similar roles in the insertion process. Among these, lipid–facing residues of the extracellular headgroup region facilitate the initialization of the insertion process, as they are energetically favorable in the interfacial region on the periplasmic side (position −8 and −7). As insertion proceeds, these residues become less favorable and occasionally unfavorable when they become more embedded in the membrane. At this time, lipid–facing residues of the hydrophobic core region start to be inserted in the membrane, and strongly drive the insertion process (position −6 to −2). When lipid–facing residues of the extracellular headgroup region approach the interfacial region of the extracellular side, they become energetically favorable again. At the same time, lipid– facing residues of the periplasmic headgroup region become inserted (position −1 and 0), and the TFE of the whole βMP reaches its minimum at position 0.
Although lipid–facing residues of the hydrophobic core region are known to provide the main driving force for membrane insertion of βMPs,11 we found that the TFEs of hydrophobic core region do not reach their minimum when βMPs are fully inserted at position 0 for all 52 βMPs. Upon incorporation of contributions from other regions, the overall TFEs of the whole βMPs indeed reach the minimum at the fully inserted position. The “W” shape of the free energy curves of the two head group regions (the red and green curve in Fig. 4C) suggests that lipid–facing residues in these regions act like “energetic latches” to lock βMPs into their fully inserted position.
Prediction of βMP positioning and orientation in the membrane
GeTFEP can be used to predict positioning and orientation of βMPs in the membrane, similarly to previous studies.17,18 Here, the membrane is idealized as an infinite slab with a thickness of h. Each βMP is initially positioned in the membrane with its center of mass of the barrel domain at the midplane of the membrane and its barrel axis aligned with the normal direction (z–axis) of the membrane (Fig. 5A). The protein can be rotated around the x– and y–axes with angles θx and θy, respectively. The two rotation angles together determine the tilt angle of the protein. The protein can also be translated with a displacement dz. This displacement and the membrane thickness determine the TM segment of the protein. When embedded in the membrane, the lipid–facing residues of the TM region and the loop residues are used to calculate the total energy of the βMP using the GeTFEP. As an example, Fig. 5B shows how rotation angles θx and θy affect the stability of the protein BtuB (PDB ID: 1nqe) when the displacement dz and the membrane thickness h are fixed.
We systematically examine the parameter combination of θx, θy, dz, and h. A βMP is predicted to take the position and the orientation when the lowest free energy is reached. The predicted protein tilt angles of all 58 βMPs correlate well (r = 0.76) with OPM records.29 The average protein tilt angle of 7.3° is consistent with that of 6.2±1.8° recorded in the OPM. The strand tilt angles and the membrane thickness predicted are again in good agreement with experimentally determined results (Table 1).
Prediction of structurally and functionally important sites of βMPs
While overall the computed TFEs of lipid–facing residues of βMPs follow the general pattern of the GeTFEP, the TFE values of a specific residue in a particular βMP can deviate significantly from values in the general profile (see SI for details). Among all 3,500 lipid–facing residued in the TM segments of all 58 βMPs, we find that 305 or 8.7% of the residues have TFE values deviate significantly from the GeTFEP. Since lipid–facing residues are overall the major contributors to the stability of βMPs as discussed above, the deviation from the general profile indicate that the residue is likely to have important roles other than providing stability. To understand the origin of these deviations, we examined three proteins in details, namely, OmpLa, PagP, and PagL, which have sufficient experimental information. We found that most deviant residues either have functional roles or have local structures quite different from residues in the canonical model of beta barrels (Tab 2).
Among the deviant residues in OmpLa, 142H and 156N are both in the catalytic triad32,33 that are essential for its phospholipase activities; 40L and 92Y are the sites where substrates bind;34 Furthermore, the deviant residue 116P interacts with 92Y and 142H through hydrogen bonds. Among the deviant residues in PagP, 69L interacts with the out–clamp α–helix of PagP;35 27I and 125L are both at the lateral routes where β–hydrogen bonding is absent (Fig 6), which ensure that substrates can access the protein interior so that PagP can carry out its enzymatic functions.36 In PagL, the deviant residue 108I is in the ligand binding site,37 and 126H is part of the catalytic triad of its enzymatic site.38
As the calculation of TFEs does not require knowledge of 3D structures of βMPs, our results suggest that deviation analysis can help to discover functional sites and/or structurally anomalous sites using sequence information only. While our analysis is restricted to three proteins due to the limited nature of experiment data, we believe overall deviant residues play special roles in either performing biological function or in maintaining the unique structural form of βMPs.
GeTFEP can predict TM region of α–helical membrane proteins
Although the MF–scale was measured in the βMP system, it was suggested that the scale is also applicable to TM region of the α–helical membrane proteins (αMPs), since the MF–scale has a strong correlation with the nonpolar solvent accessible surface areas of the residues.10 We hypothesize that the GeTFEP may also reflects fundamental thermodynamic properties of transferring sidechains of amino acids to the membrane environment, regardless whether the residue is in a β–barrel or a α–helical membrane protein. We carried out the standard hydropathy analysis39 using the Membrane Protein Explorer (MPEx) program.40 on 131 αMPs obtained from the MPTopo database41 Since MPEx uses depth–independent hydrophobicity scales, we used the mid–GeTFEP scale for our calculation.
The results show that this simple analysis using the mid–GeTFEP scale correctly predicts both the TM regions and the numbers of the TM segments for 90 or ∼69% of the 131 αMPs in the dataset. This compares favorably to other hydrophobicity scales, including those measured or derived from αMPs (Table 3). For most of the remaining 41 proteins, GeTFEP correctly predicted the TM regions, but predicted the numbers of the TM segments incorrectly due to the ambiguity in assignment of whether two consecutive TM segments should be considered as one TM segment (see Fig. S5B for an example). Examination of the number of TM residues correctly predicted by the mid–GeTFEP scale show that we achieves a precision of ∼85% and a recall of ∼71%, which compares favorably to other hydrophobicity scales (Table 3). These results suggest that the GeTFEP reflects fundamental thermodynamic properties of amino acid residues inside membrane, and can be used to study the general stability of both α–helical and β–barrel membrane proteins.
The validity of transfer free energy value of Pro in the GeTFEP
We further examine the TFE value of Pro in the mid-GeTFEP scale, which is qualitatively different from that in the MF–scale. We swapped the value of Pro from MF–scale into the mid–GeTFEP scale, and used this Pro–swapped scale in the hydropathy analysis. This is reasonable as the mid–GeTFEP scale is strongly correlated with the MF–scale, and has comparable values. However, we found that the precision of predicting TM residues deteriorates significantly from 85% using the mid–GeTFEP scale to 72% using the Pro–swapped scale (Table 3). This result suggests that Pro is more likely to be membrane unfavorable as characterized by the mid–GeTFEP scale rather than membrane favorable as characterized by the MF–scale.
3 Conclusions and discussion
In this study, we derived the General Transfer Free Energy Profile (GeTFEP) from a non– redundant set of 58 βMPs. We showed that the GeTFEP agrees well with previous experimentally measured and computationally derived TFEs. The GeTFEP reveals fundamental thermodynamic properties of amino acid residues inside membrane environment, and it is useful in analysis of stability and function of membrane proteins.5
As the lipid membrane bilayer is anisotropic along the bilayer normal,42 a residue at different depth of the membrane will have different interaction with lipid molecules in the environment, resulting in the depth–dependency of transfer free energies. However, there are few experimental measurements of TFEs at different depth positions other than the hydrophobic core, except Arg and Leu.10 Comparison between the GeTFEP and the experimentally measured values of Arg and Leu shows that the GeTFEP captures this depth–dependency well.
In addition, the GeTFEP exhibits asymmetric values between TFEs of residues in the membrane inner leaflet (depth −4 to 0) and in the outer leaflet (depth 0 to +4, Fig. 2C). Most βMPs in our dataset resides in the bacterial outer membrane, whose outer leaflet contains additional complex lipolysaccharides in contrast to its inner leaflet of phospholipids. This asymmetry in membrane composition results in the asymmetry of the transfer free energies in the GeTFEP. To understand membrane proteins in an environment of symmetric membrane leaflets, we also derived a symmetric TFE profile, named sym–GeTFEP, by mirroring the TFE values of the inner leaflet side of the GeTFEP (Fig. S6). In this study, the sym– GeTFEP was used to analyze the non–outer–membrane βMPs, e.g. α– and γ–hemolysins and vibrio cholerae cytolysin.
We explored the energetic contribution of different regions of βMPs during the membrane insertion process. Our analysis showed that the stability of βMPs does not come alone from the extensive property of the hydrophobicity of lipid–facing residues in the TM segment. Rather, the pattern of the amino acid residues in the TM segment also play significant roles. Results from analysis of sequence shuffling show that the patterns and location of amino acid residues are optimized to stabilize βMPs in the membrane environment. Using the GeTFEP, we are also able to predict membrane positioning and orientations of βMPs.
The GeTFEP can also be used to detect structurally or functionally important residues in βMPs. This can be achieved by examination of residues whose TFEs deviate significantly from the GeTFEP. As calculation of TFEs of residues of a specific βMP only requires rough estimation of relative positions between adjacent β–strands, which can be reliably predicted from the protein sequence,43,44 computing the TFE deviation therefore requires only sequence information. The GeTFEP–deviation analysis can aid in discovery of functional sites or structurally important sites in novel βMPs, without requiring knowledge of their 3D structures. In addition, GeTFEP–based analysis can aid in design and engineering of novel βMPs.
Furthermore, we demonstrated that GeTFEP can be used to predict TM residues of α–helical membrane proteins. Results showed that GeTFEP performs better than the hydrophobicity scales measured/calculated in αMP systems, suggesting that the GeTFEP reflects fundamental thermodynamic properties of amino acid residues inside membrane, and can be used to study the general stability of both α–helical and β–barrel membrane proteins.
Acknowledgement
This work is supported by NIH R01GM079804, R01CA204962-01A1, R21AI126308, and R35GM127084.
Footnotes
↵* E-mail: jliang{at}uic.edu