Abstract
The plasticity of naturally occurring protein structures, which can change shape considerably in response to changes in environmental conditions, is critical to biological function. While computational methods have been used to de novo design proteins that fold to a single state with a deep free energy minima (Huang et al., 2016), and to reengineer natural proteins to alter their dynamics (Davey et al., 2017) or fold (Alexander et al., 2009), the de novo design of closely related sequences which adopt well-defined, but structurally divergent structures remains an outstanding challenge. Here, we design closely related sequences (over 94% identity) that can adopt two very different homotrimeric helical bundle conformations -- one short (∼66 Å height) and the other long (∼100 Å height) -- reminiscent of the conformational transition of viral fusion proteins (Ivanovic et al., 2013; Podbilewicz, 2014; Skehel and Wiley, 2000). Crystallographic and NMR spectroscopic characterization show that both the short and long state sequences fold as designed. We sought to design bistable sequences for which both states are accessible, and obtained a single designed protein sequence that populates either the short state or the long state depending on the measurement conditions. The design of sequences which are poised to adopt two very different conformations sets the stage for creating large scale conformational switches between structurally divergent forms.
Introduction
There are many examples of de novo designed amino acid sequences that fold to a single designed target structure (Huang et al., 2016; Lu et al., 2018), but designing new amino acid sequences capable of adopting divergent structural conformations is challenging since the free energy difference between the two states must be small enough that a few amino acid changes can shift the global energy minimum from one state to the other, and both states must be stable relative to the unfolded state. Redesigns of natural protein backbones have yielded large conformational changes for systems such as the Zn antennafinger, Zinc-binding/Coiled coil (ZiCo), and designed peptide Sw2, that involve changes in oligomerization state from a homotrimeric 3-helix bundle to a monomeric zinc-finger fold (Ambroggio and Kuhlman, 2006; Cerasoli et al., 2005; Hori and Sugiura, 2004), and the pHios de novo design also involves a change in oligomerization state (pentamer to hexamer; (Lizatović et al., 2016)). In other cases, the engineered proteins have two well-defined states that are quite similar, for example the dynamic switching of DANCER proteins primarily involves a single tryptophan residue (Davey et al., 2017), and the Rocker channel has two symmetrically related states that are structurally identical (Joh et al., 2017). Finally, designed protein pH (Boyken et al., 2019) and peptide (Langan et al., 2019) dependent switches involve transitions between a single well-defined state and less well ordered states. Overall, in most previous work, the differences in conformation have generally either been relatively small, involved changes in oligomerization state, or only one of the two states has been well-defined. An exception is the engineering of sequences with 98% identity that switch folds between the naturally occuring GA and GB domains (Alexander et al., 2009).
We sought to design sets of closely related de novo protein sequences with two structurally divergent ground states having the same oligomerization state. Inspired by class I viral fusion proteins (i.e. influenza HA, parainfluenza F, HIV Env, and Ebola GP) (Ivanovic et al., 2013; Podbilewicz, 2014; Skehel and Wiley, 2000), we chose homotrimeric helical bundles as the fundamental architecture. We decided on a design scheme with two divergent conformations containing a constant six helix bundle “base”, and a variable portion with three inner helices and three “flipping” helices. In the compact “short” state (∼66 Å height) the flipping helices fold back to interact with the inner helices, and in the extended “long” state (∼100 Å height) they extend to form interactions with each other (Fig. 1a). Due to the difficulty of computing the relative stabilities of very different conformations accurately, we first designed related sequences which adopt one or the other of the two states. Then, guided by experiments, we identified structural features that affect the conformational preference between states and attempted to design single sequences which adopt both states.
Results
We started from a variant of a previously Rosetta-designed six-helix bundle (2L6HC3_13; PDB ID 5j0h) (Boyken et al., 2016) and introduced a cut in the outer helix to produce a disconnected segment to serve as the flipping helix. This flipping helix was connected to the C-terminus either via a turn to generate the “short” state, or a contiguous helix to generate the “long” state (Fig. 1b). To determine the appropriate hinge length, connectors of 2-7 residues were tested in silico and it was found that a hinge of length five provided good packing of the long state (Fig. S1b). Rosetta design and folding calculations were then used to find sequences that were strongly predicted to fold into one of the two states (Fig. S1d), genes encoding such designs were obtained, and the proteins were purified and characterized. All initial designs were solubly expressed, had circular dichroism (CD) spectra indicative of well-folded, alpha-helical structures, and three of four were trimeric by size exclusion chromatography with multi-angle light scattering (SEC-MALS) (Fig. S2). However, by small angle x-ray scattering (SAXS), it was difficult to determine which state (short, long, or a combination) the proteins adopted (Fig. S2).
To facilitate discrimination between the two states by SAXS, we carried out a second round of design with 1) longer flipping helices (which results in a greater difference in length between the states), and 2) helical interfaces redesigned to favor the short state (with sequence optimized for the turn and packing of the flipping helices on the inner helices) or the long state (with sequence optimized for packing between the flipping helices across the symmetric interface) — the previous design round enforced the same sequence at the interface regions for all designs (Fig. S3). The length of the flipping helix was increased from 15 to 21 residues and the optimal hinge length was recalculated to be three residues. The proteins in this second round of designs expressed solubly, and three of the four were helical by CD spectroscopy and ran as trimers by SEC-MALS (Fig. S4). However, by SAXS, all three appeared to be in the long state (Fig. S4), suggesting that the short state in this design scheme has higher free energy than the long state.
Our third round of designs aimed to generate short and long states with more equal stability. We used a larger portion of 2L6HC3_13 as the base to provide more stability, and fused it with a second Rosetta-designed six-helix bundle (2L6HC3_23) (Boyken et al., 2016), through aligning the matching superhelical parameters of the inner three helices (Fig. 1c). Compared to the second generation designs, in the short state, the flipping helix is positioned closer to the inner helices and the termini of the hinge region are closer together, allowing closer packing in the short state (Fig. S5). The outer helices were trimmed to avoid clashes and the required hinge length was recalculated to be three residues. Four designs with different fusion sites, placement of hydrogen bond networks, and loop flexibilities were tested; all expressed solubly and were helical by CD, but the two larger designs were not trimeric by SEC-MALS (Fig. S6). One design, AAA, was confirmed to be in the long state by crystallography, again indicating the short state has higher energy than the long state (Fig. 2f). We next focus on tuning the sequence of AAA to favor the short state.
Tuning the short/long state difference with hydrogen bond networks
To tune the relative accessibility of the short and long states, we used modular hydrogen bond networks that would be buried in the short state, but are partially exposed in the long state. The interface strength between the inner and flipping helices is expected to depend on both the number (more networks results in weaker interfaces) and position (solvent exposed locations are more likely to be disrupted by competing water interactions and result in weaker interfaces) of the networks. This strategy has the added advantage of creating more soluble proteins because residues necessarily exposed in the long state (when the flipping helix extends, parts of the inner helices it previously packed against are solvent exposed) are not entirely hydrophobic (as they would be if the stabilizing interactions were entirely non-polar, as in most design calculations). The interaction energy between the flipping helix and the inner helices in the short state was tuned by combining hydrogen bond (“A”) or hydrophobic layers (“X”) in different orders in each of three possible positions (Boyken et al., 2016) (Fig. 1d). Designs are given three letter names corresponding to the layers in order, starting at the hinge proximal position. Structure prediction calculations suggested that some layer combinations (such as XAX) are more favorable for the short state than others (such as AAA) and suggest that the short and long backbones states are among the lowest energy states of these systems (Fig. 1e).
We constructed the family of all eight possible permutations of two different layers in three positions, dubbed SERPNTs (in Silico Engineered Rearrangeable ProteiN Topologies). We set the three hinge residues to glycine to reduce favorable contributions to the long state free energy from the helical form of the hinge. All eight constructs expressed well, and by FPLC, most samples were monodisperse or had a small aggregation peak (only AXA formed a large soluble aggregate) (Fig. S7c). SEC-MALS measurements indicate that in solution, XAX, XAA, and AAA are trimeric as expected and CD wavelength and temperature measurements for XAX, XAA, and AAA show that the designs are helical and highly temperature stable (Fig. S7d,e).
The number and position of the hydrogen bond networks correlates with which state is lower energy -- designs with one network fold into the short state, two networks fold into a more dynamic short state, and three networks fold into the long state (details in supplementary text). Crystal structures of designs XXA (PDB ID 6nyi) and XAX (PDB ID 6nye) match the modeled short state (Fig. 2a,b). For both, most of the deviation, as expected, is in the flexible hinge: the crystal structures show that each monomer adopts a different loop. By NMR, the average solution structure adopted by XAX is locked in the compact state, as shown by the presence of multiple NOEs from a 3D CM-CMHM NOESY spectrum (Fig. 2c). Analysis of an MAI(LV)proS methyl-labeled XAA sample also suggests a short state, with the flipping helix shifted up (towards the hinge) compared to the design model (Fig. 2d,e, Fig. S8). The flipping helix appears to present a more flexible interface (relative to XAX), as evidenced by increased linewidths both in the amide TROSY and methyl HMQC spectra (Fig. S9 left). The crystal structure of design AAA (PDB ID 6nx2) is in the long state with some deviation from the predicted model because the asparagines of two of the hydrogen bond networks coordinate bromide ions and shift the helical register (ions and waters are not explicitly accounted for during modeling) (Fig. 2e, S10). These results suggested potential for a single sequence that can interconvert between the two designed states since a change in three residues, corresponding to the location of a hydrogen bond network, can switch the observed state (i.e., XAA vs AAA).
Tuning the short/long state difference in the hinge region
To bring the amino acid sequence of the short and long states still closer together, we next tuned the hinge region, which transitions from a loop to fully helical in the conformational change (Fig. 2a,b,e vs f). In the long state, “key” hinge residue (G75) and the “backup” hinge residue (T76) have the potential to form an ion coordination site, but not in the short state (Fig. 3a). Similar natural sites include the calcium coordination site in human cytomegalovirus glycoprotein B (PDB ID 5cxf) or the nickel coordinating site in methylmalonyl CoA decarboxylase (PDB ID 1ef8) (Fig. 3b), among others (Fig. S11-S13) (Putignano et al., 2018). We sought to design calcium or nickel binding sites in the long state by introducing aspartic acid or histidine at the key hinge residue position. We also designed for structural ambiguity in the turn -- with the result that secondary structure prediction for residue 75 is sometimes helical and sometimes coil, but with low confidence (Fig. S14).
Crystal structures show that designs XAX_GGDQ (PDB ID 6nyk) and XXA_GVDQ (PDB ID 6nz1), which introduces a potential calcium coordination site, adopt the short state like their respective base designs without the hinge residue modifications (i.e., XAX and XXA) (Fig. 3c,d). In contrast, the crystal structure of XAA_GGHN (PDB ID 6nz3), which introduces a potential nickel coordination site (occupied by a water molecule in the structure), is in the extended state, unlike the parent XAA design, with RMSD 1.44 to the extended model (Fig. 3e). Together, these results show that the hinge region is amenable to mutation without causing the protein to misfold into undesigned conformations.
A bistable designed sequence that adopts both the short and long states
Having found that both the short and long state backbones are compatible with modular replacement of hydrogen bonding layers between the helices and with mutations in the hinge region, we carried out more detailed structural characterization to determine if any of the designs can adopt both conformations. Detailed NMR and crystallographic characterization of design XAA_GVDQ, described in the following paragraphs, suggest that this sequence may indeed populate both the short and long states.
To characterize the structure in solution, we prepared a selective MAI(LV)proS methyl-labeled XAA_GVDQ sample (Fig. S9 right). In the absence of calcium, the 3D CM-CMHM NOESY and residual dipolar couplings (RDCs), are consistent with the short state. We solved the NMR structure using the chemical shifts, long-range NOEs (both methyl to methyl and amide to methyl), and RDCs, and found it to be very close to the short state design model, RMSD 1.19 Å (Fig. 4a,c, Fig. S15, Table S6, PDB ID 6O0C).
We next sought to determine the structure of the design in the presence of calcium. Overlays of 1D methyl proton spectra show that the protein goes into a non-observable state (several 100s of kDa) in a calcium concentration-dependent manner, even at dilute concentrations and low temperature conditions (Fig. S15). RDCs, amide TROSY spectrum, and 2D 1H-13C HMQC for XAA_GVDQ samples with and without calcium were also recorded (Fig. S15). SAXS analysis of XAA_GVDQ and the control sequence with no aspartate, XAA (linker GGGT), shows calcium dependent changes in the normalized Kratky plot indicating in increase in size in solution (Fig. 4b, Table S4).
Since the calcium-bound state was not observable by NMR, despite efforts to optimize pH, temperature, concentration, calcium, and buffer components, we turned to X-ray crystallography, and sought to determine crystal structures in both the absence and presence of calcium. After extensive screening, we were able to solve a structure in the absence of calcium (Fig. S17, PDB ID 6nxm) in which the flipping helices interact with the inner helices in the same manner as they would in the designed short state, but do so in a domain swapped manner with a symmetry mate to form a hexamer. This is consistent with the observation of the short state by NMR in the absence of calcium; the domain swapping almost certainly arises during crystallization.
We next sought to solve the structure of the design in the presence of calcium. We succeeded in solving the structure in 50 mM calcium acetate by molecular replacement to 2.3 Å resolution (Table S3). The protein is in the designed long state with a metal ion, likely calcium, coordinated by the aspartates in the hinge region (Fig. 4a, PDB ID 6ny8). The ion is most likely calcium given the coordination and the crystallography conditions, which consisted of 0.05 M calcium acetate, 0.1 M sodium acetate pH 4.5, 40% (w/v) 1,2-propanediol at 18°C.
The differences in the crystal and NMR structures of XAA_GVDQ are striking. It is possible that the long state is driven by crystal interaction forces rather than ion coordination, but these crystal specific interactions are likely to be quite small as the flipping helix makes relatively few crystal-lattice interactions (Fig. 4d). On the other hand, since at least >95% of the molecules in solution in the presence of calcium are in the short state by solution NMR, the short state appears to be more favorable in solution. We cannot rule out the possibility that both states are indeed populated in solution, but the barrier to interconversion is high and the long state is removed from solution under NMR conditions: the calcium dependent sample precipitation noted above could reflect the long state being more prone to aggregation, in which case it could be lost in the NMR experiment which is carried out at high protein concentrations.
Discussion
We have designed a family of de novo proteins in which small changes (3 mutations) in the interface or hinge region can cause the proteins to fold into two distinct and structurally divergent conformations. In contrast to most previous work, the differences in conformation of our designs are relatively large, do not rely on changes in oligomerization state, and both states are well-defined. Most notably, the protein XAA_GVDQ adopts the designed short state by NMR in the absence of calcium, and the designed long state by crystallography in the presence of calcium. Given that changes in three residues in the hinge region can change the conformation of the design (hinge GGHN is long vs. GGGT is short), future work focused on these residues could yield designs that occupy the short and long states in equilibrium and are switchable in solution conditions. Our designed systems, poised as they are between two different states, are excellent starting points for design of triggered conformational changes -- only small or binding free energies would be required to switch conformational states; a particularly interesting possibility would be synthetic environmentally sensitive membrane fusion proteins analogous to those of the viral proteins that inspired the design scheme.
Materials and Methods
Synthetic gene construction
Synthetic genes were ordered from Genscript Inc. (Piscataway, NJ, USA) and delivered in pET28b+ E. coli expression vectors, inserted at the NdeI and XhoI sites. Genes are encoded in frame with the N-terminal hexahistidine tag and thrombin cleavage site. A stop codon was added at the C-terminus just before XhoI.
Bacterial protein expression and purification
Plasmids were transformed into chemically competent E. coli BL21(DE3)Star (Invitrogen) or Rosetta(DE3)pLysS (QB3 MacroLabs). Starter cultures were grown in Lysogeny Broth (LB) media with 50 μg/mL kanamycin overnight with shaking at 37°C. 5 mL of starter culture was used to inoculate 500 mL of 2xTY with 50-100 μg/mL kanamycin at 37°C until OD600 = 0.6. Cultures were cooled to 18°C and induced with 0.1mM IPTG at OD600 = 0.8 and grown overnight. Cells were harvested by centrifugation at 3500 rcf for 15 minutes at 4°C and resuspended in lysis buffer (25 mM Tris pH 8.0, 300 mM NaCl, 20 mM imidazole), then lysed by sonication. Lysates were cleared by centrifugation at 14,500 rcf for at least 45 minutes at 4°C. Supernatant was applied to 0.5-1 mL Ni-NTA resin (Qiagen) in gravity columns pre-equilibrated in lysis buffer. The column was washed thrice with 10 column volumes (CV) of wash buffer (25 mM Tris pH 8.0, 300 mM NaCl, 20 mM imidazole). Protein was eluted with 25 mM Tris pH 8.0, 300 mM NaCl, 250 mM imidazole, 0.5 mM TCEP. Proteins were buffer exchanged using FPLC (Akta Pure, GE) and a HiPrep 26/10 Desalting column (GE) into 25 mM Tris pH 8.0, 300 mM NaCl. The hexahistidine tag was removed by thrombin cleavage by incubating with 1:5000 thrombin (EMD Millipore) overnight at room temperature. Proteins were further purified by gel filtration using FPLC and a Superdex 75 10/300 GL (GE) size exclusion column.
Circular dichroism (CD) measurements
CD wavelength scans (260 to 195 nm) and temperature melts (25°C to 95°C) were measured using an AVIV model 410 CD spectrometer. Temperature melts monitored absorption signal at 222 nm and were carried out at a heating rate of 4°C/min. Protein samples were prepared at 0.25 mg/mL in PBS in a 1 mm cuvette.
Small angle X-ray scattering (SAXS) measurements
Proteins were purified by gel filtration using FPLC (Akta Pure, GE) and a Superdex 75 10/300 GL (GE) into SAXS buffer (20 mM Tris pH 8.0, 150 mM NaCl, and 2% glycerol). Fractions containing trimers were collected and concentrated using Spin-X UF 500 10 MWCO spin columns (Corning) and the flow through from the spin columns was used as buffer blanks. Samples were prepared at concentrations of 3-8 mg/ml. SAXS measurements were made at SIBYLS 12.3.1 beamline at the Advanced Light Source (ALS). The light path is generated by a super-bend magnet to provide a 1012 photons/sec flux (1 Å wavelength) and detected on a Pilatus3 2M pixel array detector. Each sample is collected multiple times with the same exposure length, generally every 0.3 seconds for a total of 10 seconds resulting in 30 frames per sample.
Protein crystallography, x-ray data collection, and structure determination
Proteins were purified by gel filtration using FPLC (Akta Pure, GE) and a Superdex 75 10/300 GL (GE) into 20 mM Tris pH 8.0, 150 mM NaCl and concentrated to 3-10 mg/ml, most commonly 8 mg/ml. Samples were screened using a Mosquito (TTP Labtech) with JCSG I-IV (Qiagen), Index (Hampton Research), and Morpheus (Molecular Dimensions) screens. Sitting drops of 1:1 mixture of 150-200 nL protein and 150-200 nL reservoir solution at 4°C or 18°C. Crystallization conditions are listed in Table S2. The X-ray data sets were collected at the Berkeley Center for Structural Biology beamlines 8.2.1 and 8.3.1 of the Advanced Light Source at the Lawrence Berkeley National Laboratory (LBNL). Data sets were indexed and scaled using HKL2000 (Otwinowski and Minor, 1997) or XDS (Kabsch, 2010). Structures were determined by the molecular-replacement method with the program PHASER (McCoy et al., 2007) within the Phenix suite(Adams et al., 2010) using the Rosetta design models as for the initial search model. The molecular replacement results were used to build the design structures and initiate crystallographic refinement and model rebuilding. Structure refinement was performed using the phenix.refine program (Adams et al., 2010; Afonine et al., 2010). Manual rebuilding using COOT (Emsley and Cowtan, 2004) and the addition of water molecules allowed construction of the final models. Summaries of statistics are provided in Table S3.
NMR spectroscopy and structure determination
All experiments were recorded at 37 °C on an 800 MHz Bruker Avance III spectrometer equipped with a TCI cryoprobe. NMR samples contained 20 mM Sodium Phosphate pH 6.2, 100mM NaCl, 0.01% NaN3, 1 U Roche protease inhibitor cocktail in 95% H2O/5% D2O. For backbone and methyl assignments, labeled samples were prepared using well-established protocols (Tugarinov et al., 2006; Tugarinov and Kay, 2003). Briefly, backbone assignments were performed on uniformly 13C/15N, perdeuterated protein samples using TROSY-based 3D experiments, HNCO, HNCA and HN(CA)CB. Side-chain methyl assignments were performed starting from the backbone assignments and using unambiguous NOE connectivities observed in 3D CM-CMHM, 3D CM-NHN and 3D HNHα-CMHM SOFAST NOESY experiments, recorded on a U-[15N, 12C, 2H] AI(LV)proS (Ala 13Cβ, Ile 13Cδ1, Leu 12Cδ1/13Cδ2, Val 12Cγ1/13Cγ2) selective methyl-labeled sample. The calcium titrations for XAA_GVDQ were performed in 20 mM Tris-HCl pH 7.2, 100 mM NaCl and 0, 0.5, 1, 2, 5, 10, 15, 20, 25, 30 and 50 mM CaCl2 buffer conditions.
Backbone amide 1DNH residual dipolar couplings (RDCs) were obtained using 2D 1H-15N transverse relaxation-optimized spectroscopy (ARTSY) (Fitzkee and Bax, 2010). For XAA_GVDQ sample, the alignment media was 10 mg/mL Pf1 phage (ASLA Biotech) for experiments performed without calcium and 15 mg/mL for experiments performed in the presence of 5 mM CaCl2. The 2H quadrupolar splittings were 8.7 Hz and 9.3 Hz in the absence and presence of CaCl2, respectively. For XAA sample, the alignment media was 10 mg/mL Pf1 phage (ASLA Biotech) and the 2H quadrupolar splitting was equal to 6.2 Hz. All NMR data were processed with NMRPipe(Delaglio et al., 1995) and analyzed using NMRFAM-SPARKY(Lee et al., 2015). The Chemical Shift Deviations (CSD) of 13C methyls (in p.p.m.) were calculated using the equation and of 15N amides (in p.p.m.) using the equation .
Structure modeling from our NMR data was performed using Rosetta’s fold and dock protocol(Das et al., 2009; DiMaio et al., 2011; Sgourakis et al., 2011; Tyka et al., 2011). Two sets of calculations were performed, using different input data (i) backbone chemical shifts and RDCs (to model the structure of XAA_GVDQ in the presence and absence of calcium, Fig 4a bottom) and (ii) backbone chemical shifts, long-range NOEs and RDCs data (to calculate the de novo NMR structures for XAA (Fig. 2e) and XAA_GVDQ (Fig. 4d))(Nerli et al., 2018). In (ii), we used a uniform upper distance bound of 10 Å for all 92 NOE restraints for XAA and for all 139 NOE restraints for XAA_GVDQ (Table S5). The top 10 lowest-energy structures from each calculation showing good agreement with RDCs and chemical shifts and a minimum number of NOE violations were deposited in the PDB, under accession code 6O0I for XAA (BMRB ID 30574) and 6O0C for XAA_GVDQ mutant M4L (BMRB ID 30573).
Funding
SAXS data collection was conducted at the Advanced Light Source (ALS), a national user facility operated by Lawrence Berkeley National Laboratory on behalf of the Department of Energy, Office of Basic Energy Sciences, through the Integrated Diffraction Analysis Technologies (IDAT) program, supported by DOE Office of Biological and Environmental Research. Additional support comes from the National Institute of Health project ALS-ENABLE (P30 GM124169) and a High-End Instrumentation Grant S10OD018483.
Crystallography data collection was conducted at Beamline 8.3.1 and 8.21 at the Advanced Light Source is operated by the University of California Office of the President, Multicampus Research Programs and Initiatives grant MR-15-328599 the National Institutes of Health (R01 GM124149 and P30 GM124169), Plexxikon Inc. and the Integrated Diffraction Analysis Technologies program of the US Department of Energy Office of Biological and Environmental Research. The Advanced Light Source (Berkeley, CA) is a national user facility operated by Lawrence Berkeley National Laboratory on behalf of the US Department of Energy under contract number DE-AC02-05CH11231, Office of Basic Energy Sciences.
All NMR data were recorded at the 800 MHz NMR spectrometer at UCSC, funded by the Office of the Director, NIH, under High End Instrumentation (HIE) Grant S10OD018455. We acknowledge additional support from the NIH through Grants R35GM125034 (NIGMS) and R01AI143997(NIAID), and through Career Development Award AI2573-01(NIAID), which funded the baker computational cluster at UCSC.
Chan Zuckerberg Biohub.
Author contributions
K.Y.W., S.E.B., P.-S.H., N.G.S and D.B., designed the research.
K.Y.W. designed, performed, and analyzed data for CD, SEC-MALS, SAXS, and X-ray crystallography experiments.
D.M., A.C.M. and N.G.S. performed NMR experiments and data analysis.
M.J.B. designed and analyzed data for X-ray crystallography experiments.
S.N. performed NMR structure calculations.
L.C. and D.M. performed SEC-MALS experiments and NMR sample preparation.
All authors contributed to experimental design, discussed results, and commented on the manuscript.
Competing Interests
The authors declare no competing financial interests.
Supplementary Materials
Supplemental Figures
Supplemental Tables
Supplemental Text
Tuning of the free energy difference between short and long states
Crystal structures of designs XXA (PDB ID 6nyi) and XAX (PDB ID 6nye) match the modeled short state with Cα root-mean-square deviations (RMSDs) of 1.37 and 0.97, respectively at 1.9-2.3 Å resolution (Fig. 2a,b). For both, most of the deviation, as expected, is in the flexible hinge: the crystal structures show that each monomer adopts a different loop. For XXA, the crystal structure confirms the short state, but there are deviations from the model at the hydrogen bond network, which is not surprisingly since it is in a solvent exposed position. For XAX, the designed hydrogen bond network formed as expected (Fig 2b. “network A” panel vs. Fig. 2a “network A” panel). To further characterize the solution structure, we prepared a selective AI(LV)proS methyl-labeled XAX sample. The 3D CM-CMHM NOESY spectrum for XAX shows NOE patterns consistent with the compact structure, with many unambiguously assigned close-range NOEs identified between the 13Cδ1 methyl of Ile66 and 13Cδ2 of Leu86 (Fig. 2c).
Attempts to solve a crystal structure for XAA were unsuccessful, but a 15N MAI(LV)proS methyl-labeled sample prepared for the protein showed highly dispersed NMR spectra. The fully assigned methyl HMQC spectra show resonances of increased linewidth, relative to our previously assigned XAX construct, indicating the presence of dynamics at the μsec-msec timescale. The fully assigned backbone amide TROSY-HSQC spectra confirm this result, and further reveal the presence of a minor component (∼10-15% population, not exchanging with the major component at the NMR timescale, as confirmed by TROSY ZZ exchange experiments) for residues distributed at the interface between the inner and outer helices, (Fig. S14), likely due to local breaking of the regular C3 symmetry. Comparison of the shared methyl NOEs between the major state of XAA and XAA_GVDQ (which by NMR is locked in the short state) (Fig. S8a) and analysis of Residual Dipolar Couplings (RDCs) (Fig. S8b) further suggest that the flipping helix is in the short state, adopting an alternative packing against the inner helices. Finally, a comparison with the de novo NMR structure (solved using chemical shifts, long-range NOEs and RDCs; Table S5) clearly shows that the flipping helix is shifted by about a turn up towards the hinge (Fig. 2d,e). The C-terminal hydrogen bond network in XAA is solvent exposed and more likely to be disrupted by competing water interactions. This disrupted network likely destabilizes the interface between the flipping helix and the inner helices, allowing alternative packing. In contrast, the C-terminal region of XAX is hydrophobic, very well packed in the crystal structure and effectively stabilizes the flipping helix.
Design AAA (PDB ID 6nx2) adopts the long state according to crystallographic evidence with RMSD = 1.40 at about 2.3 Å resolution (Fig. 2e). While the helical propensity of the residues at and surrounding the hinge region was predicted to be low, in the crystal structure this region is clearly helical and the most flexible residues are at the C-terminus (Fig. S14). The structure of AAA deviates from the predicted model because the asparagines of two of the hydrogen bond networks coordinate bromide ions (ions and waters are not explicitly accounted for during modeling). In particular, N61 shifts toward the center of the helix (relative to the model to coordinate the ion (Fig. S10). Taken together, there is potential for a single sequence that can interconvert between the two designed states since a change in three residues, corresponding to the location of a hydrogen bond network, can switch the observed state (i.e., XAA vs AAA).
Acknowledgements
We thank X. Ren, J. Hurley, G. Meigs, P. Adams, B. Sankaran, and A. Kang for help with X-ray crystallography. We thank K. Burnettt, G. Hura, and J. Tanamachi for help with SAXS. We thank C. Nixon and S. Marqusee for help with CD. We thank B. Belardi, M. Vahey, S. Son, S. Ovchinnikov, and the rest of the Fletcher and Baker laboratories for discussion. We thank UW Hyak supercomputer and Rosetta@Home volunteers (https://boinc.bakerlab.org) for computing resources.
Footnotes
Updated text and Figure 4.
References
Supplemental References
- 1.
- 2.
- 3.