Structures and function of locked conformations of SARS-CoV-2 spike

The spike protein (S) of SARS-CoV-2 has been observed in three distinct pre-fusion conformations: locked, closed and open. Of these, the locked conformation was not previously observed for SARS-CoV-1 S and its function remains poorly understood. Here we engineered a SARS-CoV-2 S protein construct “S-R/x3” to arrest SARS-CoV-2 spikes in the locked conformation by a disulfide bond. Using this construct we determined high-resolution structures revealing two distinct locked states, with or without the D614G substitution that has become fixed in the globally circulating SARS-CoV-2 strains. The D614G mutation induces a structural change in domain D from locked-1 to locked-2 conformation to alter spike dynamics, promoting transition into the closed conformation from which opening of the receptor binding domain is permitted. The transition from locked to closed conformations is additionally promoted by a change from low to neutral pH. We propose that the locked conformations of S are present in the acidic cellular compartments where virus is assembled and egresses. In this model, release of the virion into the neutral pH extracellular space would favour transition to the closed form which itself can stochastically transition into the open form. The S-R/x3 construct provides a tool for the further structural and functional characterization of the locked conformations of S, as well as how sequence changes might alter S assembly and regulation of receptor binding domain dynamics.


Introduction
The spike (S) protein of coronaviruses is responsible for interaction with cellular receptors and fusion with the target cell membrane (Li, 2016). It is the main target of the immune system, therefore the focus for vaccine and therapeutics development. Most candidate Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) vaccines utilise S protein or its derivatives as the immunogen (Krammer, 2020) and a number of antibodies targeting the SARS-CoV-2 S protein are under development for COVID-19 treatment (Klasse and Moore, 2020).
In the locked and closed conformations, the three copies of RBD in each spike trimer lie down on the top of the spike such that each RBD interacts in trans with the NTD of the neighbouring S protomer, hiding the receptor binding site. In the locked conformation, the three RBDs are in very close proximity at the 3-fold axis, and a disulfide-bond stabilised helix-turn-helix motif (FPPR motif) is formed below the RBD that sterically prevents RBD opening (Bangaru et al., 2020;Cai et al., 2020;Toelzer et al., 2020;Xiong et al., 2020). Additionally, a linoleic acid is bound within the RBD and this ligand has been proposed to stabilise the locked conformation (Toelzer et al., 2020). The locked conformation is adopted in solution by the Novavax NVX-CoV2373 vaccine candidate (Bangaru et al., 2020). In structures of the locked conformation, both NTD and RBD are well resolved, suggesting that the locked spike has a rigid overall structure. In the closed conformation, the RBDs are slightly further apart than in the locked conformation and the FPPR motif is not folded (Bangaru et al., 2020;Cai et al., 2020;Toelzer et al., 2020;Xiong et al., 2020). As a result, the NTD and RBD exhibit considerable dynamics and are less well resolved than the core part of the protein in cryo-EM structures. Due to such dynamics, the RBD in the closed conformation can readily lift up to transit into the open conformation (Walls et al., 2020;Wrapp et al., 2020). In the open conformation, one or more RBDs are raised up to expose the receptor binding loops. ACE2 or antibody binding to these loops appears to maintain the spike in an open conformation Xu et al., 2021). It is believed that opening of the cleaved spike ultimately leads to structural transition into the postfusion conformation. The virus uses the dramatic structural refolding from the prefusion conformation to the lowest-energy postfusion conformation to drive membrane fusion (Bosch et al., 2003).
Among the observed conformational states of SARS-CoV-2 spike, the locked conformation is unique in that it has not been observed for the SARS-CoV-1 spike. Other coronavirus spikes do have locked-like conformations featuring motifs that preventing RBD opening, but for many of them, stochastic formation of an open conformation has not been observed (Yuan et al., 2017). The early cryoEM studies of purified SARS-CoV-2 spike ectodomain did not identify the locked conformation, and we previously found that only a small fraction of trimeric ectodomains adopt the locked conformation (Xiong et al., 2020), suggesting this conformation is transient. However, other cryoEM studies showed that the spike protein can predominantly adopt the locked conformation under certain conditions: purified fulllength spike proteins including the transmembrane region, primarily adopt locked conformations (Bangaru et al., 2020;Cai et al., 2020); insect-cell-expressed spike ectodomains also primarily adopt the locked conformation, perhaps due to stabilizing lipid in the insect cell media (Toelzer et al., 2020). The locked conformation was not observed in the high-resolution structures of spike proteins on virions .
Considering the above observations, the role of the locked conformation in the SARS-CoV-2 life cycle is unclear. Here, based on our previous locked spike structure (Xiong et al., 2020), we engineered a disulfide pair which stabilises the SARS-CoV-2 spike in the locked conformation. This construct provides a tool to characterize the structure and function of the locked form of S and to assess how mutations and other environmental factors could affect its function. Newly obtained information regarding the "locked" spike allowed us to consider its functional role in the context of SARS-CoV-2 life cycle.

Design of a spike protein ectodomain stabilized in the locked conformation
We previously did not observe a locked conformation for S-R (S with a deletion at the Furin cleavage site leaving only a single arginine residue), while for S-R/PP (S-R additionally containing the widely-used, stabilizing double-proline mutation) the locked conformation was rare, <10% of S trimers when imaged by cryoEM (Xiong et al., 2020). The scarcity of the locked conformation for the expressed ectodomain makes it difficult to study. To overcome this challenge, based on the locked SARS-CoV-2 S-R/PP and closed S-R/PP/x1 and S-R/x2 spike structures determined previously (Xiong et al., 2020), we engineered cysteines residues replacing positions D427 and V987 to generate a new disulfide-stabilised S protein S-R/x3. We predicted that S-R/x3 should be predominantly in the locked conformation (Fig.  1a). Under non-reducing conditions, S-R/x3 ran exclusively as trimer in SDS-PAGE gel, and the trimer is converted to monomer under reducing conditions (Fig. 1b). This behaviour is similar to our previously engineered S-R/x2, suggesting efficient formation of disulfide bond between the 2 engineered cysteines. Negative stain electron microscopy (EM) images of purified S-R/x3 show well-formed S trimers (Fig. 1c). We additionally introduced a D614G substitution into the S-R/x3 spike to obtain S-R/x3/D614G. Similar to S-R/x3, purified S-R/x3/D614G ran as disulfide linked trimers in SDS-PAGE gel under non-reducing conditions, and negative stain EM images show well-formed trimers (Fig. 1b,c).

Cryo-EM structure of S-R/x3
We performed cryo-EM of S-R/x3 trimers as described previously for S-R/x2 (Xiong et al., 2020). Consistent with the predicted effect of the engineered disulfide bond, 3D classification of S-R/x3 particles showed that majority of the particles (77%) are in the locked conformation, while 23% of the particles are in the closed conformation (Fig. S1). We obtained a reconstruction of the locked conformation at 2.6 Å with well resolved NTD and RBD densities (Fig. S2). S trimers in the locked conformation are characterised by structural rigidity, showing minimal local dynamics in NTD and RBD (Fig. S2). The engineered disulfide bond can be identified in the cryoEM density ( Fig. 2a). Consistent with our previous observation of the locked conformation (Xiong et al., 2020), the RBDs are tightly clustered around the three-fold symmetry axis ( Fig. S3), linoleic acid is bound into the described pocket in the RBD (Fig. 2a), and residues 833-855 are folded into a helix-loop-helix fusion peptide proximal region (FPPR) motif (Fig. 2a). We suggest that folding of the FPPR motif sterically prevents RBD opening.
The improved resolution relative to our previous reconstruction of the S-R/PP locked structure, allowed us to build an atomic model including Domain D loop 633-641 ( Fig. 2) where side chains were previously poorly resolved. Residues between 617-633 form a large disordered loop, which is not resolved in the cryoEM density. In structures of an unusual dimer of locked spikes, this loop has been seen to make contacts across the dimer interface (Bangaru et al., 2020).

Cryo-EM Structure of S-R/x3/D614G
We also determined the structure of the S-R/x3/D614G trimers. The proportion of locked conformations is greatly reduced compared to S-R/x3, constituting only ~20% of total particles (Fig. S1). Despite the presence of the x3 disulfide bond, 80% of the particles are in a closed conformation. The closed conformation accommodates the disulfide bond by motion of the RBD towards S2 (Fig. S3). In the S-R/x3/D614G locked structure, features observed in the S-R/x3 locked structure are present including the more tightly clustered RBD around the 3-fold axis, linoleic acid in the RBD and a folded FPPR motif. However, the S-R/x3/D614G locked structure differs from the S-R/x3 locked structure at domain D (Fig. 2,3). Within this domain, residues 617-641, which adopts a disordered loop structure in locked S-R/x3, refold into two small a-helices connected by a loop in locked S-R/x3/D614G. This change also affected structural features interacting with domain D, which are described below.

Comparison of the two different locked conformations
By comparing the locked conformations in S-R/x3 and S-R/x3/D614G, we observed that SARS-CoV-2 spike protein can adopt 2 different locked conformations (Fig. 3). We will refer to the conformation observed for 614D (S-R/x3) as "locked-1" while the one observed for 614G (S-R/x3/D614G) as "locked-2". The two conformations differ primarily at residues 617-641 within domain D, and due to this structural perturbation, there are differences in the FPPR motif and in residues 316-325 which are located in the hinge region connecting domain C and D. In the 614D locked-1 structure, the long sidechains of R634 and Y636 interdigitate with the sidechains of F318 and Y837 (Fig. 3a). In this interaction, R634 in domain D is sandwiched between F318 of the hinge region, and Y837 of the FPPR motif, bridging these three structural features. It appears likely that this zip-locking interaction maintains residues 617-641 in a loop structure. This arrangement positions loop residue R634 approximately 9 Å above D614.
In the 614D locked-1 structure, we and others observed a salt bridge between D614 and K854 within the FPPR motif Toelzer et al., 2020;Xiong et al., 2020) (Fig. 3a). The D614G mutation abolishes this salt bridge in the locked-2 structure (Fig. 3b). More significantly, the 617-641 loop refolds into two small alpha-helices (Fig. 3b). We suggest that the mutation of negatively charged D614 to neutral G and loss of the salt bridge alters the local electrostatic interactions, preventing R634 binding between F318 and Y837, and triggering refolding of the 617-641 loop. A similar helical structure has been observed for this region in domain D of other beta-coronavirus spike proteins including Murine hepatitis virus (MHV) (Walls et al., 2016), Severe acute respiratory syndrome coronavirus-1 (SARS-CoV-1) (Yuan et al., 2017), OC43-CoV , Middle-East Respiratory Syndrome coronavirus (MERS-CoV) (Walls et al., 2019)spikes (Fig. S4). Transition from the locked-1 to the locked-2 conformation breaks the interactions that link residues in domain D, the hinge region between domain C and domain D, and the FPPR motif. As a result, Y837 in the FPPR motif moves away from the RBD by approximately 4 Å while F318 rotates towards the inside of the spike, resulting a movement of approximately 7Å in the domain C-D hinge region (Fig. 3c). Further movement is required at the hinge region for the spike to transition from locked-2 to the closed conformation, this change results in a twisting motion of domain C and the RBD, loosening RBD packing, and thereby presumably readying the RBD for opening. Therefore, free movement of the F318 and the surrounding hinge region appears to be important for structural transition from locked-1 to locked-2 and from locked-2 to closed/open conformations. The absence of the interlocking interactions involving R634, Y636, F318 and Y837 in the S-R/x3/D614G structures provides an explanation for the observation that D614G promotes RBD opening.

Comparison to other Locked spike structures
Most SARS-CoV-2 spike structures deposited in the PDB are in the open or closed conformations and have a disordered FPPR motif and unresolved 617-641 region (https://www.ebi.ac.uk/pdbe/pdbe-kb/protein/P0DTC2). Only a few studies have captured S trimers in locked conformations. Two studies imaged purified full-length spike protein by cryoEM, with or without the PP mutation: these full-length structures are in the locked-1 conformation (Bangaru et al., 2020;Cai et al., 2020) (Fig. S5a,h). Soluble S-GSAS/PP trimers (where the furin cleavage site is replaced with a GSAS sequence) purified from insect cells are predominantly in the locked-1 conformation (Toelzer et al., 2020) (Fig. S5c,d), while there are two studies describing S-GSAS/PP trimers purified from mammalian cells that are also predominantly in the locked-1 conformation Xu et al., 2021) (Fig.  S5e-g). We have previously described that a small proportion of S-R/PP and S-R/PP/x1 trimers are in the locked-1 conformation (Xiong et al., 2020). In all these studies, where the locked-1 conformation has been observed, the spike features a more constricted closed RBD and binding of lipid within the RBD as well as a folded FPPR motif. In all these structures, residues 617-641 form a loop with a disordered central region (618-632, dashed line in Fig.  3), and residues 634 and 636 interact with F318, as we observed here for the locked-1 conformation. Ward and colleagues have observed an unusual spike dimer in which the otherwise disordered 617-641 loop forms a long, extended loop structure that contacts the dimeric partner (Bangaru et al., 2020) (Fig. S5a). The reason why locked conformations are observed under some conditions but not others is not completely clear. The purification procedure; the presence of membrane anchor in the full-length spike (Bangaru et al., 2020;Cai et al., 2020); differences in expression conditions -added lipid in insect cell media, low pH condition of insect cell media (Toelzer et al., 2020); EM grid preparation methods -use of a high octyl-glucoside concentration , and differences in reagents as concluded by Xu et al (Xu et al., 2021), may have all played a part.
Zhou and colleagues investigated the effect of low pH on ACE2 and antibody binding to the SARS-CoV-2 spike protein (Zhou et al., 2020). Their structure at pH 4 is most similar to the locked-2 conformation: the RBD adopts the more constricted position, the FPPR motif is ordered, and the 617-641 loop adopts an a-helical structure (Fig S5i-l). Low pH can therefore convert the "closed" conformation to locked-2 conformation. In contrast to the locked-2 conformation that we have observed, the low pH forms do not contain lipid bound within the RBD.

The locked-1 conformation can transition to the closed conformation over time.
We observed that after storing the S-R/x3 spike at 4°C for 40 days the number of wellformed trimeric spikes in negative stain EM was reduced by 30-40%. CryoEM imaging revealed that the remaining trimers had transitioned into the "closed" conformation ( Fig.  S1), and lipid was no longer bound in the RBD. This suggests that the locked-1 conformation is metastable at pH 7.4, and spontaneously transitions into the closed conformation. It is consistent with the general observation that most published ectodomain constructs (not stabilised by the x3 crosslink) adopt the closed/open conformations rather than the locked conformation. We incubated the now "closed" spike at pH 5.0 overnight before determining its structure by cryoEM. We found that low pH treatment had converted most of the closed spike to the locked-2 conformation (Fig. S1), without lipid bound in the RBD, consistent with the observations of Zhou et al (Zhou et al., 2020). A comparison of the closed and locked-2 conformations suggests to us that H519 within RBD and H49 within domain A, which are located close to where the RBD and NTD, respectively, contact the trimer core, may play a role in mediating pH-dependent structural changes (Fig. S6).
These observations demonstrate that although low pH can maintain the RBD in the locked conformation, low pH alone is not sufficient to revert the closed conformation back to the locked-1 conformation. It further demonstrates that the 614D spike, like the 614G spike, is able to adopt the locked-2 conformation. We speculate that breaking the interdigitation of residues in the 617-641 loop results in a locked-1 to closed transition and that this is a largely irreversible process. Under conditions, such as low pH, which favour transition from the closed back to locked conformation, the complex interdigitation cannot refold, resulting in a trimer in the locked-2 conformation.
The functional role of the locked-1 and locked-2 structures.
The disulfide-bond engineered into the S-R/x3 construct stabilizes the S trimer in the locked-1 conformation. In this conformation F318, R634, Y636 and Y837 interdigitate to link domain D, the hinge connecting domain C and D, and the FPPR motif. This interaction bridges the three structural features and we hypothesise that it locks the trimer, preventing RBD opening. At neutral pH, this structure is metastable and transitions over time into the closed conformation, from which stochastic RBD opening can take place. Lowering the pH of the closed spike, which promotes the close-apposition of RBDs in the locked conformation, leads to reversion into the locked-2 conformation, perhaps because the complex interdigitation of residues in domain D in the locked-1 conformation cannot easily be reformed.
Removal of an electrostatic interaction between D614 and R634 in the D614G spike hinders formation of the interdigitating stabilising interaction in domain D. Therefore, the D614G spike adopts the locked-2 conformation, where domain D residues 617-641 fold into a helixturn-helix motif similar to that observed in other beta-CoV spikes. This motif is located between the NTD and the hinge region leading to domain C and RBD. We speculate that the alternate conformations observed for domain D are part of a control mechanism regulating RBD opening and modulating receptor binding.
The absence of the interdigitating interaction network favours the transition to the closed conformation, as illustrated by the observation that the majority of D614G spike particles adopt the "closed" conformation. This is consistent with the observation that constructs bearing the D614G substitution are more susceptible to RBD opening, and provides a structural explanation for the change in conformational dynamics (Gobeil et al., 2021;Yurkovetskiy et al., 2020). A similar observation has also been made for full-length SARS-CoV-2 spike protein where the D614G substitution also induces disordering of the FPPR in domain D and increased number of open spikes .
Beta-coronaviruses have recently been shown to assemble in the low pH, high lipid environment of the ERGIC, and egress through the acidic environment of endosome (Ghosh et al., 2020) (Fig 4a). Given that low pH and lipid binding both favour locked conformations, we speculate that during virus assembly, both the 614D and D614G spikes are in their respective locked-1 and locked-2 conformations, and that this provides a mechanism to prevent premature transition into open or post-fusion conformations during virus assembly. Once viruses are released into the neutral pH environment outside of the cell, over time the spike will transition to the closed conformation where the RBD can open and mediate binding to target cells (Fig 4b). The D614G spike, which lacks the interactions in domain D to stabilise the RBD in the locked-1 conformation, is likely to exhibit a different structural dynamics that may leads to more rapid transition to the closed and subsequently the open form. The resulting change in receptor binding, cell entry and immune presentation characteristics has presumably provided a transmission advantage leading to global dominance of the mutant virus (Korber et al., 2020).

Expression Constructs
Expression constructs were generated essentially as described in Xiong et al. 2020(Xiong et al., 2020. Starting from the construct S-R (Xiong et al., 2020), introduction of cysteines to form x3 disulfide and introduction of D614G substitution were carried out using Q5 polymerase PCR with primers containing desired substitutions, followed by In-Fusion HD (Takara Bio) assembly.

Protein Production and Purification
Proteins were expressed in Expi293 cells and purified by metal exchange chromatography exactly as described in Xiong et al. 2020(Xiong et al., 2020. Except where otherwise indicated, proteins were flash frozen in liquid nitrogen and stored at -80°C. Negative staining EM 3 µl of proteins (~0.05 mg/ml) diluted with PBS buffer were deposited onto carbon-coated grids (EMS CF200-Cu) glow-discharged for 15 seconds at 25 mA in air. After 60s incubation, excess proteins were wicked by filter paper. Grids were washed once in buffer, and stained twice in Nano-W stain (Nanoprobes) with blotting in between. The grids were air dried on filter paper and imaged using a Tecnai T12 Spirit operated at 120 kV. Micrographs were taken in Digital Micrograph (Gatan).

Cryo-EM
Grid preparation and image collection were performed for S-R/x3 and S-R/x3/D614G spike proteins essentially as described in Xiong et al. 2020(Xiong et al., 2020. C-Flat 2/2 3C grids (Protochips) were glow-discharged for 45 seconds at 25 mA. 3 µl of freshly purified protein at 0.6 mg/ml supplemented with 0.01% octyl-glucoside (OG) was applied to the grids, which were plunge-frozen in liquid ethane using a Vitrobot (Thermo Fisher Scientific).
An aliquot of S-R/x3 freshly purified spike at 1.0 mg/ml was stored at 4˚C for 40 days. 10 µl of the stored protein was subjected to plunge-freezing at 1.0 mg/ml following the same procedure as for the freshly purified S trimers. Another 10 µl of the stored protein was incubated with 1 µl of citrate acid (pH 4.8) overnight and then plunge-frozen.
Grids were stored in liquid nitrogen and loaded into a Titan Krios electron microscope (Thermo Fisher Scientific) operated at 300kV. Movies with 48 frames were collected with a Gatan K3 BioQuantum direct electron detector with the slit retracted. Three shots per hole were achieved with beam-image shift controlled in SerialEM-3.7.0 (Mastronarde, 2005). An accumulated dose of 50 electrons/Å 2 were acquired in counting mode at the magnification of 81,000 X, corresponding to a calibrated pixel size of 1.061 Å/pixel. Detailed data acquisition parameters are summarized in Extended Data Table 1.

CryoEM image processing
Real-time data processing was performed in RELION-3.1's Scheduler as described in (Xiong, 2020). Motion correction, contrast transfer function (CTF) estimation, template particle picking and initial 3D classification were carried out while micrographs were being collected. An EM structure of the SARS-CoV-2 S protein in open form was filtered to 20 Å resolution as a 3D reference for template picking. Initial 3D classification with an open S model was accomplished at bin4 in batches of 500,000 particles to identify S protein trimers. Subsets of S trimers in the 3D classes which displayed clear secondary structures were pooled and subjected to one round of 2D classification cleaning. Subsequently, a second round of 3D classification was used to assess the ratio of closed and locked states (Fig. S1). Auto refinement, Bayesian polishing and CTF refinement were performed iteratively on classified closed and locked subsets, respectively (Zivanov et al., 2019. Following the final round of 3D auto-refinement, map resolutions were estimated at the 0.143 criterion of the phase-randomization-corrected Fourier shell correlation (FSC) curve calculated between two independently refined half-maps multiplied by a soft-edged solvent mask. Final reconstructions were sharpened and locally filtered in RELION post-processing (Fig. S2). The estimated B-factors of each map are listed in Extended Data Table 1.
The radiation damage caused fading of the disulphide bond between the two engineered cysteines in S-R/x3 and S-R/x3/D614G. In order to recover the bond density, EM maps of closed and locked S trimers were reconstructed from the first 4 frames in the movies of S-R/x3 and the first 5 frames of S-R/x3/D614G, respectively. Distinct densities of the disulphide bond were observed in the EM structures reconstructed from early exposed frames and are shown in Fig. 2.

Model building and refinement
For the closed conformation structures, the SARS-CoV-2 S protein ectodomain structure (PDBID: 6ZOX (Xiong et al., 2020)) was fitted into the EM density as a starting model. For the locked conformations, S structures from our previous study (PDBID: 6ZP2, 6ZOZ (Xiong et al., 2020)) were used as starting models. Model building and adjustment were performed manually in Coot-0.9 (Emsley and Cowtan, 2004). Steric clash and sidechain rotamer conformations were improved using the Namdinator web server (Kidmose et al., 2019). After further manual adjustment, the locked and closed structures were refined and validated in PHENIX-1.18.261 (Afonine et al., 2018) to good geometry. Refinement statistics are given in Extended Data Table 1. Pipeline is illustrated for S-R/x3, S-R/x3/D614G, S-R/x3 after 40 days, and the latter after transfer to pH5.0 buffer. After automated picking, 3D and 2D classification steps were used to remove contaminating objects. 3D classification was then used to sort the data into locked and closed conformations.