A novel conformational state for SARS-CoV-2 main protease

The SARS-CoV-2 main protease (Mpro) has a pivotal role in mediating viral genome replication and transcription of coronavirus, making it a promising target for drugs against Covid-19 pandemic. Here we present a crystal structure of Mpro disclosing new structural features of key regions of the enzyme. We show that the oxyanion loop, involved in substrate recognition and enzymatic activity, can adopt a new conformation, which is stable and significantly different from the known ones. In this new state the S1 subsite of the substrate binding region is completely reshaped and a new cavity near the S2’ subsite is created. This new structural information expands the knowledge of the conformational space available to Mpro, paving the way for the design of novel classes of inhibitors specifically designed to target this unprecedented binding site conformation, thus enlarging the chemical space for urgent antiviral drugs against Covid-19 pandemic.


Introduction
Mpro is a cysteine peptidase essential for the replication of SARS-CoV-2 , with 96% sequence identity and very similar 3D structure to SARS-CoV Mpro Douangamath et al., 2020;Jin et al., 2020aJin et al., , 2020bKneller et al., 2020c;Yang et al., 2003;Zhang et al., 2020). Mpro is involved in the proteolytic processing of the large polyproteins pp1a and pp1ab with the formation of individual non-structural proteins (Snijder et al., 2016). Mpro forms a homodimer fundamental for the proper catalytic activity (Anand et al., 2002). A key role is played by the "N-finger" as the N-terminal tail of one protomer interacts and stabilizes the binding site (S1 subsite) of the other protomer (Verschueren et al., 2008). The N-finger and the C-terminus are the result of the autoproteolytic processing of Mpro. In mature enzyme both termini of one protomer are facing the active side of the other. The enzymatic cleavage of the substrate occurs at the C-terminal end of a conserved glutamine in position P1 of the consensus sequence, with His41 and Cys145 as catalytic dyad (Anand et al., 2002). An important structural element for the catalytic event is the oxyanion loop, residues 138-145, lining the binding site for glutamine P1 and assumed involved in the stabilization of the tetrahedral acyl transition state (Anand et al., 2002;Lee et al., 2020;Verschueren et al., 2008). In the vast majority of structures, the oxyanion loop adopts the same conformation (here called "canonical") (Douangamath et al., 2020;Jin et al., 2020aJin et al., , 2020bZhang et al., 2020), whose mobility has been described by room temperature x-ray crystallography (Kneller et al., 2020b(Kneller et al., , 2020a. In few cases it can also exist in a different "collapsed" conformation, considered catalytically incompetent, often associated with modifications in the N-or C-termini or with acidic pH values (Verschueren et al., 2008;Yang et al., 2003).

Results and Discussion
In a campaign to get structural insights on SARS-CoV-2 Mpro we analyzed 27 different datasets to determine the crystal structure of Mpro in complex with inhibitors, namely masitinib, manidipine and bedaquiline (Ghahremanpour et al., 2020). Among these, as "positive" controls (i.e. structures already known) there were ligand-free Mpro and Mpro in complex with inhibitor boceprevir (Fu et al., 2020). Almost all tested crystals were monoclinic (space group C2, Table SI), isomorphous to the crystals of the free enzyme 6Y2E  and to most of the deposited Mpro structures, indicating the same crystal contacts. After successful MR and a first round of refinement, in most cases the electron density was clearly visible along the entire sequence, indicating a protein matrix with a structure very similar to the search models 6Y2E and 5REL (including the complex with boceprevir). However, there were a significant number of cases, around 10, where the electron density was of much lower quality or even absent in particular portions of the protein, namely for residues 139-144 of the oxyanion loop, residues 1-3 of the N-finger and the side chain of His163 in the S1 specificity subsite, all residues part of the active site. To cope with the known molecular replacement bias problem and to correctly rebuild the ambiguous parts, we performed new MR runs using as search model structure 6Y2E deprived of residues 139-144 and 1-3, and with an alanine instead of a histidine at position 163 (to remove the His side chain). This allowed to confirm perturbations in the conformation of the selected areas for 10 structures while clear electron densities were visible for the remaining cases with the oxyanion loop unambiguously in the canonical conformation ( Fig. S1 and S2). In some cases, the electron density was so poor that the tracing of the chain was very problematic, and it was not possible to reliably rebuild entirely the mobile zones (Fig. S1b). For four structures it was possible to efficiently model residues 139-144, 1-3 and the side chain of His163 in "new" conformations, different from the "canonical" and "collapsed" ones ( Fig. S1c). In summary, we found three different conformational states for the oxyanion loop: canonical (Fig. S1a), flexible (i.e. with poor electron density, Fig. S1b) and, strikingly, in a new state (Fig. S1c), clearly different from M6' N142 S139 F140 S139 K137 N142 C145 L141 c 5 the canonical and the collapsed ones (Fig. S3). All structures, including the one with the oxyanion loop in the new state (hereinafter called "new-Mpro"), refer to a correctly autoprocessed and functional protein, produced and crystallized with procedures similar to those of canonical 6Y2E   yellow) can still interact with Glu166-Oe and Phe140-CO, even if with a different geometry. His163 is no more available for binding but can be replaced by His172 that moves towards the S1 subsite. The new cavity near the S2´ subsite is indicated by a red asterisk.
The most striking property of new-Mpro is the different conformational state of the oxyanion loop characterized by two consecutive b-turns with hydrogen bonds between Ser139-CO and Gln142-NH and between Leu141-CO and Ser144-NH (Fig. 2a). The loop is stabilized by other hydrogen bonds. In the new conformation Asn142-Ca and the side-chain of Phe140 move away from the canonical position of 9.8 Å and around 7.5 Å, respectively (Fig. 2b). Notably, Gly143-NH, assumed stabilizing the tetrahedral oxyanion intermediate during catalysis (Anand et al., 2002;Lee et al., 2020;Verschueren et al., 2008), is moved 8.8 Å apart, raising the possibility that this new conformation is catalytically incompetent. However, the position of the catalytic dyad is not altered ( Fig. S4), with Cys145 side chain in double conformation. the His163 side chain is at 1.2 Å from the new position of Gly143-CO. Note also the movement of His172. b, the new oxyanion loop of one protomer pushes away residues 1´-3´ of the other protomer; however, the key salt-bridge between Arg4´ and Glu290 is conserved. c, overall superposition of canonical and new-Mpro shows that, besides in the oxyanion loop (red ellipsoid), major differences are located in the N-finger and the C-terminal tail (not visible in new-Mpro).
The superposition of the new conformation with the canonical one in the complex with the acylintermediate 7KHP (Lee et al., 2020) does not show evident steric clashes for the substrate, indicating that new-Mpro could bind P1 glutamine with Glu166-Oe and Phe140-CO, even if with a different geometry (Fig. 2c). Although His163 is no more available for binding as it rotates away to avoid steric clashes with Gly143-CO (Fig. 3a), it can be replaced by His172 that moves towards the S1 subsite ( Fig. 3a and 2c). As consequence of the new oxyanion conformation of one protomer, residues 1-3 of the N-finger of the other protomer (protomer´) moves away ( Fig. 3b and 3c), with Gly2´-CO at 3.2 Å from Ser139-NH. Remarkably, Arg4´ does not move and the inter-protomers salt bridge with Glu290 (important for catalytic activity and dimer formation) is still present (Fig.   3b). Another characteristic of the new structure is the destabilization of the C-terminal tail whose electron density is not visible anymore from residue 301 on, indicating high flexibility. This is due to the rearrangement of the interactions between the oxyanion loop of one protomer and the Nfinger and C-terminal portion of the other (Fig. 4).
Notably, the new conformation of the oxyanion loop generates a new cavity near position S2¢ as evident from the comparison of the new structure and the SARS-CoV-2 acyl-enzyme 7KHP (Lee et al., 2020)   This new structure was derived from crystals obtained with Mpro pre-incubated with inhibitors masitinib, manidipine or bedaquiline, however in no cases electron densities indicating the presence of the inhibitors were detected. This is explainable by the reported medium/low IC50 (in the range 2.5-19 µM) (Drayman et al., 2020;Ghahremanpour et al., 2020) and by the very low aqueous solubility of the molecules (when inhibitors in 100% DMSO were added to the protein solution whether Mpro proteolytic recognition is based on structural selection or on substrate-induced subsite cooperativity (Behnam, 2021). Intriguing is also the possibility that the remodeling of the S2´ subsite can be correlated to the high amino acid variation in position P2´ of SARS coronaviruses non-structural proteins (nsp) cleavage sites, Mpro autoprocessing included (Behnam, 2021).

Protein expression and purification
The plasmid PGEX- in buffer A to remove the GST-tagged PreScission protease, the His-tag, and the uncleaved protein.
Fractions containing the target protein at high purity were pooled, concentrated at 25 mg/ml and flash-frozen in liquid nitrogen for storage in small aliquots at -80 °C.

Protein characterization and enzymatic kinetics
Correctness of Mpro DNA sequence was verified by sequencing the expression plasmid. The Crystals appeared overnight and finished growing in less than 48 h after the crystallization drops were prepared. In the case of co-crystallization, Mpro was incubated for 16 h at 8 °C with 13-fold molar excess of inhibitor (final DMSO concentration 5%). After incubation a white precipitate appeared and the solutions were cleared by centrifugation at 16000 x g; then the protein was crystallized under the same conditions described for the apo form. For data collections, crystals were fished from the drops, cryo-protected with a quick deep into 30% PEG 400 (with 5 mM inhibitor in the case of co-crystals) and flash-cooled in liquid nitrogen. Crystals were monoclinic (space group C2, Table SI), isomorphous to the crystals of the free enzyme 6Y2E, with one monomer in the asymmetric unit, the functional dimer being formed by the crystallographic twofold axis.

Structure determination and refinement
Data collections were performed at ESRF, beamlines ID23-2 and ID23-1. Diffraction data integration and scaling were performed with XDS (Kabsch, 2010), data reduction and analysis with Aimless (Evans and Murshudov, 2013). Initially, structures were solved by Molecular Replacement (MR) with Phaser (McCoy et al., 2007) from Phenix (Liebschner et al., 2019), using as search model structures 6Y2E and 5REL (Mpro in complex with PCM-0102340) (Douangamath et al., 2020). To limit MR model bias in critical zones (namely residues 139-144, 1-3 and the side chain of His163) we then performed new MR runs using as search model structure 6Y2E without residues 139-144 and 1-3, and with an alanine instead of an histidine at position 163. Only for cocrystallization experiments with boceprevir electron density relative to the ligand was clearly visible since the beginning of the refinement (Fig. S2), and the 3 final structures, modelled from residue 1 to 306 (to compare with the "new" structure modelled until residue 301), are virtually identical to the PDB deposited ones (Fu et al., 2020). In all other cases, no electron densities indicating the presence of inhibitors masitinib, manidipine or bedaquiline in the active site (or elsewhere) were detectable. For four structures it was possible to efficiently model residues 139-144, 1-3 and the side chain of His163 in "new" conformations. The final structures were obtained by alternating cycles of manual refinement with Coot (Emsley and Cowtan, 2004) and automatic refinement with phenix.refine . The final electron density for the new oxyanion conformation and the N-finger are reported in Fig. 1. At the end, the model was submitted to ensamble.refinement (Burnley et al., 2012) by Phenix with default parameters. Statistics on data collection and refinement are reported in Table SI.

Molecular Modeling
Molecular Dynamics trajectories were collected on a heterogeneous NVIDIA GPU cluster composed of 20 GPUs whose model span from GTX1080 to RTX2080Ti. Research," n.d.) structure preparation tool. At first, the functional unit of the protease (the dimeric form) was restored applying a symmetric crystallographic transformation to each asymmetric unit.
Residues with alternate conformation were assigned to the highest occupancy alternative. The last 6 residues of the non-canonical structures were added using MOE Loop Modeler tool. MOE Protonate3D tool was used to assign the most probable protonation state of each residue (pH 7.4, T = 310 K, i.f. = 0.154). Finally, ions and each co-crystallized molecule except for water were removed. The system setup for the MD simulations was carried out using tleap software implemented in the AmberTools14 (Case et al., 2005) suite. AMBER ff14SB (Maier et al., 2015) was adopted for system parametrization and partial charges attribution. Protein structures were explicitly solvated in a rectangular prism TIP3P (Jorgensen et al., 1983) periodic water box whose borders were placed at a distance of 15 Å from any protein atoms. Na + and Clatoms were added to neutralize the system until a salt concentration of 0.154 M was reached. Molecular Dynamics simulations were then performed using ACEMD3 (Harvey et al., 2009) software, which is based upon OpenMM 7.4.2 (Eastman et al., 2017) engine. At first, 1000 steps of energy minimization were executed using the conjugate-gradient algorithm. Then, a two steps equilibration procedure was carried out: the first step consisted of 1 ns of canonical ensemble (NVT) simulation with 5 kcal mol -1 A -2 harmonic positional constraints applied to each protein atom while the second one consisted of 1 ns of isothermal-isobaric (NPT) simulation with 5 kcal mol -1 A -2 harmonic positional constraints applied only to protein alpha carbons. The production phase consisted of three independent MD replica for each protein conformation. Each simulation had a duration of 1 µs and was performed using the NVT ensemble at a constant temperature of 310 K with a timestep of 2 fs.
For both the equilibration and the production stage, the temperature was maintained constant by a Langevin thermostat. During the second step of the equilibration stage, the pressure was maintained at the fixed value of 1 bar. For Molecular Dynamics simulation analysis, MD trajectories were aligned using protein α-carbon atoms from the first trajectory frame as a reference, wrapped into an image of the system under periodic boundary conditions (PBC) and subsequently saved using a 200 ps interval between each frame and removing any ion and water molecule using Visual Molecular Dynamics 1.9.2 (Humphrey et al., 1996)