Abstract
In December 2019, the first cases of a novel coronavirus infection were diagnosed in Wuhan, China. Due to international travel and human-to-human transmission, the virus spread rapidly inside and outside of China. Currently, there is no effective antiviral treatment for COVID-19, therefore research efforts are focused on the rapid development of vaccines and antiviral drugs. The SARS-CoV-2 Mpro protease constitutes one of the most attractive antiviral drug targets. To address this emerging problem, we have synthesized a combinatorial library of fluorogenic substrates with glutamine in the P1 position. We used it to determine the substrate preferences of the SARS-CoV and SARS-CoV-2 proteases, using natural and a large panel of unnatural amino acids. The results of our work provide a structural framework for the design of inhibitors as antiviral agents or diagnostic tests.
Introduction
In December 2019, a severe respiratory disease of unknown origin emerged in Wuhan, Hubei province, China.[1] Symptoms of the first patients were flu-like and included fever, cough and myalgia, but with a tendency to develop a potentially fatal dyspnea and acute respiratory distress syndrome.[1b] Genetic analysis confirmed a betacoronavirus as the causing agent. The virus was initially named 2019 novel coronavirus (2019-nCoV),[1-2] but shortly thereafter, it was renamed to SARS-CoV-2.[3] By March 07, 2020, the WHO had registered >100,000 cumulative cases, in 65 countries, of coronavirus disease 2019 (COVID-19), with >3400 deaths. [4]
Currently, there is no approved vaccine or treatment for COVID-19. Efforts are being made to characterize molecular targets, pivotal for the development of anti-coronaviral therapies.[5] The main protease (Mpro, also known as 3CLpro), is one of coronaviral non-structural proteins (Nsp5) designated as a potential target for drug development.[6] Mpro cleaves the viral polyproteins, generating twelve non-structural proteins (Nsp4-Nsp16), including the RNA-dependent RNA polymerase (RdRp, Nsp12) and helicase (Nsp13). The inhibition of Mpro would prevent the virus from replication and therefore constitutes one of the potential anti-coronaviral strategies. [6-7]
Due to the close phylogenetic relationship between SARS-CoV-2 and SARS-CoV,[2, 8] their main proteases share many structural and functional features. From the perspective of the design and synthesis of new Mpro inhibitors, a key feature of both of the enzymes is their ability to cleave the peptide bond following Gln. The SARS-CoV Mpro cleaves polyproteins within the Leu-Gln↓(Ser, Ala, Gly) sequence (↓ indicates the cleavage site), which appears to be a conserved pattern of this protease.[6a, 7, 9] The ability of peptide bond hydrolysis after Gln residues is also observed for main proteases of other coronaviruses[10] but is unknown for human enzymes. This observation, along with further studies on the Mpro, can potentially lead to new broad-spectrum anti-coronaviral inhibitors with minimum side effects.[11]
In the present study, we applied the HyCoSuL (Hybrid Combinatorial Substrate Library) approach to determine the full substrate specificity profile of SARS-CoV Mpro and SARS-CoV-2 Mpro proteases. The use of natural and a large number of unnatural amino acids with diverse chemical structures allowed an in-depth characterization of the residue preference of the binding pockets within the active sites of the proteases. The results from library screening enabled us to design and synthesize ACC-labeled substrates with improved catalytic efficiency in comparison to a substrate containing only natural amino acids. Moreover, results from our studies clearly indicate that SARS-CoV Mpro and SARS-CoV-2 Mpro proteases exhibit overlapping substrate specificity. This knowledge can be applied in the design of chemical compounds for effective therapy of COVID-19.
Results and Discussion
To determine the SARS-CoV Mpro and SARS-CoV-2 Mpro substrate preferences, we applied a hybrid combinatorial substrate library (HyCoSuL) approach. The library consists of three sublibraries, each of them contains a fluorescent tag – ACC (7-amino-4-carbamoylmethylcoumarin), two fixed positions and two varied positions containing an equimolar mixture of 19 amino acids (Mix) (P2 sublibrary: Ac-Mix-Mix-X-Gln-ACC, P3 sublibrary: Ac-Mix-X-Mix-Gln-ACC, P4 sublibrary: Ac-X-Mix-Mix-Gln-ACC, X =19 natural and over 100 unnatural amino acids, Figure 1). We incorporated glutamine at the P1 position, because the available crystal structures of SARS-CoV Mpro revealed that only glutamine (and, at a very small number of cleavage sites, histidine) residue can occupy the S1 pocket of this enzyme.[9, 12] The imidazole of His163, located at the very bottom of the S1 pocket, is suitably positioned to interact with the Gln side chain. The Gln is also involved in the other two interactions with main chain of F140, and side chain of Glu166. The library screen revealed that SARS-CoV and SARS-CoV-2 Mpro display very similar substrate specificity, however SARS-CoV Mpro possesses broader substrate preferences at the P2 position (Figure 2). The most preferred amino acid at the P2 position is leucine in case of both proteases. SARS-CoV Mpro exhibits lower activity toward other tested amino acids at this position (<30%). The S2 pocket of SARS-CoV-2 Mpro can accommodate other hydrophobic residues, such as 2-Abz (54%), Phe(4-NO2) (50%), 3-Abz (50%), β-Ala (49%), Dht (46%), hLeu (43%), Met (41%), and Ile (37%) (amino acid structures are presented in Table S1, SI). Both enzymes prefer hydrophobic D and L amino acids and also positively charged residues at the P3 position; the best are: Tle, D-Phe, D-Tyr, Orn, hArg, Dab, Dht, Lys, D-Phg, D-Trp, Arg, and Met(O)2. SARS-CoV and SARS-CoV-2 Mpro possess broad substrate specificity at the P4 positon. The most preferred are small aliphatic residues such as Abu, Val, Ala, and Tle, but other hydrophobic amino acids are also accepted. These findings can be partly explained by the available crystal structures of SARS-CoV Mpro in complex with inhibitors.[6b, 9, 12] The hydrophobic S2 subsite of SARS-CoV Mpro is larger compared to other Mpro coronavirus proteases, which explains less stringent specificity.[11] The S2 pocket can form hydrophobic interactions with P2 residues that are not only limited to leucine. The S3 pocket of SARS Mpro is not very well defined which is also reflected in our P3 substrate specificity profile. The S4 pocket can be occupied by small residues due to crowded cavity formed by Pro168, L167 at the bottom and T190, A191 at the top wall.
To validate the results from library screening, we designed and synthesized ACC-labeled substrates containing the most preferred amino acids in each position. Then, we measured the rate of substrate hydrolysis relevant to each protease (Figure 3). The data clearly demonstrate that SARS-CoV Mpro and SARS-CoV-2 Mpro exhibit the same activity toward tested substrates. The results are consistent with the HyCoSuL screening data. The most preferred substrate, Ac-Abu-Tle-Leu-Gln-ACC, is composed of the best amino acids in each position. Kinetic parameters were determined for the two best substrates (Ac-Abu-Tle-Leu-Gln-ACC, Ac-Thz-Tle-Leu-Gln-ACC) and one containing the best recognized natural amino acids (Ac-Val-Lys-Leu-Gln-ACC) (Table 1) toward SARS-CoV-2 Mpro. Due to substrates precipitation due to high concentration needed in the assay, kinetic parameters toward SARS-CoV Mpro could not be determined. Analysis of kinetic parameters revealed that these three substrates differ in the kcat value, while KM values are comparable.
In summary, we established substrate specificity profiles at the P4-P2 positions of the SARS-CoV Mpro and SARS-CoV-2 Mpro proteases using a combinatorial approach. Our data clearly demonstrate that these two enzymes display very similar substrate preferences. Information provided here can be used for the design of inhibitors and activity-based probes against the SARS-CoV-2.
Materials and Methods
Reagents
The reagents used for solid-phase peptide synthesis were as follows: Rink Amide (RA) resin (particle size 100-200 mesh, loading 0.74 mmol/g), all Fmoc-amino acids, O-benzotriazole-N,N,N’,N’-tetramethyl-uronium-hexafluoro-phosphate (HBTU), 2-(1-H-7-azabenzotriazol-1-yl)-1,1,3,3-tetramethyluranium hexafluorophosphate (HATU), piperidine, diisopropylcarbodiimide (DICI) and trifluoroacetic acid (TFA), purchased from Iris Biotech GmbH (Marktredwitz, Germany); anhydrous N-hydroxybenzotriazole (HOBt) from Creosauls, Louisville, KY, USA; 2,4,6-collidine (2,4,6-trimethylpyridine), HPLC-grade acetonitrile, triisopropylsilane (TIPS) from Sigma-Aldrich (Poznan, Poland); and N,N-diisopropylethylamie (DIPEA) from VWR International (Gdansk, Poland). N,N-dimethylformamide (DMF), dichloromethane (DCM), methanol (MeOH), diethyl ether (Et2O), acetic acid (AcOH), and phosphorus pentoxide (P2O5), obtained from Avantor (Gliwice, Poland). Designed substrates were purified by HPLC on a Waters M600 solvent delivery module with a Waters M2489 detector system using a semipreparative Wide Pore C8 Discovery column. The solvent composition was as follows: phase A (water/0.1% TFA) and phase B (acetonitrile/0.1% TFA). The purity of each compound was confirmed with an analytical HPLC system using a Jupiter 10 μm C4 300 Å column (250 × 4.6 mm). The solvent composition was as follows: phase A (water/0.1% TFA) and phase B (acetonitrile/0.1% TFA); gradient, from 5% B to 95% B over a period of 15 min. The molecular weight of each substrate was confirmed by high-resolution mass spectrometry. Waters LCT premier XE with electrospray ionization (ESI) and a time-of-flight (TOF) module.
Enzyme preparation
Gene cloning, recombinant production of the SARS-CoV and SARS-CoV-2 Mpro are described elsewhere.[7, 13]
Combinatorial library synthesis
Synthesis of H2N-ACC-resin
ACC synthesis was carried out according to Maly et al.[14] To a glass reaction vessel, 1 eq (9.62 mmol, 13 g) of Rink AM resin was added and stirred gently once per 10 min in DCM for 1 h, then filtered and washed 3 times with DMF. Fmoc-group deprotection was performed using 20% piperidine in DMF (three cycles: 5, 5, and 25 min), filtered and washed with DMF each time (six times). Next, 2.5 eq of Fmoc-ACC-OH (24.05 mmol, 10.64 g) was preactivated with 2.5 eq HOBt monohydrate (24.05 mmol, 3.61 g) and 2.5 eq DICI (24.05 mmol, 3.75 mL) in DMF and the slurry was added to the resin. The reaction was shaked gently for 24 hours at room temperature. After this time, the resin was washed four times with DMF and the reaction was repeated using 1.5 eq of above reagents in order to improve the yield of ACC coupling to the resin. After 24 hours, the resin was washed with DMF and the Fmoc protecting group was removed using 20% piperidine in DMF (5, 5, and 25 min), filtered and washed with DMF (six times).
Synthesis of H2N-Gln(Trt)-ACC-resin
2.5 eq Fmoc-Gln(Trt)-OH (24.05 mmol, 14.69 g) with 2.5 eq HATU (24.05 mmol, 9.15 g), 2.5 eq collidine (24.05 mmol, 3.18 mL) in DMF were activated for 2 min and added to filter cannula with 1 eq (9.62 mmol) H2N-ACC-resin and the reaction was carried out for 24 h. Next, the resin was washed four times with DMF and the same reaction was performed again using 1.5 eq of above reagents. After four DMF washes, the Fmoc protecting group was removed using 20% piperidine in DMF (5, 5, and 25 min). Subsequently, the resin was washed with DCM (3 times) and MeOH (3 times) and dried over P2O5. The synthesis of P2, P3, and P4 sublibraries is exemplified in detail with the P2 sublibrary. The P2 library consisted of 137 compounds where all of the natural amino acids (omitting cysteine) and a pool of unnatural amino acids were used at a defined position (in this case, the P2 position) and an isokinetic mixture of 19 amino acids (without cysteine; plus norleucine mimicking methionine) was coupled in the remaining positions (in case of the P2 sublibrary, positions P3 and P4 were occupied by isokinetic mixture). Equivalent ratios of amino acids in the isokinetic mixture were defined based on their reported coupling rates. A fivefold excess (over the resin load) of the mixture was used. For fixed positions, 2.5 eq of single amino acid was used. All reactions were performed with the use of the coupling reagents DICI and HOBt. For P2 coupling, the synthesis of the library was performed using a MultiChem 48-wells synthesis apparatus (FlexChem from SciGene, Sunnyvale, CA, USA). To each well of the reaction apparatus, 1 eq of dry H2N-Gln(Trt)-ACC-resin (0.059 mmol, 80 mg) was added and stirred gently for 30 minutes in DCM, and then washed four times with DMF. In separate Eppendorf tubes, 2.5 eq (0.15 mmol) Fmoc-P2-OH was preactivated with 2.5 eq HOBt (0.15 mmol, 22.5 mg) and 2.5 eq DICI (0.15 mmol, 23.55 μL) in DMF. Next, preactivated amino acids were added to wells of the apparatus containing H2N-Gln(Trt)-ACC-resin, followed by 3 h of agitation at room temperature. Then, the reaction mixture was filtered, washed with DMF (4 times), and the ninhydrin test was carried out in order to confirm P2-amino acid coupling. Subsequently, Fmoc protecting groups were removed with the use of 20% piperidine in DMF (5, 5, and 25 min). For P3 and P4 position coupling, an isokinetic mixture for 48 portions was prepared from 18 Fmoc-protected natural amino acids (omitting cysteine; plus norleucine mimicking methionine; 19 amino acids in total). Next, 5 eq of isokinetic mixture, 5 eq HOBt (14.16 mmol, 2.13 g), and 5 eq DICI (14.16 mmol, 2.22 mL) were diluted in DMF and preactivated for 3 min. The activated isokinetic mixture was added to each of 48 wells containing 1 eq of H2N-P2-Gln(Trt)-ACC-resin. After 3 h of gentle agitation, the slurry was filtered off and washed with DMF (4 times). A ninhydrin test was carried out and the Fmoc protecting group was removed using 20% piperidine in DMF (5, 5, and 25 min). The same procedure was applied for the remaining compounds. The isokinetic mixture was added to prepare the P4 position in the same manner as for the P3 position. In the last step of the synthesis, N-terminus acetylation was performed; to prepare the mixture for 48 compounds, 5 eq of AcOH (14.16 mmol, 807 μL), 5 eq HBTU (14.16 mmol, 5.37 g), and 5 eq DIPEA (14.16 mmol, 2.44 mL) in ∼45 mL of DMF were added to a 50-mL falcon tube. After gentle stirring for 1 min, the mixture (∼800 μL) was added to each well in the reaction apparatus, containing the H2N-Mix-Mix-P2-Gln(Trt)-ACC-resin, followed by gentle agitation for 30 min. Next, the resin was washed six times with DMF, three times with DCM, three times with MeOH, and dried over P2O5. After completing the synthesis, peptides were cleaved from the resin with a mixture of cold TFA:TIPS:H2O (%, v/v/v 95:2.5:2.5; 2 mL/well; 2 hours, shaking once per 15 min). The solution from each well was collected separately and the resin was washed once with a portion of fresh cleavage solution (1 mL), followed by addition of diethyl ether (Et2O, 14 mL) into falcons with peptides in solution. After precipitation (30 min at −20°C), the mixture was centrifuged and washed again with Et2O (5 mL). After centrifugation, the supernatant was removed and the remaining white precipitate was dissolved in ACN/H2O (v/v, 3/1) and lyophilized. The products were dissolved in DMSO to a final concentration of 10 mM and used without further purification. The synthesis of P3 and P4 sublibraries was performed in the same manner as described above; P3 and P4 sublibraries were synthesized by coupling fixed amino-acid residues to P3 (isokinetic mixture coupled to P2 and P4) and P4 position (isokinetic mixture coupled to P2 and P3).
Library screening
Hybrid combinatorial substrate library screening was performed using a spectrofluorometer (Molecular Devices Spectramax Gemini XPS) in 384-well plates (Corning). The assay conditions were as follows: 1 μL of substrate and 49 μL of enzyme, which was incubated at 37°C for 10 min in assay buffer (20 mM Tris, 150 mM NaCl, 1 mM EDTA, 1 mM DTT, pH 7.3). The final substrate concentration was 100 μM and the final enzyme concentration was 1 μM SARS-CoV and 0.6 μM SARS-CoV-2 Mpro, respectively. The release of ACC was measured for 45 min (λex = 355 nm, λem = 460 nm) and the linear part of each progress curve was used to determine the substrate hydrolysis rate. Substrate specificity profiles were established by setting the highest value of relative fluorescence unit per second (RFU/s) from each position as 100% and others were adjusted accordingly.
Individual substrate synthesis
ACC-labeled substrates were synthesized on the solid support according to the solid phase peptide synthesis method described elsewhere.[15] In brief, Fmoc-ACC-OH (2.5 eq) was attached to a Rink-amide resin using HOBt (2.5 eq) and DICI (2.5 eq) in DMF as coupling reagents. Then, the Fmoc protecting group was removed using 20% piperidine in DMF (three cycles: 5, 5, and 25 min). Fmoc-Gln(Trt)-OH (2.5 eq) was coupled to the H2N-ACC-resin using HATU (2.5 eq) and 2,4,6-collidine (2.5 eq) in DMF. After Fmoc group removal, Fmoc-P2-OH (2.5 eq) amino acid was attached (HOBt and DICI (2.5 eq) in DMF). Amino acids in P3 and P4 positions were coupled in the same manner. The free N-terminus was acetylated using HBTU, AcOH and DIPEA in DMF (5 eq of each reagent). Then, the resin was washed five times with DMF, three times with DCM and three times with MeOH, and dried over P2O5. Substrates were removed from the resin with a mixture of TFA/TIPS/H2O (% v/v/v, 95:2.5:2.5), precipitated in Et2O, purified on HPLC and lyophilized. The purity of each substrate was confirmed using analytical HPLC. Each substrate was dissolved in DMSO at a final concentration of 10 mM and stored at −80°C until use.
Kinetic analysis of substrates
Substrate screening was carried out in the same manner as the library assay. Substrate concentration was 5 μM, SARS-CoV Mpro was 0.3 μM and SARS-CoV-2 Mpro was 0.3 μM. Substrate hydrolysis was measured for 30 min using the following wavelengths: λex = 355 nm, λem = 460 nm. The experiment was repeated three times. Results were presented as mean values with standard deviations. Kinetic parameters were assayed in 96-well plates (Corning). Wells contained 80 μL of enzyme in assay buffer (0.074-0.1 μM SARS-CoV-2 Mpro) and 20 μL of substrate at eight different concentrations ranging from 58.5 μM to 1200 μM. ACC liberation was monitored for 30 min (λex = 355 nm, λem = 460 nm). Each experiment was repeated at least three times. Kinetic parameters were determined using the Michaelis-Menten equation and GraphPad Prism software.
Author contributions
M. D. and W.R. designed the research; W.R., K.G. and M.Z. performed the research and collected data; R.H., X.S. and L.Z. contributed enzymes; M.D. and W.R. analyzed and interpreted the data and wrote the manuscript; and all authors critically revised the manuscript.
Competing interest
The authors declare no competing financial interest.
Acknowledgments
The Drag laboratory is supported by the National Science Centre in Poland and the “TEAM/2017-4/32” project, which is conducted within the TEAM programme of the Foundation for Polish Science cofinanced by the European Union under the European Regional Development Fund. Work in the Hilgenfeld laboratory was supported by the German Center for Infection Research (DZIF), TTU 01, grant # 8011801806. W.R. is a beneficiary of a START scholarship from the Foundation for Polish Science.