Biophysical characterization of the SARS-CoV-2 spike protein binding with the ACE2 receptor and implications for infectivity

SARS-CoV-2 is a novel highly virulent pathogen which gains entry to human cells by binding with the cell surface receptor – angiotensin converting enzyme (ACE2). We computationally contrasted the binding interactions between human ACE2 and coronavirus spike protein receptor binding domain (RBD) of the 2002 epidemic-causing SARS-CoV-1, SARS-CoV-2, and bat coronavirus RaTG13 using the Rosetta energy function. We find that the RBD of the spike protein of SARS-CoV-2 is highly optimized to achieve very strong binding with human ACE2 (hACE2) which is consistent with its enhanced infectivity. SARS-CoV-2 forms the most stable complex with hACE2 compared to SARS-CoV-1 (23% less stable) or RaTG13 (11% less stable) while occupying the greatest number of residues in the ATR1 binding site. Notably, the SARS-CoV-2 RBD out-competes the angiotensin 2 receptor type I (ATR1) which is the native binding partner of ACE2 by 35% in terms of the calculated binding affinity. Strong binding is mediated through strong electrostatic attachments with every fourth residue on the N-terminus alpha-helix (starting from Ser19 to Asn53) as the turn of the helix makes these residues solvent accessible. By contrasting the spike protein SARS-CoV-2 Rosetta binding energy with ACE2 of different livestock and pet species we find strongest binding with bat ACE2 followed by human, feline, equine, canine and finally chicken. This is consistent with the hypothesis that bats are the viral origin and reservoir species. These results offer a computational explanation for the increased infectivity of SARS-CoV-2 and allude to therapeutic modalities by identifying and rank-ordering the ACE2 residues involved in binding with the virus.

1 0 1 hydrophobic interaction whereas the SARS-CoV-1 RBD forms neither (see Figure 2). Consequently, a 1 0 2 computational alanine scan (see Figure 3) reveals that alanine mutation of this position leads to 1 0 3 significant loss of hACE2 binding in both SARS-CoV-2 (~61% reduction) and RaTG13 (~59% 1 0 4 reduction) but not in SARS-CoV-1 (only ~12% reduction). The spike protein RBD for SARS-CoV-1 (and 1 0 5 RaTG13) are only able to form eight (and eleven) strong electrostatic contacts using seven (and ten) RBD 1 0 6 residues, respectively. This does not imply that SARS-CoV-1 and RaTG13 only use these residues to bind 1 0 7 to hACE2. More than fifteen additional interface residues either form weak electrostatic contacts or are 1 0 8 simply non-interacting. Table 1 lists the hydrogen-bonded interactions between the RBDs and hACE2 RaTG13. In SARS-CoV-2, Phe456 simultaneously interacts with hACE2 residues Thr27 and Asp30 whereas only 1 1 6 the hydrophobic contact is observed in RatG13. In SARS-CoV-1, Leu443 is unable to establish neither the backbone  Each one of the hACE2 binding residues from the three viral spike RBDs was computationally mutated to 1 2 3 alanine (one at a time) and the resultant hACE2-RBD complexes were energy minimized and scored 1 2 4 using the Rosetta energy function. This procedure assesses how important is the identity of the native 1 2 5 residues by defaulting them to alanine and observing whether this significantly affects binding. The 1 2 6 percent loss of hACE2 binding upon an alanine mutation was used as a proxy score for assessing the 1 2 7 importance of each RBD residue in binding and subsequent pathogenesis. The results from the alanine 1 2 8 scan study (see Figure 3) reveal that ~90% (19 out of 21) of the hACE2-binding residues of SARS-CoV-1 2 9 2 are important for complex formation. Even a single mutation to alanine of any of these residues lowers 1 3 0 the binding score by more than 60%. These results imply that the SARS-CoV-2 RBDs of the spike 1 3 1 protein are highly optimized for binding with hACE2. We note that positions Lys417 and Gly502 have 1 3 2 one of the strongest impacts on binding (78% and 79% reduction upon mutation to Ala, respectively).

3 3
This is because they help establish one strong electrostatic contact with Asp30, and three with Gln325, 1 3 4 Lys353, and Gly354 (as listed in Table 1). The computational alanine scanning results identify the same 1 3 5 three residues Phe486, Gln493, and Asn501 to be important for hACE2 binding as proposed by Wan et 1 3 6 al. 9 . We find that Phe486, Gln493, and Asn501 each establish three new contacts, consequently their 1 3 7 mutation to Ala (even for only one of them) leads to loss of ACE2 binding by more than ~62.5%. normalized with respect to binding score prior to mutation. SARS-CoV-2 spike RBD appears to be highly optimized 1 4 2 for binding hACE2 as the single mutation to more than 90% of the residues forming the RBD to alanine causes 1 4 3 significant reduction in binding energy.

4 4 4 5
Alanine scanning results of the spike protein RBD of SARS-CoV-1 show less significant penalty to the 1 4 6 binding score upon mutation to alanine. Only twelve residues are involved in strong electrostatic 1 4 7 coupling with hACE2 residues, out of which six are hydrogen bonded (indicated in Table 1). In 1 4 8 summary, alanine scans indicate that SARS-CoV-2 has the highest number of "effectively" interacting 1 4 9 residues at the ACE2 binding interface whereas the SARS-CoV-1 spike forms only a few strong hACE2 1 5 0 connectors with a large number of "idle" interface residues (43% -9 out of 21) which do not affect 1 5 1 13 ite ed ses he tic In ng E2 ect hACE2 binding upon mutation to alanine. RatG13 appears to be between the two with 13 strong 1 5 2 electrostatic interactors (61% -13 out of 21), out of which seven are hydrogen bonded, and only four idle 1 5 3 residues at the interface (i.e., residues Thr484, Leu486, Gly496, and Tyr505). Presence of tyrosine and glycine residues in the hACE2 binding domains of these spike proteins 1 6 0 1 6 1 All three viral RBDs are enriched in tyrosine residues. As many as 26.3% (5 out of 19 residues) of the 1 6 2 SARS-CoV-1 RBD residues, 25% (4 out of 16 residues) for SARS-CoV-2, and 29% (5 out of 17 1 6 3 residues) for RaTG13 are tyrosine residues. We have not explored the phylogenetic basis for the presence 1 6 4 of tyrosine residues but they do seem to be important for conferring high binding affinity spike and 1 6 5 hACE2 for both SARS-CoV-2 and RaTG13, as alluded to by the alanine scan results (see Figure 3). In  interaction between the phenyl rings. This enables both of these tyrosine side-chains to form a strong 1 7 4 electrostatic contact with the Thr27 side-chain of hACE2. It is thus unsurprising that mutation of either 1 7 5 Tyr473 or Tyr489 (in both SARS-CoV-2 and RaTG13) to alanine results in a similar (>58%, respectively 1 7 6 as shown in Figure 3) reduction in binding with hACE2. In contrast, in the energy minimized complex of 1 7 7 SARS-CoV-1 RBD with hACE2 both Tyr442 and Tyr475 (see Figure 4a) only contribute to internal 1 7 8 stability of the spike by forming strong electrostatic contacts with RBD residues Trp476 and Asn473.

7 9
They are therefore unavailable (or too far > 6.0Å) for binding with the neighboring hACE2 residues.  Next, we focus on the role of glycine residues (see Figure 4b) in all three spike RBDs which form 1 9 0 important electrostatic contacts with hACE2 as they lead to more than 55% loss of binding (on average) 1 9 1 upon mutation to alanine. We chose to study in detail one such representative glycine from all three spike Interestingly, for all three variants the interaction with the hACE2 residue Lys353 with glycine residues in 1 9 6 the spike protein is the same. Atomic coordinates of both these complexes were independently, and between Tyr491, Tyr505, and His505 residues, respectively (see Figure 4b). Mutation Y491A for SARS-2 0 5 CoV-1 has no effect on hACE2 binding but Y505A (and H505A) in SARS-CoV-2 (and RaTG13) reduces 2 0 6 binding by more than 40%. However, alanine mutation to any of the hinge glycine residues leads to >70% 2 0 7 loss of hACE2 binding in all three RBD-hACE2 complexes. Thus, we recover the strong functional motif 2 0 8 xGzGx in the spike RBD which is conserved between all three SARS-CoV strains. ten unique Rosetta energy minimization trajectories. Interestingly, RaTG13 spike residues occupy the 2 1 5 largest number of hACE2 residues resulting in the highest reduction (~14% more than SARS-CoV-2) of 2 1 6 solvent accessible surface area (SASA) (see Figure 5e). Nevertheless, the associated Rosetta binding 2 1 7 energy is 11.2% less than the one for SARS-CoV-2 which forms overall stronger hydrogen-bonded 2 1 8 contacts. for the hACE2-ATR1 complex, we used protein-protein docking and Rosetta binding energy screening to 2 3 4 identify the most stable configuration of the complex. Analysis of the hACE2-ATR1 binding interface 2 3 5 reveals 41 hACE2 residues and 26 ATR1 residues at the interface connected by five strong electrostatic 2 3 6 contacts and several long range weak electrostatic contacts. We find that eleven SARS-CoV-2 RBD 2 3 7 binding residues of hACE2 are shared by the ATR1 binding region. Moreover, the SARS-CoV-2 spike 2 3 8 protein binds hACE2 with ~35% better binding score than ATR1 binds hACE2. RaTG13 and SARS-2 3 9 CoV-1 exhibit ~21% and ~5% better Rosetta binding energies, respectively with hACE2 compared to the 2 4 0 hACE2-ATR1 complex. They also share only nine and eight residues, respectively with the ATR1 We computationally explored the potentially available margin of improvement for the binding affinity of 2 5 2 SARS-CoV-2 with hACE2 using the IPRO 13 protein design software. We allowed all 21 contacting 2 5 3 residues of the RBD of the spike protein to simultaneously mutate. We run two separate design 2 5 4 trajectories and, in both cases the best design achieved an approximately 23% improvement in binding 2 5 5 affinity using the Rosetta scoring function. This improvement is less than the difference between the 2 5 6 calculated binding scores of SARS-CoV-1 and SARS-CoV-2 implying that SARS-CoV-2 has already 2 5 7 achieved most of the theoretically possible binding affinity gain with hACE2 compared to SARS-CoV-1.

5 8
Interestingly, the network of glycine residues in SARS-CoV-2 is conserved in all redesigned RBDs.