SARS-CoV-2 Variants are Selecting for Spike Protein Mutations that Increase Protein Stability

The emergence of SARS-CoV-2 in 2019 has caused severe disruption and a huge number of human deaths across the globe. As the pandemic spreads, a natural result is the emergence of variants with a variety of amino acid mutations. Variants of SARS-CoV-2 with mutations in their spike protein may result in an increased infectivity, increased lethality, or immune escape, and whilst many of these properties can be explained through changes to binding affinity or changes to post-translational modification, many mutations have no known biophysical impact on the structure of protein. The Gibbs free energy of a protein represents a measure of protein stability, with an increased stability resulting in a protein that is more thermodynamically stable, and more robust to changes in external environment. Here we show that mutations in the spike proteins of SARS-CoV-2 are selecting for amino acid changes that result in a more stable protein than expected by chance. We calculate all possible mutations in the SARS-CoV-2 spike protein, and show that many variants are more stable than expected when compared to the background, indicating that protein stability is an important consideration for the understanding of SARS-CoV-2 evolution. Variants exhibit a range of stabilities, and we further suggest that some stabilising mutations may be acting as a “counterbalance” to destabilising mutations that have other properties, such as increasing binding site affinity for the human ACE2 receptor. We suggest that protein folding calculations offer a useful tool for early identification of advantageous mutations.

Here we show that mutations in the spike proteins of SARS-CoV-2 are selecting for amino acid changes that result in a more stable protein than expected by chance. We calculate all possible mutations in the SARS-CoV-2 spike protein, and show that many variants are more stable than expected when compared to the background, indicating that protein stability is an important consideration for the understanding of SARS-CoV-2 evolution. Variants exhibit a range of stabilities, and we further suggest that some stabilising mutations may be acting as a "counterbalance" to destabilising mutations that have other properties, such as increasing binding site affinity for the human ACE2 receptor. We suggest that protein folding calculations offer a useful tool for early identification of advantageous mutations.
TEXT: Since the emergence of SARS-CoV-2 in late 2019, over 2 million people have died as a result of infection 1 . As the global pandemic continues, the emergence of viral variants with RNA mutations is an expected phenomena, caused by random errors in RNA copying, and selected for by evolutionary pressure 2 . Many variants contain mutations in the spike protein that confer an advantage to the virus, such as increased ACE2 receptor binding 3 , glycosylation/cleavage site alterations 4 , and immune evasion 5 , as well as protein stability 6 .
Understanding these properties helps infer how a variant may differ from another mutational profile, and provides insights into the mechanisms by which variants differ, such as increased infectivity or vaccine resistance [7][8][9] . The WHO classifies variants in SARS-CoV-2 into major categories, the two most important: "Variants of Concern" and "Variants of Interest" are assigned to emerging variants likely to have a different phenotype and mutational profile to the original SARS-CoV-2 10 .  11,12 , and we have recently shown that they can be predictive of mutations that destabilise or damage a protein in a cancer context [13][14][15] . Whilst stability of mutations has been assessed in the SARS-CoV-2 spike protein 16,17 , variant analysis has not yet been performed.
Mutations that stabilise the SARS-CoV-2 spike protein are likely to lead to better binding to other molecules, and a greater lifespan of a protein before thermal unfolding. The requirement for calculation of predicted Δ Δ G values is protein structural information, which was recently published for the SARS-CoV-2 spike protein 18 .
Here we calculate the     This work highlights that mutations with a stabilising effect on the SARS-CoV-2 spike protein are one of the key drivers of evolution of the virus, and contributing to the increased transmissibility of emerging variants. That variants are more stable than expected by chance shows that evolution is favouring mutations with a stabilising effect, and it may be that mutations 11 that destabilise a protein but have other influences on protein structure, such as K417N 21 , which alters ACE2 binding affinity, are offset or preceded by mutations that stabilise the structure. We note however, that not all mutations in all variants can be considered, due to missing regions of the cryo-em structure, and as such this study does not necessarily represent the true Δ Δ G for each variant. Furthermore, we study only the structure in its "closed" conformation as we feel this is the most physiological relevant of the existing structures, but further work will need to address the impact of the dynamic range of the structure on mutational stability. We highlight that stability of the SARS-CoV-2 spike protein is an important consideration for future study of variants, and is likely one of a number of driving forces in the evolution of new variants. Finally, we suggest that folding simulations of newly sequenced variants may offer a computationally inexpensive method to highlight advantageous mutations, and that prospective simulation of further mutation to these samples may predict future variants to surveil for.

ASSOCIATED CONTENT
Supporting Information.
The following files are available free of charge.