Abstract
The SARS-CoV-2 pandemic recently entered an alarming new phase with the emergence of the variants of concern (VOC) and understanding their biology is paramount to predicting future ones. Current efforts mainly focus on mutations in the spike glycoprotein (S), but changes in other regions of the viral proteome are likely key. We analyzed more than 900,000 SARS-CoV-2 genomes with a computational systems biology approach including a haplotype network and protein structural analyses to reveal lineage-defining mutations and their critical functional attributes. Our results indicate that increased transmission is promoted by epistasis, i.e., combinations of mutations in S and other viral proteins. Mutations in the non-S proteins involve immune-antagonism and replication performance, suggesting convergent evolution. Furthermore, adaptive mutations appear in geographically disparate locations, suggesting that either independent, repeat mutation events or recombination among different strains are generating VOC. We demonstrate that recombination is a stronger hypothesis, and may be accelerating the emergence of VOC by bringing together cooperative mutations. This emphasizes the importance of a global response to stop the COVID-19 pandemic.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
“Nothing in biology makes sense except in the light of evolution” -Theodosius Dobzhansky
We incorporated information from a recent analysis that identified recombination in North America and revised Figure 3.