ABSTRACT
RNA viruses, including SARS-CoV-2, evolve by mutation acquisition, or by hybridization between viral genomes. The SARS-CoV-2 pandemic provided an exceptional opportunity to analyze the mutations that appeared over a three-year period.
In this study, we analysed the type of mutations and their epidemic consequences on the thousands of genomes produced in our laboratory. These were obtained by next-generation sequencing from respiratory samples performed for genomic surveillance. The frequencies of mutations were calculated using Nextclade, Microsoft Excel, and an in-house Python script. In total, 61,397 genomes matching 483 Pangolin lineages were analyzed; 22,225 nucleotide mutations were identified, and of them 220 (1.0%) were each at the root of at least 836 genomes, a frequency threshold classifying mutations as “hyperfertile”. Two of these seeded the pandemic in Europe, namely a mutation in the RNA-dependent RNA polymerase associated with an increased mutation rate (P323L) and one in the spike protein (D614G), which plays a particular role in virus fitness. Most of these 220 “hyperfertile” mutations occurred in areas not predicted to be associated with increased virulence. Their number was 8±6 (0-22) per 1,000 nucleotides on average per gene. They were 3.7 times more frequent in accessory than informational genes (14 versus 4; p= 0.0037). Particularly, they were 4.1 times more frequent in ORF8 than in the gene encoding RNA polymerase. Interestingly, stop codons were present in 97 positions, almost only in six accessory genes including ORF7a (25 per 100 codons) and ORF8 (21). Furthermore, 1,661 mutations (16.3%) were associated with a lower number of “offspring” (50-835) and classified as “fertile”.
In conclusion, except for two initial mutations that could predict a change in the dynamics of the epidemic (mutation rate and change in the virus attachment site), most of the “hyperfertile” mutations did not predict the emergence of a new epidemic form. Significantly, some mutations were in non-coding areas and some consisted of stop codons, indicating that some genes (particularly ORF7a and ORF8) were rather “non-virulence genes” at a given stage of the epidemic, which is an unusual concept for viruses.
Competing Interest Statement
D.R. declares grants or contracts and royalties or licenses from Hitachi High-Technologies Corporation, Tokyo, Japan. He is a scientific board member of Eurofins company, and a founder and shareholder of a microbial culture company (Culture Top), two biotechnology companies (Techno-Jouvence, and Gene and Green TK), and an infectious diseases rapid diagnosis company (Pocrame). The other authors have no conflicts of interest to declare relative to the present study. Funding sources played no role in the design and conduct of the study, the collection, management, analysis, and interpretation of the data, and the preparation, review, or approval of the manuscript.