Abstract
During the spread of the COVID-19 pandemic, the SARS-CoV-2 coronavirus underwent mutation and recombination events that altered its genome compositional structure, thus providing an unprecedented opportunity to check an evolutionary process in real time. The mutation rate is known to be lower than expected for neutral evolution, suggesting purifying selection and convergent evolution. We begin by summarizing the compositional heterogeneity of each viral genome by computing its Sequence Compositional Complexity (SCC). To analyze the full range of SCC diversity, we select random samples of high-quality coronavirus genomes covering the full span of the pandemic. We then search for evolutionary trends that could inform us on the adaptive process of the virus to its human host by computing the phylogenetic ridge regression of SCC against time (i.e., the collection date of each viral isolate). In early samples, we find no statistical support for any trend in SCC values, although the viral genome appears to evolve faster than Brownian Motion (BM) expectation. However, in samples taken after the emergence of high fitness variants, and despite the brief time span elapsed, a driven decreasing trend for SCC and an increasing one for its absolute evolutionary rate are detected, pointing to a role for purifying selection in the evolution of SCC in the coronavirus. The higher fitness of variant genomes leads to adaptive trends of SCC over pandemic time in the coronavirus.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
The title and the abstract have been reworded. References have been updated.