Abstract
Since the outbreak of the COVID-19 pandemic, the SARS-CoV-2 coronavirus accumulated an important amount of genome compositional heterogeneity through mutation and recombination, which can be summarized by means of a measure of Sequence Compositional Complexity (SCC). To test evolutionary trends that could inform us on the adaptive process of the virus to its human host, we compute SCC in high-quality coronavirus genomes from across the globe, covering the full span of the pandemic. By using phylogenetic ridge regression, we find trends for SCC in the short time-span of SARS-CoV-2 pandemic expansion. In early samples, we find no statistical support for any trend in SCC values over time, although the virus genome appears to evolve faster than Brownian Motion expectation. However, in samples taken after the emergence of Variants of Concern with higher transmissibility, and controlling for phylogenetic and sampling effects, we detect a declining trend for SCC and an increasing one for its absolute evolutionary rate. This means that the decay in SCC itself accelerated over time, and that increasing fitness of variant genomes lead to a reduction of their genome sequence heterogeneity. Therefore, our work shows that phylogenetic trends, typical of macroevolutionary time scales, can be also revealed on the shorter time spans typical of viral genomes.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
E-mails: José L. Oliver, oliver{at}ugr.es
Pedro Bernaola-Galván, rick{at}uma.es
Francisco Perfectti, fperfect{at}ugr.es
Cristina Gómez Martín, c.a.gomezmartin{at}amsterdamumc.nl
Silvia Castiglione, silviacastiglione2{at}gmail.com
Pasquale Raia, pasquale.raia{at}unina.it
Miguel Verdú, Miguel.Verdu{at}ext.uv.es
Andrés Moya, Andres.Moya{at}uv.es
Updated authors list