PT - JOURNAL ARTICLE AU - Gómez-Carballa, Alberto AU - Bello, Xabier AU - Pardo-Seco, Jacobo AU - Martinón-Torres, Federico AU - Salas, Antonio TI - The impact of super-spreaders in COVID-19: mapping genome variation worldwide AID - 10.1101/2020.05.19.097410 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.05.19.097410 4099 - http://biorxiv.org/content/early/2020/05/21/2020.05.19.097410.short 4100 - http://biorxiv.org/content/early/2020/05/21/2020.05.19.097410.full AB - The human pathogen severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the major pandemic of the 21st century. We analyzed >4,700 SARS-CoV-2 genomes and associated meta-data retrieved from public repositories. SARS-CoV-2 sequences have a high sequence identity (>99.9%), which drops to >96% when compared to bat coronavirus. We built a mutation-annotated reference SARS-CoV-2 phylogeny with two main macro-haplogroups, A and B, both of Asian origin, and >160 sub-branches representing virus strains of variable geographical origins worldwide, revealing a uniform mutation occurrence along branches that could complicate the design of future vaccines. The root of SARS-CoV-2 genomes locates at the Chinese haplogroup B1, with a TMRCA dating to 12 November 2019 - thus matching epidemiological records. Sub-haplogroup A2a originates in China and represents the major non-Asian outbreak. Multiple bottleneck episodes, most likely associated with super-spreader hosts, explain COVID-19 pandemic to a large extent.Competing Interest StatementThe authors have declared no competing interest.