Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic

Maciej F Boni, Philippe Lemey, Xiaowei Jiang, Tommy Tsan-Yuk Lam, Blair Perry, Todd Castoe, Andrew Rambaut, David L Robertson
doi: https://doi.org/10.1101/2020.03.30.015008
Maciej F Boni
1Center for Infectious Disease Dynamics, Department of Biology, Pennsylvania State University, University Park, PA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mfb9@psu.edu philippe.lemey@kuleuven.be a.rambaut@ed.ac.uk David.L.Robertson@glasgow.ac.uk
Philippe Lemey
2Department of Microbiology, Immunology and Transplantation, KU Leuven, Leuven, Belgium
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mfb9@psu.edu philippe.lemey@kuleuven.be a.rambaut@ed.ac.uk David.L.Robertson@glasgow.ac.uk
Xiaowei Jiang
3Xi’an Jiaotong-Liverpool University (XJTLU), Suzhou, Jiangsu, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tommy Tsan-Yuk Lam
4State Key Laboratory of Emerging Infectious Diseases, School of Public Health, The University of Hong Kong, Hong Kong SAR, China
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Blair Perry
5Department of Biology, University of Texas Arlington, Arlington, TX, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Todd Castoe
5Department of Biology, University of Texas Arlington, Arlington, TX, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrew Rambaut
6Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mfb9@psu.edu philippe.lemey@kuleuven.be a.rambaut@ed.ac.uk David.L.Robertson@glasgow.ac.uk
David L Robertson
7MRC-University of Glasgow Centre for Virus Research (CVR), Glasgow, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: mfb9@psu.edu philippe.lemey@kuleuven.be a.rambaut@ed.ac.uk David.L.Robertson@glasgow.ac.uk
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

There are outstanding evolutionary questions on the recent emergence of coronavirus SARS-CoV-2/hCoV-19 in Hubei province that caused the COVID-19 pandemic, including (1) the relationship of the new virus to the SARS-related coronaviruses, (2) the role of bats as a reservoir species, (3) the potential role of other mammals in the emergence event, and (4) the role of recombination in viral emergence. Here, we address these questions and find that the sarbecoviruses – the viral subgenus responsible for the emergence of SARS-CoV and SARS-CoV-2 – exhibit frequent recombination, but the SARS-CoV-2 lineage itself is not a recombinant of any viruses detected to date. In order to employ phylogenetic methods to date the divergence events between SARS-CoV-2 and the bat sarbecovirus reservoir, recombinant regions of a 68-genome sarbecovirus alignment were removed with three independent methods. Bayesian evolutionary rate and divergence date estimates were consistent for all three recombination-free alignments and robust to two different prior specifications based on HCoV-OC43 and MERS-CoV evolutionary rates. Divergence dates between SARS-CoV-2 and the bat sarbecovirus reservoir were estimated as 1948 (95% HPD: 1879-1999), 1969 (95% HPD: 1930-2000), and 1982 (95% HPD: 1948-2009). Despite intensified characterization of sarbecoviruses since SARS, the lineage giving rise to SARS-CoV-2 has been circulating unnoticed for decades in bats and been transmitted to other hosts such as pangolins. The occurrence of a third significant coronavirus emergence in 17 years together with the high prevalence and virus diversity in bats implies that these viruses are likely to cross species boundaries again.

In Brief The Betacoronavirus SARS-CoV-2 is a member of the sarbecovirus subgenus which shows frequent recombination in its evolutionary history. We characterize the extent of this genetic exchange and identify non-recombining regions of the sarbecovirus genome using three independent methods to remove the effects of recombination. Using these non-recombining genome regions and prior information on coronavirus evolutionary rates, we obtain estimates from three approaches that the most likely divergence date of SARS-CoV-2 from its most closely related available bat sequences ranges from 1948 to 1982.

Key Points

  • RaTG13 is the closest available bat virus to SARS-CoV-2; a sub-lineage of these bat viruses is able to infect humans. Two sister lineages of the RaTG13/SARS-CoV-2 lineage infect Malayan pangolins.

  • The sarbecoviruses show a pattern of deep recombination events, indicating that there are high levels of co-infection in horseshoe bats and that the viral pool can generate novel allele combinations and substantial genetic diversity; the sarbecoviruses are efficient ‘explorers’ of phenotype space.

  • The SARS-CoV-2 lineage is not a recent recombinant, at least not involving any of the bat or pangolin viruses sampled to date.

  • Non-recombinant regions of the sarbecoviruses can be identified, allowing for phylogenetic inference and dating to be performed. We constructed three such regions using different methods.

  • We estimate that RaTG13 and SARS-CoV-2 diverged 40 to 70 years ago. There is a diverse unsampled reservoir of generalist viruses established in horseshoe bats.

  • While an intermediate host responsible for the zoonotic event cannot be ruled out, the relevant evolution for spillover to humans very likely occurred in horseshoe bats.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted March 31, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic
Maciej F Boni, Philippe Lemey, Xiaowei Jiang, Tommy Tsan-Yuk Lam, Blair Perry, Todd Castoe, Andrew Rambaut, David L Robertson
bioRxiv 2020.03.30.015008; doi: https://doi.org/10.1101/2020.03.30.015008
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic
Maciej F Boni, Philippe Lemey, Xiaowei Jiang, Tommy Tsan-Yuk Lam, Blair Perry, Todd Castoe, Andrew Rambaut, David L Robertson
bioRxiv 2020.03.30.015008; doi: https://doi.org/10.1101/2020.03.30.015008

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Evolutionary Biology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4119)
  • Biochemistry (8828)
  • Bioengineering (6531)
  • Bioinformatics (23484)
  • Biophysics (11804)
  • Cancer Biology (9222)
  • Cell Biology (13336)
  • Clinical Trials (138)
  • Developmental Biology (7442)
  • Ecology (11424)
  • Epidemiology (2066)
  • Evolutionary Biology (15173)
  • Genetics (10452)
  • Genomics (14056)
  • Immunology (9187)
  • Microbiology (22198)
  • Molecular Biology (8823)
  • Neuroscience (47621)
  • Paleontology (351)
  • Pathology (1431)
  • Pharmacology and Toxicology (2493)
  • Physiology (3736)
  • Plant Biology (8090)
  • Scientific Communication and Education (1438)
  • Synthetic Biology (2224)
  • Systems Biology (6042)
  • Zoology (1254)