Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

The genomic variation landscape of globally-circulating clades of SARS-CoV-2 defines a genetic barcoding scheme

Qingtian Guan, Mukhtar Sadykov, Raushan Nugmanova, Michael J. Carr, Stefan T. Arold, View ORCID ProfileArnab Pain
doi: https://doi.org/10.1101/2020.04.21.054221
Qingtian Guan
1King Abdullah University of Science and Technology (KAUST), Pathogen Genomics Laboratory, Biological and Environmental Science and Engineering (BESE), Thuwal-Jeddah, 23955-6900, Saudi Arabia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mukhtar Sadykov
1King Abdullah University of Science and Technology (KAUST), Pathogen Genomics Laboratory, Biological and Environmental Science and Engineering (BESE), Thuwal-Jeddah, 23955-6900, Saudi Arabia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Raushan Nugmanova
1King Abdullah University of Science and Technology (KAUST), Pathogen Genomics Laboratory, Biological and Environmental Science and Engineering (BESE), Thuwal-Jeddah, 23955-6900, Saudi Arabia
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael J. Carr
2National Virus Reference Laboratory (NVRL), School of Medicine, University College Dublin, Belfield, Dublin 4, Ireland
3Research Center for Zoonosis Control, Global Institution for Collaborative Research and Education (GI-CoRE); Hokkaido University, N20 W10 Kita-ku, Sapporo, 001-0020 Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stefan T. Arold
4King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Biological and Environmental Science and Engineering (BESE), Thuwal-Jeddah, 23955-6900, Saudi Arabia
5Centre de Biochimie Structurale, CNRS, INSERM, Université de Montpellier, 34090 Montpellier, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Arnab Pain
1King Abdullah University of Science and Technology (KAUST), Pathogen Genomics Laboratory, Biological and Environmental Science and Engineering (BESE), Thuwal-Jeddah, 23955-6900, Saudi Arabia
3Research Center for Zoonosis Control, Global Institution for Collaborative Research and Education (GI-CoRE); Hokkaido University, N20 W10 Kita-ku, Sapporo, 001-0020 Japan
6Nuffield Division of Clinical Laboratory Sciences (NDCLS), The John Radcliffe Hospital, University of Oxford, Headington, Oxford, OX3 9DU, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Arnab Pain
  • For correspondence: arnab.pain@kaust.edu.sa
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

ABSTRACT

We describe fifteen major mutation events from 2,058 high-quality SARS-CoV-2 genomes deposited up to March 31st, 2020. These events define five major clades (G, I, S, D and V) of globally-circulating viral populations, representing 85.7% of all sequenced cases, which we can identify using a 10 nucleotide genetic classifier or barcode. We applied this barcode to 4,000 additional genomes deposited between March 31st and April 15th and classified successfully 95.6% of the clades demonstrating the utility of this approach. An analysis of amino acid variation in SARS-CoV-2 ORFs provided evidence of substitution events in the viral proteins involved in both host-entry and genome replication. The systematic monitoring of dynamic changes in the SARS-CoV-2 genomes of circulating virus populations over time can guide therapeutic and prophylactic strategies to manage and contain the virus and, also, with available efficacious antivirals and vaccines, aid in the monitoring of circulating genetic diversity as we proceed towards elimination of the agent. The barcode will add the necessary genetic resolution to facilitate tracking and monitoring of infection clusters to distinguish imported and indigenous cases and thereby aid public health measures seeking to interrupt transmission chains without the requirement for real-time complete genomes sequencing.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted April 23, 2020.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
The genomic variation landscape of globally-circulating clades of SARS-CoV-2 defines a genetic barcoding scheme
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
The genomic variation landscape of globally-circulating clades of SARS-CoV-2 defines a genetic barcoding scheme
Qingtian Guan, Mukhtar Sadykov, Raushan Nugmanova, Michael J. Carr, Stefan T. Arold, Arnab Pain
bioRxiv 2020.04.21.054221; doi: https://doi.org/10.1101/2020.04.21.054221
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
The genomic variation landscape of globally-circulating clades of SARS-CoV-2 defines a genetic barcoding scheme
Qingtian Guan, Mukhtar Sadykov, Raushan Nugmanova, Michael J. Carr, Stefan T. Arold, Arnab Pain
bioRxiv 2020.04.21.054221; doi: https://doi.org/10.1101/2020.04.21.054221

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3506)
  • Biochemistry (7348)
  • Bioengineering (5324)
  • Bioinformatics (20266)
  • Biophysics (10020)
  • Cancer Biology (7744)
  • Cell Biology (11306)
  • Clinical Trials (138)
  • Developmental Biology (6437)
  • Ecology (9954)
  • Epidemiology (2065)
  • Evolutionary Biology (13325)
  • Genetics (9361)
  • Genomics (12587)
  • Immunology (7702)
  • Microbiology (19027)
  • Molecular Biology (7444)
  • Neuroscience (41049)
  • Paleontology (300)
  • Pathology (1230)
  • Pharmacology and Toxicology (2138)
  • Physiology (3161)
  • Plant Biology (6861)
  • Scientific Communication and Education (1273)
  • Synthetic Biology (1897)
  • Systems Biology (5313)
  • Zoology (1089)