Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A complete reference genome improves analysis of human genetic variation

Sergey Aganezov, Stephanie M. Yan, Daniela C. Soto, View ORCID ProfileMelanie Kirsche, Samantha Zarate, Pavel Avdeyev, Dylan J. Taylor, Kishwar Shafin, Alaina Shumate, Chunlin Xiao, Justin Wagner, Jennifer McDaniel, Nathan D. Olson, Michael E.G. Sauria, View ORCID ProfileMitchell R. Vollger, View ORCID ProfileArang Rhie, Melissa Meredith, Skylar Martin, Joyce Lee, Sergey Koren, View ORCID ProfileJeffrey A. Rosenfeld, Benedict Paten, View ORCID ProfileRyan Layer, Chen-Shan Chin, Fritz J. Sedlazeck, Nancy F. Hansen, Danny E. Miller, Adam M. Phillippy, Karen H. Miga, View ORCID ProfileRajiv C. McCoy, View ORCID ProfileMegan Y. Dennis, View ORCID ProfileJustin M. Zook, View ORCID ProfileMichael C. Schatz
doi: https://doi.org/10.1101/2021.07.12.452063
Sergey Aganezov
1Department of Computer Science, Johns Hopkins University, Baltimore MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Stephanie M. Yan
2Department of Biology, Johns Hopkins University, Baltimore MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniela C. Soto
3Biochemistry & Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, Davis, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Melanie Kirsche
1Department of Computer Science, Johns Hopkins University, Baltimore MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Melanie Kirsche
Samantha Zarate
1Department of Computer Science, Johns Hopkins University, Baltimore MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Pavel Avdeyev
4Genome Informatics Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dylan J. Taylor
2Department of Biology, Johns Hopkins University, Baltimore MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kishwar Shafin
5UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alaina Shumate
6Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Chunlin Xiao
7National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Justin Wagner
8National Institute of Standards and Technology, Gaithersburg, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer McDaniel
8National Institute of Standards and Technology, Gaithersburg, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nathan D. Olson
8National Institute of Standards and Technology, Gaithersburg, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael E.G. Sauria
2Department of Biology, Johns Hopkins University, Baltimore MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mitchell R. Vollger
9Department of Genome Sciences, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mitchell R. Vollger
Arang Rhie
4Genome Informatics Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Arang Rhie
Melissa Meredith
5UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Skylar Martin
10Department of Computer Science and the Biofrontiers Institute, University of Colorado, Boulder, CO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joyce Lee
11Bionano Genomics, San Diego, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sergey Koren
4Genome Informatics Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeffrey A. Rosenfeld
12Cancer Institute of New Jersey, New Brunswick, NJ, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jeffrey A. Rosenfeld
Benedict Paten
5UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ryan Layer
10Department of Computer Science and the Biofrontiers Institute, University of Colorado, Boulder, CO, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ryan Layer
Chen-Shan Chin
13DNAnexus, Mountain View, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fritz J. Sedlazeck
14Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nancy F. Hansen
15Comparative Genomics Analysis Unit, National Human Genome Research Institute, Rockville, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Danny E. Miller
9Department of Genome Sciences, University of Washington, Seattle, WA, USA
16Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s Hospital, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Adam M. Phillippy
4Genome Informatics Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Karen H. Miga
5UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rajiv C. McCoy
2Department of Biology, Johns Hopkins University, Baltimore MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rajiv C. McCoy
  • For correspondence: rajiv.mccoy@jhu.edu mydennis@ucdavis.edu justin.zook@nist.gov mschatz@cs.jhu.edu
Megan Y. Dennis
3Biochemistry & Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, Davis, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Megan Y. Dennis
  • For correspondence: rajiv.mccoy@jhu.edu mydennis@ucdavis.edu justin.zook@nist.gov mschatz@cs.jhu.edu
Justin M. Zook
8National Institute of Standards and Technology, Gaithersburg, MD, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Justin M. Zook
  • For correspondence: rajiv.mccoy@jhu.edu mydennis@ucdavis.edu justin.zook@nist.gov mschatz@cs.jhu.edu
Michael C. Schatz
1Department of Computer Science, Johns Hopkins University, Baltimore MD, USA
2Department of Biology, Johns Hopkins University, Baltimore MD, USA
17Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael C. Schatz
  • For correspondence: rajiv.mccoy@jhu.edu mydennis@ucdavis.edu justin.zook@nist.gov mschatz@cs.jhu.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 Mbp of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome to clinical and functional study. Here we demonstrate how the new reference universally improves read mapping and variant calling for 3,202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of novel variants per sample—a new frontier for evolutionary and biomedical discovery. Simultaneously, the new reference eliminates tens of thousands of spurious variants per sample, including up to 12-fold reduction of false positives in 269 medically relevant genes. The vast improvement in variant discovery coupled with population and functional genomic resources position T2T-CHM13 to replace GRCh38 as the prevailing reference for human genetics.

One Sentence Summary The T2T-CHM13 reference genome universally improves the analysis of human genetic variation.

Competing Interest Statement

C.S.C. is an employee of DNAnexus. J.L. is an employee of Bionano Genomics. S.A. is an employee of Oxford Nanopore Technologies. F.J.S has received travel funds and spoken at PacBio and Oxford Nanopore Technologies events. S.K. has received travel funds to speak at symposia organized by Oxford Nanopore Technologies.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted July 13, 2021.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A complete reference genome improves analysis of human genetic variation
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A complete reference genome improves analysis of human genetic variation
Sergey Aganezov, Stephanie M. Yan, Daniela C. Soto, Melanie Kirsche, Samantha Zarate, Pavel Avdeyev, Dylan J. Taylor, Kishwar Shafin, Alaina Shumate, Chunlin Xiao, Justin Wagner, Jennifer McDaniel, Nathan D. Olson, Michael E.G. Sauria, Mitchell R. Vollger, Arang Rhie, Melissa Meredith, Skylar Martin, Joyce Lee, Sergey Koren, Jeffrey A. Rosenfeld, Benedict Paten, Ryan Layer, Chen-Shan Chin, Fritz J. Sedlazeck, Nancy F. Hansen, Danny E. Miller, Adam M. Phillippy, Karen H. Miga, Rajiv C. McCoy, Megan Y. Dennis, Justin M. Zook, Michael C. Schatz
bioRxiv 2021.07.12.452063; doi: https://doi.org/10.1101/2021.07.12.452063
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
A complete reference genome improves analysis of human genetic variation
Sergey Aganezov, Stephanie M. Yan, Daniela C. Soto, Melanie Kirsche, Samantha Zarate, Pavel Avdeyev, Dylan J. Taylor, Kishwar Shafin, Alaina Shumate, Chunlin Xiao, Justin Wagner, Jennifer McDaniel, Nathan D. Olson, Michael E.G. Sauria, Mitchell R. Vollger, Arang Rhie, Melissa Meredith, Skylar Martin, Joyce Lee, Sergey Koren, Jeffrey A. Rosenfeld, Benedict Paten, Ryan Layer, Chen-Shan Chin, Fritz J. Sedlazeck, Nancy F. Hansen, Danny E. Miller, Adam M. Phillippy, Karen H. Miga, Rajiv C. McCoy, Megan Y. Dennis, Justin M. Zook, Michael C. Schatz
bioRxiv 2021.07.12.452063; doi: https://doi.org/10.1101/2021.07.12.452063

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (3691)
  • Biochemistry (7800)
  • Bioengineering (5678)
  • Bioinformatics (21295)
  • Biophysics (10582)
  • Cancer Biology (8179)
  • Cell Biology (11946)
  • Clinical Trials (138)
  • Developmental Biology (6764)
  • Ecology (10401)
  • Epidemiology (2065)
  • Evolutionary Biology (13874)
  • Genetics (9709)
  • Genomics (13074)
  • Immunology (8150)
  • Microbiology (20020)
  • Molecular Biology (7859)
  • Neuroscience (43070)
  • Paleontology (321)
  • Pathology (1279)
  • Pharmacology and Toxicology (2260)
  • Physiology (3353)
  • Plant Biology (7232)
  • Scientific Communication and Education (1313)
  • Synthetic Biology (2008)
  • Systems Biology (5539)
  • Zoology (1128)