Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Telomere-to-telomere assembly of a complete human X chromosome

View ORCID ProfileKaren H. Miga, View ORCID ProfileSergey Koren, View ORCID ProfileArang Rhie, View ORCID ProfileMitchell R. Vollger, Ariel Gershman, Andrey Bzikadze, Shelise Brooks, Edmund Howe, David Porubsky, Glennis A. Logsdon, Valerie A. Schneider, Tamara Potapova, Jonathan Wood, William Chow, Joel Armstrong, Jeanne Fredrickson, Evgenia Pak, Kristof Tigyi, Milinn Kremitzki, Christopher Markovic, Valerie Maduro, Amalia Dutra, Gerard G. Bouffard, Alexander M. Chang, Nancy F. Hansen, Françoisen Thibaud-Nissen, Anthony D. Schmitt, Jon-Matthew Belton, Siddarth Selvaraj, Megan Y. Dennis, Daniela C. Soto, Ruta Sahasrabudhe, Gulhan Kaya, Josh Quick, Nicholas J. Loman, Nadine Holmes, Matthew Loose, Urvashi Surti, Rosa ana Risques, Tina A. Graves Lindsay, Robert Fulton, Ira Hall, Benedict Paten, Kerstin Howe, View ORCID ProfileWinston Timp, Alice Young, View ORCID ProfileJames C. Mullikin, Pavel A. Pevzner, View ORCID ProfileJennifer L. Gerton, Beth A. Sullivan, View ORCID ProfileEvan E. Eichler, View ORCID ProfileAdam M. Phillippy
doi: https://doi.org/10.1101/735928
Karen H. Miga
1UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Karen H. Miga
  • For correspondence: khmiga@soe.ucsc.edu adam.phillippy@nih.gov
Sergey Koren
2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sergey Koren
Arang Rhie
2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Arang Rhie
Mitchell R. Vollger
3Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Mitchell R. Vollger
Ariel Gershman
4Department of Molecular Biology & Genetics, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Andrey Bzikadze
5Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Shelise Brooks
6NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Edmund Howe
7Stowers Institute for Medical Research, Kansas City, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
David Porubsky
3Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Glennis A. Logsdon
3Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Valerie A. Schneider
8National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tamara Potapova
7Stowers Institute for Medical Research, Kansas City, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jonathan Wood
9Wellcome Sanger Institute, Cambridge, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
William Chow
9Wellcome Sanger Institute, Cambridge, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joel Armstrong
1UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jeanne Fredrickson
10University of Washington, Department of Pathology, Seattle WA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Evgenia Pak
11Cytogenetic and Microscopy Core, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kristof Tigyi
1UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Milinn Kremitzki
12McDonnell Genome Institute at Washington University, St. Louis, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Christopher Markovic
12McDonnell Genome Institute at Washington University, St. Louis, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Valerie Maduro
13Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Amalia Dutra
11Cytogenetic and Microscopy Core, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gerard G. Bouffard
6NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alexander M. Chang
2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nancy F. Hansen
14Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Françoisen Thibaud-Nissen
8National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Anthony D. Schmitt
15Arima Genomics, San Diego, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jon-Matthew Belton
15Arima Genomics, San Diego, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Siddarth Selvaraj
15Arima Genomics, San Diego, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Megan Y. Dennis
16Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Daniela C. Soto
16Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ruta Sahasrabudhe
17DNA Technologies Core, Genome Center, University of California, Davis, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gulhan Kaya
16Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Josh Quick
18Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicholas J. Loman
18Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nadine Holmes
19DeepSeq, School of Life Sciences, University of Nottingham, Nottingham, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Matthew Loose
19DeepSeq, School of Life Sciences, University of Nottingham, Nottingham, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Urvashi Surti
20Department of Pathology, University of Pittsburgh, Pittsburgh, PA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rosa ana Risques
10University of Washington, Department of Pathology, Seattle WA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tina A. Graves Lindsay
12McDonnell Genome Institute at Washington University, St. Louis, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Robert Fulton
12McDonnell Genome Institute at Washington University, St. Louis, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Ira Hall
12McDonnell Genome Institute at Washington University, St. Louis, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Benedict Paten
1UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kerstin Howe
9Wellcome Sanger Institute, Cambridge, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Winston Timp
4Department of Molecular Biology & Genetics, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Winston Timp
Alice Young
6NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James C. Mullikin
6NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Rockville, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for James C. Mullikin
Pavel A. Pevzner
21Department of Computer Science and Engineering, University of California, San Diego, CA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jennifer L. Gerton
7Stowers Institute for Medical Research, Kansas City, MO USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jennifer L. Gerton
Beth A. Sullivan
22Department of Molecular Genetics and Microbiology, Division of Human Genetics, Duke University Medical Center, Durham, NC USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Evan E. Eichler
3Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA USA
23Howard Hughes Medical Institute, University of Washington, Seattle, WA USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Evan E. Eichler
Adam M. Phillippy
2Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Adam M. Phillippy
  • For correspondence: khmiga@soe.ucsc.edu adam.phillippy@nih.gov
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

After nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has been finished end to end, and hundreds of unresolved gaps persist 1,2. The remaining gaps include ribosomal rDNA arrays, large near-identical segmental duplications, and satellite DNA arrays. These regions harbor largely unexplored variation of unknown consequence, and their absence from the current reference genome can lead to experimental artifacts and hide true variants when re-sequencing additional human genomes. Here we present a de novo human genome assembly that surpasses the continuity of GRCh38 2, along with the first gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome 3, we reconstructed the ∼2.8 megabase centromeric satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). This complete chromosome X, combined with the ultra-long nanopore data, also allowed us to map methylation patterns across complex tandem repeats and satellite arrays for the first time. These results demonstrate that finishing the human genome is now within reach and will enable ongoing efforts to complete the remaining human chromosomes.

Footnotes

  • https://github.com/nanopore-wgs-consortium/CHM13

  • http://www.stowers.org/research/publications/libpb-1453

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license.
Back to top
PreviousNext
Posted August 16, 2019.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Telomere-to-telomere assembly of a complete human X chromosome
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Telomere-to-telomere assembly of a complete human X chromosome
Karen H. Miga, Sergey Koren, Arang Rhie, Mitchell R. Vollger, Ariel Gershman, Andrey Bzikadze, Shelise Brooks, Edmund Howe, David Porubsky, Glennis A. Logsdon, Valerie A. Schneider, Tamara Potapova, Jonathan Wood, William Chow, Joel Armstrong, Jeanne Fredrickson, Evgenia Pak, Kristof Tigyi, Milinn Kremitzki, Christopher Markovic, Valerie Maduro, Amalia Dutra, Gerard G. Bouffard, Alexander M. Chang, Nancy F. Hansen, Françoisen Thibaud-Nissen, Anthony D. Schmitt, Jon-Matthew Belton, Siddarth Selvaraj, Megan Y. Dennis, Daniela C. Soto, Ruta Sahasrabudhe, Gulhan Kaya, Josh Quick, Nicholas J. Loman, Nadine Holmes, Matthew Loose, Urvashi Surti, Rosa ana Risques, Tina A. Graves Lindsay, Robert Fulton, Ira Hall, Benedict Paten, Kerstin Howe, Winston Timp, Alice Young, James C. Mullikin, Pavel A. Pevzner, Jennifer L. Gerton, Beth A. Sullivan, Evan E. Eichler, Adam M. Phillippy
bioRxiv 735928; doi: https://doi.org/10.1101/735928
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Telomere-to-telomere assembly of a complete human X chromosome
Karen H. Miga, Sergey Koren, Arang Rhie, Mitchell R. Vollger, Ariel Gershman, Andrey Bzikadze, Shelise Brooks, Edmund Howe, David Porubsky, Glennis A. Logsdon, Valerie A. Schneider, Tamara Potapova, Jonathan Wood, William Chow, Joel Armstrong, Jeanne Fredrickson, Evgenia Pak, Kristof Tigyi, Milinn Kremitzki, Christopher Markovic, Valerie Maduro, Amalia Dutra, Gerard G. Bouffard, Alexander M. Chang, Nancy F. Hansen, Françoisen Thibaud-Nissen, Anthony D. Schmitt, Jon-Matthew Belton, Siddarth Selvaraj, Megan Y. Dennis, Daniela C. Soto, Ruta Sahasrabudhe, Gulhan Kaya, Josh Quick, Nicholas J. Loman, Nadine Holmes, Matthew Loose, Urvashi Surti, Rosa ana Risques, Tina A. Graves Lindsay, Robert Fulton, Ira Hall, Benedict Paten, Kerstin Howe, Winston Timp, Alice Young, James C. Mullikin, Pavel A. Pevzner, Jennifer L. Gerton, Beth A. Sullivan, Evan E. Eichler, Adam M. Phillippy
bioRxiv 735928; doi: https://doi.org/10.1101/735928

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4117)
  • Biochemistry (8820)
  • Bioengineering (6523)
  • Bioinformatics (23470)
  • Biophysics (11798)
  • Cancer Biology (9216)
  • Cell Biology (13327)
  • Clinical Trials (138)
  • Developmental Biology (7440)
  • Ecology (11417)
  • Epidemiology (2066)
  • Evolutionary Biology (15160)
  • Genetics (10442)
  • Genomics (14051)
  • Immunology (9176)
  • Microbiology (22170)
  • Molecular Biology (8817)
  • Neuroscience (47600)
  • Paleontology (350)
  • Pathology (1429)
  • Pharmacology and Toxicology (2492)
  • Physiology (3733)
  • Plant Biology (8084)
  • Scientific Communication and Education (1437)
  • Synthetic Biology (2221)
  • Systems Biology (6039)
  • Zoology (1254)