TY - JOUR T1 - Telomere-to-telomere assembly of a complete human X chromosome JF - bioRxiv DO - 10.1101/735928 SP - 735928 AU - Karen H. Miga AU - Sergey Koren AU - Arang Rhie AU - Mitchell R. Vollger AU - Ariel Gershman AU - Andrey Bzikadze AU - Shelise Brooks AU - Edmund Howe AU - David Porubsky AU - Glennis A. Logsdon AU - Valerie A. Schneider AU - Tamara Potapova AU - Jonathan Wood AU - William Chow AU - Joel Armstrong AU - Jeanne Fredrickson AU - Evgenia Pak AU - Kristof Tigyi AU - Milinn Kremitzki AU - Christopher Markovic AU - Valerie Maduro AU - Amalia Dutra AU - Gerard G. Bouffard AU - Alexander M. Chang AU - Nancy F. Hansen AU - Françoisen Thibaud-Nissen AU - Anthony D. Schmitt AU - Jon-Matthew Belton AU - Siddarth Selvaraj AU - Megan Y. Dennis AU - Daniela C. Soto AU - Ruta Sahasrabudhe AU - Gulhan Kaya AU - Josh Quick AU - Nicholas J. Loman AU - Nadine Holmes AU - Matthew Loose AU - Urvashi Surti AU - Rosa ana Risques AU - Tina A. Graves Lindsay AU - Robert Fulton AU - Ira Hall AU - Benedict Paten AU - Kerstin Howe AU - Winston Timp AU - Alice Young AU - James C. Mullikin AU - Pavel A. Pevzner AU - Jennifer L. Gerton AU - Beth A. Sullivan AU - Evan E. Eichler AU - Adam M. Phillippy Y1 - 2019/01/01 UR - http://biorxiv.org/content/early/2019/08/16/735928.2.abstract N2 - After nearly two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no one chromosome has been finished end to end, and hundreds of unresolved gaps persist 1,2. The remaining gaps include ribosomal rDNA arrays, large near-identical segmental duplications, and satellite DNA arrays. These regions harbor largely unexplored variation of unknown consequence, and their absence from the current reference genome can lead to experimental artifacts and hide true variants when re-sequencing additional human genomes. Here we present a de novo human genome assembly that surpasses the continuity of GRCh38 2, along with the first gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome 3, we reconstructed the ∼2.8 megabase centromeric satellite DNA array and closed all 29 remaining gaps in the current reference, including new sequence from the human pseudoautosomal regions and cancer-testis ampliconic gene families (CT-X and GAGE). This complete chromosome X, combined with the ultra-long nanopore data, also allowed us to map methylation patterns across complex tandem repeats and satellite arrays for the first time. These results demonstrate that finishing the human genome is now within reach and will enable ongoing efforts to complete the remaining human chromosomes. ER -