RT Journal Article SR Electronic T1 Highly-accurate long-read sequencing improves variant detection and assembly of a human genome JF bioRxiv FD Cold Spring Harbor Laboratory SP 519025 DO 10.1101/519025 A1 Aaron M. Wenger A1 Paul Peluso A1 William J. Rowell A1 Pi-Chuan Chang A1 Richard J. Hall A1 Gregory T. Concepcion A1 Jana Ebler A1 Arkarachai Fungtammasan A1 Alexey Kolesnikov A1 Nathan D. Olson A1 Armin Töpfer A1 Chen-Shan Chin A1 Michael Alonge A1 Medhat Mahmoud A1 Yufeng Qian A1 Adam M. Phillippy A1 Michael C. Schatz A1 Gene Myers A1 Mark A. DePristo A1 Jue Ruan A1 Tobias Marschall A1 Fritz J. Sedlazeck A1 Justin M. Zook A1 Heng Li A1 Sergey Koren A1 Andrew Carroll A1 David R. Rank A1 Michael W. Hunkapiller YR 2019 UL http://biorxiv.org/content/early/2019/01/13/519025.abstract AB The major DNA sequencing technologies in use today produce either highly-accurate short reads or noisy long reads. We developed a protocol based on single-molecule, circular consensus sequencing (CCS) to generate highly-accurate (99.8%) long reads averaging 13.5 kb and applied it to sequence the well-characterized human HG002/NA24385. We optimized existing tools to comprehensively detect variants, achieving precision and recall above 99.91% for SNVs, 95.98% for indels, and 95.99% for structural variants. We estimate that 2,434 discordances are correctable mistakes in the high-quality Genome in a Bottle benchmark. Nearly all (99.64%) variants are phased into haplotypes, which further improves variant detection. De novo assembly produces a highly contiguous and accurate genome with contig N50 above 15 Mb and concordance of 99.998%. CCS reads match short reads for small variant detection, while enabling structural variant detection and de novo assembly at similar contiguity and markedly higher concordance than noisy long reads.