Assembly and diploid architecture of an individual human genome via single-molecule technologies

Nat Methods. 2015 Aug;12(8):780-6. doi: 10.1038/nmeth.3454. Epub 2015 Jun 29.

Abstract

We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.

MeSH terms

  • Algorithms
  • Chromosome Mapping
  • Computational Biology / methods*
  • Diploidy
  • Gene Library
  • Genetic Variation
  • Genome
  • Genome, Human*
  • Haplotypes
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Nucleotides / genetics
  • Polymorphism, Single Nucleotide*
  • Reproducibility of Results
  • Sequence Analysis, DNA
  • Tandem Repeat Sequences

Substances

  • Nucleotides