Haplotyping germline and cancer genomes with high-throughput linked-read sequencing

Nat Biotechnol. 2016 Mar;34(3):303-11. doi: 10.1038/nbt.3432. Epub 2016 Feb 1.

Abstract

Haplotyping of human chromosomes is a prerequisite for cataloguing the full repertoire of genetic variation. We present a microfluidics-based, linked-read sequencing technology that can phase and haplotype germline and cancer genomes using nanograms of input DNA. This high-throughput platform prepares barcoded libraries for short-read sequencing and computationally reconstructs long-range haplotype and structural variant information. We generate haplotype blocks in a nuclear trio that are concordant with expected inheritance patterns and phase a set of structural variants. We also resolve the structure of the EML4-ALK gene fusion in the NCI-H2228 cancer cell line using phased exome sequencing. Finally, we assign genetic aberrations to specific megabase-scale haplotypes generated from whole-genome sequencing of a primary colorectal adenocarcinoma. This approach resolves haplotype information using up to 100 times less genomic DNA than some methods and enables the accurate detection of structural variants.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA / genetics
  • Genome, Human
  • Genomic Structural Variation
  • Germ Cells
  • Haplotypes / genetics*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Neoplasms / genetics*
  • Nucleic Acid Conformation
  • Oncogene Proteins, Fusion / genetics
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA / methods*

Substances

  • EML4-ALK fusion protein, human
  • Oncogene Proteins, Fusion
  • DNA