BAsE-Seq: a method for obtaining long viral haplotypes from short sequence reads

Genome Biol. 2014;15(11):517. doi: 10.1186/PREACCEPT-6768001251451949.

Abstract

We present a method for obtaining long haplotypes, of over 3 kb in length, using a short-read sequencer, Barcode-directed Assembly for Extra-long Sequences (BAsE-Seq). BAsE-Seq relies on transposing a template-specific barcode onto random segments of the template molecule and assembling the barcoded short reads into complete haplotypes. We applied BAsE-Seq on mixed clones of hepatitis B virus and accurately identified haplotypes occurring at frequencies greater than or equal to 0.4%, with >99.9% specificity. Applying BAsE-Seq to a clinical sample, we obtained over 9,000 viral haplotypes, which provided an unprecedented view of hepatitis B virus population structure during chronic infection. BAsE-Seq is readily applicable for monitoring quasispecies evolution in viral diseases.

Trial registration: ClinicalTrials.gov NCT00962871.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genetic Variation
  • Haplotypes / genetics*
  • Hepatitis B / genetics
  • Hepatitis B / virology
  • Hepatitis B virus / genetics*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Sequence Analysis, DNA / methods*
  • Software

Associated data

  • ClinicalTrials.gov/NCT00962871