Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes

Nat Methods. 2009 Apr;6(4):291-5. doi: 10.1038/nmeth.1311. Epub 2009 Mar 15.

Abstract

Amplification artifacts introduced during library preparation for the Illumina Genome Analyzer increase the likelihood that an appreciable proportion of these sequences will be duplicates and cause an uneven distribution of read coverage across the targeted sequencing regions. As a consequence, these unfavorable features result in difficulties in genome assembly and variation analysis from the short reads, particularly when the sequences are from genomes with base compositions at the extremes of high or low G+C content. Here we present an amplification-free method of library preparation, in which the cluster amplification step, rather than the PCR, enriches for fully ligated template strands, reducing the incidence of duplicate sequences, improving read mapping and single nucleotide polymorphism calling and aiding de novo assembly. We illustrate this by generating and analyzing DNA sequences from extremely (G+C)-poor (Plasmodium falciparum), (G+C)-neutral (Escherichia coli) and (G+C)-rich (Bordetella pertussis) genomes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition*
  • Base Sequence
  • Chromosome Mapping / methods*
  • Gene Library*
  • Molecular Sequence Data
  • Nucleic Acid Amplification Techniques
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Analysis, DNA / methods*