Rapid multiplexed genotyping of simple tandem repeats using capture and high-throughput sequencing

Hum Mutat. 2013 Sep;34(9):1304-11. doi: 10.1002/humu.22359. Epub 2013 Jun 17.

Abstract

Although simple tandem repeats (STRs) comprise ~2% of the human genome and represent an important source of polymorphism, this class of variation remains understudied. We have developed a cost-effective strategy for performing targeted enrichment of STR regions that utilizes capture probes targeting the flanking sequences of STR loci, enabling specific capture of DNA fragments containing STRs for subsequent high-throughput sequencing. Utilizing a capture design targeting 6,243 STR loci <94 bp and multiplexing eight individuals in a single Illumina HiSeq2000 sequencing lane we were able to call genotypes in at least one individual for 67.5% of the targeted STRs. We observed a strong relationship between (G+C) content and genotyping rate. STRs with moderate (G+C) content were recovered with >90% success rate, whereas only 12% of STRs with ≥ 80% (G+C) were genotyped in our assay. Analysis of a parent-offspring trio, complete hydatidiform mole samples, repeat analyses of the same individual, and Sanger sequencing-based validation indicated genotyping error rates between 7.6% and 12.4%. The majority of such errors were a single repeat unit at mono- or dinucleotide repeats. Altogether, our STR capture assay represents a cost-effective method that enables multiplexed genotyping of thousands of STR loci suitable for large-scale population studies.

Keywords: genome instability; high-throughput sequencing; microsatellite; repeat variation; sequence capture.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition
  • Genetic Variation
  • Genome, Human
  • Genomics / methods*
  • Genotype
  • HapMap Project
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Reproducibility of Results
  • Tandem Repeat Sequences*