Population-scale analysis of human microsatellites reveals novel sources of exonic variation

Gene. 2013 Mar 10;516(2):328-34. doi: 10.1016/j.gene.2012.12.068. Epub 2012 Dec 26.

Abstract

Using our microsatellite specific genotyping method, we analyzed tandem repeats, which are known to be highly variable with some recognized as biomarkers causative of disease, in over 500 individuals who were exon sequenced in a 1000 Genomes Project pilot study. We were able to genotype over 97% of the microsatellite loci in the targeted regions. A total of 25,115 variations were observed, including repeat length and single nucleotide polymorphisms, corresponding to an average of 45.6 variations per individual and a density of 1.1 variations per kilobase. Standard variant detection did not report 94.2% of the exonic repeat length variations in part because the alignment techniques are not ideal for repetitive regions. Additionally some standard variation detection tools rely on a database of known variations, making them less likely to call repeat length variations as only a small percent of these loci (~6000) have been accurately characterized. A subset of the hundreds of non-synonymous variations we identified was experimentally validated, indicating an accuracy of 96.5% for our microsatellite-based genotyping method, with some novel variants identified in genes associated with cancer. We propose that microsatellite-based genotyping be used as a part of large scale sequencing studies to identify novel variants.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Exons / genetics*
  • Genetic Variation* / physiology
  • Genetics, Population
  • Genome, Human / genetics
  • Genotype
  • Humans
  • Microsatellite Repeats / genetics*
  • Molecular Sequence Data
  • Pilot Projects
  • Polymorphism, Single Nucleotide / physiology
  • Sequence Analysis, DNA
  • Validation Studies as Topic