Natural genetic variation caused by small insertions and deletions in the human genome

  1. Scott E. Devine1,3,4,6,7,9
  1. 1Department of Biochemistry, Emory University School of Medicine, Atlanta, Georgia 30322, USA;
  2. 2Bimcore, Emory University, Atlanta, Georgia 30322, USA;
  3. 3Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA;
  4. 4Division of Endocrinology, Diabetes, and Nutrition, Department of Medicine, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA;
  5. 5MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom;
  6. 6Winship Cancer Institute, Emory University, Atlanta, Georgia 30322, USA;
  7. 7Greenebaum Cancer Center, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
    • 8 Present address: Department of Pathology, Brigham and Women's Hospital, Boston, MA 02115, USA.

    Abstract

    Human genetic variation is expected to play a central role in personalized medicine. Yet only a fraction of the natural genetic variation that is harbored by humans has been discovered to date. Here we report almost 2 million small insertions and deletions (INDELs) that range from 1 bp to 10,000 bp in length in the genomes of 79 diverse humans. These variants include 819,363 small INDELs that map to human genes. Small INDELs frequently were found in the coding exons of these genes, and several lines of evidence indicate that such variation is a major determinant of human biological diversity. Microarray-based genotyping experiments revealed several interesting observations regarding the population genetics of small INDEL variation. For example, we found that many of our INDELs had high levels of linkage disequilibrium (LD) with both HapMap SNPs and with high-scoring SNPs from genome-wide association studies. Overall, our study indicates that small INDEL variation is likely to be a key factor underlying inherited traits and diseases in humans.

    Footnotes

    • 9 Corresponding author.

      E-mail sdevine{at}som.umaryland.edu.

    • [Supplemental material is available for this article. The microarray data from this study have been submitted to the NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) under accession no. GSE27889. The INDEL variants reported in this study have been deposited in the NCBI dbSNP database (http://www.ncbi.nlm.nih.gov/projects/SNP/) (a complete listing of the accession numbers can be found in Supplemental Table 17).]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.115907.110.

    • Received September 27, 2010.
    • Accepted March 18, 2011.
    | Table of Contents

    Preprint Server