Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation

Am J Hum Genet. 2005 Mar;76(3):449-62. doi: 10.1086/428594. Epub 2005 Jan 31.

Abstract

Although many algorithms exist for estimating haplotypes from genotype data, none of them take full account of both the decay of linkage disequilibrium (LD) with distance and the order and spacing of genotyped markers. Here, we describe an algorithm that does take these factors into account, using a flexible model for the decay of LD with distance that can handle both "blocklike" and "nonblocklike" patterns of LD. We compare the accuracy of this approach with a range of other available algorithms in three ways: for reconstruction of randomly paired, molecularly determined male X chromosome haplotypes; for reconstruction of haplotypes obtained from trios in an autosomal region; and for estimation of missing genotypes in 50 autosomal genes that have been completely resequenced in 24 African Americans and 23 individuals of European descent. For the autosomal data sets, our new approach clearly outperforms the best available methods, whereas its accuracy in inferring the X chromosome haplotypes is only slightly superior. For estimation of missing genotypes, our method performed slightly better when the two subsamples were combined than when they were analyzed separately, which illustrates its robustness to population stratification. Our method is implemented in the software package PHASE (v2.1.1), available from the Stephens Lab Web site.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Alleles
  • Biometry
  • Chromosomes, Human, X / genetics
  • Data Interpretation, Statistical
  • Genomics / statistics & numerical data*
  • Genotype
  • Haplotypes / genetics*
  • Humans
  • Linkage Disequilibrium*
  • Male
  • Models, Genetic*
  • Software