LDx: estimation of linkage disequilibrium from high-throughput pooled resequencing data

PLoS One. 2012;7(11):e48588. doi: 10.1371/journal.pone.0048588. Epub 2012 Nov 9.

Abstract

High-throughput pooled resequencing offers significant potential for whole genome population sequencing. However, its main drawback is the loss of haplotype information. In order to regain some of this information, we present LDx, a computational tool for estimating linkage disequilibrium (LD) from pooled resequencing data. LDx uses an approximate maximum likelihood approach to estimate LD (r(2)) between pairs of SNPs that can be observed within and among single reads. LDx also reports r(2) estimates derived solely from observed genotype counts. We demonstrate that the LDx estimates are highly correlated with r(2) estimated from individually resequenced strains. We discuss the performance of LDx using more stringent quality conditions and infer via simulation the degree to which performance can improve based on read depth. Finally we demonstrate two possible uses of LDx with real and simulated pooled resequencing data. First, we use LDx to infer genomewide patterns of decay of LD with physical distance in D. melanogaster population resequencing data. Second, we demonstrate that r(2) estimates from LDx are capable of distinguishing alternative demographic models representing plausible demographic histories of D. melanogaster.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Animals
  • Drosophila melanogaster / genetics
  • Genetic Loci
  • Haplotypes
  • High-Throughput Nucleotide Sequencing*
  • Internet
  • Linkage Disequilibrium*
  • Polymorphism, Single Nucleotide
  • Reproducibility of Results
  • Sequence Analysis, DNA*
  • Software*