Abstract
High-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of species. One side-effect of these methods, however, is that one or more alleles at a particular locus may not be sequenced, particularly when the sequencing depth is low, resulting in some heterozygous genotypes being called as homozygous. Under-called heterozygous genotypes have a profound effect on the estimation of linkage disequilibrium and, if not taken into account, leads to inaccurate estimates. We developed a new likelihood method, GUS-LD, to estimate pairwise linkage disequilibrium using low coverage sequencing data that accounts for under-called heterozygous genotypes. Our findings show that accurate estimates were obtained using GUS-LD on low coverage sequencing data, whereas underestimation of linkage disequilibrium results if no adjustment is made for under-called heterozygotes.