Abstract
Many tasks in statistical genetics involve pairwise estimation of linkage disequilibrium (LD). The study of LD in diploids is mature. However, in polyploids, the field lacks a comprehensive characterization of LD. Polyploids also exhibit greater levels of genotype uncertainty than diploids, and yet no methods currently exist to estimate LD in polyploids in the presence of such genotype uncertainty. Furthermore, most LD estimation methods do not quantify the level of uncertainty in their LD estimates. Our paper contains three major contributions. (i) We characterize haplotypic and composite measures of LD in polyploids. These composite measures of LD turn out to be functions of common statistical measures of association. (ii) We derive procedures to estimate haplotypic and composite LD in polyploids in the presence of genotype uncertainty. We do this by estimating LD directly from genotype likelihoods, which may be obtained from many genotyping platforms. (iii) We derive standard errors of all LD estimators that we discuss. We validate our methods on both real and simulated data. Our methods are implemented in the R package ldsep, available on the Comprehensive R Archive Network https://cran.r-project.org/package=ldsep.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
(i) The terminology "gametic LD" has been changed to "haplotypic LD". A discussion of this choice is presented at the start of Section 2.1. (ii) The Discussion has been improved to include more applied aspects of the methodology. Specifically, we now discuss the applications, assumptions, and interpretations of the various LD estimators introduced in this manuscript. (iii) We have included new simulations to explore the effects of interpretable deviations from Hardy-Weinberg equilibrium, namely preferential pairing and double reduction. These are provided in Section S12 of the Supplementary Material.