TY - JOUR T1 - Approaches to estimating inbreeding coefficients in clinical isolates of Plasmodium falciparum from genomic sequence data JF - bioRxiv DO - 10.1101/021519 SP - 021519 AU - Lucas Amenga-Etego AU - Ruiqi Li AU - John D. O’Brien Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/05/03/021519.abstract N2 - The advent of whole-genome sequencing has generated increased interest in modeling the structure of strain mixture within clinicial infections of Plasmodium falciparum (Pf). The life cycle of the parasite implies that the mixture of multiple strains within an infected individual is related to the out-crossing rate across populations, making methods for measuring this process in situ central to understanding the genetic epidemiology of the disease. In this paper, we show how to estimate inbreeding coefficients using genomic data from Pf clinical samples, providing a simple metric for assessing within-sample mixture that connects to an extensive literature in population genetics and conservation ecology. Features of the P. falciparum genome mean that some standard methods for inbreeding coefficients and related F-statistics cannot be used directly. Here, we review an initial effort to estimate the inbreeding coefficient within clinical isolates of P. falciparum and provide several generalizations using both frequentist and Bayesian approaches. The Bayesian approach connects these estimates to the Balding-Nichols model, a mainstay within genetic epidemiology. We provide simulation results on the performance of the estimators and show their use on ~ 1500 samples from the PF3K data set. We also compare the results to output from a recent mixture model for within-sample strain mixture, showing that inbreeding coefficients provide a strong proxy for the results of these more complex models. We provide the methods described within an open-source R package pfmix. ER -