Abstract
Individuals infected with the Plasmodium falciparum malaria parasite can carry multiple strains with varying levels of relatedness. Yet, how parameters of local epidemiology and the biology of transmission affect the rate and relatedness of such mixed infections remains unclear. Here, we develop an enhanced method for strain deconvolution from genome sequencing data, which estimates the number of strains, their proportions, identity-by-descent (IBD) profiles and individual haplotypes. We validate the method through experimental and in silico simulations and apply it to the Pf3k data set, consisting of 2,344 field samples from 13 countries. We find that the rate of mixed infection varies from 18% to 63% acrosscountries and that 51% of all mixed infections involve more than two strains. By modelling the structure of IBD resulting from different infection mechanisms we estimate that 55% of dual infections contain sibling strains likely to have been co-transmitted from a single mosquito, and find evidence of mixed infections propagated over successive infection cycles. By combining genetic data with epidemiological estimates of prevalence from the Malaria Atlas Project, we find that, at the country level, prevalence correlates with both the rate of mixed infection (Pearson r = 0.65, P = 3.7 ⨯ 10−6) and the level of IBD (r = −0.51, P = 6.0 ⨯ 10−4). Genomics is becoming a standard tool in pathogen surveillance.In this work, we conclude that monitoring fine-scale patterns of mixed infections and within-sample relatedness will be highly informative for assessing the impact of interventions and to inform malaria control programs.