RT Journal Article SR Electronic T1 Mapping the drivers of within-host pathogen evolution using massive data sets JF bioRxiv FD Cold Spring Harbor Laboratory SP 155242 DO 10.1101/155242 A1 Duncan S. Palmer A1 Isaac Turner A1 Sarah Fidler A1 John Frater A1 Philip Goulder A1 Dominique Goedhals A1 Kuan-Hsiang Gary Huang A1 Annette Oxenius A1 Rodney Phillips A1 Roger Shapiro A1 Cloete van Vuuren A1 Angela R. McLean A1 Gil McVean YR 2017 UL http://biorxiv.org/content/early/2017/06/25/155242.abstract AB Differences among hosts, resulting from genetic variation in the immune system or heterogeneity in drug treatment, can impact within-host pathogen evolution. Identifying such interactions can potentially be achieved through genetic association studies. However, extensive and correlated genetic population structure in hosts and pathogens presents a substantial risk of confounding analyses. Moreover, the multiple testing burden of interaction scanning can potentially limit power. To address these problems, we have developed a Bayesian approach for detecting host influences on pathogen evolution that makes use of vast existing data sets of pathogen diversity to improve power and control for stratification. The approach models key processes, including recombination and selection, and identifies regions of the pathogen genome affected by host factors. Using simulations and empirical analysis of drug-induced selection on the HIV-1 genome we demonstrate the power of the method to recover known associations and show greatly improved precision-recall characteristics compared to other approaches. We build a high-resolution map of HLA-induced selection in the HIV-1 genome, identifying novel epitope-allele combinations.