PT - JOURNAL ARTICLE AU - Duncan S. Palmer AU - Isaac Turner AU - Sarah Fidler AU - John Frater AU - Philip Goulder AU - Dominique Goedhals AU - Kuan-Hsiang Gary Huang AU - Annette Oxenius AU - Rodney Phillips AU - Roger Shapiro AU - Cloete van Vuuren AU - Angela R. McLean AU - Gil McVean TI - Mapping the drivers of within-host pathogen evolution using massive data sets AID - 10.1101/155242 DP - 2017 Jan 01 TA - bioRxiv PG - 155242 4099 - http://biorxiv.org/content/early/2017/06/25/155242.short 4100 - http://biorxiv.org/content/early/2017/06/25/155242.full AB - Differences among hosts, resulting from genetic variation in the immune system or heterogeneity in drug treatment, can impact within-host pathogen evolution. Identifying such interactions can potentially be achieved through genetic association studies. However, extensive and correlated genetic population structure in hosts and pathogens presents a substantial risk of confounding analyses. Moreover, the multiple testing burden of interaction scanning can potentially limit power. To address these problems, we have developed a Bayesian approach for detecting host influences on pathogen evolution that makes use of vast existing data sets of pathogen diversity to improve power and control for stratification. The approach models key processes, including recombination and selection, and identifies regions of the pathogen genome affected by host factors. Using simulations and empirical analysis of drug-induced selection on the HIV-1 genome we demonstrate the power of the method to recover known associations and show greatly improved precision-recall characteristics compared to other approaches. We build a high-resolution map of HLA-induced selection in the HIV-1 genome, identifying novel epitope-allele combinations.