RT Journal Article SR Electronic T1 Bayesian analysis of genetic association across tree-structured routine healthcare data in the UK Biobank JF bioRxiv FD Cold Spring Harbor Laboratory SP 105122 DO 10.1101/105122 A1 Adrian Cortes A1 Calliope A. Dendrou A1 Allan Motyer A1 Luke Jostins A1 Damjan Vukcevic A1 Alexander Dilthey A1 Peter Donnelly A1 Stephen Leslie A1 Lars Fugger A1 Gil McVean YR 2017 UL http://biorxiv.org/content/early/2017/02/01/105122.abstract AB Genetic discovery from the multitude of phenotypes extractable from routine healthcare data has the ability to radically transform our understanding of the human phenome, thereby accelerating progress towards precision medicine. However, a critical question when analysing high-dimensional and heterogeneous data is how to interrogate increasingly specific subphenotypes whilst retaining statistical power to detect genetic associations. Here we develop and employ a novel Bayesian analysis framework that exploits the hierarchical structure of diagnosis classifications to jointly analyse genetic variants against UK Biobank healthcare phenotypes. Our method displays a more than 20% increase in power to detect genetic effects over other approaches, such that we uncover the broader burden of genetic variation: we identify associations with over 2,000 diagnostic terms. We find novel associations with common immune-mediated diseases (IMD), we reveal the extent of genetic sharing between specific IMDs, and we expose differences in disease perception or diagnosis with potential clinical implications.