A unifying statistical framework to discover disease genes from GWASs

Cell Genom. 2023 Mar 8;3(3):100264. doi: 10.1016/j.xgen.2023.100264.

Abstract

Genome-wide association studies (GWASs) identify genomic loci associated with complex traits, but it remains a challenge to identify the genes affected by causal genetic variants in these loci. Attempts to solve this challenge are frustrated by a number of compounding problems. Here, we show how to combine solutions to these problems into a unified mathematical framework. From this synthesis, it becomes possible to compute the probability that each gene in the genome is affected by a causal variant, given a particular trait, without making assumptions about the relevant cell types or tissues. We validate each component of the framework individually and in combination. When applied to large GWASs of human disease, the resulting paradigm can rediscover the majority of well-known disease genes. Moreover, it establishes human genetics support for many genes previously implicated only by clinical or preclinical evidence, and it uncovers a plethora of novel disease genes with compelling biological rationale.

Keywords: DNase I hypersensitive sites; causal genes; drug discovery; enhancers; functional genomics; genome-wide association study; linkage disequilibrium; statistical fine-mapping.