Efficient Bayesian mixed-model analysis increases association power in large cohorts

Po-Ru Loh; George Tucker; Brendan K Bulik-Sullivan; Bjarni J Vilhjálmsson; Hilary K Finucane; Rany M Salem; Daniel I Chasman; Paul M Ridker; Benjamin M Neale; Bonnie Berger; Nick Patterson; Alkes L Price

doi:10.1038/ng.3190

Efficient Bayesian mixed-model analysis increases association power in large cohorts

Nat Genet. 2015 Mar;47(3):284-90. doi: 10.1038/ng.3190. Epub 2015 Feb 2.

Authors

Affiliations

¹ 1] Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA. [2] Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.
² 1] Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA. [2] Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. [3] Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts, USA.
³ 1] Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA. [2] Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA.
⁴ Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
⁵ 1] Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA. [2] Department of Endocrinology, Children's Hospital Boston, Boston, Massachusetts, USA.
⁶ Division of Preventive Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA.
⁷ 1] Department of Mathematics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. [2] Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts, USA.
⁸ Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.
⁹ 1] Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA. [2] Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA. [3] Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.

PMID: 25642633
PMCID: PMC4342297
DOI: 10.1038/ng.3190

Abstract

Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts and may not optimize power. All existing methods require time cost O(MN(2)) (where N is the number of samples and M is the number of SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here we present a far more efficient mixed-model association method, BOLT-LMM, which requires only a small number of O(MN) time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to 9 quantitative traits in 23,294 samples from the Women's Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for genome-wide association studies in large cohorts.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Bayes Theorem*
Female
Genetic Association Studies / methods*
Genome, Human*
Genotyping Techniques
Humans
Linear Models
Polymorphism, Single Nucleotide
Quantitative Trait Loci

Abstract

Publication types

MeSH terms

Grants and funding