Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Mixed model association for biobank-scale data sets

Po-Ru Loh, Gleb Kichaev, Steven Gazal, Armin P Schoech, Alkes L Price
doi: https://doi.org/10.1101/194944
Po-Ru Loh
1Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, USA
2Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gleb Kichaev
3Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, California, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Steven Gazal
2Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
4Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Armin P Schoech
2Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
4Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
5Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alkes L Price
2Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
4Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
6Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Biobank-based genome-wide association studies are enabling exciting insights in complex trait genetics, but much uncertainty remains over best practices for optimizing statistical power and computational efficiency in GWAS while controlling confounders. Here, we introduce a much faster version of our BOLT-LMM Bayesian mixed model association method— capable of running analyses of the full UK Biobank cohort in a few days on a single compute node—and show that it produces highly powered, robust test statistics when run on all 459K European samples (retaining related individuals). When used to conduct a GWAS for height in UK Biobank, BOLT-LMM achieved power equivalent to linear regression on 650K samples—a 93% increase in effective sample size versus the common practice of analyzing unrelated British samples using linear regression (UK Biobank documentation; Bycroft et al. bioRxiv). Across a broader set of 23 highly heritable traits, the total number of independent GWAS loci detected increased from 5,839 to 10,759, an 84% increase. We recommend the use of BOLT-LMM (retaining related individuals) for biobank-scale analyses, and we have publicly released BOLT-LMM summary association statistics for the 23 traits analyzed as a resource for all researchers.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted January 04, 2018.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Mixed model association for biobank-scale data sets
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Mixed model association for biobank-scale data sets
Po-Ru Loh, Gleb Kichaev, Steven Gazal, Armin P Schoech, Alkes L Price
bioRxiv 194944; doi: https://doi.org/10.1101/194944
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Mixed model association for biobank-scale data sets
Po-Ru Loh, Gleb Kichaev, Steven Gazal, Armin P Schoech, Alkes L Price
bioRxiv 194944; doi: https://doi.org/10.1101/194944

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4237)
  • Biochemistry (9147)
  • Bioengineering (6786)
  • Bioinformatics (24025)
  • Biophysics (12137)
  • Cancer Biology (9545)
  • Cell Biology (13795)
  • Clinical Trials (138)
  • Developmental Biology (7642)
  • Ecology (11716)
  • Epidemiology (2066)
  • Evolutionary Biology (15518)
  • Genetics (10650)
  • Genomics (14332)
  • Immunology (9493)
  • Microbiology (22858)
  • Molecular Biology (9103)
  • Neuroscience (49032)
  • Paleontology (355)
  • Pathology (1484)
  • Pharmacology and Toxicology (2572)
  • Physiology (3849)
  • Plant Biology (8338)
  • Scientific Communication and Education (1472)
  • Synthetic Biology (2296)
  • Systems Biology (6196)
  • Zoology (1302)