Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A scalable Bayesian method for integrating functional information in genome-wide association studies

Jingjing Yang, Lars G. Fritsche, Xiang Zhou, Gonçalo Abecasis, International Age-related Macular Degeneration Genomics Consortium (IAMDGC)
doi: https://doi.org/10.1101/101691
Jingjing Yang
1Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lars G. Fritsche
1Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
2K.G. Jebsen Center for Genetic Epidemiology, Department of Public Health, NTNU, Norwegian University of Science and Technology, Trondheim, Norway.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Xiang Zhou
1Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: xzhousph@umich.edu goncalo@umich.edu
Gonçalo Abecasis
1Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, 1415 Washington Heights, Ann Arbor, MI 48109, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: xzhousph@umich.edu goncalo@umich.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Although genome-wide association studies (GWASs) have identified many risk loci for complex traits and common diseases, most of the identified associations reside in noncoding regions and have unknown biological functions. Recent genomic sequencing studies have produced a rich resource of annotations that help characterize the function of genetic variants. Integrative analysis that incorporates these functional annotations into GWAS can help elucidate the biological mechanisms underlying the identified associations and help prioritize causal-variants. Here, we develop a novel, flexible Bayesian variable selection model with efficient computational techniques for such integrative analysis. Different from previous approaches, our method models the effect-size distribution and probability of causality for variants with different annotations and jointly models genome-wide variants to account for linkage disequilibrium (LD), thus prioritizing associations based on the quantification of the annotations and allowing for multiple causal-variants per locus. Our efficient computational algorithm dramatically improves both computational speed and posterior sampling convergence by taking advantage of the block-wise LD structures of human genomes. With simulations, we show that our method accurately quantifies the functional enrichment and performs more powerful for identifying true causal-variants than several competing methods. The power gain brought up by our method is especially apparent in cases when multiple causal-variants in LD reside in the same locus. We also apply our method for an in-depth GWAS of age-related macular degeneration with 33,976 individuals and 9,857,286 variants. We find the strongest enrichment for causality among non-synonymous variants (54x more likely to be causal, 1.4x larger effect-sizes) and variants in active promoter (7.8x more likely, 1.4x larger effect-sizes), as well as identify 5 potentially novel loci in addition to the 32 known AMD risk loci. In conclusion, our method is shown to efficiently integrate functional information in GWASs, helping identify causal variants and underlying biology.

Author summary We propose a novel Bayesian hierarchical model to account for linkage disequilibrium (LD) and multiple functional annotations in GWAS, paired with an expectation-maximization Markov chain Monte Carlo (EM-MCMC) computational algorithm to jointly analyze genome-wide variants. Our method improves the MCMC convergence property to ensure accurate Bayesian inference of the quantifications of the functional enrichment pattern and fine-mapped association results. By applying our method to the real GWAS of age-related macular degeneration (AMD) with various functional annotations (i.e., gene-based, regulatory, and chromatin states), we find that the variants of non-synonymous, coding, and active promoter annotations have the highest causal probability and the largest effect-sizes. In addition, our method produces fine-mapped association results in the identified risk loci, two of which are shown as examples (C2/CFB/SKIV2L and C3) with justifications by haplotype analysis, model comparison, and conditional analysis. Therefore, we believe our integrative method will be useful for quantifying the enrichment pattern of functional annotations in GWAS, and then prioritizing associations with respect to the learned functional enrichment pattern.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted February 03, 2017.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A scalable Bayesian method for integrating functional information in genome-wide association studies
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A scalable Bayesian method for integrating functional information in genome-wide association studies
Jingjing Yang, Lars G. Fritsche, Xiang Zhou, Gonçalo Abecasis, International Age-related Macular Degeneration Genomics Consortium (IAMDGC)
bioRxiv 101691; doi: https://doi.org/10.1101/101691
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
A scalable Bayesian method for integrating functional information in genome-wide association studies
Jingjing Yang, Lars G. Fritsche, Xiang Zhou, Gonçalo Abecasis, International Age-related Macular Degeneration Genomics Consortium (IAMDGC)
bioRxiv 101691; doi: https://doi.org/10.1101/101691

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2440)
  • Biochemistry (4803)
  • Bioengineering (3340)
  • Bioinformatics (14724)
  • Biophysics (6658)
  • Cancer Biology (5188)
  • Cell Biology (7455)
  • Clinical Trials (138)
  • Developmental Biology (4378)
  • Ecology (6904)
  • Epidemiology (2057)
  • Evolutionary Biology (9943)
  • Genetics (7357)
  • Genomics (9550)
  • Immunology (4583)
  • Microbiology (12730)
  • Molecular Biology (4960)
  • Neuroscience (28422)
  • Paleontology (199)
  • Pathology (810)
  • Pharmacology and Toxicology (1400)
  • Physiology (2031)
  • Plant Biology (4521)
  • Scientific Communication and Education (980)
  • Synthetic Biology (1305)
  • Systems Biology (3922)
  • Zoology (731)