Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Automating Mendelian randomization through machine learning to construct a putative causal map of the human phenome

View ORCID ProfileGibran Hemani, View ORCID ProfileJack Bowden, View ORCID ProfilePhilip Haycock, Jie Zheng, Oliver Davis, Peter Flach, View ORCID ProfileTom Gaunt, View ORCID ProfileGeorge Davey Smith
doi: https://doi.org/10.1101/173682
Gibran Hemani
1MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, Population Health Sciences, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gibran Hemani
Jack Bowden
1MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, Population Health Sciences, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jack Bowden
Philip Haycock
1MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, Population Health Sciences, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Philip Haycock
Jie Zheng
1MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, Population Health Sciences, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Oliver Davis
1MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, Population Health Sciences, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Peter Flach
2Intelligent Systems Laboratory, Department of Computer Science, University of Bristol
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Tom Gaunt
1MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, Population Health Sciences, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tom Gaunt
George Davey Smith
1MRC Integrative Epidemiology Unit (IEU) at the University of Bristol, Population Health Sciences, Bristol, UK
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for George Davey Smith
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

A major application for genome-wide association studies (GWAS) has been the emerging field of causal inference using Mendelian randomization (MR), where the causal effect between a pair of traits can be estimated using only summary level data. MR depends on SNPs exhibiting vertical pleiotropy, where the SNP influences an outcome phenotype only through an exposure phenotype. Issues arise when this assumption is violated due to SNPs exhibiting horizontal pleiotropy. We demonstrate that across a range of pleiotropy models, instrument selection will be increasingly liable to selecting invalid instruments as GWAS sample sizes continue to grow. Methods have been developed in an attempt to protect MR from different patterns of horizontal pleiotropy, and here we have designed a mixture-of-experts machine learning framework (MR-MoE 1.0) that predicts the most appropriate model to use for any specific causal analysis, improving on both power and false discovery rates. Using the approach, we systematically estimated the causal effects amongst 2407 phenotypes. Almost 90% of causal estimates indicated some level of horizontal pleiotropy. The causal estimates are organised into a publicly available graph database (http://eve.mrbase.org), and we use it here to highlight the numerous challenges that remain in automated causal inference.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 10, 2017.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Automating Mendelian randomization through machine learning to construct a putative causal map of the human phenome
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Automating Mendelian randomization through machine learning to construct a putative causal map of the human phenome
Gibran Hemani, Jack Bowden, Philip Haycock, Jie Zheng, Oliver Davis, Peter Flach, Tom Gaunt, George Davey Smith
bioRxiv 173682; doi: https://doi.org/10.1101/173682
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Automating Mendelian randomization through machine learning to construct a putative causal map of the human phenome
Gibran Hemani, Jack Bowden, Philip Haycock, Jie Zheng, Oliver Davis, Peter Flach, Tom Gaunt, George Davey Smith
bioRxiv 173682; doi: https://doi.org/10.1101/173682

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Epidemiology
Subject Areas
All Articles
  • Animal Behavior and Cognition (4235)
  • Biochemistry (9136)
  • Bioengineering (6784)
  • Bioinformatics (24001)
  • Biophysics (12129)
  • Cancer Biology (9534)
  • Cell Biology (13778)
  • Clinical Trials (138)
  • Developmental Biology (7636)
  • Ecology (11702)
  • Epidemiology (2066)
  • Evolutionary Biology (15513)
  • Genetics (10644)
  • Genomics (14327)
  • Immunology (9483)
  • Microbiology (22841)
  • Molecular Biology (9090)
  • Neuroscience (48995)
  • Paleontology (355)
  • Pathology (1482)
  • Pharmacology and Toxicology (2570)
  • Physiology (3846)
  • Plant Biology (8331)
  • Scientific Communication and Education (1471)
  • Synthetic Biology (2296)
  • Systems Biology (6192)
  • Zoology (1301)