Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Multi-scale Inference of Genetic Trait Architecture using Biologically Annotated Neural Networks

View ORCID ProfilePinar Demetci, View ORCID ProfileWei Cheng, View ORCID ProfileGregory Darnell, View ORCID ProfileXiang Zhou, View ORCID ProfileSohini Ramachandran, View ORCID ProfileLorin Crawford
doi: https://doi.org/10.1101/2020.07.02.184465
Pinar Demetci
1Department of Computer Science, Brown University, Providence, RI, USA
2Center for Computational Molecular Biology, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Pinar Demetci
Wei Cheng
2Center for Computational Molecular Biology, Brown University, Providence, RI, USA
3Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Wei Cheng
Gregory Darnell
2Center for Computational Molecular Biology, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Gregory Darnell
Xiang Zhou
4Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
5Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Xiang Zhou
Sohini Ramachandran
1Department of Computer Science, Brown University, Providence, RI, USA
2Center for Computational Molecular Biology, Brown University, Providence, RI, USA
3Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sohini Ramachandran
Lorin Crawford
2Center for Computational Molecular Biology, Brown University, Providence, RI, USA
6Microsoft Research New England, Cambridge, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lorin Crawford
  • For correspondence: lcrawford@microsoft.com
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

In this article, we present Biologically Annotated Neural Networks (BANNs), a nonlinear probabilistic framework for association mapping in genome-wide association (GWA) studies. BANNs are feedforward models with partially connected architectures that are based on biological annotations. This setup yields a fully interpretable neural network where the input layer encodes SNP-level effects, and the hidden layer models the aggregated effects among SNP-sets. We treat the weights and connections of the network as random variables with prior distributions that reflect how genetic effects manifest at different genomic scales. The BANNs software uses variational inference to provide posterior summaries which allow researchers to simultaneously perform (i) mapping with SNPs and (ii) enrichment analyses with SNP-sets on complex traits. Through simulations, we show that our method improves upon state-of-the-art association mapping and enrichment approaches across a wide range of genetic architectures. We then further illustrate the benefits of BANNs by analyzing real GWA data assayed in approximately 2,000 heterogenous stock of mice from the Wellcome Trust Centre for Human Genetics and approximately 7,000 individuals from the Framingham Heart Study. Lastly, using a random subset of individuals of European ancestry from the UK Biobank, we show that BANNs is able to replicate known associations in high and low-density lipoprotein cholesterol content.

Author Summary A common goal in genome-wide association (GWA) studies is to characterize the relationship between genotypic and phenotypic variation. Linear models are widely used tools in GWA analyses, in part, because they provide significance measures which detail how individual single nucleotide polymorphisms (SNPs) are statistically associated with a trait or disease of interest. However, traditional linear regression largely ignores non-additive genetic variation, and the univariate SNP-level mapping approach has been shown to be underpowered and challenging to interpret for certain trait architectures. While nonlinear methods such as neural networks are well known to account for complex data structures, these same algorithms have also been criticized as “black box” since they do not naturally carry out statistical hypothesis testing like classic linear models. This limitation has prevented nonlinear regression approaches from being used for association mapping tasks in GWA applications. Here, we present Biologically Annotated Neural Networks (BANNs): a flexible class of feedforward models with partially connected architectures that are based on biological annotations. The BANN framework uses approximate Bayesian inference to provide interpretable probabilistic summaries which can be used for simultaneous (i) mapping with SNPs and (ii) enrichment analyses with SNP-sets (e.g., genes or signaling pathways). We illustrate the benefits of our method over state-of-the-art approaches using extensive simulations. We also demonstrate the ability of BANNs to recover novel and previously discovered genomic associations using quantitative traits from the Wellcome Trust Centre for Human Genetics, the Framingham Heart Study, and the UK Biobank.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://github.com/lcrawlab/BANNs

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted May 06, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Multi-scale Inference of Genetic Trait Architecture using Biologically Annotated Neural Networks
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Multi-scale Inference of Genetic Trait Architecture using Biologically Annotated Neural Networks
Pinar Demetci, Wei Cheng, Gregory Darnell, Xiang Zhou, Sohini Ramachandran, Lorin Crawford
bioRxiv 2020.07.02.184465; doi: https://doi.org/10.1101/2020.07.02.184465
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Multi-scale Inference of Genetic Trait Architecture using Biologically Annotated Neural Networks
Pinar Demetci, Wei Cheng, Gregory Darnell, Xiang Zhou, Sohini Ramachandran, Lorin Crawford
bioRxiv 2020.07.02.184465; doi: https://doi.org/10.1101/2020.07.02.184465

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genetics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4237)
  • Biochemistry (9155)
  • Bioengineering (6797)
  • Bioinformatics (24052)
  • Biophysics (12149)
  • Cancer Biology (9562)
  • Cell Biology (13814)
  • Clinical Trials (138)
  • Developmental Biology (7653)
  • Ecology (11729)
  • Epidemiology (2066)
  • Evolutionary Biology (15534)
  • Genetics (10663)
  • Genomics (14346)
  • Immunology (9503)
  • Microbiology (22876)
  • Molecular Biology (9113)
  • Neuroscience (49080)
  • Paleontology (357)
  • Pathology (1487)
  • Pharmacology and Toxicology (2576)
  • Physiology (3851)
  • Plant Biology (8347)
  • Scientific Communication and Education (1473)
  • Synthetic Biology (2299)
  • Systems Biology (6202)
  • Zoology (1302)