Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

An Association Test of the Spatial Distribution of Rare Missense Variants within Protein Structures Improves Statistical Power of Sequencing Studies

Bowen Jin, View ORCID ProfileJohn A. Capra, Penelope Benchek, Nicholas Wheeler, View ORCID ProfileAdam C. Naj, Kara L. Hamilton-Nelson, John J. Farrell, Yuk Yee Leung, Brian Kunkle, Badri Vadarajan, Gerard D. Schellenberg, Richard Mayeux, Li-san Wang, Lindsay A. Farrer, Margaret A. Pericak-Vance, Eden R. Martin, Jonathan L. Haines, Dana C. Crawford, View ORCID ProfileWilliam S. Bush
doi: https://doi.org/10.1101/2021.08.09.455695
Bowen Jin
1Graduate Program in Systems Biology and Bioinformatics, Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John A. Capra
2The Bakar Computational Health Sciences Institute, Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA 94143, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for John A. Capra
Penelope Benchek
3Cleveland Institute for Computational Biology, Department for Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicholas Wheeler
3Cleveland Institute for Computational Biology, Department for Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Adam C. Naj
4Department of Pathology and Laboratory Medicine, Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Adam C. Naj
Kara L. Hamilton-Nelson
5The John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
John J. Farrell
6Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, MA 02118, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yuk Yee Leung
4Department of Pathology and Laboratory Medicine, Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Brian Kunkle
5The John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
7Dr. John T. Macdonald Foundation, Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Badri Vadarajan
8Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Department of Neurology, Getrude H. Sergievsky Center, Department of Neurology, Columbia University, New York, NY 10032, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gerard D. Schellenberg
4Department of Pathology and Laboratory Medicine, Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Richard Mayeux
8Taub Institute for Research on Alzheimer’s Disease and the Aging Brain, Department of Neurology, Getrude H. Sergievsky Center, Department of Neurology, Columbia University, New York, NY 10032, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Li-san Wang
4Department of Pathology and Laboratory Medicine, Penn Neurodegeneration Genomics Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lindsay A. Farrer
6Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, MA 02118, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Margaret A. Pericak-Vance
5The John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
7Dr. John T. Macdonald Foundation, Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Eden R. Martin
5The John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
7Dr. John T. Macdonald Foundation, Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jonathan L. Haines
3Cleveland Institute for Computational Biology, Department for Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dana C. Crawford
3Cleveland Institute for Computational Biology, Department for Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
William S. Bush
3Cleveland Institute for Computational Biology, Department for Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for William S. Bush
  • For correspondence: wsb36@case.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

ABSTRACT

Over 90% of variants are rare, and 50% of them are singletons in the Alzheimer’s Disease Sequencing Project Whole Exome Sequencing (ADSP WES) data. However, either single variant tests or unit-based tests are limited in the statistical power to detect the association between rare variants and phenotypes. To best utilize rare variants and investigate their biological effect, we exam their association with phenotypes in the context of protein. We developed a protein structure-based approach, POKEMON (Protein Optimized Kernel Evaluation of Missense Nucleotides), which evaluates rare missense variants based on their spatial distribution on the protein rather than allele frequency. The hypothesis behind this is that the three-dimensional spatial distribution of variants within a protein structure provides functional context and improves the power of association tests. POKEMON identified four candidate genes from the ADSP WES data, namely two known Alzheimer’s disease (AD) genes (TREM2 and SORL) and two novel genes (DUSP18 and CSF1R). For known AD genes, the signal from the spatial cluster is stable even if we exclude known AD risk variants, indicating the presence of additional low frequency risk variants within these genes. DUSP18 has a cluster of variants primarily shared by case subjects around the ligand-binding domain, and this cluster is further validated in a replication dataset with a larger sample size. POKEMON is an open-source tool available at https://github.com/bushlab-genomics/POKEMON.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted August 10, 2021.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
An Association Test of the Spatial Distribution of Rare Missense Variants within Protein Structures Improves Statistical Power of Sequencing Studies
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
An Association Test of the Spatial Distribution of Rare Missense Variants within Protein Structures Improves Statistical Power of Sequencing Studies
Bowen Jin, John A. Capra, Penelope Benchek, Nicholas Wheeler, Adam C. Naj, Kara L. Hamilton-Nelson, John J. Farrell, Yuk Yee Leung, Brian Kunkle, Badri Vadarajan, Gerard D. Schellenberg, Richard Mayeux, Li-san Wang, Lindsay A. Farrer, Margaret A. Pericak-Vance, Eden R. Martin, Jonathan L. Haines, Dana C. Crawford, William S. Bush
bioRxiv 2021.08.09.455695; doi: https://doi.org/10.1101/2021.08.09.455695
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
An Association Test of the Spatial Distribution of Rare Missense Variants within Protein Structures Improves Statistical Power of Sequencing Studies
Bowen Jin, John A. Capra, Penelope Benchek, Nicholas Wheeler, Adam C. Naj, Kara L. Hamilton-Nelson, John J. Farrell, Yuk Yee Leung, Brian Kunkle, Badri Vadarajan, Gerard D. Schellenberg, Richard Mayeux, Li-san Wang, Lindsay A. Farrer, Margaret A. Pericak-Vance, Eden R. Martin, Jonathan L. Haines, Dana C. Crawford, William S. Bush
bioRxiv 2021.08.09.455695; doi: https://doi.org/10.1101/2021.08.09.455695

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4384)
  • Biochemistry (9609)
  • Bioengineering (7103)
  • Bioinformatics (24895)
  • Biophysics (12630)
  • Cancer Biology (9972)
  • Cell Biology (14365)
  • Clinical Trials (138)
  • Developmental Biology (7966)
  • Ecology (12124)
  • Epidemiology (2067)
  • Evolutionary Biology (16001)
  • Genetics (10936)
  • Genomics (14753)
  • Immunology (9880)
  • Microbiology (23697)
  • Molecular Biology (9489)
  • Neuroscience (50924)
  • Paleontology (370)
  • Pathology (1541)
  • Pharmacology and Toxicology (2686)
  • Physiology (4022)
  • Plant Biology (8671)
  • Scientific Communication and Education (1511)
  • Synthetic Biology (2401)
  • Systems Biology (6444)
  • Zoology (1346)