A statistical framework for powerful multi-trait rare variant analysis in large-scale whole-genome sequencing studies

Abstract
Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally-scalable analytical pipeline for functionally-informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits (low-density lipoprotein cholesterol, high-density lipoprotein cholesterol and triglycerides) in 61,861 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered new associations with lipid traits missed by single-trait analysis, including rare variants within an enhancer of NIPSNAP3A and an intergenic region on chromosome 1.
Competing Interest Statement
Z.R.M. is an employee of Insitro. M.E.M. receives research funding from Regeneron Pharmaceutical Inc., unrelated to this project. B.M.P. serves on the Steering Committee of the Yale Open Data Access Project funded by Johnson & Johnson. L.M.R. is a consultant for the TOPMed Administrative Coordinating Center (via Westat). X. Lin is a consultant of AbbVie Pharmaceuticals and Verily Life Sciences. The remaining authors declare no competing interests.
Footnotes
List of consortium members and their affiliations appears at the end of the paper.
Subject Area
- Biochemistry (13873)
- Bioengineering (10577)
- Bioinformatics (33605)
- Biophysics (17316)
- Cancer Biology (14383)
- Cell Biology (20369)
- Clinical Trials (138)
- Developmental Biology (10984)
- Ecology (16213)
- Epidemiology (2067)
- Evolutionary Biology (20520)
- Genetics (13518)
- Genomics (18813)
- Immunology (13943)
- Microbiology (32497)
- Molecular Biology (13527)
- Neuroscience (70875)
- Paleontology (533)
- Pathology (2222)
- Pharmacology and Toxicology (3779)
- Physiology (5959)
- Plant Biology (12161)
- Synthetic Biology (3402)
- Systems Biology (8242)
- Zoology (1870)