Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology

View ORCID ProfileYing Wang, View ORCID ProfileMasahiro Kanai, Taotao Tan, Mireille Kamariza, View ORCID ProfileKristin Tsuo, Kai Yuan, Wei Zhou, View ORCID ProfileYukinori Okada, the BioBank Japan Project, Hailiang Huang, Patrick Turley, Elizabeth G. Atkinson, View ORCID ProfileAlicia R. Martin
doi: https://doi.org/10.1101/2022.12.29.522270
Ying Wang
1Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
2Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
3Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ying Wang
  • For correspondence: yiwang@broadinstitute.org armartin@broadinstitute.org
Masahiro Kanai
1Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
2Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
3Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
4Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
5Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Masahiro Kanai
Taotao Tan
6Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Mireille Kamariza
7Society of Fellows, Harvard University, Cambridge, MA, 02138 USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kristin Tsuo
1Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
2Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
3Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Kristin Tsuo
Kai Yuan
1Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
2Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Wei Zhou
1Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
2Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
3Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yukinori Okada
5Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
8Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
9Laboratory of Statistical Immunology, Immunology Frontier Research Center (WPI-IFReC), Osaka University, Suita 565-0871, Japan
10Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita 565-0871, Japan
11Center for Infectious Disease Education and Research (CiDER), Osaka University, Suita 565-0871, Japan
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Yukinori Okada
Hailiang Huang
1Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
2Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patrick Turley
12Department of Economics, University of Southern California, Los Angeles, CA, USA
13Center for Economic and Social Research, University of Southern California, Los Angeles, CA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Elizabeth G. Atkinson
6Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alicia R. Martin
1Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
2Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
3Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Alicia R. Martin
  • For correspondence: yiwang@broadinstitute.org armartin@broadinstitute.org
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Polygenic risk scores built from multi-ancestry genome-wide association studies (GWAS, PRSmulti) have the potential to improve PRS accuracy and generalizability across populations. To provide the best practice to leverage the increasing diversity of genomic studies, we used large-scale simulated and empirical data to investigate how ancestry composition, trait-specific genetic architecture, and PRS methodology affect the performance of PRSmulti as compared to PRS constructed from single-ancestry GWAS (PRSsingle). In both simulations on 6 various scenarios and empirical analyses on 17 anthropometric and blood panel traits, we showed that the accuracy of PRSmulti overall outperformed PRSsingle in the understudied target populations, except for a few comparisons where the understudied population only accounted for a very small proportion of the multi-ancestry GWAS. Further, using substantially fewer samples for traits such as height and mean corpuscular volume from Biobank Japan (BBJ) may achieve comparable accuracies to using 320,000 European (EUR) individuals from UK Biobank (UKBB). Finally, we find that incorporating PRS based on local ancestry-informed GWAS and large-scale EUR-based PRS improved predictive performance than using EUR-based PRS alone in understudied African (AFR) population, especially for less polygenic traits when there are variants with large ancestry-specific effects. Overall, our study provides insights into how ancestry composition and genetic architecture impact polygenic prediction across populations, particularly across imbalanced sample sizes. Our work also highlights the need for increasing diversity in genetic studies to achieve equitable PRS performance across ancestral populations and provides practical guidance on developing PRS from multiple resources.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted December 30, 2022.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology
Ying Wang, Masahiro Kanai, Taotao Tan, Mireille Kamariza, Kristin Tsuo, Kai Yuan, Wei Zhou, Yukinori Okada, the BioBank Japan Project, Hailiang Huang, Patrick Turley, Elizabeth G. Atkinson, Alicia R. Martin
bioRxiv 2022.12.29.522270; doi: https://doi.org/10.1101/2022.12.29.522270
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology
Ying Wang, Masahiro Kanai, Taotao Tan, Mireille Kamariza, Kristin Tsuo, Kai Yuan, Wei Zhou, Yukinori Okada, the BioBank Japan Project, Hailiang Huang, Patrick Turley, Elizabeth G. Atkinson, Alicia R. Martin
bioRxiv 2022.12.29.522270; doi: https://doi.org/10.1101/2022.12.29.522270

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4383)
  • Biochemistry (9601)
  • Bioengineering (7097)
  • Bioinformatics (24868)
  • Biophysics (12621)
  • Cancer Biology (9959)
  • Cell Biology (14358)
  • Clinical Trials (138)
  • Developmental Biology (7954)
  • Ecology (12110)
  • Epidemiology (2067)
  • Evolutionary Biology (15989)
  • Genetics (10929)
  • Genomics (14745)
  • Immunology (9871)
  • Microbiology (23680)
  • Molecular Biology (9486)
  • Neuroscience (50884)
  • Paleontology (369)
  • Pathology (1540)
  • Pharmacology and Toxicology (2683)
  • Physiology (4019)
  • Plant Biology (8657)
  • Scientific Communication and Education (1510)
  • Synthetic Biology (2397)
  • Systems Biology (6439)
  • Zoology (1346)