Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Automatic inference of demographic parameters using Generative Adversarial Networks

Zhanpeng Wang, Jiaping Wang, Michael Kourakos, Nhung Hoang, Hyong Hark Lee, Iain Mathieson, View ORCID ProfileSara Mathieson
doi: https://doi.org/10.1101/2020.08.05.237834
Zhanpeng Wang
1Department of Computer Science, Haverford College, Haverford, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jiaping Wang
1Department of Computer Science, Haverford College, Haverford, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael Kourakos
2Department of Computer Science, Swarthmore College, Swarthmore, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nhung Hoang
2Department of Computer Science, Swarthmore College, Swarthmore, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Hyong Hark Lee
2Department of Computer Science, Swarthmore College, Swarthmore, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Iain Mathieson
3Department of Genetics, University of Pennsylvania, Philadelphia, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sara Mathieson
1Department of Computer Science, Haverford College, Haverford, PA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sara Mathieson
  • For correspondence: smathieson@haverford.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Population genetics relies heavily on simulated data for validation, inference, and intuition. In particular, since the evolutionary “ground truth” for real data is always limited, simulated data is crucial for training supervised machine learning methods. Simulation software can accurately model evolutionary processes, but requires many hand-selected input parameters. As a result, simulated data often fails to mirror the properties of real genetic data, which limits the scope of methods that rely on it. Here, we develop a novel approach to estimating parameters in population genetic models that automatically adapts to data from any population. Our method, pg-gan, is based on a generative adversarial network that gradually learns to generate realistic synthetic data. We demonstrate that our method is able to recover input parameters in a simulated isolation-with-migration model. We then apply our method to human data from the 1000 Genomes Project, and show that we can accurately recapitulate the features of real data.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://github.com/mathiesonlab/pg-gan

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted February 09, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Automatic inference of demographic parameters using Generative Adversarial Networks
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Automatic inference of demographic parameters using Generative Adversarial Networks
Zhanpeng Wang, Jiaping Wang, Michael Kourakos, Nhung Hoang, Hyong Hark Lee, Iain Mathieson, Sara Mathieson
bioRxiv 2020.08.05.237834; doi: https://doi.org/10.1101/2020.08.05.237834
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Automatic inference of demographic parameters using Generative Adversarial Networks
Zhanpeng Wang, Jiaping Wang, Michael Kourakos, Nhung Hoang, Hyong Hark Lee, Iain Mathieson, Sara Mathieson
bioRxiv 2020.08.05.237834; doi: https://doi.org/10.1101/2020.08.05.237834

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4235)
  • Biochemistry (9140)
  • Bioengineering (6784)
  • Bioinformatics (24009)
  • Biophysics (12132)
  • Cancer Biology (9537)
  • Cell Biology (13782)
  • Clinical Trials (138)
  • Developmental Biology (7638)
  • Ecology (11707)
  • Epidemiology (2066)
  • Evolutionary Biology (15514)
  • Genetics (10648)
  • Genomics (14330)
  • Immunology (9484)
  • Microbiology (22850)
  • Molecular Biology (9096)
  • Neuroscience (49010)
  • Paleontology (355)
  • Pathology (1483)
  • Pharmacology and Toxicology (2570)
  • Physiology (3848)
  • Plant Biology (8332)
  • Scientific Communication and Education (1471)
  • Synthetic Biology (2296)
  • Systems Biology (6194)
  • Zoology (1301)