Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

pong: fast analysis and visualization of latent clusters in population genetic data

Aaron A. Behr, Katherine Z. Liu, Gracie Liu-Fang, Priyanka Nakka, Sohini Ramachandran
doi: https://doi.org/10.1101/031815
Aaron A. Behr
1Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
2Department of Computer Science, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Katherine Z. Liu
2Department of Computer Science, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Gracie Liu-Fang
3Computer Science Department, Wellesley College, Wellesley, MA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Priyanka Nakka
1Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
4Center for Computational Molecular Biology, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sohini Ramachandran
1Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA
4Center for Computational Molecular Biology, Brown University, Providence, RI, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

Motivation: A series of methods in population genetics use multilocus genotype data to assign individuals membership in latent clusters. These methods belong to a broad class of mixed-membership models, such as latent Dirichlet allocation used to analyze text corpora. Inference from mixed-membership models can produce different output matrices when repeatedly applied to the same inputs, and the number of latent clusters is a parameter that is varied in the analysis pipeline. For these reasons, quantifying, visualizing, and annotating the output from mixed-membership models are bottlenecks for investigators.

Results: Here, we introduce pong, a network-graphical approach for analyzing and visualizing membership in latent clusters with a D3.js interactive visualization. We apply this new method to 225,705 unlinked genome-wide single-nucleotide variants from 2,426 unrelated individuals in the 1000 Genomes Project, and show that pong outpaces current solutions by more than an order of magnitude in runtime while providing a customizable and interactive visualization of population structure that is more accurate than those produced by current tools.

Availability: pong is freely available and can be installed using the Python package management system pip.

Contact: aaron_behr{at}alumni.brown.edu, sramachandran{at}brown.edu

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
Back to top
PreviousNext
Posted November 14, 2015.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
pong: fast analysis and visualization of latent clusters in population genetic data
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
pong: fast analysis and visualization of latent clusters in population genetic data
Aaron A. Behr, Katherine Z. Liu, Gracie Liu-Fang, Priyanka Nakka, Sohini Ramachandran
bioRxiv 031815; doi: https://doi.org/10.1101/031815
Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
Citation Tools
pong: fast analysis and visualization of latent clusters in population genetic data
Aaron A. Behr, Katherine Z. Liu, Gracie Liu-Fang, Priyanka Nakka, Sohini Ramachandran
bioRxiv 031815; doi: https://doi.org/10.1101/031815

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Genomics
Subject Areas
All Articles
  • Animal Behavior and Cognition (2428)
  • Biochemistry (4784)
  • Bioengineering (3329)
  • Bioinformatics (14657)
  • Biophysics (6631)
  • Cancer Biology (5163)
  • Cell Biology (7418)
  • Clinical Trials (138)
  • Developmental Biology (4357)
  • Ecology (6869)
  • Epidemiology (2057)
  • Evolutionary Biology (9906)
  • Genetics (7342)
  • Genomics (9511)
  • Immunology (4545)
  • Microbiology (12661)
  • Molecular Biology (4937)
  • Neuroscience (28287)
  • Paleontology (199)
  • Pathology (804)
  • Pharmacology and Toxicology (1388)
  • Physiology (2020)
  • Plant Biology (4487)
  • Scientific Communication and Education (977)
  • Synthetic Biology (1297)
  • Systems Biology (3909)
  • Zoology (725)