SNAPPY: Single Nucleotide Assignment of Phylogenetic Parameters on the Y chromosome

Alissa L. Severson; Jonathan A. Shortt; Fernando L. Mendez; Genevieve L. Wojcik; Carlos D. Bustamante; Christopher R. Gignoux

doi:10.1101/454736

Abstract

Summary The assignment of Y chromosome data to related clusters, or haplogroups, is a common application in human population genetics. To enable this at scale, we developed SNAPPY. SNAPPY is a software program used to assign Y-chromosome phylogeny-informed haplotypes using dense genotype data. The program efficiently tests all haplotypes in a provided Y-chromosome database to find the haplogroup that is best supported by the input genotypes. Importantly, the method considers both the amount of support for the specific haplogroup, as well as its ancestral haplogroups via parsimony. This accounts for the underlying genealogy the haplotypes represent, strengthening the accuracy of the assignments. SNAPPY is fast, scalable, and uses standard file formats, making it easy to integrate into analytical pipelines.

Availability and Implementation The program is implemented in python. The program, a user manual, haplotype databases, and test datasets are available for download at github.com/chrisgene/snappy.

Contact Jonathan.shortt{at}ucdenver.edu, Chris.gignoux{at}ucdenver.edu

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.