TY - JOUR T1 - Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men JF - bioRxiv DO - 10.1101/088716 SP - 088716 AU - G. David Poznik Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/11/19/088716.abstract N2 - We have developed an algorithm to rapidly and accurately identify the Y-chromosome haplogroup of each male in a sample of one to millions. The algorithm, implemented in the yHaplo* software package (yHaplo), does not rely on any particular genotyping modality or platform. Full sequences yield the most granular haplogroup classifications, but genotyping arrays can yield reliable calls, provided a reasonable number of phylogenetically informative variants has been assayed. The algorithm is robust to missing data, genotype errors, mutation recurrence, and other complications. We have tested the software on full sequences from phase 3 of the 1000 Genomes Project and on subsets thereof constructed by downsampling to SNPs present on each of four genotyping arrays. We have also run the software on array data from more than 600,000 males. ER -