RT Journal Article SR Electronic T1 Chromosome-wide characterization of Y-STR mutation rates using ultra-deep genealogies JF bioRxiv FD Cold Spring Harbor Laboratory SP 036590 DO 10.1101/036590 A1 Thomas Willems A1 Melissa Gymrek A1 G. David Poznik A1 Chris Tyler-Smith A1 The 1000 Genomes Project Y-Chromosome Working Group A1 Yaniv Erlich YR 2016 UL http://biorxiv.org/content/early/2016/01/15/036590.abstract AB Although the utility of short tandem repeats on the Y-chromosome (Y-STRs) has long been recognized and leveraged in forensics, genealogy and paternity testing, the bulk of these applications have relied on only a few dozen loci identified as having remarkably high mutation rates. Recent efforts have expanded the set of Y-STRs with known mutation rates to two hundred markers, but the limited throughput of the capillary method for estimating mutation rates has left the mutability of most Y-STRs uncharacterized, particularly those with dinucleotide repeat units. To address this limitation, we developed a novel method capable of concurrently estimating the mutation rates of all Y-STRs by leveraging population-scale whole-genome sequencing data. Extensive simulations confirmed that our method robustly accounts for PCR stutter artifacts and obtains unbiased mutation rate estimates. Application of the method to orthogonal datasets from the 1000 Genomes Project and Simons Genome Diversity Project utilized evolutionary data from over 250,000 meioses to estimate the mutation rates of more than 700 Y-STRs with 2–6 base pair repeat units, yielding the largest such set to date. Comparison of these estimates with those from father-son studies indicated a high degree of concordance for loci that have been previously characterized. In addition, we identified nearly 100 previously uncharacterized Y-STRs with pergeneration mutation rates greater than 1 in 3000. Altogether, our study provides a broadly applicable method for estimating Y-STR mutation rates from whole-genome sequencing cohorts, outlines a framework for imputing Y-STRs, vastly expands the number of identified loci with high discriminative power and provides the first chromosome-wide characterization of the mutation rates of dinucleotide short tandem repeats.