PT - JOURNAL ARTICLE AU - Shubham Saini AU - Ileena Mitra AU - Melissa Gymrek TI - A reference haplotype panel for genome-wide imputation of short tandem repeats AID - 10.1101/277673 DP - 2018 Jan 01 TA - bioRxiv PG - 277673 4099 - http://biorxiv.org/content/early/2018/03/06/277673.short 4100 - http://biorxiv.org/content/early/2018/03/06/277673.full AB - Short tandem repeats (STRs) are involved in dozens of Mendelian disorders and have been implicated in a variety of complex traits in humans. However, existing technologies have not allowed for systematic STR association studies. Genotype array data is available for hundreds of thousands of samples, but is limited to variation in common single nucleotide polymorphisms (SNPs) and does not adequately capture more complex variants like STRs. Here, we leverage next-generation sequencing from 479 families along with existing bioinformatics tools to phase STRs onto SNP haplotypes and create a genome-wide reference haplotype panel. Imputation using our panel achieved an average of 97% concordance between true and imputed STR genotypes in an external dataset and could accurately recover repeat lengths at known pathogenic loci. Imputed STRs capture on average 20% more variation in STR allele length with increased power to detect underlying STR associations compared to individual common SNPs, highlighting a limitation of standard genome-wide association studies. Our framework will enable testing for STR associations with hundreds of traits across massive sample sizes without the need to generate additional data.