PT - JOURNAL ARTICLE AU - Paul D. Blischak AU - Laura S. Kubatko AU - Andrea D. Wolfe TI - Accounting for genotype uncertainty in the estimation of allele frequencies in autopolyploids AID - 10.1101/021907 DP - 2015 Jan 01 TA - bioRxiv PG - 021907 4099 - http://biorxiv.org/content/early/2015/07/02/021907.short 4100 - http://biorxiv.org/content/early/2015/07/02/021907.full AB - Despite the ever increasing opportunity to collect large-scale datasets for population genomic analyses, the use of high throughput sequencing to study populations of polyploids has seen little application. This is due in large part to problems associated with determining allele copy number in the genotypes of polyploid individuals (allelic dosage uncertainty–ADU), which complicates the calculation of important quantities such as allele frequencies. This well-known problem has hindered population genetic studies in polyploids even though various solutions to circumvent the difficulty of estimating polyploid genotypes have been proposed. Additional complications arise because of the mixed inheritance patterns and variable reproductive modes that are characteristic of many polyploid taxa, making the development of population genetic models for polyploids especially difficult. Here we describe a statistical model to estimate biallelic SNP frequencies in a population of autopolyploids using high throughput sequencing data in the form of read counts. Uncertainty in the number of copies of an allele in an individual’s genotype is accounted for by treating genotypes as a latent variable in a hierarchical Bayesian model. In this way, we bridge the gap from data collection (using techniques such as restriction-site associated DNA sequencing) to allele frequency estimation in a unified inferential framework by summing over genotype uncertainty. Simulated datasets were generated under various conditions for both tetraploid and hexaploid populations to evaluate the model’s performance and to help guide the collection of empirical data. We also discuss potential extensions to generalize our model and its application to polyploids.