PT  - JOURNAL ARTICLE
AU  - Suyash S. Shringarpure
AU  - Carlos D. Bustamante
AU  - Kenneth Lange
AU  - David H. Alexander
TI  - Efficient analysis of large datasets and sex bias with ADMIXTURE
AID  - 10.1101/039347
DP  - 2016 Jan 01
TA  - bioRxiv
PG  - 039347
4099  - http://biorxiv.org/content/early/2016/02/10/039347.short
4100  - http://biorxiv.org/content/early/2016/02/10/039347.full
AB  - Background: A number of large genomic datasets are being generated for studies of human ancestry and diseases. The ADMIXTURE program is commonly used to infer individual ancestry from genomic data.Results: We describe two improvements to the ADMIXTURE software. The first enables ADMIXTURE to infer ancestry for a new set of individuals using cluster allele frequencies from a reference set of individuals. Using data from the 1000 Genomes Project, we show that this allows ADMIXTURE to infer ancestry for 10,920 individuals in a few hours (a 5x speedup). This mode also allows ADMIXTURE to correctly estimate individual ancestry and allele frequencies from a set of related individuals. The second modification allows ADMIXTURE to correctly handle X-chromosome (and other haploid) data from both males and females. We demonstrate increased power to detect sex-biased admixture in African-American individuals from the 1000 Genomes project using this extension.Conclusions: These modifications make ADMIXTURE more efficient and versatile, allowing users to extract more information from large genomic datasets.