TY - JOUR T1 - EigenGWAS: finding loci under selection through genome-wide association studies of eigenvectors in structured populations JF - bioRxiv DO - 10.1101/023457 SP - 023457 AU - Guo-Bo Chen AU - Sang Hong Lee AU - Zhi-Xiang Zhu AU - Beben Benyamin AU - Matthew R. Robinson Y1 - 2015/01/01 UR - http://biorxiv.org/content/early/2015/11/17/023457.abstract N2 - We apply the statistical framework for genome-wide association studies (GWAS) to eigenvector decomposition (EigenGWAS), which is commonly used in population genetics to characterise the structure of genetic data. The approach does not require discrete sub-populations and thus it can be utilized in any genetic data where the underlying population structure is unknown, or where the interest is assessing divergence along a gradient. Through theory and simulation study we show that our approach can identify regions under selection along gradients of ancestry. In real data, we confirm this by demonstrating LCT to be under selection between HapMap CEU-TSI cohorts, and validated this selection signal across European countries in the POPRES samples. HERC2 was also found to be differentiated between both the CEU-TSI cohort and within the POPRES sample, reflecting the likely anthropological differences in skin and hair colour between northern and southern European populations. Controlling for population stratification is of great importance in any quantitative genetic study and our approach also provides a simple, fast, and accurate way of predicting principal components in independent samples. With ever increasing sample sizes across many fields, this approach is likely to be greatly utilized to gain individual-level eigenvectors avoiding the computational challenges associated with conducting singular value decomposition in large datasets. We have developed freely available software to facilitate the application of the methods. ER -