TY - JOUR T1 - Local PCA Shows How the Effect of Population Structure Differs Along the Genome JF - bioRxiv DO - 10.1101/070615 SP - 070615 AU - Han Li AU - Peter Ralph Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/08/21/070615.abstract N2 - Population structure leads to systematic patterns in measures of mean relatedness between individuals in large genomic datasets, which are often discovered and visualized using dimension reduction techniques such as principal component analysis (PCA). Mean relatedness is an average of the relationships across locus-specific genealogical trees, which can be strongly affected on intermediate genomic scales by linked selection and other factors, We show how to use local principal components analysis to describe this meso-scale heterogeneity in patterns of relatedness, and apply the method to genomic data from three species, finding in each that the effect of population structure can vary substantially across only a few megabases. In a global human dataset, localized heterogeneity is likely explained by polymorphic chromosomal inversions. In a range-wide dataset of Medicago truncatula, factors that produce heterogeneity are shared between chromosomes, correlate with local gene density, and may be caused by background selection or local adaptation. In a dataset of primarily African Drosophila melanogaster, large-scale heterogeneity across each chromosome arm is explained by known chromosomal inversions thought to be under recent selection, and after removing samples carrying inversions, remaining heterogeneity is correlated with recombination rate and gene density, again suggesting a role for linked selection. The visualization method provides a flexible new way to discover biological drivers of genetic variation, and its application to data highlights the strong effects that linked selection and chromosomal inversions can have on observed patterns of genetic variation. ER -