Genome scans for detecting footprints of local adaptation using a Bayesian factor model

Mol Biol Evol. 2014 Sep;31(9):2483-95. doi: 10.1093/molbev/msu182. Epub 2014 Jun 3.

Abstract

There is a considerable impetus in population genomics to pinpoint loci involved in local adaptation. A powerful approach to find genomic regions subject to local adaptation is to genotype numerous molecular markers and look for outlier loci. One of the most common approaches for selection scans is based on statistics that measure population differentiation such as FST. However, there are important caveats with approaches related to FST because they require grouping individuals into populations and they additionally assume a particular model of population structure. Here, we implement a more flexible individual-based approach based on Bayesian factor models. Factor models capture population structure with latent variables called factors, which can describe clustering of individuals into populations or isolation-by-distance patterns. Using hierarchical Bayesian modeling, we both infer population structure and identify outlier loci that are candidates for local adaptation. In order to identify outlier loci, the hierarchical factor model searches for loci that are atypically related to population structure as measured by the latent factors. In a model of population divergence, we show that it can achieve a 2-fold or more reduction of false discovery rate compared with the software BayeScan or with an FST approach. We show that our software can handle large data sets by analyzing the single nucleotide polymorphisms of the Human Genome Diversity Project. The Bayesian factor model is implemented in the open-source PCAdapt software.

Keywords: FST; landscape genetics; population genomics; population structure; selection scans.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adaptation, Biological
  • Bayes Theorem
  • Genetic Variation
  • Genome, Human
  • Genomics / methods*
  • Humans
  • Polymorphism, Single Nucleotide*
  • Population / genetics*
  • Software*