Skip to main content
Log in

Bayesian spatial modeling of genetic population structure

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Natural populations of living organisms often have complex histories consisting of phases of expansion and decline, and the migratory patterns within them may fluctuate over space and time. When parts of a population become relatively isolated, e.g., due to geographical barriers, stochastic forces reshape certain DNA characteristics of the individuals over generations such that they reflect the restricted migration and mating/reproduction patterns. Such populations are typically termed as genetically structured and they may be statistically represented in terms of several clusters between which DNA variations differ clearly from each other. When detailed knowledge of the ancestry of a natural population is lacking, the DNA characteristics of a sample of current generation individuals often provide a wealth of information in this respect. Several statistical approaches to model-based clustering of such data have been introduced, and in particular, the Bayesian approach to modeling the genetic structure of a population has attained a vivid interest among biologists. However, the possibility of utilizing spatial information from sampled individuals in the inference about genetic clusters has been incorporated into such analyses only very recently. While the standard Bayesian hierarchical modeling techniques through Markov chain Monte Carlo simulation provide flexible means for describing even subtle patterns in data, they may also result in computationally challenging procedures in practical data analysis. Here we develop a method for modeling the spatial genetic structure using a combination of analytical and stochastic methods. We achieve this by extending a novel theory of Bayesian predictive classification with the spatial information available, described here in terms of a colored Voronoi tessellation over the sample domain. Our results for real and simulated data sets illustrate well the benefits of incorporating spatial information to such an analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Andrieu C, Doucet A and Robert CP (2004). Computational advances for and from Bayesian Analysis. Stat Sci 19: 120–129

    MathSciNet  Google Scholar 

  • Balding DJ and Nichols RA (1997). Significant genetic correlations among Caucasians at forensic DNA loci. Heredity 78: 583–589

    Article  Google Scholar 

  • Barber CB, Dobkin DP and Huhdanpaa HT (1996). The Quickhull algorithm for convex hulls. ACM Trans Math Software 22: 469–483

    Article  MATH  MathSciNet  Google Scholar 

  • Berry A (1999) A wide-range efficient algorithm for minimal triangulation. Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, Philadelphia, SIAM, pp 860–861

  • Cegelski CC, Waits LP and Anderson NJ (2003). Assessing population structure and gene flow in Montana wolverines (Gulo gulo) using assignment-based approaches. Mol Ecol 12: 2907–2918

    Article  Google Scholar 

  • Corander J, Waldmann P and Sillanpää MJ (2003). Bayesian analysis of genetic differentiation between populations. Genetics 163: 367–374

    Google Scholar 

  • Corander J, Waldmann P, Marttinen P and Sillanpää MJ (2004). BAPS 2: enhanced possibilities for the analysis of genetic population structure. Bioinformatics 20: 2363–2369

    Article  Google Scholar 

  • Corander J, Marttinen P and Mäntyniemi S (2006). Bayesian identification of stock mixtures from molecular marker data. Fish Bull 104: 550–558

    Google Scholar 

  • Corander J, Gyllenberg M, Koski T (2007) Bayesian unsupervised classification framework based on stochastic partitions of data and a parallel search strategy. Adv Data Analysis Classification, under review

  • Denison DGT and Holmes CC (2001). Bayesian partitioning for estimating disease risk. Biometrics 57: 143–149

    Article  MathSciNet  Google Scholar 

  • Duda RO, Hart PE and Stork DG (2000). Pattern classification, 2nd edn. Wiley, New York

    Google Scholar 

  • Falush D, Stephens M and Pritchard JK (2003). Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 164: 1567–1587

    Google Scholar 

  • Gelfand AE and Vounatsou P (2003). Proper multivariate conditional autoregressive models for spatial data analysis. Biostatistics 4: 11–25

    Article  MATH  Google Scholar 

  • Guillot G, Estoup A, Mortier F and Cosson JF (2005). A spatial statistical model for landscape genetics. Genetics 170: 1261–1280

    Article  Google Scholar 

  • Hartl DL and Clark AG (1997). Principles of population genetics, 3rd edn. Sinauer Associates, Sunderland

    Google Scholar 

  • Heikkinen J and Arjas E (1998). Non-parametric Bayesian estimation of a spatial Poisson intensity. Scand J Statist 25: 435–450

    Article  MATH  MathSciNet  Google Scholar 

  • Heikkinen J and Arjas E (1999). Modeling a poisson forest in variable elevations: a nonparametric Bayesian approach. Biometrics 55: 738–745

    Article  MATH  Google Scholar 

  • Kimura M and Weiss GH (1964). The stepping-stone model of population structure and the decrease of genetic correlation with distance. Genetics 49: 561–576

    Google Scholar 

  • Lauritzen SL (1996). Graphical models. Oxford University Press, Oxford

    Google Scholar 

  • Manni F, Guérard E and Heyer E (2004). Geographic patterns of (genetic, morphologic, linguistic) variation: how barriers can be detected by “Monmonier’s algorithm”. Hum Biol 76: 173–190

    Article  Google Scholar 

  • Pella J and Masuda M (2001). Bayesian methods for analysis of stock mixtures from genetic characters. Fish Bull 99: 151–167

    Google Scholar 

  • Perks W (1947). Some observations on inverse probability including a new indifference rule. J Inst Actuaries 73: 285–334

    MathSciNet  Google Scholar 

  • Pritchard JK, Stephens M and Donnelly P (2000). Inference of population structure using multilocus genotype data. Genetics 155: 945–959

    Google Scholar 

  • Rannala B and Mountain JL (1997). Detecting immigration by using multilocus genotypes. PNAS 94: 9197–9201

    Article  Google Scholar 

  • Seppä P, Gyllenstrand M, Corander J and Pamilo P (2004). Coexistence of the social types: Genetic population structure in the ant Formica exsecta. Evolution 58: 2462–2471

    Google Scholar 

  • Sawyer S (1977). Asymptotic properties of the equilibrium probability of identity in a geographically structured population. Adv Appl Prob 9: 268–282

    Article  MATH  MathSciNet  Google Scholar 

  • Vounatsou P, Smith T and Gelfand AE (2000). Spatial modeling of multinomial data with latent structure; an application to geographical mapping of human gene and haplotype frequencies. Biostatistics 1: 177–189

    Article  MATH  Google Scholar 

  • Wasser SK, Shedlock AM, Comstock K, Ostrander EA, Mutayoba B and Stephens M (2004). Assigning African elephant DNA to geographic region of origin: Applications to the ivory trade. PNAS 101: 14847–14852

    Article  Google Scholar 

  • Wright S (1943). Isolation by distance. Genetics 28: 139–156

    Google Scholar 

  • Wright S (1951). The genetical structure of populations. Ann Eugen 15: 323–354

    Google Scholar 

  • Wright S (1965). The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution 52: 950–956

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jukka Corander.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corander, J., Sirén, J. & Arjas, E. Bayesian spatial modeling of genetic population structure. Computational Statistics 23, 111–129 (2008). https://doi.org/10.1007/s00180-007-0072-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-007-0072-x

Keywords

Navigation