ReviewIntegrating statistical genetic and geospatial methods brings new power to phylogeography
Graphical abstract
Research highlights
► Genetic and geospatial data complement one another in phylogeography. ► Practical examples to illustrate how these data can be integrated. ► Variety of tools and approaches can be applied to many biological systems.
Introduction
Phylogeography continues to grow as a discipline, making rapid advances that have been fueled by new methodologies in statistical and population genetics (e.g., Buckley, 2009, Carstens and Richards, 2007, Hickerson et al., 2010, Knowles, 2009, Kozak et al., 2008, Riddle et al., 2008). Originally conceived as a means for bridging the gap between phylogenetics and population genetics, phylogeography continues to explore the processes underlying the geographic distribution of genetic diversity within and among species (Avise et al., 1987, Avise, 2000, Avise, 2009).
The field has moved considerably beyond the use of bifurcating ‘species’ trees as the sole source of primary data. Coalescent theory (Kingman, 1982) and the development of statistically rigorous methods for inferring historical demographic processes and testing among alternative hypotheses of population differentiation have revolutionized the field (Hickerson et al., 2010, Knowles, 2004, Knowles, 2009, Nielsen and Beaumont, 2009). Methods capitalizing on known properties of the coalescent have been used to address a diversity of questions in evolutionary biology (see review in Knowles (2009)) including estimating species trees from gene trees (e.g., Carstens and Knowles, 2007, Heled and Drummond, 2010, Yang and Rannala, 2010), reconstructing changes in population size through time from ancient DNA (e.g., Chan et al., 2006, Shapiro et al., 2004; see Ramakrishnan and Hadly, 2009), and characterizing the demographic signatures associated with colonization events (e.g., Rosenblum et al., 2007). Recent reviews of coalescent-based methods underscore advances in the field and highlight some of the software programs that implement these approaches (Hickerson et al., 2010, Knowles, 2009, Kuhner, 2008, Nielsen and Beaumont, 2009, Riddle et al., 2008).
The availability of geospatial data (e.g., vegetation, climate, paleoclimate, geology) and the development of predictive modeling approaches (e.g., species distribution models, Phillips et al., 2006; mechanistic models, Buckley et al., 2010) have progressed in parallel with these innovations in population genetics and we are now on the verge of the next generation of phylogeographic analyses. An example of a geospatial technique with tremendous potential for use in phylogeographic studies is species distribution models (SDMs; also known as ecological niche models or ENMs); these have already been applied widely to evolutionarily and ecological studies. SDMs predict the distribution of a species using various climatic and geographic variables (e.g., temperature, rainfall, aspect; Phillips et al., 2006). The resulting model generates a map indicating areas of high and low habitat suitability based on a species’ ecological tolerance. SDMs have been used in conjunction with genetic methods to estimate ancestral distributions, the ecological interchangeability/divergence of sister taxa (and subsequently, the identification and delineation of cryptic species) and as proxies for a species dispersal potential (Graham et al., 2004, Knowles et al., 2007, Rissler et al., 2006, Stockman and Bond, 2007). While researchers must be mindful of the assumptions underlying SDMs (e.g., niche conservatism, habitat saturation, Gleasonian biotic communities), the associated uncertainties (see Elith and Leathwick, 2009, Pearson et al., 2006, Wiens et al., 2009), and the strengths and weakness of particular methodologies (e.g., Elith and Graham, 2009, Hernandez et al., 2006), these advances create considerable opportunity for merging genetic and geospatial data for the purpose of constructing and testing among temporally and spatially explicit phylogeographic hypotheses. Geographic Information Systems (GIS) provide a variety of integrative approaches that have proven useful for illuminating phylogeographic patterns and processes (see Kidd and Ritchie, 2006, Kozak et al., 2008, Richards et al., 2007). Recent empirical examples have illustrated the power of merging these data (e.g., Buckley et al., 2009, Carnaval et al., 2009, Carstens et al., 2005, Graham et al., 2004, Hugall et al., 2002, Knowles and Alvarado-Serrano, 2010, Knowles et al., 2007, Rodríguez-Robles et al., 2010, Shepard and Burbrink, 2009), but in general, relatively few phylogeographic studies have explicitly incorporated geospatial information.
Without rigorously incorporating the “geographic” component of phylogeography, there is a tendency to rely on anecdotal biogeographic inferences or simplistic classifications of biogeographic barriers. This can undervalue the influence of geography and climate on organismal distribution, and oversimplify the varying impacts geographic barriers may have (Crawford et al., 2007). Moreover, genetic patterns analyzed without consideration of spatial complexity can underestimate the effects of environmental history on organismal dispersal through time (Kozak et al., 2008).
Despite GIS technology becoming more broadly available and user friendly within the last decade, it remains underutilized in the field of phylogeography. This stems, in part, from the fact that only a handful of programs were created explicitly for phylogeographic studies. However, in reality, the abundance of currently available geospatial tools offers a rich resource for incorporating GIS into phylogeography (Table 1). More powerful and insightful phylogeographic inferences are attainable with available GIS data and tools; inventive and creative approaches to problems in phylogeography can emerge by drawing from existing methods and incorporating approaches from fields such as landscape ecology, population genetics, phylogenetics, and GIS. For example, like phylogeography, landscape genetics is primarily concerned with spatial patterns of genetic diversity with respect to habitat features but at smaller temporal and spatial scales. Within this field there is a rich set of methodologies for examining the correspondence between contemporary patterns of diversity and divergence among georeferenced genetic data, and quantitative information about landscape features (see Gaggiotti, 2010 and associated papers; Manel et al., 2003). Commonly used concepts and approaches in landscape genetics integrating GIS data with spatially explicit measures of genetic diversity, differentiation, and effective population size can be applied to biogeographic questions. This can help to identify barriers to gene flow and better understand how particular characteristics constrain or facilitate population connectivity in an evolutionary context.
Integrative approaches will ultimately allow us to more thoroughly consider and examine the range of potential histories underlying divergence patterns within and among species. Hypotheses generated under the exploration of one type of data are testable by the other, and jointly considering both types of information will aid in the refinement of hypotheses and the recognition of potential mechanisms previously not considered (Buckley, 2009, Knowles, 2009). While geospatial data must be used with caution (see Box 1 for discussion of species distribution modeling), they are a practical and informative tool that can place inferred demographic events in an historical and spatial context, guide genetic sampling, and point to areas for further investigation. Identifying how demographic events coincide with changes in landscape and environmental histories, such as climatic variables and the distribution of suitable habitat over time, can reveal the ecological and evolutionary mechanisms that may underlie population differentiation.
The use of ecological information and historical climatic and environmental data to guide the construction of appropriate phylogenetic and demographic models has added to our understanding of the role of particular geological barriers and climatic changes in intraspecific divergence. Thus, approaching phylogeographic studies from multiple independent perspectives can help to highlight some of the potential mechanisms underlying diversification so that we more thoroughly consider relevant and testable alternative hypotheses that might not otherwise be apparent.
Section snippets
Practical applications
Phylogeography aims to understand how patterns of divergence within species and species complexes coincide with current and historical geologic, geographic, and landscape features. By evaluating phylogeographic hypotheses within a statistical framework that unites phylogenetic and population genetic perspectives, we can infer the processes underlying differentiation and select among alternative evolutionary histories (Knowles and Maddison, 2002). Such integrative approaches will benefit even
Example 1 – population connectivity: visualizing putative dispersal corridors
For this first example, we use concepts from landscape genetics to explore patterns of genetic connectivity among populations, developing hypotheses of directionality and strength of gene flow among populations and across the landscape. This approach is of particular interest to conservation biologists who wish to identify those regions of the landscape that are crucial for maintaining gene flow among populations of interest. We illustrate the approach with an iguanid lizard (Oplurus cuvieri)
Example 2 – the influence of biogeographic barriers: constructing and testing among alternative hypotheses
In the second example we focus on a single species and demonstrate how two differing types of spatially explicit information can be used as a foundation for constructing alternative phylogeographic hypotheses. We use genetic data, the coalescent, species distribution models, and reconstructed ancestral distributions to examine the phylogeography of an amphibian within the Central American isthmus. The Hourglass Treefrog (Dendropsophis ebraccatus) occurs in the lowlands of Costa Rica and Panama
Example 3 – comparative phylogeography: detecting underlying mechanisms of diversification
In the third example we focus on comparisons of phylogeographic divergence across distantly related taxa. The field that is loosely defined as “comparative phylogeography” is of particular interest to investigators that are most interested in underlying climatic and/or geological mechanisms than in the ecological or evolutionary history of a single taxon. Here, we illustrate the approach by investigating montane animals in southwestern North America. We demonstrate the use of GIS and SDMs to
Conclusion
There are many elegant ways to integrate geospatial information into phylogeographic studies to elucidate both patterns of divergence and the associated processes. Phylogeography is iterative in nature, looking constantly among phylogenetic, population genetic, and geospatial patterns of differentiation (Buckley, 2009). Merging geospatial and genetic data is important from the early stages of data exploration all the way to complex analyses including multiple taxa, loci, and heterogeneous
Acknowledgments
We thank P. Barber, A. Carnaval, D. Lytle, J. McCormack, and J. Robertson for sharing data. T. Milledge and J. Pormann of the Duke Shared Cluster Resource, M. Hickerson, and D. Wegmann assisted with compiling, installing, and troubleshooting software. We thank members of the Yoder Lab, D. Swofford, D. Beamer, J. Robertson, and the instructors and participants of the 2009 Statistical Phylogeography course for fruitful discussions. This manuscript benefited from comments by members of the Yoder
References (113)
RAMAS Metapop: viability analysis for stage-structured metapopulations (version 5.0). Applied Biomathematics
(2002)- et al.
Serial SimCoal: a population genetics model for data from multiple populations and points in time
Bioinformatics
(2005) - et al.
Comparative phylogeography as an integrative approach to historical biogeography
Journal of Biogeography
(2001) - et al.
Estimating divergence times from molecular data on phylogenetic and population genetic timescales
Annual Review of Ecology and Systematics
(2002) Phylogeography: The History and Formation of Species
(2000)Phylogeography: retrospect and prospect
Journal of Biogeography
(2009)- et al.
Intraspecific phylogeography: the mitochondrial DNA bridge between population genetics and systematics
Annual Review of Ecology and Systematics
(1987) Phylogeography of the canyon treefrog, Hyla arenicolor (Cope) based on mitochondrial DNA sequence data
Molecular Ecology
(1999)Patterns of gene flow and population genetic structure in the canyon treefrog, Hyla arenicolor (Cope)
Molecular Ecology
(1999)- et al.
Approximate Bayesian computation in population genetics
Genetics
(2002)
In defence of model-based inference in phylogeography REPLY
Molecular Ecology
Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approach
Proceedings of the National Academy of Sciences
Comparative mtDNA phylogeography of neotropical freshwater fishes: testing shared history to infer the evolutionary landscape of lower Central America
Molecular Ecology
Toward an organismal, integrative, and iterative phylogeography
Bioessays
Identifying glacial refugia in a geographic parthenogen using palaeoclimate modelling and phylogeography: the New Zealand stick insect Argosarchus horridus (White)
Molecular Ecology
Can mechanism inform species’ distribution models?
Ecology Letters
Historical climate modelling predicts patterns of current biodiversity in the Brazilian Atlantic forest
Journal of Biogeography
Stability predicts genetic diversity in the Brazilian Atlantic forest hotspot
Science
Estimating species phylogeny from gene-tree probabilities despite incomplete lineage sorting: an example from Melanoplus grasshoppers
Systematic Biology
Integrating coalescent and ecological; niche modeling in comparative phylogeography
Evolution
Investigating the evolutionary history of the Pacific Northwest mesic forest ecosystem: hypothesis testing within a comparative phylogeographic framework
Evolution
Bayesian estimation of the timing and severity of a population bottleneck from ancient DNA
PLoS Biology
Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study
Molecular Ecology Notes
Integrating individual behaviour and landscape genetics: the population structure of timber rattlesnake hibernacula
Molecular Ecology
Earliest evolution associated with closure of the Tropical American Seaway
Proceedings of the National Academy of Science
The role of tropical dry forest as a long-term barrier to dispersal: a comparative phylogeographic analysis of dry forest tolerant and intolerant frogs
Molecular Ecology
Approximate Bayesian Computation (ABC) in practice
Trends in Ecology and Evolution
The ade4 package: implementing the duality diagram for ecologists
Journal of Statistical Software
BEAST: Bayesian evolutionary analysis by sampling trees
BMC Evolutionary Biology
Bayesian coalescent inference of past population dynamics from molecular sequences
Molecular Biology and Evolution
Do they? How do they? Why do they differ? On finding reasons for differing performances of species distribution models
Ecography
Species distribution models: ecological explanation and prediction across space and time
Annual Review of Ecology, Evolution, and Systematics
A statistical explanation of MaxEnt for ecologists
Diversity and Distributions
Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows
Molecular Ecology Resources
SIMCOAL: a general coalescent program for the simulation of molecular data in interconnected populations with arbitrary demography
Journal of Heredity
Population genetic structure reveals terrestrial affinities for a headwater stream insect
Freshwater Biology
Spatially explicit Bayesian clustering models in population genetics
Molecular Ecology Resources
Preface to the special issue: advances in the analysis of spatial genetic data
Molecular Ecology Resources
Integrating phylogenetics and environmental niche models to explore speciation mechanisms in dendrobatid frogs
Evolution
Geneland: a computer package for landscape genetics
Molecular Ecology Notes
Bayesian inference of species trees from multilocus data
Molecular Biology and Evolution
The effect of sample size and species characteristics on performance of different species distribution modeling methods
Ecography
Integration within the Felsenstein equation for improved Markov chain Monte Carlo methods in population genetics
Proceedings of the National Academy of Sciences
Testing comparative phylogeographic models of marine vicariance and dispersal using a hierarchical Bayesian approach
BMC Evolutionary Biology
Calibrating a molecular clock from phylogeographic data: moments and likelihood estimators
Evolution
Test for simultaneous divergence using approximate Bayesian computation
Evolution
msBayes: pipeline for testing comparative phylogeographic histories using hierarchical approximate Bayesian computation
BMC Bioinformatics
Phylogeography’s past, present, and future: 10 years after Avise, 2000
Molecular Phylogenetics and Evolution
Cited by (176)
Geography and past climate changes have shaped the evolution of a widespread lizard in arid Central Asia
2023, Molecular Phylogenetics and EvolutionEvaluating the role of landforms in habitat suitability and connectivity of Moringa peregrina (Forssk.) in southeastern Iran
2023, South African Journal of BotanySome like it hot: Past and present phylogeography of a desert dwelling gecko across the Arabian Peninsula
2024, Journal of Biogeography