The genomic landscape at a late stage of stickleback speciation: high genomic divergence interspersed by small localized regions of introgression

Mark Ravinet; Kohta Yoshida; Shuji Shigenobu; Atsushi Toyoda; Asao Fujiyama; Jun Kitano

doi:10.1101/190629

Abstract

Speciation is a continuous process and analysis of species pairs at different stages of divergence provides insight into how it unfolds. Genomic studies on young species pairs have often revealed peaks of divergence and heterogeneous genomic differentiation. Yet it remains unclear how localised peaks of differentiation progress to genome-wide divergence during the later stages of speciation with gene flow. Spanning the speciation continuum, stickleback species pairs are ideal for investigating how genomic divergence builds up during speciation. However, attention has largely focused on young postglacial species pairs, with little known of the genomic signatures of divergence and introgression in older systems. The Japanese stickleback species pair, composed of the Pacific Ocean three-spined stickleback (Gasterosteus aculeatus) and the Japan Sea stickleback (G. nipponicus), which co-occur in the Japanese islands, is at a late stage of speciation. Divergence likely started well before the end of the last glacial period and crosses between Japan Sea females and Pacific Ocean males result in hybrid male sterility. Here we use coalescent analyses and Approximate Bayesian computation to show that the two species split approximately 0.68-1 million years ago but that they have continued to hybridise at a low rate throughout divergence. Population genomic data revealed that high levels of genomic differentiation are maintained across the majority of the genome when gene flow occurs. However despite this, we identified multiple, small regions of introgression, strongly correlated with recombination rate. Our results demonstrate that a high level of genome-wide divergence can establish in the face of persistent introgression and that gene flow can be localized to small genomic regions at the later stages of speciation with gene flow.

Author summary When species evolve, reproductive isolation leads to a build-up of differentiation in the genome where genes involved in the process occur. Much of our understanding of this comes from early stage speciation, with relatively few examples from more divergent species pairs that still exchange genes. To address this, we focused on Pacific Ocean and Japan Sea sticklebacks, which co-occur in the Japanese islands. We established that they are the oldest and most divergent known stickleback species pair, that they evolved in the face of gene flow and that this gene flow is still on going. We found introgression is confined to small, localised genomic regions where recombination rate is high. Our results show high divergence can be maintained between species, despite extensive gene flow.

Introduction

Speciation is a continuous process through which reproductive isolation is established [1–3]. According to the genic view of speciation [4], when populations are in contact, gene flow is initially restricted at barrier loci (i.e. loci underlying reproductive isolation), leading to the emergence of peaks of genetic differentiation surrounding such barriers; i.e. heterogeneous genomic differentiation [5,6]. As speciation progresses, this localised build-up of reproductive isolation spreads to nearby regions due to linkage [4,5,7]. Once a critical amount of differentiation at multiple barrier loci has accumulated, reduction of the genome-wide effective migration rate will eventually lead to divergence across the entire genome [5,7]. This final step of genome-wide congealing may be a rapid and non-linear phase transition under certain conditions, such as when isolating barriers have a polygenic basis or a few strong barrier loci arise [8–10]

Recent empirical genomic studies have revealed regions of high and low differentiation dispersed throughout the genome at early stages of speciation [7,11,12]. This empirical data has lent strong support to the genic perspective of the speciation process [4]. To-date however, the majority of speciation genomic studies demonstrating heterogeneous genetic differentiation have come from young species or population pairs with low divergence [7,11,12] with some exceptions [9,13]. Even for these cases [13,14], it is unclear whether gene flow has occurred throughout divergence or whether the species pair in question experienced periods of geographical isolation (see below). A bias towards early stage divergence has undoubtedly been informative to understand the onset of the speciation process [15–18]. However, it is limited in its scope for understanding the factors that eventually lead to the completion of the speciation process – i.e. the evolution of genome-wide differentiation. When divergence with gene flow is studied across the speciation continuum in Timema stick insects (including the later stage divergence) the range of genome-wide differentiation appears to be disjointed with a gap in F_ST between 0.3 and 0.7 [8]. Furthermore, literature surveys have also indicated that many species pairs exhibit a mean F_ST between either 0.03-0.27 or 0.70-0.90 when gene flow is occurring [8,9]. This disjointed distribution of differentiation is consistent with the idea that progression towards speciation may be non-linear, with a phase transition due to genome wide congealing [10,19].

In addition to need for more balanced understanding of the extent of genomic divergence and introgression at later stages of the speciation continuum, there is a need for studies which account for factors such as demographic history [12,14]. This is because high genome-wide differentiation in a single species pair may have evolved via genetic drift and local adaptation during allopatric isolation, rather than due to divergence with gene flow. In other cases, heterogeneous genomic differentiation may be due to erosion of genetic differentiation due to introgression following secondary contact after geographical isolation. Without a picture of the demographic history, this scenario may be indistinguishable from primary divergence [20] and will therefore introduce bias if multiple species pairs are compared. Despite the fact that the expected pattern of genomic differentiation during speciation is influenced by the timing and duration of geographical isolation [7], testing different demographic histories has been somewhat neglected by the field [7,20].

Other factors besides demographic history of a species pair can also confound patterns of heterogeneous genomic differentiation. For example, local adaptation or background selection in genomic regions where recombination is reduced can elevate differentiation measures and be mistaken for barrier loci [13,21,22]. Similarly, regions of low differentiation may be caused by not only on-going gene flow but also incomplete lineage sorting (ILS) [21,23] and mutation rate variation [24]. Distinction between gene flow and ILS is likely easier in more divergent species pairs [24–26]. Furthermore, the use of multiple classical and recently developed methods, such as detection of recent hybrid progeny, ABBA-BABA tests [27,28], model-based inference[29], and comparison between allopatric and sympatric pairs [22,28] provide a means to distinguish signatures of gene flow from alternative explanations. Thus, there is a need for studies which account for factors such as demographic history, recombination rate variation, and ILS that can confound the interpretation of genome scan data [7,12].

Three-spined stickleback species pairs (genus Gasterosteus) span the speciation continuum at varying stages of divergence, making them a model system for speciation research [30,31]. To-date genomic research on speciation with gene flow in the stickleback complex has largely focused on weakly divergent species pairs, such as lake-stream ecotypes, with mean genome-wide F_ST values of less than 0.3 [32–34]. Such studies have shown that the genomic landscape of differentiation between these recently diverged sympatric or parapatric species pairs is heterogeneous and interspersed with multiple peaks of high differentiation [16,32,34]. The emerging pattern is consistent with predictions under the genic concept of speciation – i.e. that reproductive isolation is localized in the genome at early stages of speciation [4,35]. However, it remains unclear whether such localized differentiation will eventually progress toward genome-wide differentiation in the face of gene flow.

Toward the end of the stickleback speciation continuum is a marine species pair in Japan [36,37]. The Japan Sea stickleback (G. nipponicus) is sympatric with the Pacific Ocean lineage of three-spined stickleback (G. aculeatus) (Fig 1A) in the waters surrounding the Japanese archipelago (Fig 1C) [36,38]. Divergence time between the two marine species has been estimated to be 1.5-2 million years based on allozyme and microsatellite data [37,39], making it much older than postglacial stickleback species pairs. Speciation has been hypothesized to have occurred as a result of the repeated isolation of the Sea of Japan during the Pleistocene, but this divergence scenario remains to be explicitly tested [37,39]. A unique feature of the G. nipponicus and G. aculeatus system, relative to postglacial stickleback species pairs, is that a neo-sex chromosome has arisen due to a fusion between a Y chromosome and a previously autosomal chromosome IX (chrIX) in the G. nipponicus lineage [36,40]. Furthermore, crosses between Japan Sea females and Pacific Ocean males show hybrid male sterility [37]. Previous quantitative trait locus (QTL) mapping identified QTL for courtship behaviour on the neo-X and hybrid male sterility on the ancestral-X. However, there are other isolating barriers, such as eco-geographical isolation, temporal isolation, and ecological selection against migrants [37,41,42]. The combination of these multiple barriers most likely contributes to the strong reproductive isolation in this system [36,43]. However, despite such strong divergence, hybrids have been observed where the two species co-occur in Northern Japan [36] and mitochondrial discordance between the species suggests some history of introgression during speciation [44,45]. Although the Japanese species pair therefore represents the furthest point of divergence along the stickleback speciation continuum, speciation remains incomplete. The evolutionary history and genome-wide patterns of genetic differentiation and introgression of this strongly divergent species pair therefore remains an open question

Fig 1. The Japan Sea stickleback is a separate species. (A) Rooted nuclear consensus tree for Japan Sea, Pacific Ocean and Atlantic Ocean stickleback lineages from 10 kb non-overlapping sliding windows across the autosome. Red trees indicate species clustering; blue trees indicate geographical clustering and green trees reflect ancestral polymorphism. NB: Only 1,000 subsampled species trees are shown here to aid illustration. (B) Mitogenome Bayesian consensus tree shows divergence between two mitochondrial clades – all Japanese sticklebacks (G. nipponicus and G. aculeatus) and G. aculeatus occurring in Europe and North America. (C) Present day distribution of G. aculeatus (blue) and G. nipponicus (red) around the Japanese archipelago. The two species overlap in Hokkaido, Northern Japan and samples for this study were collected in Bekanbeushi River in Akkeshi unless noted. (D) PSMC plot of 26 resequenced genomes shows a steady population size in the Pacific Ocean lineage (blue) but a bottleneck around 0.15-0.3 million years before present and a subsequent increase in the Japan Sea lineage (orange).

The aim of our study was to address gap in our knowledge; i.e. to quantify the patterns of genomic differentiation and introgression at a later stage of the speciation continuum. To this end, we used a whole-genome reseqeuncing data from the Japanese stickleback species pair to determine their evolutionary history and characterise patterns of gene flow between them. Our first aim was to establish how and when divergence took place between G. nipponicus and G. aculeatus. Using thousands of genomic loci and a coalescent modelling approach, we tested a range of divergence scenarios and estimated the timing and duration of isolation, the extent of gene flow and fluctuations in population size. After identifying that the two species have indeed diverged in the face of gene flow, we then used a comparative genome scan approach with an additional G. aculeatus lineage from the Atlantic Ocean [46] as an allopatric control (Fig 1A, Fig S1). After establishing that gene flow has occurred but that a high level of genomic differentiation has remained, we used two independent measures of gene flow to identify where in the genome introgression has left its mark. We tested whether introgression occurs more frequently in regions of high recombination and whether it occurs in regions with functionally important genes. Our findings suggest a high level of genome-wide divergence can be maintained, as introgression is restricted to small, localized genomic regions.

Results

Ancestral demography and population genomic analyses support divergence with gene flow

Phylogenetic analysis on 35,666 10 kb non-overlapping genome windows on autosomes (i.e., excluding chrIX and chrXIX) supports a deep split between G. aculeatus (both Pacific and Atlantic Ocean lineages) and G. nipponicus (Japan Sea stickleback) (Fig 1A). Of all windows, 98.8% support the split between species, while only 0.51% indicate clustering of fish occurring in Japan (the Japanese Pacific Ocean G. aculeatus and the Japan Sea G. nipponicus; Table S1 and Fig 1A).

We calculated genealogical sorting index (gsi) [47] on maximum likelihood phylogenies estimated from non-overlapping sliding windows of 20 kb across the autosome. High gsi indicates monophyly, while low gsi indicates mixed ancestry [47]. Genome-wide averages (± SD) of gsi were high, but not complete, for all three Gasterosteus lineages with that of the Japan Sea stickleback being the highest (Atlantic gsi = 0.45 ± 0.10, Pacific gsi = 0.57 ± 0.09, Japan Sea gsi = 0.72 ± 0.06).

This is in stark contrast to the mitogenome phylogeny where sticklebacks from both species occurring in Japan fall into a single clade separate from the clade occurring in the Western Pacific and Atlantic (Fig 1B, Fig S2). A lack of mitogenome divergence between G. aculeatus and G. nipponicus from the Japanese archipelago suggests mitochondrial introgression has occurred where these lineages overlap (Fig 1C).

Since the consensus autosomal phylogeny suggests a more recent split between the Pacific and Atlantic G. aculeatus lineages, a deeper mtDNA split based on spatial distribution suggests mitochondrial introgression has likely occurred from the Japan Sea G. nipponicus into the Pacific Ocean G. aculeatus. Divergence time estimates between the mitogenome clades are thus informative for dating speciation. Bayesian coalescent analysis using a strict clock model in Bayesian Evolutionary Analysis by Sampling Trees (BEAST) suggests a median split date of 1.30 million years (0.15-2.41; 95% Highest Posterior Density [HPD] intervals; Table S2) for the two major mitogenome clades (Fig S2), consistent with previous estimates [44]. Divergence between Eastern Pacific and Atlantic haplotypes is more recent at 0.39 million years (0.03-0.74; 95% HPD) but is older than the Most Recent Common Ancestor (MRCA) of all haplotypes occurring in Japan (Fig 1B, S2A), suggesting mitochondrial gene flow from G. nipponicus to G. aculeatus has occurred in the recent past (i.e. <0.39 million years BP).

Mitochondrial introgression can be driven by large demographic disparities between populations with gene flow and is more likely to occur from a larger to a smaller population [48]. To investigate the demographic disparities between G. aculeatus and G. nipponicus, we used pairwise sequential Markov coalescent (PSMC) on all 26 Atlantic Ocean, Japan Sea and Pacific Ocean resequenced stickleback genomes. Strikingly, G. nipponicus experienced a severe bottleneck around 0.15-0.3 million years before present (BP) (Fig 1D); mean N_e fell to 26,422 ± 1,191 at its lowest point. Subsequently after 0.1 million years BP, G. nipponicus underwent a dramatic population size expansion (Fig 1D): mean N_e rose to 195,974 ± 28,832 (i.e. ∼7.5 times increase from the bottleneck) during the late Pleistocene. In contrast, the Japanese Pacific Ocean G. aculeatus population has remained relatively stable throughout its history (mean N_e ± SD = 118,150 ± 4,330; Fig 1D, see Fig S3 for bootstrap support). Although the Atlantic (Fig 1D) and Western Pacific lineages of G. aculeatus (Fig S4) also experienced some population growth during the late Pleistocene, their effective population sizes remained smaller than that of G. nipponicus. Genome-wide averages of Tajima’s D also support a recent demographic expansion for G. nipponicus (mean ± SD of Tajima’s D = -0.82±0.45) and stable population size in the Pacific Ocean (mean ± SD of Tajima’s D = -0.04 ± 0.63). Taken together, these data indicate that mitochondrial introgression likely occurred during the late Pleistocene, when G. nipponicus N_e was substantially larger than G. aculeatus N_e. This is consistent with the hypothesis that mitochondrial gene flow from G. nipponicus to G. aculeatus has occurred <0.39 million years BP (see above).

To explicitly test whether divergence between G. aculeatus and G. nipponicus occurred in the presence of gene flow, we used an Approximate Bayesian Computation (ABC) approach with 1,874 2 kb loci randomly sampled from across the autosome. We tested five divergence scenarios – isolation (I), isolation with migration (IM), isolation-with-ancient-migration (IAM), isolation-with-recent-migration (IRM) and isolation-with-ancient-and-recent-migration (IARM) – i.e. two discrete periods of contact. Since the results of our PSMC analyses clearly indicate N_e has varied throughout divergence (Fig 1D), we performed a hierarchical ABC analysis, first selecting the most appropriate population growth model (i.e. constant size, population growth and a Japan Sea bottleneck) within each divergence scenario and then performing final model selection amongst the best supported divergence/growth model scenarios (see Supplementary Methods for full specification of models, priors, parameters and extensive sensitivity testing).

Using 20 summary statistics (see Supplementary Methods for a full list of statistics used) and a neural-network rejection method with 1% tolerance of simulated datasets, the best-supported divergence scenario was a model of IM (Fig 2A; Table 1). Parameter estimates from the IM model suggest divergence between G. aculeatus and G. nipponicus occurred 0.68 million years ago (median estimate, 0.18-4.17 million years, lower & upper 95% HPD; Fig 2B). A Japan Sea bottleneck occurred 0.3 million years ago (0.03-2.21 million years 95% HPD), reducing N_e to about 20% of the contemporary estimate (Fig 2C, Table S3). Mean migration rates between the two species were low, and migration from the Japan Sea into the Pacific Ocean lineage was slightly greater (Fig 2D, Table S3). Contemporary N_e of the Japan Sea lineage is larger than that of the Pacific Ocean, although the N_e estimates differed in magnitude from those estimated by PSMC (Figs 1D and 2C, Table S3).

Fig 2. ABC analysis supports isolation with gene flow. (A) A model of isolation with migration and a bottleneck in the Japan Sea lineage is best supported by ABC analysis using ∼2,000 nuclear loci (see Table 1). Posterior probability densities for model parameters estimated using neural network analysis, a 1% tolerance rate and 20 summary statistics. Parameters are: T = time of split, m₁₂ = migration from Japan Sea to Pacific Ocean, m₂₁ = migration from Pacific Ocean to Japan Sea; T_G = timing of bottleneck, N_PO = Pacific Ocean effective population size, N_JS = Japan Sea effective population size and N_JSB = Japan Sea bottleneck effective population size. Posterior probability density curves for (B) Japan Sea and Pacific Ocean divergence time and timing of bottleneck in the Japan Sea lineage, (C) Japan Sea, Pacific Ocean and Japan Sea bottleneck effective population sizes, and (D) mean migration rates. Figures on each panel are median parameter estimates.

View this table:

Table 1.

Posterior probability values for models for final ABC model selection using neural network and standard rejection methods. All estimates produced using a tolerance of 1% and 20 summary statistics. Bold text indicates the model where posterior probability provides the highest support. Models are I = isolation, IM = isolation with migration, IAM = isolation and ancient migration, IRM = isolation and recent migration, IARM = isolation with ancient and recent migration.

Identifying admixture between species where they co-occur provides strong evidence of on-going introgression [7,12]. To address this, we used a RAD-sequencing dataset with a larger sample size of 245 individuals from the Atlantic, Pacific and Japan Sea lineages, including previously published data from Pacific-derived populations in North America [49]. Principal component analysis (PCA) of allele frequencies at 3, 744 high-quality bi-allelic SNPs showed that, consistent with our whole genome data, the main axis explaining 20% of the variance was between G. aculeatus and G. nipponicus (Fig S5). The secondary axis explaining 9.49% of the variance was mainly between the Atlantic and Pacific populations (Fig S5). Importantly, PCA shows a single individual is intermediate between the Pacific and Japan Sea populations occurring in Akkeshi, the sympatric site in Hokkaido, Japan where our whole genome-sequenced samples were collected (Fig 1C). A separate Bayesian analysis for admixture using STRUCTURE [50,51] found greatest support for K = 2 in the Japanese populations and also identified the putative F₁ hybrid plus individuals with possible recent admixture at this sympatric site (Fig S6).

Taken together, these data indicate that divergence between the Japanese G. aculeatus and G. nipponicus is much older and greater compared to commonly studied postglacial stickleback species pairs. Despite the great extent of divergence between Japanese stickleback species, parameter estimates and observational data suggest that gene flow between them is on-going.

High levels of genome-wide divergence with highly localized signatures of introgression

Genome-wide differentiation was strikingly high between G. nipponicus and G. aculeatus regardless of their geographical overlap: both relative (F_ST) and absolute divergence (d_XY) were high (Fig 3A & B, Fig 4, and Figs S7 and S8). The genome-wide average of F_ST between the sympatric species was 0.628; this is higher than all other studied stickleback species pairs [32–34,52] (see Fig 3C). Despite consistently high divergence, both F_ST and d_XY values were significantly lower where the two species occur in contact (Table 2, Figs 3A & B, 4, S7 and S8; 10,000 replicate permutation tests on 10 kb windows: P < 2.2 × 10^-16 for both statistics), consistent with the presence of gene flow in sympatry.

Fig 3. Genomic divergence is lower in sympatry than in allopatry between species. Histograms of (A) relative (F_ST) and (B) absolute (d_XY) differentiation measures for each of the species comparisons. (C) Mean genome-wide F_ST of the Japanese species pair compared with those of other stickleback systems taken from previously published studies [32–34,52].

Fig 4. Genome-wide distribution of divergence and introgression. Divergence was measured using F_ST and d_XY, while introgression was measured using G_MIN and f_d. Data plotted here is from 50 kb non-overlapping genome windows. Blue and yellow lines indicates allopatric (Japan Sea vs Atlantic) and sympatric (Japan Sea vs Pacific Ocean) comparisons, respectively.

View this table:

Table 2.

Genome-wide averages for measures of divergence and introgression. F_ST, d_XY, G_MIN, and f_d for all pairwise comparisons of Japan Sea (JS), Pacific Ocean (PO) and Atlantic Ocean sticklebacks (AT) are shown. Mean ± SD and lower and upper limits of the 95% confidence interval (in parenthesis) are shown. NA, not analysed.

A more fine-scale analysis of genome-wide divergence based on 10 kb non-overlapping windows revealed that the high baseline divergence between G. nipponicus and G. aculeatus is interspersed by regions of low differentiation in both F_ST and d_XY genome scans (Figs 4 top two panels, S7 and S8). Processes such as background selection that alter within-population diversity can bias relative measures of differentiation such as F_ST [7,21,22,53]. Furthermore, absolute measures like d_XY can take a long time to reach equilibrium between populations following divergence, meaning a lower power for detecting recent introgression events [7,26]. Therefore, to better identify genomic regions of recent introgression, we calculated two independent measures of introgression. The first of these was G_MIN, the ratio of the minimum d_XY to the average d_XY [26]. Under strict isolation, G_MIN is the lower bound of divergence time between two populations, whereas when introgression occurs, G_MIN reflects the timing of the most recent migration [26]. The second measure was f_d, an estimate of the proportion of introgressed sites in a genome window, calculated using a four population ABBA-BABA test [54]. G_MIN is more effective at identifying recent, low level gene flow than either F_ST or d_XY but by definition it is unable to detect genomic regions where complete introgression has occurred [26], which can however be detected using f_d. Importantly, both measures are robust to variation in recombination rate [26,54]. Combining these two statistics therefore allows us to identify both low-level (G_MIN) and strong introgression (f_d).

Focusing on between species comparisons, mean (± SD) G_MIN measured from 10 kb non-overlapping windows was greater in allopatry than sympatry (Japan Sea vs. Atlantic: 0.876 ± 0.071; Japan Sea vs. Pacific: 0.857±0.103; randomization test P < 2.2×10^-16; Fig 4). Mean f_d was also greater when the species overlapped (JS vs. AT: -0.0031 ± 0.0540; JS vs. PO: 0.0039±0.0328; P < 2.2×10^-16; Fig 4), and both statistics are more strongly negatively correlated in sympatry (Fig S9) supporting gene flow between G. nipponicus and Japanese populations of G. aculeatus.

Genomic regions of low G_MIN (i.e. G_MIN valleys) may indicate recent introgression. We identified genome windows with low G_MIN values using a Hidden-Markov classification model [55] and an outlier approach based on permutations of variant sites (Fig S10). We then clustered 10 kb outlier windows occurring within 30 kb of one another into putative G_MIN valleys. G_MIN in particular may be susceptible to false positives as a result of ILS. However, lower d_XY and higher f_d in sympatric G_MIN valley windows compared to the genomic background suggests ILS alone does not explain the patterns observed here (Fig S11; randomization test, P < 2.2 × 10^-16 in both cases). These regions of introgression were more common in the genome when the two species overlapped, with 637 valleys in sympatry (JS-PO comparison) compared to 337 in allopatry (JS-AT comparison) (randomization test, t = 5.35, P < 2.2 × 10^-16) and a greater number of valleys per chromosome (Fig 5A), although mean valley size did not differ significantly (77.6 kb and 75.4 kb in sympatry and allopatry respectively, P = 0.82). Interestingly, 225 valleys were shared between JS-PO and JS-AT comparisons (Fig. 4). These shared valleys may indicate ILS but they may also reflect introgression from Pacific Ocean to Japan Sea, where one or a few Japan Sea individuals carry haplotypes derived from Pacific Ocean and therefore are also similar to Atlantic Ocean haplotypes too. However, a larger number of valleys (412 valleys) were unique to the JS-PO comparison, where introgression might occur from Japan Sea to Pacific Ocean.

Fig 5. Fewer introgression valleys occur on the neo-X chromosome. (A) A greater number of G_MIN valleys occur in sympatry than in allopatry between species. (B) G_MIN valleys and f_d peaks also occur in regions of the genome with a higher recombination rate. Fewer valleys occur on the neo-X chromosome (chrIX) compared to autosomes (C), even when chromosome length is taken into consideration (D); N.B – data for (C) and (D) were measured using females only.

A similar geographical comparison of peaks of f_d between species was not possible, due to the much lower genome-wide distribution of f_d between G. nipponicus and the Atlantic G. aculeatus (Fig 4). Nonetheless, Hidden-Markov classification identified 823 f_d peaks occurring between G. nipponicus and Pacific G. aculeatus (Fig S12). If the f_d peaks are mainly indicate introgression from Pacific Ocean to Japan Sea, d_XY between Japan Sea and Atlantic Ocean is expected to be lower in these regions compared to the genome background, as Japan Sea fish carry haplotypes derived from the Pacific Ocean, which in turn are similar to the Atlantic Ocean haplotypes. While JS-AT d_XY was lower in f_d peaks compared to the genome background (one-tailed permutation test, P < 2.2 × 10^-16), this difference was not very clear (Fig. S13). In contrast, if introgression occurred mainly from Japan Sea to Pacific Ocean, d_XY in the PO-AT comparison should increase in f_d peaks relative to the genome background, as Pacific Ocean fish carry Japan Sea-derived haplotypes, which are divergent from the Atlantic Ocean haplotypes. We clearly observed this pattern (P < 2.2 × 10^-16; Fig. S13); suggesting that introgression from Japan Sea to Pacific Ocean may be more predominant than the opposite direction.

To further investigate the direction of gene flow, we used partitioned D statistics (an extension of the four population test – see Fig. S14), which tests the excess of shared derived alleles using five, rather than four populations [56]. To this end, we added an allopatric Japan Sea population (collected from Lake Shinji, a brackish lake at the Japan Sea coast of southern Honshu). A positive D₁₂ statistic is proposed to indicate the predominance of introgression from P3 to P2 (Fig. S14) [56]. When P3 was set to Japan Sea (where P3₁ is sympatric and P3₂ is allopatric with the Pacific Ocean) and P2 to Pacific Ocean (see Fig. S14B), D₁₂ was significantly positive in f_d peaks (one-tailed permutation test, P < 2.2 × 10^-16). In contrast, when we rotated the populations at the tips – i.e. setting P2 to sympatric Japan Sea, P3₁ to Pacific Ocean, and P3₂ to Atlantic Ocean (see Fig S14C), D₁₂ was not positive, consistent with the suggestion that introgression is occurring mainly from Japan Sea to Pacific Ocean. However, the resolution of partitioned D statistics has been criticized [57]; positive D₁₂ can also be caused by introgression from the Pacific Ocean (P2) to the common ancestor of the sympatric and allopatric Japan Sea populations (P3₁ & P3₂). To overcome this issue, we calculated D_FOIL, which also uses a five-population test but accounts for all possible introgression events [57]. When P₁ = sympatric Japan Sea, P₂ = allopatric Japan Sea, P₃ = Pacific Ocean, and P₄ = Atlantic Ocean (Fig. S15), D_FOIL clearly indicated the presence of ancestral introgression (239 out of 4,236 100 kb-windows) between the Japan Sea ancestor (P₁₂) and the Pacific Ocean (P₃) (see Fig. S15). However, we found only a few windows showing unidirectional introgression (6 loci in total), precluding conclusion of the predominant direction of introgression using this analysis (Fig. S15). This low sensitivity may be due to the fact that structuring in the Japan Sea lineage is low [58] – i.e. recent divergence time or high intraspecific gene flow.

Characterization of genomic regions of introgression

To investigate whether introgression co-varies with recombination rate, we used a previously published recombination map from an Atlantic G. aculeatus cross [59] to interpolate genome-wide recombination rate variation (see Methods). We detected a negative correlation between recombination rate and G_MIN and a positive correlation with f_d (Pearson’s correlation, G_MIN: r = -0.17, P < 2.2 × 10^-16; f_d: r = -0.08, P < 2.2 × 10^-16, Fig S16). Accordingly, mean recombination rate for putatively introgressed regions was over two times higher than the genome background (G_MIN: valley = 8.98 cM/Mb, non-valley = 3.99 cM/Mb; f_d: peak = 9.64 cM/Mb, non-peak = 4.16 cM/Mb; randomization test P < 2.2 × 10^-16 in both cases; Fig 5B).

Sex chromosomes likely played an important role in speciation between G. aculeatus and G. nipponicus. A fusion between Y and chrIX means that chrIX segregates as a neo-sex chromosome in G. nipponicus but not G. aculeatus which only carries the ancestral and shared sex chromosome, chrXIX [36,40]. The divergent XY (G. aculeatus) and X₁X₂Y (G. nipponicus) systems means that recombination is reduced for chrIX and chrXIX in hybrids carrying the neo-Y [40]. Given this recombination rate reduction and previously identified QTL for traits involved in reproductive isolation that map to chrIX and chrXIX [36,40], we tested whether recent introgression (i.e. measured using G_MIN) was reduced in this part of the genome relative to the autosome. For this, we repeated our analyses using females only (5 Japan Sea and 6 Pacific Ocean). The number and density of valleys was lowest on the neo-sex chromosome, chrIX (16 valleys or 0.8 valleys per Mb) but not the ancestral sex chromosome (chrXIX, see Table S4).

Finally, we investigated the nature of introgression between the two species. We first asked whether introgression occurs more frequently in genic or non-genic regions. We identified 3,261 genes occurring in G_MIN valleys and 2,958 genes from f_d peaks between sympatric G. aculeatus and G. nipponicus; 60% of genes identified were found in both types of introgressed window, whereas 23% occurred only in G_MIN valleys and 15% only in f_d peaks (see Fig S17). Irrespective of the method used to detect putatively introgressed regions, the number of genes identified was greater than the number expected by chance (P < 0.0001 based on a null distribution generated from 1,000 random samples of the genome). This suggests that introgression may be more likely in genic regions of the genome than non-genic regions.

To further investigate the functional enrichment of the genes occurring in regions of introgression, we performed gene ontology (GO) analysis on 2,310 G_MIN valley and 2,217 f_d peak genes with orthologs in the human genome. Enriched GO terms for f_d peaks included immune response, metabolic processes and chromatin assembly, while enriched GO terms for G_MIN valleys included major histocompatibility complex (MHC) protein and metabolic processes (Table S5 & S6).

Discussion

Japanese stickleback speciation has occurred in the face of on-going gene flow

Determining the demographic and evolutionary history of species pairs is an important first step for understanding how speciation has unfolded in any system [7,12]. Our present study has produced several lines of evidence indicating that divergence between the Japanese sticklebacks has occurred in the presence of gene flow.

Firstly, contrasting mitochondrial and nuclear genome phylogenies show that mitochondrial introgression has occurred from G. nipponicus into G. aculeatus at some point in the last 0.39 million years. Our mitogenome phylogeny confirmed previous findings that there is no mitochondrial divergence between the G. nipponicus and Japanese populations of G. aculeatus [44,45]. This is in contrast to our nuclear autosomal phylogeny which showed that majority of the genome supports a clear split between G. nipponicus and G. aculeatus occurring in Japan and that the latter shares a more recent common ancestor with Atlantic European G. aculeatus populations. In short, mitogenome data clusters the Gasterosteus lineages by geography, while the nuclear data clusters them by species. Introgression is the most likely explanation for this mitonuclear discordance as the lower effective population size of the mitochondrial genome relative to the autosome makes ILS unlikely, particularly over a 1 million year time-scale of divergence.

Disparities in effective population size between lineages are a common cause of unidirectional mitonuclear introgression with introgression likely occurring from a larger to a smaller population [48]. Our reconstruction of temporal variation in effective population size using PSMC showed a rapid population explosion of G. nipponicus during the late Pleistocene that created a large demographic disparity with the G. aculeatus Pacific Ocean lineage, although it should be noted that admixture can increase effective population size estimates when using PSMC [60,61]. Reasons for this population growth remain unclear but it is surprising, particularly since G. nipponicus is unable to colonise freshwater environments [42,58], which might be expected to increase effective population size due to meta-population dynamics. A possible explanation is that a lack of dependence on freshwater for spawning [41,62,63] and greater foraging efficacy on marine prey [42] means G. nipponicus is better adapted for exploiting an abundant marine environment. Unidirectional mitochondrial introgression might also be caused by female mate choice [64]. Our previous behavioural studies indicate that Japan Sea females often mate with Pacific Ocean males, while Pacific Ocean females rarely mate with Japan Sea males [36,37]. Hybrid females from Japan Sea female and Pacific Ocean male crosses are fertile [37] and will carry Japan Sea mitochondrial DNA. Backcrossing of these hybrids to Pacific Ocean males would result in unidirectional mitochondrial introgression from the Japan Sea to Pacific Ocean.

Secondly, our ABC analysis supported a model of isolation with migration. Previously, it has been speculated that the Japan Sea stickleback diverged largely as a result of geographical isolation in the Sea of Japan caused by sea level fluctuation during the early Pleistocene [37,39]. Using ABC, we were able to explicitly test several divergence hypotheses in a statistical framework [65]; our findings suggest that gene flow has likely occurred throughout the majority of divergence history. It should be noted that ABC and most established demographic inference methods perform poorly when resolving the timing of gene flow between lineages [66,67]. Therefore, one caveat to the interpretation of our ABC results is that we cannot rule out the possibility that the two species diverged in repeated cycles of contact (i.e. akin to our IARM model which had the second highest level of support; Table 1), but these periods of contact were simply too close in time. Nonetheless, the pooled posterior probabilities from the analysis overwhelmingly support a model of divergence with gene flow irrespective of the timing or nature of the actual speciation event. The presence of extant recent hybrids in sympatry also strongly indicates that hybridization is still on-going. We observed a probable F₁ hybrid in the wild and several other individuals with evidence of hybrid ancestry in our RAD-seq dataset, consistent with previous studies that observed wild caught hybrids [36,68]. This provides direct observation of admixture in the wild.

Finally, lower levels of genome-wide divergence (both F_ST and d_XY) between sympatric pairs compared to allopatric pairs also indicate the presence of gene flow. Our G_MIN and f_d genome scans showed a higher number of putatively introgressed regions between G. nipponicus and Japanese Pacific G. aculeatus than between G. nipponicus and Atlantic G. aculeatus, suggesting that introgression has been occurring even after the Atlantic and Pacific stickleback populations diverged approximately 400,000 years BP. Our partitioned D statistics demonstrated that gene flow from G. nipponicus into Japanese Pacific G. aculeatus may be more predominant than the opposite direction in sympatry.

High genomic divergence at a late stage of speciation with gene flow

Compared to young species pairs, less is known about the patterns of genomic differentiation at more advanced stages of speciation with gene flow. Our study provides insight on this under-represented stage of divergence. Our ABC analyses placed the estimated divergence time of G. aculeatus and G. nipponicus at 0.68 million years BP. Similarly, our Bayesian coalescent analysis of mitogenome divergence revealed a 1.3 million year split between the Japanese and Atlantic-Pacific Gasterosteus mitochondrial clades. An older divergence time is somewhat expected from the mitochondrial genome, given its fourfold lower effective population size [48]. Because splitting of mitochondrial lineages can occur by geographical structuring without speciation [69,70], our estimate of mitochondrial divergence may reflect more ancient geographical structuring that occurred prior to species divergence. Nonetheless, both mitochondrial and nuclear split estimates suggest that divergence between G. aculeatus and G. nipponicus occurred well before the end of the last glacial period. Therefore the Japanese stickleback system is older than all other previously examined postglacial sympatric or parapatric species pairs, which have typically diverged within the last 20,000 years [30].

The Japanese stickleback system also has a mean genome-wide F_ST value of 0.628, higher than any other sympatric or parapatric stickleback species pair studied so far (Fig 3C); placing this pair at the furthest end of the speciation continuum. A recent meta-analysis quantifying genetic differentiation pointed to a lack of evidence of sympatric or parapatric species pairs with F_ST between 0.3 and 0.7[8]. The Japanese species pair is therefore an example of an underepresented sympatric species pair that maintains high divergence in the face of gene flow. Within the stickleback species complex, this high divergence is in stark contrast to the typical low baseline divergence interspersed with regions of high differentiation observed in more commonly studied species pairs such as lake-stream or freshwater-anadromous pairs [16,33,71].

The primary explanation for the observed elevated divergence is most likely the more ancient divergence time of the Japan Sea-Pacific Ocean species pair compared to postglacial species pairs [16,72]. However, the results of our demographic analyses indicate that high divergence is not due to a long period of allopatric isolation without gene flow, contrary to what has previously been suggested [37,39]. This is important, as failing to account for variation in evolutionary history among species pairs placed on a continuum will obscure the processes leading to higher differentiation as speciation progresses. A further explanation for the high genomic divergence is the presence of strong isolating barriers between the Japan Sea and Pacific Ocean sticklebacks. Total reproductive isolation (0.970) is greater than in all postglacial species pairs (0.716-0.895) [43] and arises from a combination of habitat [41,42], temporal [62] and sexual isolation, and hybrid sterility [36,37]. Recent theoretical studies have shown that selection on many barrier loci in the face of gene flow may result in a transition from low to high differentiation as a result of ‘genome-wide congealing’ [10,19]. It is important to note that we lack evidence that such a transition might explain the high differentiation we see here relative to the rest of the stickleback continuum (Fig 3C). However, pervasive selection on multiple isolating barriers in the Japanese stickleback system does suggest that this process could have contributed to the high genomic differentiation we observe [5].

Localized introgression at a late stage of speciation with gene flow

Our study has also demonstrated two important signatures of introgression in the Japanese sympatric stickleback pair. Firstly, levels of background genome differentiation between G. aculeatus and G. nipponicus estimated by F_ST were lower in sympatry compared to allopatry. The higher overall genetic differentiation between G. nipponicus and Atlantic G. aculeatus is likely due to genetic drift and local adaptation and the fact that these two lineages have never overlapped geographically. Secondly and strikingly, we identified small regions of localised introgression dispersed throughout the genome when G. nipponicus and G. aculeatus co-occur in sympatry. These introgression regions were measured using G_MIN, the ratio of minimum d_XY to mean d_XY, where low values indicate introgression [26] and f_d, the proportion of introgressed sites in a genome window [54].

Several methodological issues might influence these measures of introgression. Firstly, both G_MIN and f_d are sensitive to sample size; fewer individuals will mean rare haplotypes have a lower sampling probability. However by re-conducting our analyses using only females, a much smaller sample size than our main analysis, we still identified clear signals of introgression. Secondly, G_MIN will be biased downwards if a recently backcrossed individual is included in the dataset. All Japanese G. aculeatus and G. nipponicus used in the study were identified as ‘pure’ individuals with genotyping at multiple microsatellite loci prior to resequencing [36,40]. Furthermore, we examined the identity of the two haplotypes producing the lowest value of d_XY in G_MIN valleys and confirmed that these came from different individuals in each case (data not shown). Nonetheless, even a single haplotype segregating in a population due to ILS can lower the value of G_MIN; i.e. ILS could produce small, interspersed G_MIN valleys. However, ILS cannot explain why more introgressed regions occur in sympatry (Fig 5A) or why there is a clear association with an increased proportion of introgressed sites (i.e. f_d) and G_MIN (Fig S9 & S13).

What then underlies the localised pattern of introgression we observe? One possible explanation is the fact that many isolating barriers are involved in reproductive isolation [36,43]. Although the genomic basis of these isolating barriers remains unknown, it is likely that barrier loci occur throughout the genome; pervasive selection at multiple loci is expected to limit the extent of introgression at this scale [73]. The strength and extent of selection at a barrier locus and genomic regions linked to it is proportional to recombination rate [73]. Recombination determines effective migration rate [74]; when recombination is high, neutral and adaptive loci linked to the target of negative selection in the recipient population have a greater probability of escaping removal and so their probability of introgression is greater [3]. Selection has a higher efficiency in these high recombination rate regions due to increased effective population size therefore deleterious introgression is also more likely to be removed. The expectation then is that signatures of introgressed neutral or adaptive alleles are most likely to persist in regions of the genome where recombination rate is sufficiently high enough, and indeed, the positive association between introgressed regions and recombination rate we observed supports this (Fig 5B, Fig S15). Introgression is typically lower on sex chromosomes relative to autosomes in multiple taxa due to the effects of reduced recombination and greater exposure to selection in the hemizygous sex [75]. The sex chromosomes play an important role in the Japanese stickleback system, harbouring QTL for hybrid sterility and behavioural isolation [36]. Consistent with this, we observed lower introgression on the neo-sex chromosome (Fig 5E & F), although we cannot exclude the possibility that the fusion occurred more recently than the speciation event, so the opportunity for introgression on the neo-sex chromosomes was simply low relative to the rest of the genome. Taken together, our findings suggest that strong divergent selection and recombination rate variation may determine the localised signature of introgression in the genome.

The nature of gene flow in the Japanese stickleback system may also give some clues as to why we observe such highly localised introgression. One possibility is that a proportion of the introgression we detected is adaptive; i.e. it is maintained because of either directional or balancing selection. Adaptive introgression has been detected in a wide range of taxa [76], including humans [77]. However, the expected signatures of the process remain unclear – especially when introgression is widespread in the genome, as is possibly the case here. Our GO analyses suggest an enrichment of immune response genes, including MHC genes, and metabolism genes in introgressed regions. Immune genes have been identified as being under balancing selection between hybridising taxa, particularly plants [78] and birds [79]. Several genes involved in metabolism are also reported to be under balancing selection in humans [80]. Furthermore, recent analysis suggests that negative frequency dependent selection might result in introgression of rare MHC alleles between divergent stickleback ecotypes [81]. Further research is necessary to directly test whether this process might explain introgression in the Japanese stickleback system.

Conclusion

Much of our knowledge of how genomic differentiation builds along the speciation continuum is drawn from studies focusing on young, allopatric or completely reproductively isolated species pairs. Very few examples of species pairs at a later stage of divergence with on-going gene flow have been investigated. Here, we have shown that the Japan Sea and Pacific Ocean species pair exemplifies this under-represented stage of speciation and is situated at the further end of the stickleback species continuum. The high genomic differentiation between the species may be due to a more ancient divergence time than previously studied postglacial species pairs, selection on multiple isolating barriers or a combination of the two. Despite high differentiation, gene flow is on-going between the species and we identified localized signatures of introgression throughout the genome. Although the localized nature of the introgression remains unclear, selection – either directional or balancing – may play some role in promoting it. Overall, our study demonstrates that high levels of genomic divergence can be established and maintained in the presence of gene flow. Further genomic studies on more species pairs at late stages of speciation with gene flow will help to understand the generality of the patterns seen here.

Materials and Methods

Ethics Statement

All animal experiments were approved by the institutional animal care and use committee of the National Institute of Genetics (23-15, 24-15, 25-18).

Sample collection, whole genome resequencing and RAD sequencing

Collection and sequencing of all Japanese individuals used for whole genome resequencing has been described previously [40] except the allopatric Japan Sea fish. Briefly sympatric populations were captured from the Akkeshi system in Hokkaido, Japan in 2006 (Fig 1C). The allopatric Japan Sea female was collected in Lake Shinji in March 2014. The outgroup species, G. wheatlandi was captured from Demarest Lloyd State Park, MA, USA in 2007, as described previously [40]. Libraries were constructed with TruSeq DNA Sample Preparation Kit (Illumina) and whole-genome 100 bp paired-end sequencing was performed on an Illumina HiSeq2000 at the National Institute of Genetics (sympatric JS and PO) and Functional Genomics Facility, NIBB Core Research Facilities (allopatric JS) [40]. Whole genome sequencing of North American marine and stream populations collected from Little Campbell River, BC, Canada was reported previously [52,82]. For the six Atlantic G. aculeatus individuals (North Sea) included in the study, we used previously published sequences [46].

Japanese individuals used for RAD sequencing have been previously described elsewhere [58]. Samples used for RAD sequencing from the Atlantic lineage were collected from across Ireland in 2009-2011 [70,83]. DNA was extracted using a Qiagen DNeasy Blood and Tissue Kit (Qiagen, Valencia, CA, USA). Single digest RAD-sequencing was performed using SbfI following a standard protocol [84]. RAD library preparation and sequencing was conducted using a 100bp single-end Illumina HiSeq by Floragenex (Oregon, USA).

Accession numbers, sample names and locations for all genome and RAD-seq samples are listed in Table S6.

Whole genome alignment and variant calling

Sequence reads were mapped to the stickleback reference genome using CLC Genome 8.0 as described previously [40]. Alignments were exported as bam files and were sorted and indexed using samtools 1.2 [85]. Variant calling was carried out in two different phases. The first phase was used to call consensus bases at all sites (i.e. variant and invariant) across the genome for all 27 resequenced individuals and the outgroup (G. wheatlandi). Mapped reads from all individuals were piled-up using samtools mpileup and called against the stickleback BROAD reference genome using the bcftools 1.2 consensus caller without any filters. This consensus call produced a vcf file with a base call for every position in the genome for all samples (27 + 1 outgroup). Consensus calls from this phase were used in later demographic inference using PSMC and ABC. The aim of the second, more stringent variant calling phase was to produce a subset of high-quality polymorphic SNPs with which to examine genome-wide differentiation between the Japan Sea, Pacific and Atlantic Ocean lineages. We used bcftools to filter the consensus-call vcf for these three lineages, only retaining sites with a Phred Quality score >10, and with a maximum individual read depth of 200. Additional downstream filters were applied prior to estimating population genomic parameters from this subset vcf (see below).

Mitochondrial genome divergence

To estimate divergence times based on mitochondrial DNA, we performed Bayesian coalescent analysis using BEAST v2.2.1 [86]. From our resequencing data, we extracted the whole mitochondrial genome from the 26 Japan Sea, Pacific and Atlantic Ocean individuals. We also downloaded two G. wheatlandi whole mitogenomes as outgroups (NCBI accession numbers: AB445129 & NC011570). Note that due to poor sequence coverage across the mitogenome we excluded our own re-sequenced G. wheatlandi individual here. Mitogenomes were aligned using MUSCLE v3.81.3 [87] resulting in a 16,549 bp final alignment.

Although there is a considerable three-spined stickleback fossil record, it is unfortunately of little use for providing fossil calibration dates for splits within the Gasterosteus genus [88,89]. However biogeographical events can also be used to calibrate node estimates and as such we used a normal prior (mean = 1.5 million years, SD = 0.75 million years) on the split between the Japan Sea and Pacific Ocean G.aculeatus lineages. We provided a further normal prior on the split date between the Pacific and Atlantic Ocean mitochondrial lineages (mean = 0.5 million years, SD = 0.25 million years). The latter prior distribution was intentionally made wide to reflect uncertainty surrounding this estimate. Initial analyses with BEAST indicated that marginal prior distributions for node ages did not behave as specified in the model and instead returned extremely recent divergence times with low likelihood support. This is a common bias in coalescent divergence time dating and use of a calibrated prior removed this issue [90,91]. As a result, we performed all further analyses with a calibrated Yule prior. Incorrect choice of molecular clock model can seriously bias coalescent estimates of lineage divergence times and so care must be taken to ensure the appropriate model is chosen [92,93]. We used path-sampling analysis in BEAST to estimate model marginal likelihoods for three different clock models – strict, relaxed lognormal and relaxed exponential. For each model, Markov chain Monte Carlo (MCMC) was run for 5 × 10⁷ with 60 steps and marginal likelihoods were calculated using BEAST. We then ran the final model using two 10⁸ independent MCMC runs. Runs were assessed in TRACER [94] to ensure convergence and that ESS values > 200 – i.e. the posterior was adequately sampled. Independent runs were then combined to produce posterior estimates of divergence times and substitution rates.

Nuclear phylogenetic analysis and genealogical sorting index (gsi)

To investigate nuclear phylogenetic discordance, we constructed maximum likelihood trees from consensus sequences for non-overlapping 10, 50 and 100 kb sliding windows following Martin et al [28]. The best-fit tree was estimated for each window using RAxML with a ‘GTRGAMMA’ model and a random number seed [95]. Trees were classified using a custom R script (https://github.com/markravinet) that binned trees based on whether they matched three different topologies; species, geography, ancestral – or were unresolved. For the species category, all Atlantic, Pacific and Japan Sea individuals form separate monophyletic groups; for the geography category, Japan Sea and Pacific Ocean form a monophyletic group separate to the Atlantic Ocean; trees where the Atlantic Ocean grouped monophyletically with the Japan Sea were classed as ancestral. Trees that did not fit any of these categories were classified as unresolved. Following categorisation, trees were then standardised to ensure equal branch lengths using the compute.brlen function from the Phytools R package [96] and were finally visualised for each gene tree class using the densiTree function in the R package Phangorn [97].

We additionally used the non-overlapping Maximum Likelihood phylogenies to calculate genealogical sorting index (gsi) [47]. We used a custom R script to estimate gsi across the autosome of 26 resequenced individuals. This allowed us to compare autosomal signals of introgression with a reduction in gsi.

Population size change over time

We used PSMC to estimate fluctuations in effective population size over time [98]. PSMC uses the density of heterozygote sites across a single diploid genome to estimate blocks of constant TMRCA that are split by recombination and then uses these to infer ancestral effective population sizes (N_e) over time [61,98]. Since PSMC can only analyse a single genome at a time, we ran the program separately on each of the 26 resequenced genomes from Japan Sea, Pacific and Atlantic Ocean lineages. We additionally ran the analyses for a reseqeuenced genome of a marine ecotype fish from Little Campbell River, Canada as representatives of the Eastern Pacific. Consensus sequences for each genome were converted to PSMC format - a binary format indicating the presence/absence of heterozygous sites within a specified window. We used 100 bp windows, requiring a minimum of 10,000 ‘good’ sites to be present on a genome scaffold in order for it to be included; heterozygous sites with a Phred Quality score <20 were removed. We then ran PSMC for 30 iterations with a maximum coalescent time of 15 (measured in units of 2N_O where N_O is ancestral population size). Due to the difficulty of inferring past effective population sizes across this time, PSMC requires the user to provide intervals which are combined to produce the same effective population size [98]. Since this method is least accurate for recent (i.e. < 20 kyr BP) and more ancient periods [98], we estimated N_e for 47 intervals, combining the first four and the last three using the command “4+19*2+4”. To scale our results from coalescent units, we assumed a generation time of 1 year [99] and used an autosomal substitution rate of 7.1 × 10^-9 [100]. Finally, to provide confidence intervals for our N_e estimates, we performed 100 bootstraps on 500 kb segments for each analysis.

Approximate Bayesian Computation (ABC)

We used ABC to test different scenarios of divergence between the Japan Sea and Pacific Ocean lineage and to estimate demographic parameters, such as divergence time and migration rate, under these scenarios.

To obtain loci suitable for our ABC analysis, we randomly sampled nuclear loci from the 20 resequenced genomes (sympatric Japan Sea and Pacific Ocean) using a similar approach to Nadachowska-Brzyska et al [14]. Using a custom R script, we produced a bed file of reference genome coordinates for 2 kb loci randomly sampled at 125 kb intervals; resulting in 2,378 loci potential per individual. Using a custom python script, we called consensus sequences for each locus from the consensus vcf. This script created two haplotype sequences for each of the 2 kb loci, randomly assigning heterozygous variants to one of the two called haplotypes; this step allowed us to use unphased data for demographic analyses [66,101]. We then further filtered these loci to include only those that occur on autosomes, with >1,000 bp sequence and a minimum of 30% coverage (i.e. > 70% bases were called) for all 20 individuals. This resulted in a final dataset of 1,874 loci. Functions and scripts for generating coordinates and extracting and filtering consensus sequences are available on GitHub (https://github.com/markravinet/genome_sampler).

Following Robinson et al [66] we used a custom R-based control script and msABC [102] to perform simulations, calculate summary statistics and quantify their distribution across the genome in a single step. This approach offers considerable flexibility in establishing prior probability distributions for each of the estimated parameters. Furthermore, given the large size of our dataset (i.e. approximately 2,000 loci for 20 individuals) each simulation produces a large amount of data, making storage a challenge. Using R to interface with msABC allowed us to greatly reduce the required data storage.

For each of the 15 models we performed 10⁶ simulations. We used a combination of GNU Parallel [103] and independent runs across multiple computing cores to reduce analysis speed to approximately 1 day per model (scripts and additional instructions available on Github: https://github.com/markravinet).

We initially ran our simulations to produce all the available summary statistics that msABC calculates. However since summary statistic choice can greatly alter the outcomes of ABC analyses [104,105], all post-simulation ABC analyses were conducted using subsets of 29, 20 and 12 summary statistics. Following completion of the simulation step, we performed a neural-network rejection step on log-transformed parameter estimates with a tolerance of 0.01 using the ABC function in the R package ABC [106]. Posterior probability was estimated for each model using the R abc postpr function with multinomial logistic regression for a range of tolerance values representing 0.1%, 0.5%, 1% and 3% of the simulated data (i.e. 1,000, 5,000, 10,000 and 30,000 datasets respectively). In keeping with a hierarchical analysis [14], we performed two rounds of model selection. We first chose the growth model with the highest posterior probability within each divergence scenario. Following this, we performed model selection on the five models with the highest support within each divergence category.

In order to ensure our ABC approach was reliable, we used pseudo observed datasets (PODs) to assess how well we could discriminate between different divergence scenarios. Essentially, this involves randomly selecting a series of simulated dataset from a known model (hence pseudo-observed) and then rerunning the model selection procedure to see whether the true model could be recovered. For further details of our POD-based sensitivity analysis and ABC approach, see the Supplementary methods section.

Detecting genome-wide divergence and recent introgression

Weir and Cockerham’s F_ST [107] was calculated using 10 and 50 kb non-overlapping windows with VCFtools 0.113 [108]. To calculate haplotype-based statistics such as d_XY, G_MIN and f_d, we used a modified version of a python script used by Martin et al [28]. In addition to our main filters on the dataset (see Genome Alignment and Variant calling), we only calculated these haplotype-based statistics for windows with >50% of useable bases – i.e. >5,000 sites within a 10 kb sliding window. For autosomal statistics, all individuals were included in the analyses. For comparing the ancestral (chrXIX) and neo-sex chromosomes (chrIX) with autosomes (Fig. 5E and 5F), we re-ran the analyses of all chromosomes using only females. In addition to 10 kb windows, we also performed analyses for non-overlapping 50 kb windows to aid visualisation; the results from all analyses were then combined into a single dataset using custom R scripts.

We calculated recently established statistics, G_MIN and f_d, for detecting introgression between divergent lineages [26,54]. G_MIN is particularly suited for identifying recent, low frequency introgression [26] whereas f_d can also identify stronger, high frequency introgression events [54]. Importantly, both methods are robust to variation in recombination rate variation. Initial genome scans conducted using G_MIN revealed a series of ‘valleys’ present across the genome. Detection of such valleys, like genomic islands of divergence, presents a variety of methodological issues. Firstly, how do we determine that G_MIN valleys are not due to stochastic variation in genealogy amongst loci? Secondly, how do we measure the size and distribution of valleys of introgression? Finally, how can we determine a null or expected distribution of valleys across the genome to test for the under- or overrepresentation of valleys?

To deal with each of these issues in turn, we first performed chromosome-specific permutations to identify the null distribution of the value of G_MIN. Specifically, we shuffled the nucleotide sequence of each chromosome 100 times and estimated G_MIN for 11 different sliding window sizes (5,000, 5,500, 6,000, 6,500, 7,000, 7,500, 8,000, 8,500, 9,000, 9,500 and 10,000 kb), representing the distribution of useable sites from the empirical dataset. We then used the lower 99 percentile of the permutations to determine the mean value of G_MIN below which a window could be classified as a valley. Identifying the boundaries of divergent genome regions is somewhat subjective and open to potential bias [109]. To account for this, we used a hidden Markov-model (HMM) approach to classify windows into two states - i.e. valleys or non-valleys – and to estimate the probability of state transition. Following Soria-Caracasco et al. [55], we used the R package HiddenMarkov [110] on a logit transformed G_MIN distribution. Transition probabilities between the two states were symmetrical with an emphasis on it being difficult to transition between states (p = 0.1) but relatively easy to remain within a state (p = 0.9). Since valleys are relatively rare in the genome, we set our models to start in the non-valley state and we provided estimated parameter values for the states based on the empirical distribution. Our permutation test to find the 99 percentile therefore also provided an empirical basis for identifying valleys. HMM estimates were run for both the sympatric and allopatric comparisons using the baumwelch function to estimate parameters using the Baum-Welch algorithm and the viterbi function to estimate the sequence of states using the Viterbi algorithm. We used a similar approach to identify f_d peaks but we instead performed the analysis using untransformed f_d values only in the sympatric Japanese G. aculeatus and G. nipponicus comparison.

RAD-seq data processing and admixture analysis

RAD sequence reads were demultiplexed and processed using the process_radtags module of Stacks 1.30 [111]. All reads were trimmed to 90 bp and any read where the average Phred quality score dropped below 10 in a 9 bp sliding window was discarded. Following filtering, reads were mapped to the Roesti et al. [59] build of the G. aculeatus genome using GSNAP [112] allowing a maximum of two indels to be present in an alignment, reporting no suboptimal hits, allowing a maximum of 8 mismatches and printing only the best alignment. SNPs were then called using the samtools and bcftools pipeline [113]. Called variants were then filtered using vcftools to remove all sites with greater than 25% missing data, to include genotypes only with an individual depth between 15X and 100X, to remove all sites with a Phred quality score below 20 and with a minor allele frequency below 0.05. Since common admixture analyses assume independence among sites (i.e. the absence of linkage disequilibrium) [114], we additionally pruned our RAD-derived SNP dataset using plink [115], removing all sites where pairwise linkage disequilibrium was greater than 0.4.

PCA on allele frequencies from all individuals was conducted using the glPca function from the R package adegenet [116]. Admixture analysis was carried out on a subset of samples from the Japanese archipelago using STRUCTURE [50,51]. For each value of K from 1 to 8, the program was run for 10 iterations with a burn-in of 10,000 steps followed by 20,000 MCMC steps. The most likely value of K was assessed using STRUCTURE HARVESTER [117].

Detecting the direction of introgression

We investigated the direction of gene flow between the Japan Sea and Pacific Ocean lineages using partitioned D statistics [56]. This is conceptually similar to standard four population ABBA-BABA tests for gene flow but includes a fifth population – an allopatric lineage of the Japan Sea. This balances the assumed phylogeny (i.e. ((P1, P2), (P3₁, P3₂), O) and therefore allows us to rotate the populations used in the analysis – i.e. testing for an enrichment of gene flow in both directions. We therefore tested two topologies ((AT, PO), (JS_S, JS_A), O) and ((JS_A, JS_S), (PO, AT), O) (see Fig S14). For either test topology, an excess of the ABBAA (compared to BABAA) or ABBBA (compared to BABBA) positions in a genome window inflates partitioned D statistics above zero – indicating gene flow from the P3 into P2.

Given that the partitioned D approach has attracted some criticism, we also calculated D_FOIL statistics [57]. D_FOIL is an additional extension of the four population test but one that incorporates all possible introgression events for a symmetric four population tree (excluding the outgroup). We used the same test phylogeny as with the partitioned D statistics (see Figs S14 & S15).

Both partitioned D and D_FOIL are based on ABBA/BABA methods – i.e. where only a single individual is present at the tips of the phylogeny. To account for this, we extended both methods to account for allele frequency data, meaning our site pattern counts are weighted by allele frequencies [54]. To calculate both D and D_FOIL statistics, we used a modified version of a python script used by Martin et al [28].

Characterization of introgression sites

In order to characterize regions of introgression, we identified candidate regions showing a strong signature of introgression (i.e. G_MIN valleys and f_d peaks) from our genome scan approach. We then counted the number of unique genes falling within our candidate valleys/peaks and compared this to a null distribution generated by 1,000 random samples of 10 kb non-valley/non-peak genome windows for the same number and size range as the valleys or peaks.

We then tested whether genes in introgressed regions were more likely to have any specific functions. To achieve this, we used gene ontology (GO) analysis on genes in valleys and 1,000 randomly chosen from across the genome. GO analysis was performed with the ClueGO plugin [118] for Cytoscape 3.4.0 [119]. Since functional annotations for this analysis were drawn from the human genome, we first generated a list of human-stickleback orthologous gene IDs (Ensembl Biomart 86). We then subset our candidate and random gene sets to include only orthologous genes. Several human genes have multiple stickleback orthologs; we therefore allowed only a single, randomly chosen occurrence of each human gene in both sets to prevent pseudo-replication. A hypergeometric test was conducted for testing enrichment with Benjamini & Hochberg FDR correction [120].

Acknowledgements

We are grateful to Manabu Kume, Seiichi Mori, and staff at the Aquarium Gobius for providing samples and Katsushi Yamaguchi for technical assistance. Keisuke Honda, the Institute of Statistical Mathematics and the DNA Data Bank of Japan are thanked for their help with running analyses on supercomputers. We are additionally grateful to Simon Martin, John Robinson and David Marques for sharing scripts and their advice on analyses. Freddy Chain and Philine Feulner are also thanked for their assistance with their sequence data. We thank Cassandra Trier for her assistance and advice with GO analyses. All members of the Kitano Lab provided invaluable advice throughout the project. We would also like to thank Mark Kirkpatrick and members of his lab for comments on manuscript.

Footnotes

↵* jkitano{at}nig.ac.jp

References

1.↵
Mayr E. Animal species and evolution. Cambridge, Massachusetts: Havard University Press; 1963.
2.
Coyne JA, Orr HA. Speciation. New York: Sinaeur; 2004.
3.↵
Nosil P. Ecological Speciation. Oxford, UK: Oxford University Press; 2012.
4.↵
Wu C-I. The genic view of the process of speciation. J Evol Biol. 2001;14: 851–865.
OpenUrl CrossRef Web of Science
5.↵
Feder JL, Egan SP, Nosil P. The genomics of speciation-with-gene-flow. Trends Genet. Elsevier Ltd; 2012;28: 342–350. doi:10.1016/j.tig.2012.03.009
OpenUrl CrossRef PubMed Web of Science
6.↵
Nosil P, Feder JL. Genomic divergence during speciation: causes and consequences. Philos Trans R Soc London Ser B. 2012;367: 332–342. doi:10.1098/rstb.2011.0263
OpenUrl CrossRef PubMed
7.↵
Ravinet M, Faria R, Butlin RK, Galindo J, Bierne N, Rafajlovic M, et al. Interpreting the genomic landscape of speciation: finding barriers to gene flow. J Evol Biol. 2017;in press.
8.↵
Nosil P, Feder JL, Flaxman SM, Gompert Z. Tipping points in the dynamics of speciation. Nat Ecol Evol. Macmillan Publishers Limited; 2017;1: 1–8. doi:10.1038/s41559-016-0001
OpenUrl CrossRef
9.↵
Riesch R, Muschick M, Lindtke D, Villoutreix R, Comeault AA, Farkas TE, et al. Transitions between phases of genomic differentiation during stick-insect speciation. Nat Ecol Evol. Macmillan Publishers Limited, part of Springer Nature.; 2017;1: 82. doi:10.1038/s41559-017-0082
OpenUrl CrossRef
10.↵
Feder JL, Nosil P, Wacholder AC, Egan SP, Berlocher SH, Flaxman SM. Genome-wide congealing and rapid transitions across the speciation continuum during speciation with gene flow. J Hered. 2014;105: 810–820. doi:10.1093/jhered/esu038
OpenUrl CrossRef PubMed
11.↵
Seehausen O, Butlin RK, Keller I, Wagner CE, Boughman JW, Hohenlohe P a, et al. Genomics and the origin of species. Nat Rev Genet. Nature Publishing Group; 2014;15: 176–92. doi:10.1038/nrg3644
OpenUrl CrossRef PubMed
12.↵
Wolf JB, Ellegren H. Making sense of genomic islands of differentiation in light of speciation. Nat Rev Genet. Nature Publishing Group; 2017;18: 87– 100. doi:10.1038/nrg.2016.133
OpenUrl CrossRef
13.↵
Burri R, Nater A, Kawakami T, Mugal CF, Olason PI, Smeds L, et al. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers. Genome Res. 2015;25: 1656–1665. doi:10.1101/gr.196485.115
OpenUrl Abstract/FREE Full Text
14.↵
1. Payseur BA
Nadachowska-Brzyska K, Burri R, Olason PI, Kawakami T, Smeds L, Ellegren H. Demographic Divergence History of Pied Flycatcher and Collared Flycatcher Inferred from Whole-Genome Re-sequencing Data. Payseur BA, editor. Plos Genet. 2013;9: e1003942. doi:10.1371/journal.pgen.1003942
OpenUrl CrossRef PubMed
15.↵
Via S. Natural selection in action during speciation. Proc Natl Acad Sci. 2009;106: 9939–9946.
OpenUrl CrossRef PubMed
16.↵
Marques DA, Lucek K, Meier JI, Mwaiko S, Wagner CE, Excoffier L, et al. Genomics of Rapid Incipient Speciation in Sympatric Threespine Stickleback. Plos Genet. 2016;12: e1005887. doi:10.1371/journal.pgen.1005887
OpenUrl CrossRef PubMed
17.
Lawniczak MKN, Emrich SJ, Holloway a K, Regier a P, Olson M, White B, et al. Widespread divergence between incipient Anopheles gambiae species revealed by whole genome sequences. Science. 2010;330: 512–4. doi:10.1126/science.1195755
OpenUrl Abstract/FREE Full Text
18.↵
Andrew RL, Rieseberg LH. Divergence is focused on few genomic regions early in speciation: incipient speciation of sunflower ecotypes. Press. 2013; doi:10.1111/evo.12106.2013
OpenUrl CrossRef
19.↵
Flaxman SM, Wacholder AC, Feder JL, Nosil P. Theoretical models of the influence of genomic architecture on the dynamics of speciation. Mol Ecol. 2014;23: 4074–4088. doi:10.1111/mec.12750
OpenUrl CrossRef
20.↵
Bierne N, Gagnaire PA, David P. The geography of introgression in a patchy environment and the thorn in the side of ecological speciation. Curr Zool. 2013;59: 72–86.
OpenUrl
21.↵
Cruickshank TE, Hahn MW. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Mol Ecol. 2014;23: 3133–3157. doi:10.1111/mec.12796
OpenUrl CrossRef PubMed Web of Science
22.↵
Noor MAF, Bennett SM. Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species. Heredity (Edinb). 2009;103: 439–444.
OpenUrl
23.↵
Nater A, Burri R, Kawakami T, Smeds L, Ellegren H. Resolving evolutionary relationships in closely related species with whole-genome sequencing data. Syst Biol. 2015;64: 1000–1017. doi:10.1093/sysbio/syv045
OpenUrl CrossRef PubMed
24.↵
Rosenzweig BK, Pease JB, Besansky NJ, Hahn MW. Powerful methods for detecting introgressed regions from population genomic data. Mol Ecol. 2016;25: 2387–2397. doi:10.1111/mec.13610
OpenUrl CrossRef
25.
Geneva A, Garrigan D. Population Genomics of Secondary Contact. Genes (Basel). 2010;1: 124–142. doi:10.3390/genes1010124
OpenUrl CrossRef
26.↵
Geneva AJ, Muirhead CA, Kingan SB, Garrigan D. A new method to scan genomes for introgression in a secondary contact model. Plos One. 2015;10: e0118621. doi:10.1371/journal.pone.0118621
OpenUrl CrossRef PubMed
27.↵
Meier JI, Marques DA, Mwaiko S, Wagner CE, Excoffier L, Seehausen O. Ancient hybridization fuels rapid cichlid fish adaptive radiations. Nat Commun. Nature Publishing Group; 2017;8: 14363. doi:10.1038/ncomms14363
OpenUrl CrossRef PubMed
28.↵
Martin SH, Dasmahapatra KK, Nadeau NJ, Salazar C, Walters JR, Simpson F, et al. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res. 2013;23: 1817–28. doi:10.1101/gr.159426.113
OpenUrl Abstract/FREE Full Text
29.↵
Martin CH, Crawford JE, Turner BJ, Simons LH. Diabolical survival in Death Valley: recent pupfish colonization, gene flow and genetic assimilation in the smallest species range on earth. Proc R Soc B Biol Sci. 2016;283: 20152334. doi:10.1098/rspb.2015.2334
OpenUrl CrossRef PubMed
30.↵
Hendry AP, Bolnick DI, Berner D, Peichel CL. Along the speciation continuum in sticklebacks. J Fish Biol. 2009;75: 2000–2036.
OpenUrl CrossRef PubMed Web of Science
31.↵
McKinnon JS, Rundle HD. Speciation in nature: the threespine stickleback model systems. Trends Ecol Evol. 2002;17: 480–481. doi:10.1016/S0169-5347(02)02579-X
OpenUrl CrossRef Web of Science
32.↵
Feulner PGD, Chain FJJ, Panchal M, Huang Y, Eizaguirre C, Kalbe M, et al. Genomics of Divergence along a Continuum of Parapatric Population Differentiation. Plos Genet. 2015;11: e1004966. doi:10.1371/journal.pgen.1004966
OpenUrl CrossRef PubMed
33.↵
Roesti M, Hendry AP, Salzburger W, Berner D. Genome divergence during evolutionary diversification as revealed in replicate lake-stream stickleback population pairs. Mol Ecol. 2012;21: 2852–2862. doi:10.1111/j.1365-294X.2012.05509.x
OpenUrl CrossRef PubMed Web of Science
34.↵
Roesti M, Kueng B, Moser D, Berner D. The genomics of ecological vicariance in threespine stickleback fish. Nat Commun. Nature Publishing Group; 2015;6: 8767. doi:10.1038/ncomms9767
OpenUrl CrossRef PubMed
35.↵
Wu C-I, Ting C-T. Genes and speciation. Nat Rev Genet. 2004;5: 114–22. doi:10.1038/nrg1269
OpenUrl CrossRef PubMed Web of Science
36.↵
Kitano J, Ross JA, Mori S, Kume M, Jones FC, Chan YF, et al. A role for neo-sex chromosomes in stickleback speciation. Nature. 2009;461: 1079–1083. doi:10.1002/ece3.234
OpenUrl CrossRef PubMed Web of Science
37.↵
Kitano J, Mori S, Peichel CL. Phenotypic divergence and reproductive isolation between sympatric forms of Japanese threespine sticklebacks. Biol J Linn Soc. 2007;91: 671–685. doi:10.1111/j.1095-8312.2007.00824.x
OpenUrl CrossRef
38.↵
Higuchi M, Sakai H, Goto A. A new threespine stickleback, Gasterosteus nipponicus sp. nov. (Teleostei: Gasterosteidae), from the Japan Sea region. Ichthyol Res. 2014; 1–2. doi:10.1007/s10228-014-0403-1
OpenUrl CrossRef
39.↵
Higuchi M, Goto A. Genetic evidence supporting the existence of two distinct species in the genus Gasterosteus around Japan. Environ Biol Fishes. 1996;47: 1–16. doi:10.1007/BF00002375
OpenUrl CrossRef
40.↵
1. Zhang J
Yoshida K, Makino T, Yamaguchi K, Shigenobu S, Hasebe M, Kawata M, et al. Sex Chromosome Turnover Contributes to Genomic Divergence between Incipient Stickleback Species. Zhang J, editor. Plos Genet. 2014;10: e1004223. doi:10.1371/journal.pgen.1004223
OpenUrl CrossRef PubMed
41.↵
Kume M, Kitano J, Mori S, Shibuya T. Ecological divergence and habitat isolation between two migratory forms of Japanese threespine stickleback (Gasterosteus aculeatus). J Evol Biol. 2010;23: 1436–1446. doi:10.1111/j.1420-9101.2010.02009.x
OpenUrl CrossRef PubMed
42.↵
1. Craft JA
Ravinet M, Takeuchi N, Kume M, Mori S, Kitano J. Comparative Analysis of Japanese Three-Spined Stickleback Clades Reveals the Pacific Ocean Lineage Has Adapted to Freshwater Environments while the Japan Sea Has Not. Craft JA, editor. Plos One. 2014;9: e112404. doi:10.1371/journal.pone.0112404
OpenUrl CrossRef
43.↵
Lackey AC, Boughman JW. Evolution of reproductive isolation in stickleback fish. Evolution (N Y). 2017;71: 357–371. doi:10.1038/hdy.2008.69
OpenUrl CrossRef
44.↵
Ortí G, Bell MA, Reimchen TE, Meyer A. Global survey of mitochondrial DNA sequences in the threespine stickleback: evidence for recent migrations. Evolution (N Y). 1994;48: 608–622.
OpenUrl
45.↵
Yamada M, Higuchi M, Goto A. Extensive introgression of mitochondrial DNA found between two genetically divergent forms of threespine stickleback, Gasterosteus aculeatus, around Japan. Environ Biol Fishes. 2001;61: 269–284.
OpenUrl
46.↵
Feulner PGD, Chain FJJ, Panchal M, Eizaguirre C, Kalbe M, Lenz TL, et al. Genome-wide patterns of standing genetic variation in a marine population of three-spined sticklebacks. Mol Ecol. 2012; no-no. doi:10.1111/j.1365-294X.2012.05680.x
OpenUrl CrossRef PubMed Web of Science
47.↵
Cummings MP, Neel MC, Shaw KL. A genealogical approach to quantifying lineage divergence. Evolution. 2008;62: 2411–22. doi:10.1111/j.1558-5646.2008.00442.x
OpenUrl CrossRef PubMed Web of Science
48.↵
Toews DPL, Brelsford A. The biogeography of mitochondrial and nuclear discordance in animals. Mol Ecol. 2012;21: 3907–30. doi:10.1111/j.1365-294X.2012.05664.x
OpenUrl CrossRef PubMed Web of Science
49.↵
Catchen J, Bassham S, Wilson T, Currey M, O’Brien C, Yeates Q, et al. The population structure and recent colonization history of Oregon threespine stickleback determined using restriction-site associated DNA-sequencing. Mol Ecol. 2013;22: 2864–2883. doi:10.1111/mec.12330
OpenUrl CrossRef Web of Science
50.↵
Falush D, Stephens M, Pritchard JK. Inference of population structure: Extensions to linked loci and correlated allele frequencies. Genetics. 2003;164: 1567–1587.
OpenUrl Abstract/FREE Full Text
51.↵
Pritchard JK, Stephens M, Rosenberg NA, Donnelly P. Association mapping in structured populations. Am J Hum Genet. 2000;67.
52.↵
Kusakabe M, Ishikawa A, Ravinet M, Yoshida K, Makino T, Toyoda A, et al. Genetic basis for variation in salinity tolerance between stickleback ecotypes. Mol Ecol. 2016; doi:10.1111/mec.13875
OpenUrl CrossRef
53.↵
Charlesworth B, Morgan MT, Charlesworth D. The effect of deleterious mutations on neutral molecular variation. Genetics. 1993;134: 1289–303. Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1205596&tool=pmcentrez&rendertype=abstract
OpenUrl Abstract/FREE Full Text
54.↵
Martin SH, Davey JW, Jiggins CD. Evaluating the use of ABBA-BABA statistics to locate introgressed loci. Mol Biol Evol. 2014;32: 244–257. doi:10.1101/001347
OpenUrl CrossRef PubMed
55.↵
Soria-Carrasco V, Gompert Z, Comeault AA, Farkas TE, Parchman TL, Johnston JS, et al. Stick Insect Genomes Reveal Natural Selection’s Role in Parallel Speciation. Science (80-). 2014;344: 738–742. doi:10.1126/science.1252136
OpenUrl Abstract/FREE Full Text
56.↵
Eaton DAR, Ree RH. Inferring phylogeny and introgression using RADseq data: An example from flowering plants (Pedicularis: Orobanchaceae). Syst Biol. 2013;62: 689–706. doi:10.1093/sysbio/syt032
OpenUrl CrossRef PubMed
57.↵
Pease JB, Hahn MW. Detection and Polarization of Introgression in a Five-Taxon Phylogeny. Syst Biol. 2015;64: 651–662. doi:10.1093/sysbio/syv023
OpenUrl CrossRef PubMed
58.↵
Cassidy L, Ravinet M, Mori S, Kitano J. Are Japanese freshwater populations of threespine stickleback derived from the Pacific Ocean lineage? Evol Ecol Res. 2013;15: 295–311.
OpenUrl
59.↵
Roesti M, Moser D, Berner D. Recombination in the threespine stickleback genome - Patterns and consequences. Mol Ecol. 2013;22: 3014–3027. doi:10.1111/mec.12322
OpenUrl CrossRef PubMed Web of Science
60.↵
Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. Nature Publishing Group; 2011;475: 493–6. doi:10.1038/nature10231
OpenUrl CrossRef PubMed Web of Science
61.↵
Schiffels S, Durbin R. Inferring human population size and separation history from multiple genome sequences. Nat Genet. Nature Publishing Group; 2014;46: 919–925. doi:10.1038/ng.3015
OpenUrl CrossRef PubMed
62.↵
Kume M, Kitamura T, Takahashi H, Goto A. Distinct spawning migration patterns in sympatric Japan Sea and Pacific Ocean forms of threespine stickleback Gasterosteus aculeatus. Icthyological Res. 2005;52: 189–193. doi:10.1007/s10228-005-0269-3
OpenUrl CrossRef
63.↵
Kume M, Kuwahara T, Arai T, Okamoto M, Goto A. A part of the Japan Sea form of the threespine stickleback, Gasterosteus aculeatus, spawns in the seawater tidal pools of western Hokkaido Island, Japan. Environ Biol Fishes. 2006;77: 169–175. doi:10.1007/s10641-006-9068-6
OpenUrl CrossRef
64.↵
Wirtz P. Mother species–father species: unidirectional hybridization in animals with female choice. Anim Behav. 1999;58: 1–12. doi:10.1006/anbe.1999.1144
OpenUrl CrossRef PubMed Web of Science
65.↵
Knowles LL. Statistical Phylogeography. Annu Rev Ecol Syst. 2009;40: 593–612.
OpenUrl CrossRef
66.↵
Robinson JD, Bunnefeld L, Hearn J, Stone GN, Hickerson MJ. ABC inference of multi-population divergence with admixture from un-phased population genomic data. Mol Ecol. 2014;23: 4458–4471. doi:10.1111/mec.12881
OpenUrl CrossRef
67.↵
Sousa V, Hey J. Understanding the origin of species with genome-scale data: modelling gene flow. Nat Rev Genet. Nature Publishing Group; 2013;14: 404–14. doi:10.1038/nrg3446
OpenUrl CrossRef PubMed
68.↵
Yamada M, Higuchi M, Goto A. Long-term occurrence of hybrids between Japan Sea and Pacific Ocean forms of threespine stickleback, Gasterosteus aculeatus, in Hokkaido Island, Japan. Environ Biol Fishes. 2007;80: 435– 443.
OpenUrl
69.↵
MÄkinen HS, MerilÄ J. Mitochondrial DNA phylogeography of the three-spined stickleback (Gasterosteus aculeatus) in Europe - Evidence for multiple glacial refugia. Mol Phylogenet Evol. 2008;46: 167–182. doi:10.1016/j.ympev.2007.06.011
OpenUrl CrossRef PubMed Web of Science
70.↵
Ravinet M, Harrod C, Eizaguirre C, Prodöhl P a. Unique mitochondrial DNA lineages in Irish stickleback populations: cryptic refugium or rapid recolonization? Ecol Evol. 2013; n/a–n/a. doi:10.1002/ece3.853
OpenUrl CrossRef
71.↵
Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, Cresko WA. Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. Plos Genet. 2010;6: e1000862.
OpenUrl CrossRef PubMed
72.↵
Lescak EA, Bassham SL, Catchen J, Gelmond O, Sherbick ML, von Hippel FA, et al. Evolution of stickleback in 50 years on earthquake-uplifted islands. Proc Natl Acad Sci U S A. 2015;112: E7204–E7212. doi:10.1073/pnas.1512020112
OpenUrl Abstract/FREE Full Text
73.↵
Barton NH, de Cara MAR. The evolution of strong reproductive isolation. Evolution. 2009;63: 1171–90. doi:10.1111/j.1558-5646.2009.00622.x
OpenUrl CrossRef PubMed Web of Science
74.↵
Barton N, Bengtsson BO. The barrier to genetic exchange between hybridising populations. Heredity (Edinb). 1986;57: 357–376. doi:10.1038/hdy.1986.135
OpenUrl CrossRef
75.↵
Muirhead CA, Presgraves DC. Hybrid Incompatibilities, Local Adaptation, and the Genomic Distribution of Natural Introgression between Species. Am Nat. 2016;187: 249–261. doi:10.1086/684583
OpenUrl CrossRef
76.↵
Hedrick PW. Adaptive introgression in animals: examples and comparison to new mutation and standing variation as sources of adaptive variation. Mol Ecol. 2013; doi:10.1111/mec.12415
OpenUrl CrossRef PubMed Web of Science
77.↵
Racimo F, Sankararaman S, Nielsen R, Huerta-Sánchez E. Evidence for archaic adaptive introgression in humans. Nat Rev Genet. Nature Publishing Group; 2015;16: 359–371. doi:10.1038/nrg3936
OpenUrl CrossRef PubMed
78.↵
Castric V, Bechsgaard J, Schierup MH, Vekemans X. Repeated adaptive introgression at a gene under multiallelic balancing selection. Plos Genet. 2008;4. doi:10.1371/journal.pgen.1000168
OpenUrl CrossRef PubMed
79.↵
Elgvin TO, Trier CN, Tørresen OK, Hagen, Ingerid J, Lien S, Nederbragt AJ, et al. The genomic mosaicism of hybrid speciation. Sci Adv. 2017;3.
80.↵
Sun C, Huo D, Southard C, Nemesure B, Hennis A, Cristina Leske M, et al. A signature of balancing selection in the region upstream to the human UGT2B4 gene and implications for breast cancer risk. Hum Genet. 2011;130: 767–775. doi:10.1007/s00439-011-1025-6
OpenUrl CrossRef PubMed
81.↵
Bolnick DI, Stutz WE. Frequency dependence limits divergent evolution by favouring rare immigrants over residents. Nature. Nature Publishing Group; 2017;546: 285–288. doi:10.1038/nature22351
OpenUrl CrossRef PubMed
82.↵
Ishikawa A, Kusakabe M, Yoshida K, Ravinet M, Makino T, Toyoda A, et al. Different contributions of local- and distant-regulatory changes to transcriptome divergence between stickleback ecotypes. 2017; 1–17. doi:10.1111/evo.13175
OpenUrl CrossRef
83.↵
Ravinet M, Prodöhl PA, Harrod C. On Irish sticklebacklZ: morphological diversification in a secondary contact zone. Evol Ecol Res. 2013;15: 271–294.
OpenUrl
84.↵
Baird N a, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis Z a, et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. Plos One. 2008;3: e3376. doi:10.1371/journal.pone.0003376
OpenUrl CrossRef PubMed
85.↵
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25: 2078–9. doi:10.1093/bioinformatics/btp352
OpenUrl CrossRef PubMed Web of Science
86.↵
Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7: 214–222.
OpenUrl CrossRef PubMed
87.↵
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32: 1792–7. doi:10.1093/nar/gkh340
OpenUrl CrossRef PubMed Web of Science
88.↵
1. Bell MA,
2. Foster SA
Bell MA. Paleobiology and evolution of threespine stickleback. In: Bell MA, Foster SA, editors. The evolutionary biology of the threespine stickleback. Oxford: Oxford University Press; 1994. pp. 438–471.
89.↵
Bell MA, Stewart JD, Park PJ. The world’s oldest fossil threespine stickleback fish. Copeia. 2009;2009: 256–265.
OpenUrl CrossRef
90.↵
Heled J, Drummond AJ. Calibrated tree priors for relaxed phylogenetics and divergence time estimation. Syst Biol. 2012;61: 138–49. doi:10.1093/sysbio/syr087
OpenUrl CrossRef PubMed
91.↵
Wheat CW, Wahlberg N. Critiquing blind dating: the dangers of over-confident date estimates in comparative genomics. Trends Ecol Evol. Elsevier Ltd; 2013;28: 636–42. doi:10.1016/j.tree.2013.07.007
OpenUrl CrossRef PubMed Web of Science
92.↵
Baele G, Lemey P, Bedford T, Rambaut A, Suchard M a, Alekseyenko A V. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol Biol Evol. 2012;29: 2157–67. doi:10.1093/molbev/mss084
OpenUrl CrossRef PubMed Web of Science
93.↵
Baele G, Li WLS, Drummond AJ, Suchard M a, Lemey P. Accurate model selection of relaxed molecular clocks in bayesian phylogenetics. Mol Biol Evol. 2013;30: 239–43. doi:10.1093/molbev/mss243
OpenUrl CrossRef PubMed Web of Science
94.↵
Rambaut A, Drummond AJ. Tracer v1.5. 2009.
95.↵
Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22: 2688–2690. doi:10.1093/bioinformatics/btl446
OpenUrl CrossRef PubMed Web of Science
96.↵
Revell LJ. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol Evol. 2012;3: 217–223. doi:10.1111/j.2041-210X.2011.00169.x
OpenUrl CrossRef
97.↵
Schliep KP. phangorn: Phylogenetic analysis in R. Bioinformatics. 2011;27: 592–593. doi:10.1093/bioinformatics/btq706
OpenUrl CrossRef PubMed Web of Science
98.↵
Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475: 493–6. doi:10.1038/nature10231
OpenUrl CrossRef PubMed Web of Science
99.↵
1. Bell MA,
2. Foster SA
Bell MA, Foster SA. Introduction to the evolutionary biology of the threespine stickleback. In: Bell MA, Foster SA, editors. The Evolutionary Biology of the Threespine Stickleback. Oxford: Oxford University Press; 1994. pp. 1–27.
100.↵
Guo B, Chain FJ, Bornberg-Bauer E, Leder EH, MerilÄ J. Genomic divergence between nine- and three-spined sticklebacks. BMC Genomics. 2013;14: 756. doi:10.1186/1471-2164-14-756
OpenUrl CrossRef PubMed
101.↵
Mailund T, Halager AE, Westergaard M, Dutheil JY, Munch K, Andersen LN, et al. A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species. Plos Genet. 2012;8: e1003125. doi:10.1371/journal.pgen.1003125
OpenUrl CrossRef PubMed
102.↵
Pavlidis P, Laurent S, Stephan W. msABC: a modification of Hudson’s ms to facilitate multi-locus ABC analysis. Mol Ecol Resour. 2010;10: 723–7. doi:10.1111/j.1755-0998.2010.02832.x
OpenUrl CrossRef PubMed
103.↵
Tange O. GNU Parallel - The Command-Line Power Tool.; login USENIX Mag. Frederiksberg, Denmark; 2011;36: 42–47. Available: http://www.gnu.org/s/parallel
OpenUrl
104.↵
Csilléry K, Blum MGB, Gaggiotti OE, François O, Csillery K, Francois O. Approximate Bayesian Computation (ABC) in practice. Trends Ecol Evol. 2010;25: 410–418. doi:10.1016/j.tree.2010.04.001
OpenUrl CrossRef PubMed Web of Science
105.↵
Bertorelle G, Benazzo A, Mona S. ABC as a flexible framework to estimate demography over space and time: some cons, many pros. Mol Ecol. 2010;19: 2609–2625. doi:10.1111/j.1365-294X.2010.04690.x
OpenUrl CrossRef PubMed Web of Science
106.↵
Csilléry K, François O, Blum MGB. ABC: an R package for approximate Bayesian computation (ABC). Methods Ecol Evol. 2012;3: no-no. doi:10.1111/j.2041-210X.2011.00179.x
OpenUrl CrossRef
107.↵
Weir B, Cockerham C. Estimating F-Statistics for the Analysis of Population Structure. Evolution (N Y). 1984;38: 1358–1370. Available: http://www.jstor.org/stable/2408641
OpenUrl CrossRef
108.↵
Danecek P, Auton A, Abecasis G, Albers C a, Banks E, DePristo M a, et al. The variant call format and VCFtools. Bioinformatics. 2011;27: 2156–8. doi:10.1093/bioinformatics/btr330
OpenUrl CrossRef PubMed Web of Science
109.↵
Smadja CM, CanbÄck B, Vitalis R, Gautier M, Ferrari J, Zhou J-J, et al. Large-scale candidate gene scan reveals the role of chemoreceptor genes in host plant specialization and speciation in the pea aphid. Evolution. 2012;66: 2723–38. doi:10.1111/j.1558-5646.2012.01612.x
OpenUrl CrossRef PubMed Web of Science
110.↵
Harte D. HiddenMarkov: Hidden Markov Models. R package version 1.8-7. Wellington: Statistics Research Associates; 2016.
111.↵
Catchen J, Hohenlohe P a., Bassham S, Amores A, Cresko W a. Stacks: an analysis tool set for population genomics. Mol Ecol. 2013;22: 3124–3140. doi:10.1111/mec.12354
OpenUrl CrossRef PubMed Web of Science
112.↵
Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26: 873–81. doi:10.1093/bioinformatics/btq057
OpenUrl CrossRef PubMed Web of Science
113.↵
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25: 2078–9. doi:10.1093/bioinformatics/btp352
OpenUrl CrossRef PubMed Web of Science
114.↵
Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155: 945–959.
OpenUrl Abstract/FREE Full Text
115.↵
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81: 559–575. doi:10.1086/519795
OpenUrl CrossRef PubMed
116.↵
Jombart T, Ahmed I. adegenet 1. 3-1lZ: new tools for the analysis of genome-wide SNP data. Bioinformatics. 2011;27: 3070–3071. doi:10.1093/bioinformatics/btr521
OpenUrl CrossRef PubMed Web of Science
117.↵
Earl D a., vonHoldt BM. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour. 2011;4: 359–361. doi:10.1007/s12686-011-9548-7
OpenUrl CrossRef
118.↵
Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al. ClueGO: A Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25: 1091–1093. doi:10.1093/bioinformatics/btp101
OpenUrl CrossRef PubMed Web of Science
119.↵
Christmas, Rowan; Avila-Campillo, Iliana; Bolouri, Hamid; Schwikowski, Benno; Anderson, Mark; Kelley, Ryan; Landys, Nerius; Workman, Chris; Ideker, Trey; Cerami, Ethan; Sheridan, Rob; Bader, Gary D.; Sander C. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Am Assoc Cancer Res Educ B. 2005; 12–16. doi:10.1101/gr.1239303.metabolite
OpenUrl CrossRef
120.↵
Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B …. 1995;57: 289–300. Available: http://www.jstor.org/stable/10.2307/2346101
OpenUrl CrossRef PubMed

View the discussion thread.

Posted September 22, 2017.

Download PDF

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5201)
Biochemistry (11715)
Bioengineering (8723)
Bioinformatics (29128)
Biophysics (14935)
Cancer Biology (12049)
Cell Biology (17359)
Clinical Trials (138)
Developmental Biology (9406)
Ecology (14144)
Epidemiology (2067)
Evolutionary Biology (18268)
Genetics (12221)
Genomics (16767)
Immunology (11843)
Microbiology (28014)
Molecular Biology (11560)
Neuroscience (60810)
Paleontology (450)
Pathology (1864)
Pharmacology and Toxicology (3231)
Physiology (4940)
Plant Biology (10384)
Scientific Communication and Education (1680)
Synthetic Biology (2878)
Systems Biology (7333)
Zoology (1642)

[1] 1.↵
Mayr E. Animal species and evolution. Cambridge, Massachusetts: Havard University Press; 1963.

[2] 2.
Coyne JA, Orr HA. Speciation. New York: Sinaeur; 2004.

[3] 3.↵
Nosil P. Ecological Speciation. Oxford, UK: Oxford University Press; 2012.

[4] 4.↵
Wu C-I. The genic view of the process of speciation. J Evol Biol. 2001;14: 851–865.
OpenUrl CrossRef Web of Science

[5] 5.↵
Feder JL, Egan SP, Nosil P. The genomics of speciation-with-gene-flow. Trends Genet. Elsevier Ltd; 2012;28: 342–350. doi:10.1016/j.tig.2012.03.009
OpenUrl CrossRef PubMed Web of Science

[6] 6.↵
Nosil P, Feder JL. Genomic divergence during speciation: causes and consequences. Philos Trans R Soc London Ser B. 2012;367: 332–342. doi:10.1098/rstb.2011.0263
OpenUrl CrossRef PubMed

[7] 7.↵
Ravinet M, Faria R, Butlin RK, Galindo J, Bierne N, Rafajlovic M, et al. Interpreting the genomic landscape of speciation: finding barriers to gene flow. J Evol Biol. 2017;in press.

[8] 8.↵
Nosil P, Feder JL, Flaxman SM, Gompert Z. Tipping points in the dynamics of speciation. Nat Ecol Evol. Macmillan Publishers Limited; 2017;1: 1–8. doi:10.1038/s41559-016-0001
OpenUrl CrossRef

[9] 9.↵
Riesch R, Muschick M, Lindtke D, Villoutreix R, Comeault AA, Farkas TE, et al. Transitions between phases of genomic differentiation during stick-insect speciation. Nat Ecol Evol. Macmillan Publishers Limited, part of Springer Nature.; 2017;1: 82. doi:10.1038/s41559-017-0082
OpenUrl CrossRef

[10] 10.↵
Feder JL, Nosil P, Wacholder AC, Egan SP, Berlocher SH, Flaxman SM. Genome-wide congealing and rapid transitions across the speciation continuum during speciation with gene flow. J Hered. 2014;105: 810–820. doi:10.1093/jhered/esu038
OpenUrl CrossRef PubMed

[11] 11.↵
Seehausen O, Butlin RK, Keller I, Wagner CE, Boughman JW, Hohenlohe P a, et al. Genomics and the origin of species. Nat Rev Genet. Nature Publishing Group; 2014;15: 176–92. doi:10.1038/nrg3644
OpenUrl CrossRef PubMed

[12] 12.↵
Wolf JB, Ellegren H. Making sense of genomic islands of differentiation in light of speciation. Nat Rev Genet. Nature Publishing Group; 2017;18: 87– 100. doi:10.1038/nrg.2016.133
OpenUrl CrossRef

[13] 13.↵
Burri R, Nater A, Kawakami T, Mugal CF, Olason PI, Smeds L, et al. Linked selection and recombination rate variation drive the evolution of the genomic landscape of differentiation across the speciation continuum of Ficedula flycatchers. Genome Res. 2015;25: 1656–1665. doi:10.1101/gr.196485.115
OpenUrl Abstract/FREE Full Text

[14] 14.↵
Payseur BA
Nadachowska-Brzyska K, Burri R, Olason PI, Kawakami T, Smeds L, Ellegren H. Demographic Divergence History of Pied Flycatcher and Collared Flycatcher Inferred from Whole-Genome Re-sequencing Data. Payseur BA, editor. Plos Genet. 2013;9: e1003942. doi:10.1371/journal.pgen.1003942
OpenUrl CrossRef PubMed

[15] Payseur BA

[16] 15.↵
Via S. Natural selection in action during speciation. Proc Natl Acad Sci. 2009;106: 9939–9946.
OpenUrl CrossRef PubMed

[17] 16.↵
Marques DA, Lucek K, Meier JI, Mwaiko S, Wagner CE, Excoffier L, et al. Genomics of Rapid Incipient Speciation in Sympatric Threespine Stickleback. Plos Genet. 2016;12: e1005887. doi:10.1371/journal.pgen.1005887
OpenUrl CrossRef PubMed

[18] 17.
Lawniczak MKN, Emrich SJ, Holloway a K, Regier a P, Olson M, White B, et al. Widespread divergence between incipient Anopheles gambiae species revealed by whole genome sequences. Science. 2010;330: 512–4. doi:10.1126/science.1195755
OpenUrl Abstract/FREE Full Text

[19] 18.↵
Andrew RL, Rieseberg LH. Divergence is focused on few genomic regions early in speciation: incipient speciation of sunflower ecotypes. Press. 2013; doi:10.1111/evo.12106.2013
OpenUrl CrossRef

[20] 19.↵
Flaxman SM, Wacholder AC, Feder JL, Nosil P. Theoretical models of the influence of genomic architecture on the dynamics of speciation. Mol Ecol. 2014;23: 4074–4088. doi:10.1111/mec.12750
OpenUrl CrossRef

[21] 20.↵
Bierne N, Gagnaire PA, David P. The geography of introgression in a patchy environment and the thorn in the side of ecological speciation. Curr Zool. 2013;59: 72–86.
OpenUrl

[22] 21.↵
Cruickshank TE, Hahn MW. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Mol Ecol. 2014;23: 3133–3157. doi:10.1111/mec.12796
OpenUrl CrossRef PubMed Web of Science

[23] 22.↵
Noor MAF, Bennett SM. Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species. Heredity (Edinb). 2009;103: 439–444.
OpenUrl

[24] 23.↵
Nater A, Burri R, Kawakami T, Smeds L, Ellegren H. Resolving evolutionary relationships in closely related species with whole-genome sequencing data. Syst Biol. 2015;64: 1000–1017. doi:10.1093/sysbio/syv045
OpenUrl CrossRef PubMed

[25] 24.↵
Rosenzweig BK, Pease JB, Besansky NJ, Hahn MW. Powerful methods for detecting introgressed regions from population genomic data. Mol Ecol. 2016;25: 2387–2397. doi:10.1111/mec.13610
OpenUrl CrossRef

[26] 25.
Geneva A, Garrigan D. Population Genomics of Secondary Contact. Genes (Basel). 2010;1: 124–142. doi:10.3390/genes1010124
OpenUrl CrossRef

[27] 26.↵
Geneva AJ, Muirhead CA, Kingan SB, Garrigan D. A new method to scan genomes for introgression in a secondary contact model. Plos One. 2015;10: e0118621. doi:10.1371/journal.pone.0118621
OpenUrl CrossRef PubMed

[28] 27.↵
Meier JI, Marques DA, Mwaiko S, Wagner CE, Excoffier L, Seehausen O. Ancient hybridization fuels rapid cichlid fish adaptive radiations. Nat Commun. Nature Publishing Group; 2017;8: 14363. doi:10.1038/ncomms14363
OpenUrl CrossRef PubMed

[29] 28.↵
Martin SH, Dasmahapatra KK, Nadeau NJ, Salazar C, Walters JR, Simpson F, et al. Genome-wide evidence for speciation with gene flow in Heliconius butterflies. Genome Res. 2013;23: 1817–28. doi:10.1101/gr.159426.113
OpenUrl Abstract/FREE Full Text

[30] 29.↵
Martin CH, Crawford JE, Turner BJ, Simons LH. Diabolical survival in Death Valley: recent pupfish colonization, gene flow and genetic assimilation in the smallest species range on earth. Proc R Soc B Biol Sci. 2016;283: 20152334. doi:10.1098/rspb.2015.2334
OpenUrl CrossRef PubMed

[31] 30.↵
Hendry AP, Bolnick DI, Berner D, Peichel CL. Along the speciation continuum in sticklebacks. J Fish Biol. 2009;75: 2000–2036.
OpenUrl CrossRef PubMed Web of Science

[32] 31.↵
McKinnon JS, Rundle HD. Speciation in nature: the threespine stickleback model systems. Trends Ecol Evol. 2002;17: 480–481. doi:10.1016/S0169-5347(02)02579-X
OpenUrl CrossRef Web of Science

[33] 32.↵
Feulner PGD, Chain FJJ, Panchal M, Huang Y, Eizaguirre C, Kalbe M, et al. Genomics of Divergence along a Continuum of Parapatric Population Differentiation. Plos Genet. 2015;11: e1004966. doi:10.1371/journal.pgen.1004966
OpenUrl CrossRef PubMed

[34] 33.↵
Roesti M, Hendry AP, Salzburger W, Berner D. Genome divergence during evolutionary diversification as revealed in replicate lake-stream stickleback population pairs. Mol Ecol. 2012;21: 2852–2862. doi:10.1111/j.1365-294X.2012.05509.x
OpenUrl CrossRef PubMed Web of Science

[35] 34.↵
Roesti M, Kueng B, Moser D, Berner D. The genomics of ecological vicariance in threespine stickleback fish. Nat Commun. Nature Publishing Group; 2015;6: 8767. doi:10.1038/ncomms9767
OpenUrl CrossRef PubMed

[36] 35.↵
Wu C-I, Ting C-T. Genes and speciation. Nat Rev Genet. 2004;5: 114–22. doi:10.1038/nrg1269
OpenUrl CrossRef PubMed Web of Science

[37] 36.↵
Kitano J, Ross JA, Mori S, Kume M, Jones FC, Chan YF, et al. A role for neo-sex chromosomes in stickleback speciation. Nature. 2009;461: 1079–1083. doi:10.1002/ece3.234
OpenUrl CrossRef PubMed Web of Science

[38] 37.↵
Kitano J, Mori S, Peichel CL. Phenotypic divergence and reproductive isolation between sympatric forms of Japanese threespine sticklebacks. Biol J Linn Soc. 2007;91: 671–685. doi:10.1111/j.1095-8312.2007.00824.x
OpenUrl CrossRef

[39] 38.↵
Higuchi M, Sakai H, Goto A. A new threespine stickleback, Gasterosteus nipponicus sp. nov. (Teleostei: Gasterosteidae), from the Japan Sea region. Ichthyol Res. 2014; 1–2. doi:10.1007/s10228-014-0403-1
OpenUrl CrossRef

[40] 39.↵
Higuchi M, Goto A. Genetic evidence supporting the existence of two distinct species in the genus Gasterosteus around Japan. Environ Biol Fishes. 1996;47: 1–16. doi:10.1007/BF00002375
OpenUrl CrossRef

[41] 40.↵
Zhang J
Yoshida K, Makino T, Yamaguchi K, Shigenobu S, Hasebe M, Kawata M, et al. Sex Chromosome Turnover Contributes to Genomic Divergence between Incipient Stickleback Species. Zhang J, editor. Plos Genet. 2014;10: e1004223. doi:10.1371/journal.pgen.1004223
OpenUrl CrossRef PubMed

[42] Zhang J

[43] 41.↵
Kume M, Kitano J, Mori S, Shibuya T. Ecological divergence and habitat isolation between two migratory forms of Japanese threespine stickleback (Gasterosteus aculeatus). J Evol Biol. 2010;23: 1436–1446. doi:10.1111/j.1420-9101.2010.02009.x
OpenUrl CrossRef PubMed

[44] 42.↵
Craft JA
Ravinet M, Takeuchi N, Kume M, Mori S, Kitano J. Comparative Analysis of Japanese Three-Spined Stickleback Clades Reveals the Pacific Ocean Lineage Has Adapted to Freshwater Environments while the Japan Sea Has Not. Craft JA, editor. Plos One. 2014;9: e112404. doi:10.1371/journal.pone.0112404
OpenUrl CrossRef

[45] Craft JA

[46] 43.↵
Lackey AC, Boughman JW. Evolution of reproductive isolation in stickleback fish. Evolution (N Y). 2017;71: 357–371. doi:10.1038/hdy.2008.69
OpenUrl CrossRef

[47] 44.↵
Ortí G, Bell MA, Reimchen TE, Meyer A. Global survey of mitochondrial DNA sequences in the threespine stickleback: evidence for recent migrations. Evolution (N Y). 1994;48: 608–622.
OpenUrl

[48] 45.↵
Yamada M, Higuchi M, Goto A. Extensive introgression of mitochondrial DNA found between two genetically divergent forms of threespine stickleback, Gasterosteus aculeatus, around Japan. Environ Biol Fishes. 2001;61: 269–284.
OpenUrl

[49] 46.↵
Feulner PGD, Chain FJJ, Panchal M, Eizaguirre C, Kalbe M, Lenz TL, et al. Genome-wide patterns of standing genetic variation in a marine population of three-spined sticklebacks. Mol Ecol. 2012; no-no. doi:10.1111/j.1365-294X.2012.05680.x
OpenUrl CrossRef PubMed Web of Science

[50] 47.↵
Cummings MP, Neel MC, Shaw KL. A genealogical approach to quantifying lineage divergence. Evolution. 2008;62: 2411–22. doi:10.1111/j.1558-5646.2008.00442.x
OpenUrl CrossRef PubMed Web of Science

[51] 48.↵
Toews DPL, Brelsford A. The biogeography of mitochondrial and nuclear discordance in animals. Mol Ecol. 2012;21: 3907–30. doi:10.1111/j.1365-294X.2012.05664.x
OpenUrl CrossRef PubMed Web of Science

[52] 49.↵
Catchen J, Bassham S, Wilson T, Currey M, O’Brien C, Yeates Q, et al. The population structure and recent colonization history of Oregon threespine stickleback determined using restriction-site associated DNA-sequencing. Mol Ecol. 2013;22: 2864–2883. doi:10.1111/mec.12330
OpenUrl CrossRef Web of Science

[53] 50.↵
Falush D, Stephens M, Pritchard JK. Inference of population structure: Extensions to linked loci and correlated allele frequencies. Genetics. 2003;164: 1567–1587.
OpenUrl Abstract/FREE Full Text

[54] 51.↵
Pritchard JK, Stephens M, Rosenberg NA, Donnelly P. Association mapping in structured populations. Am J Hum Genet. 2000;67.

[55] 52.↵
Kusakabe M, Ishikawa A, Ravinet M, Yoshida K, Makino T, Toyoda A, et al. Genetic basis for variation in salinity tolerance between stickleback ecotypes. Mol Ecol. 2016; doi:10.1111/mec.13875
OpenUrl CrossRef

[56] 53.↵
Charlesworth B, Morgan MT, Charlesworth D. The effect of deleterious mutations on neutral molecular variation. Genetics. 1993;134: 1289–303. Available: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1205596&tool=pmcentrez&rendertype=abstract
OpenUrl Abstract/FREE Full Text

[57] 54.↵
Martin SH, Davey JW, Jiggins CD. Evaluating the use of ABBA-BABA statistics to locate introgressed loci. Mol Biol Evol. 2014;32: 244–257. doi:10.1101/001347
OpenUrl CrossRef PubMed

[58] 55.↵
Soria-Carrasco V, Gompert Z, Comeault AA, Farkas TE, Parchman TL, Johnston JS, et al. Stick Insect Genomes Reveal Natural Selection’s Role in Parallel Speciation. Science (80-). 2014;344: 738–742. doi:10.1126/science.1252136
OpenUrl Abstract/FREE Full Text

[59] 56.↵
Eaton DAR, Ree RH. Inferring phylogeny and introgression using RADseq data: An example from flowering plants (Pedicularis: Orobanchaceae). Syst Biol. 2013;62: 689–706. doi:10.1093/sysbio/syt032
OpenUrl CrossRef PubMed

[60] 57.↵
Pease JB, Hahn MW. Detection and Polarization of Introgression in a Five-Taxon Phylogeny. Syst Biol. 2015;64: 651–662. doi:10.1093/sysbio/syv023
OpenUrl CrossRef PubMed

[61] 58.↵
Cassidy L, Ravinet M, Mori S, Kitano J. Are Japanese freshwater populations of threespine stickleback derived from the Pacific Ocean lineage? Evol Ecol Res. 2013;15: 295–311.
OpenUrl

[62] 59.↵
Roesti M, Moser D, Berner D. Recombination in the threespine stickleback genome - Patterns and consequences. Mol Ecol. 2013;22: 3014–3027. doi:10.1111/mec.12322
OpenUrl CrossRef PubMed Web of Science

[63] 60.↵
Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. Nature Publishing Group; 2011;475: 493–6. doi:10.1038/nature10231
OpenUrl CrossRef PubMed Web of Science

[64] 61.↵
Schiffels S, Durbin R. Inferring human population size and separation history from multiple genome sequences. Nat Genet. Nature Publishing Group; 2014;46: 919–925. doi:10.1038/ng.3015
OpenUrl CrossRef PubMed

[65] 62.↵
Kume M, Kitamura T, Takahashi H, Goto A. Distinct spawning migration patterns in sympatric Japan Sea and Pacific Ocean forms of threespine stickleback Gasterosteus aculeatus. Icthyological Res. 2005;52: 189–193. doi:10.1007/s10228-005-0269-3
OpenUrl CrossRef

[66] 63.↵
Kume M, Kuwahara T, Arai T, Okamoto M, Goto A. A part of the Japan Sea form of the threespine stickleback, Gasterosteus aculeatus, spawns in the seawater tidal pools of western Hokkaido Island, Japan. Environ Biol Fishes. 2006;77: 169–175. doi:10.1007/s10641-006-9068-6
OpenUrl CrossRef

[67] 64.↵
Wirtz P. Mother species–father species: unidirectional hybridization in animals with female choice. Anim Behav. 1999;58: 1–12. doi:10.1006/anbe.1999.1144
OpenUrl CrossRef PubMed Web of Science

[68] 65.↵
Knowles LL. Statistical Phylogeography. Annu Rev Ecol Syst. 2009;40: 593–612.
OpenUrl CrossRef

[69] 66.↵
Robinson JD, Bunnefeld L, Hearn J, Stone GN, Hickerson MJ. ABC inference of multi-population divergence with admixture from un-phased population genomic data. Mol Ecol. 2014;23: 4458–4471. doi:10.1111/mec.12881
OpenUrl CrossRef

[70] 67.↵
Sousa V, Hey J. Understanding the origin of species with genome-scale data: modelling gene flow. Nat Rev Genet. Nature Publishing Group; 2013;14: 404–14. doi:10.1038/nrg3446
OpenUrl CrossRef PubMed

[71] 68.↵
Yamada M, Higuchi M, Goto A. Long-term occurrence of hybrids between Japan Sea and Pacific Ocean forms of threespine stickleback, Gasterosteus aculeatus, in Hokkaido Island, Japan. Environ Biol Fishes. 2007;80: 435– 443.
OpenUrl

[72] 69.↵
MÄkinen HS, MerilÄ J. Mitochondrial DNA phylogeography of the three-spined stickleback (Gasterosteus aculeatus) in Europe - Evidence for multiple glacial refugia. Mol Phylogenet Evol. 2008;46: 167–182. doi:10.1016/j.ympev.2007.06.011
OpenUrl CrossRef PubMed Web of Science

[73] 70.↵
Ravinet M, Harrod C, Eizaguirre C, Prodöhl P a. Unique mitochondrial DNA lineages in Irish stickleback populations: cryptic refugium or rapid recolonization? Ecol Evol. 2013; n/a–n/a. doi:10.1002/ece3.853
OpenUrl CrossRef

[74] 71.↵
Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, Cresko WA. Population genomics of parallel adaptation in threespine stickleback using sequenced RAD tags. Plos Genet. 2010;6: e1000862.
OpenUrl CrossRef PubMed

[75] 72.↵
Lescak EA, Bassham SL, Catchen J, Gelmond O, Sherbick ML, von Hippel FA, et al. Evolution of stickleback in 50 years on earthquake-uplifted islands. Proc Natl Acad Sci U S A. 2015;112: E7204–E7212. doi:10.1073/pnas.1512020112
OpenUrl Abstract/FREE Full Text

[76] 73.↵
Barton NH, de Cara MAR. The evolution of strong reproductive isolation. Evolution. 2009;63: 1171–90. doi:10.1111/j.1558-5646.2009.00622.x
OpenUrl CrossRef PubMed Web of Science

[77] 74.↵
Barton N, Bengtsson BO. The barrier to genetic exchange between hybridising populations. Heredity (Edinb). 1986;57: 357–376. doi:10.1038/hdy.1986.135
OpenUrl CrossRef

[78] 75.↵
Muirhead CA, Presgraves DC. Hybrid Incompatibilities, Local Adaptation, and the Genomic Distribution of Natural Introgression between Species. Am Nat. 2016;187: 249–261. doi:10.1086/684583
OpenUrl CrossRef

[79] 76.↵
Hedrick PW. Adaptive introgression in animals: examples and comparison to new mutation and standing variation as sources of adaptive variation. Mol Ecol. 2013; doi:10.1111/mec.12415
OpenUrl CrossRef PubMed Web of Science

[80] 77.↵
Racimo F, Sankararaman S, Nielsen R, Huerta-Sánchez E. Evidence for archaic adaptive introgression in humans. Nat Rev Genet. Nature Publishing Group; 2015;16: 359–371. doi:10.1038/nrg3936
OpenUrl CrossRef PubMed

[81] 78.↵
Castric V, Bechsgaard J, Schierup MH, Vekemans X. Repeated adaptive introgression at a gene under multiallelic balancing selection. Plos Genet. 2008;4. doi:10.1371/journal.pgen.1000168
OpenUrl CrossRef PubMed

[82] 79.↵
Elgvin TO, Trier CN, Tørresen OK, Hagen, Ingerid J, Lien S, Nederbragt AJ, et al. The genomic mosaicism of hybrid speciation. Sci Adv. 2017;3.

[83] 80.↵
Sun C, Huo D, Southard C, Nemesure B, Hennis A, Cristina Leske M, et al. A signature of balancing selection in the region upstream to the human UGT2B4 gene and implications for breast cancer risk. Hum Genet. 2011;130: 767–775. doi:10.1007/s00439-011-1025-6
OpenUrl CrossRef PubMed

[84] 81.↵
Bolnick DI, Stutz WE. Frequency dependence limits divergent evolution by favouring rare immigrants over residents. Nature. Nature Publishing Group; 2017;546: 285–288. doi:10.1038/nature22351
OpenUrl CrossRef PubMed

[85] 82.↵
Ishikawa A, Kusakabe M, Yoshida K, Ravinet M, Makino T, Toyoda A, et al. Different contributions of local- and distant-regulatory changes to transcriptome divergence between stickleback ecotypes. 2017; 1–17. doi:10.1111/evo.13175
OpenUrl CrossRef

[86] 83.↵
Ravinet M, Prodöhl PA, Harrod C. On Irish sticklebacklZ: morphological diversification in a secondary contact zone. Evol Ecol Res. 2013;15: 271–294.
OpenUrl

[87] 84.↵
Baird N a, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis Z a, et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. Plos One. 2008;3: e3376. doi:10.1371/journal.pone.0003376
OpenUrl CrossRef PubMed

[88] 85.↵
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25: 2078–9. doi:10.1093/bioinformatics/btp352
OpenUrl CrossRef PubMed Web of Science

[89] 86.↵
Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7: 214–222.
OpenUrl CrossRef PubMed

[90] 87.↵
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32: 1792–7. doi:10.1093/nar/gkh340
OpenUrl CrossRef PubMed Web of Science

[91] 88.↵
Bell MA,
Foster SA
Bell MA. Paleobiology and evolution of threespine stickleback. In: Bell MA, Foster SA, editors. The evolutionary biology of the threespine stickleback. Oxford: Oxford University Press; 1994. pp. 438–471.

[92] Bell MA,

[93] Foster SA

[94] 89.↵
Bell MA, Stewart JD, Park PJ. The world’s oldest fossil threespine stickleback fish. Copeia. 2009;2009: 256–265.
OpenUrl CrossRef

[95] 90.↵
Heled J, Drummond AJ. Calibrated tree priors for relaxed phylogenetics and divergence time estimation. Syst Biol. 2012;61: 138–49. doi:10.1093/sysbio/syr087
OpenUrl CrossRef PubMed

[96] 91.↵
Wheat CW, Wahlberg N. Critiquing blind dating: the dangers of over-confident date estimates in comparative genomics. Trends Ecol Evol. Elsevier Ltd; 2013;28: 636–42. doi:10.1016/j.tree.2013.07.007
OpenUrl CrossRef PubMed Web of Science

[97] 92.↵
Baele G, Lemey P, Bedford T, Rambaut A, Suchard M a, Alekseyenko A V. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol Biol Evol. 2012;29: 2157–67. doi:10.1093/molbev/mss084
OpenUrl CrossRef PubMed Web of Science

[98] 93.↵
Baele G, Li WLS, Drummond AJ, Suchard M a, Lemey P. Accurate model selection of relaxed molecular clocks in bayesian phylogenetics. Mol Biol Evol. 2013;30: 239–43. doi:10.1093/molbev/mss243
OpenUrl CrossRef PubMed Web of Science

[99] 94.↵
Rambaut A, Drummond AJ. Tracer v1.5. 2009.

[100] 95.↵
Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22: 2688–2690. doi:10.1093/bioinformatics/btl446
OpenUrl CrossRef PubMed Web of Science

[101] 96.↵
Revell LJ. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol Evol. 2012;3: 217–223. doi:10.1111/j.2041-210X.2011.00169.x
OpenUrl CrossRef

[102] 97.↵
Schliep KP. phangorn: Phylogenetic analysis in R. Bioinformatics. 2011;27: 592–593. doi:10.1093/bioinformatics/btq706
OpenUrl CrossRef PubMed Web of Science

[103] 98.↵
Li H, Durbin R. Inference of human population history from individual whole-genome sequences. Nature. 2011;475: 493–6. doi:10.1038/nature10231
OpenUrl CrossRef PubMed Web of Science

[104] 99.↵
Bell MA,
Foster SA
Bell MA, Foster SA. Introduction to the evolutionary biology of the threespine stickleback. In: Bell MA, Foster SA, editors. The Evolutionary Biology of the Threespine Stickleback. Oxford: Oxford University Press; 1994. pp. 1–27.

[105] Bell MA,

[106] Foster SA

[107] 100.↵
Guo B, Chain FJ, Bornberg-Bauer E, Leder EH, MerilÄ J. Genomic divergence between nine- and three-spined sticklebacks. BMC Genomics. 2013;14: 756. doi:10.1186/1471-2164-14-756
OpenUrl CrossRef PubMed

[108] 101.↵
Mailund T, Halager AE, Westergaard M, Dutheil JY, Munch K, Andersen LN, et al. A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species. Plos Genet. 2012;8: e1003125. doi:10.1371/journal.pgen.1003125
OpenUrl CrossRef PubMed

[109] 102.↵
Pavlidis P, Laurent S, Stephan W. msABC: a modification of Hudson’s ms to facilitate multi-locus ABC analysis. Mol Ecol Resour. 2010;10: 723–7. doi:10.1111/j.1755-0998.2010.02832.x
OpenUrl CrossRef PubMed

[110] 103.↵
Tange O. GNU Parallel - The Command-Line Power Tool.; login USENIX Mag. Frederiksberg, Denmark; 2011;36: 42–47. Available: http://www.gnu.org/s/parallel
OpenUrl

[111] 104.↵
Csilléry K, Blum MGB, Gaggiotti OE, François O, Csillery K, Francois O. Approximate Bayesian Computation (ABC) in practice. Trends Ecol Evol. 2010;25: 410–418. doi:10.1016/j.tree.2010.04.001
OpenUrl CrossRef PubMed Web of Science

[112] 105.↵
Bertorelle G, Benazzo A, Mona S. ABC as a flexible framework to estimate demography over space and time: some cons, many pros. Mol Ecol. 2010;19: 2609–2625. doi:10.1111/j.1365-294X.2010.04690.x
OpenUrl CrossRef PubMed Web of Science

[113] 106.↵
Csilléry K, François O, Blum MGB. ABC: an R package for approximate Bayesian computation (ABC). Methods Ecol Evol. 2012;3: no-no. doi:10.1111/j.2041-210X.2011.00179.x
OpenUrl CrossRef

[114] 107.↵
Weir B, Cockerham C. Estimating F-Statistics for the Analysis of Population Structure. Evolution (N Y). 1984;38: 1358–1370. Available: http://www.jstor.org/stable/2408641
OpenUrl CrossRef

[115] 108.↵
Danecek P, Auton A, Abecasis G, Albers C a, Banks E, DePristo M a, et al. The variant call format and VCFtools. Bioinformatics. 2011;27: 2156–8. doi:10.1093/bioinformatics/btr330
OpenUrl CrossRef PubMed Web of Science

[116] 109.↵
Smadja CM, CanbÄck B, Vitalis R, Gautier M, Ferrari J, Zhou J-J, et al. Large-scale candidate gene scan reveals the role of chemoreceptor genes in host plant specialization and speciation in the pea aphid. Evolution. 2012;66: 2723–38. doi:10.1111/j.1558-5646.2012.01612.x
OpenUrl CrossRef PubMed Web of Science

[117] 110.↵
Harte D. HiddenMarkov: Hidden Markov Models. R package version 1.8-7. Wellington: Statistics Research Associates; 2016.

[118] 111.↵
Catchen J, Hohenlohe P a., Bassham S, Amores A, Cresko W a. Stacks: an analysis tool set for population genomics. Mol Ecol. 2013;22: 3124–3140. doi:10.1111/mec.12354
OpenUrl CrossRef PubMed Web of Science

[119] 112.↵
Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26: 873–81. doi:10.1093/bioinformatics/btq057
OpenUrl CrossRef PubMed Web of Science

[120] 113.↵
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25: 2078–9. doi:10.1093/bioinformatics/btp352
OpenUrl CrossRef PubMed Web of Science

[121] 114.↵
Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155: 945–959.
OpenUrl Abstract/FREE Full Text

[122] 115.↵
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81: 559–575. doi:10.1086/519795
OpenUrl CrossRef PubMed

[123] 116.↵
Jombart T, Ahmed I. adegenet 1. 3-1lZ: new tools for the analysis of genome-wide SNP data. Bioinformatics. 2011;27: 3070–3071. doi:10.1093/bioinformatics/btr521
OpenUrl CrossRef PubMed Web of Science

[124] 117.↵
Earl D a., vonHoldt BM. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conserv Genet Resour. 2011;4: 359–361. doi:10.1007/s12686-011-9548-7
OpenUrl CrossRef

[125] 118.↵
Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al. ClueGO: A Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25: 1091–1093. doi:10.1093/bioinformatics/btp101
OpenUrl CrossRef PubMed Web of Science

[126] 119.↵
Christmas, Rowan; Avila-Campillo, Iliana; Bolouri, Hamid; Schwikowski, Benno; Anderson, Mark; Kelley, Ryan; Landys, Nerius; Workman, Chris; Ideker, Trey; Cerami, Ethan; Sheridan, Rob; Bader, Gary D.; Sander C. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Am Assoc Cancer Res Educ B. 2005; 12–16. doi:10.1101/gr.1239303.metabolite
OpenUrl CrossRef

[127] 120.↵
Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B …. 1995;57: 289–300. Available: http://www.jstor.org/stable/10.2307/2346101
OpenUrl CrossRef PubMed