Supergene origin and maintenance in Atlantic cod

Supergenes are sets of genes that are inherited as a single marker and encode complex phenotypes through their joint action. They are identified in an increasing number of organisms, yet their origins and evolution remain enigmatic. In Atlantic cod, four megabase-scale supergenes have been identified and linked to migratory lifestyle and environmental adaptations. Here we investigate the origin and maintenance of these four supergenes through analysis of whole-genome-sequencing data, including a new long-read-based genome assembly for a non-migratory Atlantic cod individual. We corroborate the finding that chromosomal inversions underlie all four supergenes, and we show that they originated at different times between 0.40 and 1.66 million years ago. We reveal gene flux between supergene haplotypes where migratory and stationary Atlantic cod co-occur and conclude that this gene flux is driven by gene conversion, on the basis of an increase in GC content in exchanged sites. Additionally, we find evidence for double crossover between supergene haplotypes, leading to the exchange of an ~275 kilobase fragment with genes potentially involved in adaptation to low salinity in the Baltic Sea. Our results suggest that supergenes can be maintained over long timescales in the same way as hybridizing species, through the selective purging of introduced genetic variation.


Introduction
Many spectacular examples of phenotypical variation within species, such as mimicry patterns in butterflies [1], social organization in ants [2], plumage morphs in birds [3,4], and floral types in plants [5], are encoded by supergenes -tightly linked sets of genes that control a stable polymorphism in a Mendelian manner [6][7][8][9]. Even though supergenes have been known for nearly a century [10], their origin remains a challenging question [9]. The emergence of supergenes requires beneficially interacting mutations in at least two genes and a coinciding reduction of recombination between these genes [7]. As a scenario in which these requirements can be met, recent research has pointed to chromosomal inversions arising in incompletely separated groups, such as interbreeding species or locally adapted populations that exchange migrants [11][12][13]. In these systems, the beneficial interaction between mutations in different genes can come from their joint adaptation to the same environment, and these mutations can become linked if they are captured by the same inversion [6][7][8][9]14]. This linkage between mutations within inversions is the result of a loop formation that occurs when chromosomes with the derived inverted haplotype pair with chromosomes with the ancestral haplotype arrangement during meiosis. If a single crossover occurs between the two chromosomes within the loop region, the recombinant chromosomes are affected by both duplications and deletions and are therefore unbalanced. The gametes carrying these unbalanced chromosomes are usually lethal, and thus do not contribute to the next generation [15,16]. When viewed backwards in time, the recombination rate between haplotypes with derived and ancestral arrangements therefore appears reduced. On the other hand, crossovers among two haplotypes that both have the derived arrangement should not affect the viability of gametes. Since most inversions originate just once, in a single individual, the number of individuals in which haplotypes with the derived arrangement can successfully recombine is initially very low, increasing only as the derived arrangement becomes more frequent in the species. The origin of a supergene is therefore expected to be equivalent to a severe bottleneck (down to a single sequence) that affects part of the genome (the inversion region) in a part of the species (the carriers of the derived arrangement) [17].
Once established, the maintenance of supergenes depends on the interaction of selection, drift, gene flow, mutation, and recombination. The derived arrangement can be prevented from fixation -and the supergene can thus remain polymorphic -by frequency-dependent selection, by heterogeneous selection regimes in different populations, or by recessive deleterious mutations that accumulate in the inversion region [7,11,13,18]. As mutations are added over time, the haplotypes with the ancestral and derived arrangements diverge from each other, due to the suppression of recombination between them [18]. Owing to the reduced opportunity for recombination, mildly deleterious mutations are more likely to be fixed inside the inversion region compared to outside and can result in the accumulation of mutational load [18]. When the mutational load becomes high within a supergene, it can lead to fitness decay for individuals carrying two copies of the same arrangement (i.e. homokaryotypes) [19]. The accumulation of deleterious mutations, however, can be counteracted by two processes that allow genetic exchange between the two arrangements [2,9,19,20]: gene conversion and double crossover. Gene conversion is a process in which a homologous sequence is used as template during the repair of a double-strand break, without requiring crossover with that homologous sequence [21,22]. The fragments copied through gene conversion are short, with lengths on the order of 50-1,000 bp [23,24]. In vertebrates including fishes, gene conversion is known to increase the GC-content of the involved sequences due to biased repair of A-C and G-T mismatches [22,25]. Double crossovers occurring within the loop formed by the two chromosomes can lead to the exchange of longer fragments. Either alone or in tandem, the two processes could have the potential to erode differences between supergene haplotypes if their per-site rates are high relative to the mutation rate [26]. However, outside of model systems like Drosophila that allow the genetic analysis of crosses produced in the laboratory, the rates of gene conversion and double crossovers are largely unknown.
In Atlantic cod (Gadus morhua), genomic regions with tight linkage over 4-17 Megabasepairs (Mbp) and strong differentiation between alternative haplotypes have been identified on linkage groups (LGs) 1, 2, 7, and 12 of the gadMor2 reference genome assembly [27][28][29][30][31]. The alternative haplotypes are associated with different life history strategies [29,32,33] and environments [28,[34][35][36][37]. One of the strongest of these associations is found between the alternative haplotypes on LG 1 and migratory and stationary Atlantic cod ecotypes [29,35]. In the Northeast Atlantic, these ecotypes co-occur during the spawning season in March and April along the Norwegian coast, but are separated throughout the rest of the year, with the migratory ecotype -the Northeast Arctic cod (NEAC) -returning to the Barents Sea. [38]. With few exceptions, individuals carrying two copies of one of the haplotypes and heterozygous individuals are migratory, while individuals with two copies of the other haplotype are stationary [29,39,40]. Despite the suggestion of a lekking mating system [41], mating between the two ecotypes appears to occur and thus explains the presence of individuals that are heterozygous for the LG 1 haplotypes as well as the very weak separation outside of the four differentiated regions [29]. The differentiated region on LG 1 thus matches the definition of a supergene [6] and is commonly referred to by this term [33-35, 42, 43]. A number of genes from within this supergene have been proposed as candidate genes under selection, including the Ca6 gene that plays a role in pH-level reduction in the swim bladder and might thus be important for feeding at greater depths in the migratory ecotype. However, the targets of selection among the nearly 800 genes within the supergene remain difficult to identify reliably due to tight linkage between them [32]. Similar to the different frequencies of LG 1 haplotypes between migratory and stationary ecotypes, one of the two alternative haplotypes on LG 2 is far more frequent in Atlantic cod from the Baltic sea compared to the nearest North Atlantic populations and has been suggested to carry genes adapted to low salinity [28,37]. The alternative haplotypes on LGs 7 and 12 also differ in their frequencies among Atlantic cod populations, with one of the two haplotypes in each case being nearly absent in the Irish and Celtic Sea, which are among the southernmost populations in the Northeast Atlantic, possibly in relation to adaptation to the higher temperatures experienced by these populations [42,44,45]. While the "supergene" status of the haplotypes on LGs 2, 7, and 12 will depend on further investigations of their ecological role, they, too, are commonly referred to by this term in the literature [33][34][35]42], and we will do so hereafter.
Chromosomal inversions have long been suspected to be the cause of recombination suppression in the four supergenes in Atlantic cod [46], but this has been confirmed only recently, first for the supergene on LG 1 with the help of detailed linkage maps for that linkage group [32], and then for the three other supergenes through comparison of long-read-based genome assemblies [42]. The age of the supergene on LG 1 was estimated to be around 1.6 million years (Myr) based on the sequence divergence between the two alternative haplotypes and an outgroup, in combination with an assumed divergence time between Atlantic cod and that outgroup [32]. This assumed divergence time, however, may be unreliable, as it was based on a mitochondrial substitution rate [47,48] that was originally calculated for Caribbean fishes [49]. For the supergenes on LGs 2, 7, and 12, no age estimates have yet been reported. Due to these uncertainties, conclusions about a possible joint origin of all four supergenes have necessarily remained highly speculative. The role of introgression -genetic exchange through hybridisation -in the origin of the Atlantic cod's supergenes has so far also been uncertain. While introgression among codfishes (subfamily Gadinae) has been supported by one former study based on genome-scale sequence data [50], these results were affected by the use of an incorrectly labelled specimen, susceptibility to reference bias, and possibly incorrect outgroup choice in the application of D-statistics (Supplementary Note 1), and thus remain inconclusive regarding the occurrence of introgression.
Here, we investigate the origin and the maintenance of supergenes in Atlantic cod as follows: We generate a new long-read-based genome assembly for a stationary Atlantic cod individual from northern Norway as a complement to the existing genome assembly for a migratory individual belonging to the NEAC population (gadMor2 [31]). Importantly, these two assemblies carry alternative haplotypes at each of the four supergenes. Through comparison of the two assemblies with each other and with an outgroup assembly, we corroborate that inversions are the cause of recombination suppression for each supergene, pinpoint the chromosomal boundaries of supergenes, and identify ancestral and derived arrangements. Using Bayesian time-calibrated phylogenetic analyses of newly generated and previously available genomic data, we identify traces of introgression among closely related codfishes and show that at least some of the supergenes originated at different times. Through demographic analyses, we find signatures of past bottlenecks associated with the origin of supergenes, and by applying D-statistics and sliding-window phylogenetic inference, we detect the occurrence of genetic exchange between haplotypes, both through gene conversion and double crossovers. Our results suggest that the long-term existence of supergenes may depend on genetic exchange between haplotypes to counter the accumulation of mutation load, and on selection acting on the exchanged sequences to maintain the separation of the two haplotypes.

Results
A genome assembly for a stationary Gadus morhua individual. To allow a comparison of genome architecture between migratory and stationary Gadus morhua, we performed PacBio and Illumina sequencing for a stationary Gadus morhua indivdual sampled in northern Norway, at the Lofoten islands (Fig. 1a). The resulting genome assembly (gadMor Stat) consisted of 6,961 contigs with a contig N50 length of 121,508 bp and had a size of 565,431,517 bp, corresponding to approximately 87% of the estimated size of the Gadus morhua genome [31]. The assembly contained 3,061 (84.1%) complete and 3,020 (83.0%) complete and single-copy genes out of 3,640 conserved genes in BUSCO's [51] Actinopterygii gene set (see Supplementary Table 1 for further assembly statistics). When aligned to the gadMor2 reference genome assembly [31] (representing a migratory individual), the gadMor Stat assembly was highly similar on almost all gadMor2 linkage groups, with a pairwise sequence divergence of 0.0040-0.0053 substitutions per site between the two assemblies. The exception to this were the four supergenes on LGs 1, 2, 7, and 12, that all showed an elevated sequence divergence of 0.0066-0.0129 substitutions per site between the two assemblies. This confirmed that the gadMor Stat and the gadMor2 assemblies carried alternative supergene haplotypes on all four linkage groups (Supplementary Table 2). To determine the chromosomal boundaries of the regions of tight linkage associated with the supergenes, we investigated linkage disequilibrium (LD) on LGs 1, 2, 7, and 12 with a dataset of single-nucleotide polymorphisms (SNPs) for 100 Gadus morhua individuals. By quantifying the strength of linkage per SNP as the sum of the distances (in bp) with which the SNP is strongly linked (R 2 > 0.8), we identified sharp declines of linkage marking the boundaries of all four supergenes (Fig. 1b, Table 1), as expected under the assumption of large-scale chromosomal inversions [52].
The presence of megabase-scale inversions on each of the four linkage groups was further supported by alignments of contigs from the gadMor Stat assembly to the gadMor2 assembly, as we identified several contigs with split alignments, of which one part mapped unambiguously near the beginning and another mapped near the end of a supergene (Supplementary Table 3). The positions of split contig alignments allowed us to pinpoint the inversion breakpoints on the four linkage groups with varying precision ( Table 1). The most informative alignments were those near the beginnings of the supergenes on LGs 1 and 7, which in both cases placed the breakpoints within a window of approximately 2 kbp. As also reported for inversions in Drosophila [53], this precise placement of the inversion breakpoints revealed that they do not match the positions of LD onset exactly, but that they were located up to 45 kbp inside of the region of tight linkage (Fig. 1b Fig. 1 Four supergenes associated with megabase-scale chromosomal inversions in Gadus morhua. a Migratory and stationary Gadus morhua seasonally co-occur along the coast of northern Norway and differ in total length and otolith measurements [38,55]. The distribution of stationary Gadus morhua is shaded in grey whereas the seasonal movements of migratory Gadus morhua are indicated with dark grey arrows. b Pairwise genetic distance (in substitutions per site) between the gadMor2 and gadMor Stat assemblies, relative to the genetic distance to an assembly for Melanogrammus aeglefinus (melAeg) [56] in a three-way whole-genome alignment. The alignment coordinates are according to the gadMor2 assembly. The four LGs 1, 2, 7, and 12 are shown as rounded horizontal bars, on which circles indicate the approximate centromere positions [42]. Supergene regions are shaded in grey, and the beginning and end of each of these regions are shown in more detail in the insets below each linkage group. Each of these insets focuses on a section of 100 kbp around a supergene's beginning or end. Shown in black above the bar representing that section is a per-SNP measure of linkage disequilibrium (LD), calculated as the sum of the distances between SNPs in high linkage (R 2 > 0.8). Based on this measure, the grey shading on the bar illustrates the beginning or the end of high LD. Drawn below the scale bar are contigs of the gadMor Stat and melAeg assemblies, in light grey and dark grey, respectively, that align well to the shown sections. The arrows indicate the alignment orientation of contigs (forward or reverse complement), and contigs are labelled with numbers as in Supplementary the four inversion regions included centromeres [42]; thus, all of them seemed to contain paracentric inversions, which, in contrast to pericentric inversions, are not expected to decrease the fitness of heterozygotes [54].
To determine which of the two genomes carries the derived arrangement in each case, we also aligned contigs from the long-read-based genome assembly of Melanogrammus aeglefinus (melAeg) [56], a closely related outgroup within the subfamily Gadinae, to the gadMor2 assembly. We again identified split contig alignments mapping near the boundaries of the supergenes on LGs 1 and 7, indicating that for these supergenes, it is the gadMor2 genome that carries the derived arrangement (Fig. 1b, Table 1, Supplementary Table 3). In contrast, a single contig of the melAeg assembly was clearly colinear to the gadMor2 assembly in a region that extended about 150 kbp in both directions from one of the ends of the supergene on LG 2, indicating that the derived arrangement on LG 2 All coordinates refer to the gadMor2 assembly [31]. Unless otherwise specified, boundaries of inversion regions were determined based on contig alignments (Supplementary Table 3). 1 Comparison with the gadMor3 assembly [42] (Supplementary Figure 2) suggests that the actual end of the inversion region region is misplaced in the gadMor2 assembly, between positions 18,890,477 and 18,900,044, and that the region from position ∼16,800,000 and ∼18,900,000 in the gadMor2 assembly is instead located after the current position ∼26,100,000. 3 This is the end of the linkage group in the gadMor2 assembly; however, comparison with the gadMor3 assembly suggests that the region from position ∼22,600,000 to ∼23,700,000 in the gadMor2 assembly is incorrectly placed and instead located at the end of the linkage group.
is carried not by the gadMor2 genome but by the gadMor Stat genome instead. For the supergene on LG 12, on the other hand, no informative alignments between the melAeg assembly and the gadMor2 assembly were found; thus, our contig-mapping approach did not allow us to determine which of the two Gadus morhua genomes carries the derived arrangement on this linkage group (however, our subsequent demographic analyses suggested that it is the gadMor Stat genome that carries the derived arrangement on LG 12; see below). Repeat content and mutation load were not increased in supergene regions compared to the genome-wide background (Supplementary Figure  1).
Rapid divergence and introgression among codfishes. To establish the phylogenetic context within which the inversions arose in Gadus morhua and to test whether introgression played a role in the origin of the four supergenes [4,57], we performed Bayesian analysis with the isolation-withmigration model as implemented in the AIM add-on package for BEAST 2 [58,59]. The dataset used in this analysis consisted of 109 alignments of sequences for the four species of the genus Gadus (G. morhua, G. chalcogrammus, G. ogac, and G. macrocephalus), their most closely related relatives Boreogadus saida and Arctogadus glacialis, and three outgroups ( Fig. 2a; Supplementary Table 5). The 109 alignments (total length: 383,727 bp) were sampled from across the genome, excluding the four supergene regions and selecting the regions with the most reliable orthology (i.e., those unambiguouly aligned in the three-way whole-genome alignment) and the weakest signals of recombination (i.e., those with low numbers of hemiplasies). We found strong support (Bayesian posterior probability, BPP: 0.997) for a topology in which Arctogadus is either placed as the sister taxon to Gadus (BPP: 0.763) or to Boreogadus (BPP: 0.234). Introgression was supported (by Bayes factors greater than 10) between Boreogadus and Arctogadus in both cases and additionally from the common ancestor of the genus Gadus to Arctogadus in the latter case ( Introgression among the three genera was further supported by Patterson's D-statistic [61,62]. Using Dsuite [63], we calculated for all possible species trios two versions of this statistic from a set of 19,035,318 SNPs: D fix , for which taxa in the trio are arranged according to a provided input tree, and D BBAA , for which taxa are arranged so that the number of sites with the "BBAA" pattern is maximised. The strongest signals of introgression were found once again between Boreogadus and Arctogadus, in a trio together with Gadus morhua (D fix = 0.360, p < 10 −10 ; Fig. 2c, Supplementary Tables 6 and 7). Within the genus Gadus, the D-statistic additionally provided strong support for introgression between the geographically co-occurring G. ogac and G. morhua in a trio with G. macrocephalus (D fix = D BBAA = 0.283, p < 10 −10 ; Supplementary Tables 6 and 7). In both cases, signals of introgression quantified with the f dM -statistic [60] (an alternative to the D-statistic that is unbiased in small genomic regions [64] and symmetrically distributed around zero in the absence of introgression [63]) were found on all chromosomes, and were not affected by the choice of genome representing Gadus morhua (Fig. 2d). The introgression signal was not elevated in supergene regions, suggesting that they did not arise in G. morhua due to introgression from G. ogac. The occurrence of introgression between Boreogadus and Arctogadus was corroborated by a tree-based equivalent to Patterson's D-statistic that does not rely on the molecular-clock assumption (the D tree -statistic of Ronco et al. [65]) and by genealogy interrogation [66,67], but introgression between G. ogac and G. morhua did not receive this additional support (Supplementary Figure 5).
Recent divergence among Gadus morhua populations. We performed phylogenomic analyses for individuals from eight Gadus morhua populations covering the species' distribution in the North Atlantic ( Fig. 3a; Supplementary Table 8). In addition to the individuals used for the gad-Mor2 and gadMor Stat assemblies, we selected from these populations 22 individuals for which preliminary analyses had shown that each of them carried, at each of the four supergene regions, two copies of the same haplotypes (i.e., they were homokaryotypic). For the four sampling localities Newfoundland, Iceland, Lofoten, and Møre, we discriminated between "migratory" and "stationary" individuals based on whether they carried the same supergene haplotype on LG 1 as the gadMor2 genome or the same as the gadMor Stat genome. At other localities, all individuals were considered stationary based on the well-known migration patterns of Gadus morhua [68]. For the individuals from Lofoten and Møre, this classification could be confirmed by an analysis of their otoliths [55], but otolith data was not available for the individuals from the other sampling localities. Based on a dataset of 20,402,423 genome-wide biallelic SNPs, we estimated relationships and divergence times among Gadus morhua populations under the multi-species coalescent model with SNAPP [69,70], first only with data from outside of the supergene regions. In line with previous studies based on SNP arrays [27,35,71], we found the primary divergence within Gadus morhua to separate the populations of the Northwest Atlantic from those of the Northeast Atlantic (including Iceland). We estimated these groups to have diverged around 65.4 ka (95% HPD: 84.3-46.2 ka) but acknowledge that these results may underestimate the true divergence time because the applied model does not account for possible gene flow after divergence (Fig. 3b, Supplementary Figure 6).

Ma
LGs 1-23 LGs  The genetic diversity, quantified by π [72], was comparable among the populations of both groups, ranging from 8.82 × 10 −4 to 1.084 × 10 −3 (Supplementary Table 9). Applied to the set of 20,402,423 SNPs, Patterson's D-statistic corroborated the occurrence of introgression between Gadus ogac and Gadus morhua and showed that this signal of introgression is similar in all populations (D fix = D BBAA = 0.249, p < 10 −10 ; Fig. 3c, Supplementary Tables 10 and 11). Estimating changes in the population size (N e ) over time for Gadus morhua with Relate [73] revealed a Pleistocene bottleneck, lasting from around 0.7 Ma to 0.3 Ma, during which the population size of the common ancestor of all populations decreased from around 50,000 to 7,000 diploid individuals. The subsequent increase in population sizes occurred in parallel with the diversification of Gadus morhua populations and was experienced by all of them to a similar degree but less so by the Bornholm Basin population of the Baltic Sea (Fig. 3d).  Different age estimates for supergenes in Gadus morhua. To infer the ages of the supergenes on LGs 1, 2, 7, and 12, we applied SNAPP analyses to SNPs from each supergene separately, extracted from the dataset of 20,402,423 biallelic SNPs. For each of the four supergenes, we recovered a deep divergence separating the haplotypes with ancestral and derived arrangements; however, the age estimates for this divergence differed widely among the four supergenes, with mean age estimates of 0.61 Ma (95% HPD: 0.77-0.46 Ma) for the supergene on LG 1 (Fig. 4a), 0.88 Ma (95% HPD: 1.10-0.67 Ma) for the supergene on LG 2 (Fig. 4d), 1.66 Ma (95% HPD: 2.05-1.29 Ma) for the supergene on LG 7 (Fig. 4g), and 0.40 Ma (95% HPD: 0.50-0.30 Ma) for the supergene on LG 12 ( Fig. 4j; Supplementary Figure 7). The migratory individuals from Newfoundland, Iceland, Lofoten, and Møre shared the same arrangement on all four linkage groups, and so did the stationary individuals from Lofoten, Suffolk, and Kiel Bight. The genetic diversity of the haplotype with the derived arrangement was on average lower on LGs 1 and 12, but higher on LGs 2 and 7 (Figs. 4a,d,g,j, Supplementary Table 9).
Gene flux between haplotypes via gene conversion. Just like gene flow between populations can lead to underestimates of their divergence time, the ages of the separation between the two alternative arrangements per supergene could be underestimated if gene flux (defined as exchange of alleles during meiosis) has occurred after their divergence, through gene conversion or double crossover [74]. To assess the frequency of these two processes within the four supergene regions, we calculated the D-statistic for the sets of supergene-specific SNPs. The D-statistic supported gene flux (and thus the occurrence of either gene conversion or double crossover) between haplotypes with derived and ancestral arrangements particularly for the supergene on LG 1 and the geographically co-occurring migratory or stationary Newfoundland populations (D fix = D BBAA = 0.383, p < 10 −10 ; Fig. 4b, Supplementary Tables 12 and 13). To test whether gene conversion could be the cause of this gene flux occurring between the Newfoundland populations, we tested for the GC bias expected from gene conversion [24,25], comparing GC-content of sites shared between the two Newfoundland populations ("ABBA" sites) to that of sites shared between the migratory Newfoundland population and other migratory populations ("BBAA" sites). The mean GC-content of the former, 0.482, is indeed significantly higher than that of the latter, 0.472 (one-sided t-test; t = −4.74, df = 27833, p < 10 −5 ), supporting gene conversion as an agent of gene flux between haplotypes with derived and ancestral arrangements in the Newfoundland populations. For the supergenes on LGs 2, 7,  Tables 14-19).
Demographic analyses recover signatures of bottlenecks following inversions. It is likely that each of the inversions associated with the four supergenes in Gadus morhua originated just once and that all of the current carriers of the inversion descended from the individual in which it originated. The supergene origin should therefore have been equivalent to an extreme bottleneck event during which the population size was reduced to a single sequence, but which affected only the inversion region, and only the carriers of the derived arrangement. We might thus expect to see signatures of this extreme bottleneck event in analyses of population size over time, in the form of differences in ancestral population sizes between the haplotypes with derived and ancestral arrangements that date to the time of supergene origin.
To verify that signatures of extreme bottlenecks are detectable in descending genomes even after long periods of time, we first performed a series of power analyses based on coalescent simulations (Supplementary Note 2). After confirming that the program Relate is in principle able to pick up such signals, we tested for the presence of bottleneck signatures associated with inversion events by performing demographic analyses with Relate separately for sets of SNPs from each of the four supergene regions. As expected due to the smaller amount of input data (1-3% compared to the genome-wide SNP data), these analyses produced estimates that were less clear (Fig. 4c,f,i,l) than those obtained with genome-wide SNP data (Fig. 3d). Nevertheless, the supergene-specific demographic analyses supported a difference between the population sizes of haplotypes with derived and ancestral arrangements, with a temporary reduction of population sizes for the haplotype with the derived arrangement that coincided with the estimated ages of the supergenes on LGs 2 and 7 (Fig. 4f,i). For the supergenes on LGs 1 and 12, the difference in the population sizes did not coincide with the supergene ages. However, for the supergene on LG 1, the population size of the haplotype with the derived arrangement was reduced compared to the ancestral arrangement in several time intervals following the supergene origin (Fig. 4c). While our contig-mapping approach had not allowed us to infer which of the two haplotypes of the LG 12 supergene had the derived arrangement, the reduced inferred population size of the haplotype carried by the sampled individuals from the Suffolk, Kiel Bight, and stationary Lofoten populations, in a time interval following the supergene origin, suggested that this haplotype might have the derived arrangement on that linkage group. The ancestral population sizes estimated from the supergene regions on LGs 2 and 7 further highlighted that estimates of the current genetic diversity may not always be useful for the identification of derived arrangements (Fig. 4i), as the population-size estimates for the haplotype with the ancestral arrangement were in both cases higher than for the derived arrangement at the time of supergene origin, but lower for most of the subsequent time towards the present (Fig. 4i).
Divergence-time profiles reveal double crossovers. To explore whether divergence times between the two arrangements per supergene are homogeneous across the supergene region, we repeated divergence-time inference with SNAPP in sliding windows of 250 kbp along all linkage groups. We expected that if any gene flux between haplotypes with derived and ancestral arrange- LG 1 LG 2 LG 7 LG 12 LG 1 LG 2 LG 7 LG 12  [20,74,75]. Contrary to this expectation, the divergence-time profiles were relatively homogeneous from beginning to end, particularly for the supergenes on LGs 1, 7, and 12 (Fig. 5a,c,d; Supplementary Figure 8), suggesting either that double crossovers are rare within these supergenes, or that sequences exchanged through double crossovers are frequently purged from the recipient haplotypes. As the supergene on LG 1 is known to include not one but two adjacent inversions of roughly similar size [32], our results also suggested a similar age and possibly a joint origin for both of these inversions. The divergence-time profile for the supergene on LG 2 appeared consistent with the expectation of a U-shaped pattern; however, comparison of the gadMor2 assembly with the recently released gadMor3 assembly [42] showed that the end of this linkage group may be misassembled in gadMor2, and the region of  (Fig. 5d). To investigate the genotypes of the two sampled Bornholm Basin individuals within this region in more detail, we identified 219 haplotype-informative sites between positions 7-8 Mbp on LG 12, and found that these individuals were both heterozygous at these sites, for a region of ∼275 kbp between positions 7,478,537 bp and 7,752,994 bp (Fig. 6). The two individuals from the Bornholm Basin population thus carried a long sequence from the haplotype with the derived arrangement even though they were otherwise clearly associated with the ancestral arrangement.
As the length of this introduced sequence was far longer than the 50-1,000 bp expected to be copied per gene-conversion event [23,24], it strongly supports double crossover between the two haplotypes of the LG 12 supergene. The region covered by the introduced sequence contains 24 predicted genes (Supplementary Table 20), including a cluster of three vitellogenin genes, out of a total of four vitellogenin genes found in the gadMor2 genome. These genes are known to influence the buoyancy of fish eggs [76][77][78][79], and could thus be a target of selection in Gadus morhua populations in the brackish Baltic Sea [28].

Discussion
Through comparison of long-read-based genome assemblies for migratory and stationary Atlantic cod individuals, we corroborated earlier conclusions that chromosomal inversions underlie all of the four supergenes in Atlantic cod [42]. The inversion breakpoints do not coincide exactly with the boundaries of supergenes, but lie up to 45 kbp inside of them, in agreement with findings reported for Drosophila that suggested that recombination suppression can extend beyond inversion breakpoints [53]. By also comparing the genome assemblies for Atlantic cod with an assembly of the LOW1503Z06  LOW1504Z07  KIE1103Z20  KIE1102Z06  LOF1106Z11  BOR1205Z07  BOR1205Z03  AVE1409Z09  AVE1409Z10  AVE1403Z11  AVE1403Z10  BAT1309Z03  BAT1309Z10  ICC0304Z11  ICO0304Z23  ICO0304Z20  LOF1103Z11  ERR1551885  TWI1307Z07  TWI1307Z15  TWI1307Z12  TWI1307Z17 7.00M LG 12 8.00M 7.75M 7.50M 7.25M closely related haddock (Melanogrammus aeglefinus) [56], we were further able to identify the gad-Mor2 assembly -representing a migratory Atlantic cod individual -as the carrier of the derived arrangement of the supergenes on LG 1 and 7, but not of that on LG 2. In addition, our demographic analyses indicated that the gadMor2 assembly might also carry the ancestral arrangement on LG 12. The haplotypes with derived arrangements were not consistently characterised by lower genetic diversity than ancestral arrangements, contrary to assumptions that were used in previous studies to distinguish between them [29,30]. As suggested by our demographic analyses (Fig. 4i) and simulations (Supplementary Table 21), this contrast can be explained by an ability of haplotypes with derived arrangements to recover from the initial bottleneck. While this recovery may require substantial frequencies of the derived arrangement (comparable to those of the ancestral one) within the species and sufficient time since the inversion origin to allow the accumulation of new mutations [17], both of these requirements may be met for the Atlantic cod supergenes. Additionally, haplotypes with the derived arrangement can have higher genetic diversity than those with the ancestral arrangement if the latter were subject to a selective sweep after the divergence.
According to our time-calibrated phylogenomic analyses, the four supergenes in Atlantic cod originated between ∼0.40 and ∼1. 66 Ma. These dates could potentially be underestimates due to gene flux between the haplotypes with derived and ancestral arrangement that could not be accounted for in our age estimation. Nevertheless, we consider these age estimates as evidence that at least some of the four supergenes had separate origins, as the age estimate for the supergene on LG 7 is more than four times as old as that for the supergene on LG 12 and their confidence intervals are clearly non-overlapping. This conclusion is further supported by the homogeneity in sliding-window age estimates from beginning to end of each supergene and the support for demographic bottlenecks coinciding with the inferred age estimates (Figs. 4,5). On the other hand, a joint origin can not be excluded for the supergenes on linkage groups 1, 2, and 12. Our age estimates thus indicate that some, but not necessarily all of the four supergenes evolved separately, after the Atlantic cod's divergence from the walleye pollock (Gadus chalcogrammus), and before the divergence of all extant Atlantic cod populations.
Introgression from other codfish species does not seem to have played a role in the origin of the four supergenes. However, we found strong signals of introgression between Atlantic cod and Greenland cod (Gadus ogac), as Atlantic cod unambigously shares more alleles with Greenland cod than with the Greenland cod's sister species, the Pacific cod (Gadus macrocephalus), despite the very recent divergence between Greenland and Pacific cod (∼40 ka). This genetic similarity between Atlantic cod and Greenland cod does not appear to be an artifact resulting from reference bias and is best explained by gene flow into Greenland cod, from an Atlantic cod population cooccurring in the Northwest Atlantic (Supplementary Note 3). Besides the signals for introgression into Greenland cod, our results strongly support gene flux between the two haplotypes of the supergene on LG 1 (Fig. 4b): between all carriers of the ancestral arrangement and the migratory individuals from Newfoundland (which carry the derived arrangement) and between all carriers of the derived arrangement and the stationary individuals from Newfoundland (which carry the ancestral arrangement). Due to elevated GC-content of sites shared between the haplotypes, we interpret these signals of gene flux as evidence for gene conversion occurring at Newfoundland, but note that double crossovers could potentially also influence GC-content through crossover-associated gene conversion [81]. It remains unclear why gene flux through gene conversion does not seem to occur at other locations where the two arrangements co-occur (e.g. in northern Norway).
As a second mechanism allowing gene flux between haplotypes with derived and ancestral arrangements, our results demonstrated the occurrence of double crossover, between the two haplotypes of the LG 12 supergene. The sequence introduced through double crossover included the vitellogenin gene cluster [82], which is assumed to contribute to the proper hydration of fish eggs and thus to the maintenance of neutral buoyancy [76][77][78][79]. However, in contrast to the fully marine environments of the open North Atlantic, the almost land-locked Baltic Sea has a severely reduced salinity, and thus requires adaptations in the hydration of eggs, so that they remain neutrally buoyant at a salinity of ∼12.5 ppt [83] (compared to ∼35 ppt in the North Atlantic and ∼18 ppt at Kiel Bight [84]) and do not sink to anoxic layers [85][86][87]. Thus, the three vitellogenin genes may be under selection in Gadus morhua from the Baltic Sea [28], increasing the frequency of the introduced sequence within the Bornholm Basin population.
The presence of four long (4-17 Mbp) and old (0.40-1.66 Ma) inversion-based supergenes in Atlantic cod adds to recent findings of inversions of similar size and/or age in butterflies [12], ants [2], birds [4], lampreys [88], and Drosophila [75]. For non-model organisms, these findings are largely owed to improvements in sequencing technology within the last decade, including long-read sequencing and chromosome conformation capture techniques, and may become more common as these techniques are applied to an increasing number of species. These findings are, however, in contrast to earlier expectations based on theoretical work and empirical studies on selected model organisms: Only about 20 years ago, the available evidence indicated that inversions (and thus inversion-based supergenes) would be "generally not ancient" [20], with maximum ages on the order of N e generations (which would correspond to ∼100 ka in Atlantic cod), because they would either degenerate through the accumulation of mutation load, or erode if gene flux occurs through gene conversion and double crossover [20,74,89]. However, we observed neither mutation-load accumulation nor erosion of supergenes in Atlantic cod: Mutation load was not increased within the four supergenes compared to the genome-wide background. And supergene erosion -at least when resulting from double crossovers -would be expected to produce U-shaped divergence profiles [20,74,75], but no such profiles were found for the Atlantic cod supergenes (Fig. 5). These observations mirror those made recently by Yan et al. [2] for a supergene in fire ants, and we thus concur with their conclusion that "low levels of recombination and/or gene conversion may play an underappreciated role in preventing rapid degeneration of supergenes". But, since our results also indicated that selection may have acted on sequences exchanged between supergene haplotypes, we further suggest thatjust like in interbreeding species that maintain stable species boundaries despite frequent hybridisation [67] -selective purging of introduced sequences may also be important for the maintenance of supergenes, by maintaining the rate of gene flux between haplotypes at exactly the right balance, between too little of it and the consequential mutation-load accumulation, and too much of it and the resulting supergene erosion.

Methods
Construction of the gadMor Stat genome assembly. We performed high-coverage genome sequencing for a stationary Gadus morhua individual (LOF1106Z11) sampled at the Lofoten islands in northern Norway. The specimen used for this was selected based on a preliminary investigation that had suggested that it carried, homozygously on each of the four LGs 1, 2, 7, and 12, a supergene haplotype that was complementary to the one in the gadMor2 genome [31], which represents a migratory individual from northern Norway. We used the Pacific Biosciences RS II platform, operated by the Norwegian Sequencing Centre (NSC; www.sequencing.uio.no), to generate 2.4 million PacBio SMRT reads with a total volume of 12.5 Gbp. This is approximately equivalent to an 19× coverage of the Gadus morhua genome, the size of which has been estimated at 650 Mbp [31]. The PacBio SMRT reads were assembled with Celera Assembler v.8.3rc2 [90], adjusting the following settings according to the nature of the PacBio reads (all others were left at their defaults): merSize =16, merThreshold=0, merDistinct=0.9995, merTotal=0.995, ovlErrorRate=0.40, ovlMinLen=500, utgGraphErrorRate=0.300, utgGraphErrorLimit=32.5, utgMergeErrorRate=0.35, utgMergeError-Limit=40, utgBubblePopping=1, utgErrorRate=0.40, utgErrorLimit=25, cgwErrorRate=0.40, cns-ErrorRate=0. 40. The consensus sequence of the assembly was polished with Quiver v.0.9.0 [91] and refined with Illumina reads sequenced for the same individual (see below). A total volume of 6.2 Gbp of Illumina reads were mapped to the assembly with BWA MEM v.0.7.12-r1039 [92] and sorted and indexed with SAMtools v.1.10 [93,94]. Subsequently, Pilon v.1.16 [95] was applied to recall consensus and assembly completeness was assessed with BUSCO v.5.0 [51], using the Actinopterygii dataset of conserved gene sequences.
Delimiting high-LD regions associated with inversions. As chromosomal inversions locally suppress recombination between individuals carrying the inversion and those that do not, we used patterns of linkage disequilibrium (LD) to guide the delimitation of inversion regions for each of the four supergenes [28,32,42]. To maximize the signal of LD generated by the inversions, we selected 100 Gadus morhua individuals from a separate dataset (Supplementary Table 22) so that for each of the four supergenes, 50 individuals carried two copies of one of the two alternative supergene haplotypes, and the other 50 individuals carried two copies of the other. Variant calls of the 100 individuals were filtered with BCFtools, excluding all indels and multi-nucleotide polymorphisms and setting all genotypes with a Phred-scaled quality below 20, a read depth below 3, or a read depth above 80 to missing. Sites with more than 80% missing data or a minor allele count below 20 were then removed from the dataset with VCFtools v.0.1.14 [98]. Linkage among single-nucleotide polymorphisms (SNPs) spaced less than 250,000 bp from each other was calculated with PLINK v.1.90b3b [99]. The strength of short-to mid-range linkage for each SNP was then quantified as the sum of the distances (in bp) between that SNP and all other SNPs with which it was found to be linked with R 2 > 0.8. We found this measure to illustrate well the sharp decline of linkage at the boundaries of the four supergenes (Fig. 1).

Contig mapping. To confirm the presence of chromosomal inversions within the four supergenes on
LGs 1, 2, 7, and 12 of the Gadus morhua genome, we aligned contigs of the gadMor Stat assembly to the gadMor2 assembly by using BLASTN v.2.2.29 [100] searches with an e-value threshold of 10 −10 , a match reward of 1, and mismatch, gap opening, and gap extension penalties of 2, 2, and 1, respectively. Matches were plotted and visually analysed for contigs of the gadMor Stat assembly that either span the boundaries of the four supergene regions or map partially close to both boundaries of one such region. We considered the latter to support the presence of a chromosomal inversion if one of two parts of a contig mapped just inside of one boundary and the other part mapped just outside of the other boundary, and if the two parts had opposite orientation; in contrast, an observation of contigs clearly spanning one of the boundaries would reject the assumption of an inversion. To further assess which of the two Gadus morhua genomes (gadMor2 or gadMor Stat) carries the haplotype with the derived arrangement at each of the four regions, we also aligned contigs of the genome assembly for Melanogrammus aeglefinus (melAeg) [56] to the gadMor2 assembly.
Threeway whole-genome alignment. To identify the regions that are most reliably orthologous among the gadMor2, gadMor Stat, and melAeg assemblies, we generated whole-genome alignments using three different approaches. First, we visually inspected the plots of BLASTN matches (see above), determined the order and orientation of all gadMor Stat and melAeg contigs unambiguously mapping to the gadMor2 assembly, and then combined these contigs into a single FASTA file per species and gadMor2 linkage group. For each linkage group, pairwise alignments were then produced with the program MASA-CUDAlign v.3.9.1.1024 [101]. Second, we used the program LASTZ v.1.0.4 [102] to align both the gadMor Stat assembly and the melAeg assembly to the gad-Mor2 assembly, after masking repetitive regions in all three assemblies with RepeatMasker v.1.0.8 (http://www.repeatmasker.org). Third, Illumina sequencing reads of the individuals used for the three assemblies were mapped to the gadMor2 assembly with BWA MEM, followed by sorting and indexing with SAMtools and conversion of the result files to FASTA format. Finally, we generated a conservative threeway whole-genome alignment by comparing the three different types of alignments and setting all sites to missing at which one or more of the three alignment types differed. Alignment sites that opened gaps in the gadMor2 sequence were deleted so that the resulting strict consensus alignment retained the coordinate system of the gadMor2 assembly.
Based on the threeway whole-genome alignment, we calculated the sequence divergence between the gadMor Stat and gadMor2 assemblies, relative to the sequence divergence between the melAeg and gadMor2 assemblies, in sliding windows of 100,000 bp. Sequence divergence was calculated in pairwise sequence comparisons as uncorrected p-distances. We also used the threeway wholegenome alignment to generate a mask of unreliable alignment sites, including all sites that had been set to missing in the alignment.
Estimating divergence times of Gadinae. To estimate the ages of supergene origins in Gadus morhua based on the carefully calibrated timeline of Musilova et al. [103], we performed two nested phylogenetic analyses. The first one used constraints specified according to the results of Musilova et al. [103] to estimate the divergence times of species within the subfamily Gadinae. The second analysis was constrained according to the results of the first one to refine the divergence-time estimates among species of the genera Gadus, Arctogadus, and Boreogadus with a larger dataset and while co-estimating introgression (see below).
The phylogenomic dataset used for the first phylogenetic analysis comprised genome assemblies for eight Gadinae species published by Malmstrøm et al. [104], a genome assembly for the most closely related outgroup Brosme brosme [104], the gadMor2 assembly for Gadus morhua, and sets of unassembled Illumina reads for Gadus macrocephalus and Gadus ogac [50] (Supplementary Table  4). Aiming to identify sequences orthologous to 3,061 exon markers used in a recent phylogenomic analysis of teleost relationships by Roth et al. [105], we first performed targeted assembly of these markers from the sets of Illumina reads for Gadus macrocephalus and Gadus ogac. Targeted assembly was conducted with Kollector v.1.0.1 [106], using marker sequences of Gadus morhua from Roth et al. [105] as queries. From the set of whole-genome and targeted assemblies, candidate orthologs to the 3,061 exon markers used by Roth et al. [105] were then identified through TBLASTN searches, using sequences of Danio rerio as queries as in the earlier study. The identified sequences were aligned with MAFFT and filtered to exclude potentially remaining paralogous sequences and misaligned regions: We removed all sequences with TBLASTN bitscore values below 0.9× the highest bitscore value and all sequences that had dN/dS values greater than 0.3 in comparison to the Danio rerio queries, we removed codons from the alignment for which BMGE v.1.1 [107] determined a gap rate greater than 0.2 or an entropy-like score greater than 0.5, and we excluded exon alignments with a length shorter than 150 bp, more than two missing sequences, or a GC-content standard deviation greater than 0.04. We then grouped exon alignments by gene and excluded all genes that 1) were represented by less than three exons, 2) had one or more completely missing sequences, 3) were supported by a mean RAxML v.8.2.4 [108] bootstrap value lower than 0.65, 4) were located within the four supergene regions, 5) exhibited significant exon tree discordance according to an analysis with Concaterpillar v.1.7.2 [109], or 6) had a gene tree with non-clock-like evolution (mean estimate for coefficient of variation greater than 0.5 or 95% highest-posterior-density interval including 1.0) according to a relaxed-clock analysis with BEAST 2 [59]. Finally, concatenated exon alignments per gene were inspected by eye, and six genes were removed due to remaining possible misalignment. The filtered dataset included alignments for 91 genes with a total alignment length of 106,566 bp and a completeness of 92.8%.
We inferred the species tree of Gadinae with StarBEAST2 [59,110] under the multi-species coalescent model, assuming a strict clock, constant population sizes, and the birth-death tree model [111], and averaging over substitution models with the bModelTest package [112] for BEAST 2. For time calibration, we placed lognormal prior distributions on the age of the divergence of the outgroup Brosme brosme from Gadinae (mean in real space: 32.325; standard deviation: 0.10) and on the crown age of Gadinae (mean in real space: 18.1358; standard deviation: 0.28); in both cases, the distribution parameters were chosen to approximate the age estimates for these two divergence events obtained by Musilova et al. [103]. We performed five replicate StarBEAST2 analyses, each with a length of one billion Markov-chain Monte Carlo (MCMC) iterations. After merging replicate posterior distributions, effective sample sizes (ESS) for all model parameters were greater than 1,000, indicating full stationarity and convergence of MCMC chains. We then used TreeAnnotator from the BEAST 2 package to summarize the posterior tree distribution in the form of a maximum-clade-credibility (MCC) consensus tree with Bayesian posterior probabilities as node support [113].
Estimating divergence times and introgression among species of the genera Gadus, Arctogadus, and Boreogadus. To further investigate divergence times and introgression among species of the closely related genera Gadus, Arctogadus, and Boreogadus, we used a second phylogenomic dataset based on read mapping to the gadMor2 assembly. This dataset included Illumina read data for all four species of the genus Gadus (G. morhua, G. chalcogrammus, G. macrocephalus, and G. ogac) [50,104], Arctogadus glacialis [104], and Boreogadus saida [104], as well as Merlangius merlangius, Melanogrammus aeglefinus, and Pollachius virens [56,104], which we here considered outgroups (Supplementary Table 5). Read data from a stationary and a migratory indivual (both sampled at the Lofoten islands) were used to represent Gadus morhua. Mapping, read sorting, and indexing were again performed with BWA MEM and SAMtools, and variant calling was again performed with GATK's HaplotypeCaller and GenotypeGVCFs tools as described above except that we now also exported invariant sites to the output file. To limit the dataset to the most reliably mapping genomic regions, we applied the mask of unreliable sites generated from the threeway whole-genome alignment (see above), resulting in set of 19,035,318 SNPs. We then extracted alignments from GATK's output files for each non-overlapping window of 5,000 bp for which no more than 4,000 sites were masked, setting all genotypes with a Phred-scaled likelihood below 20 to missing. Alignments were not extracted from the four supergene regions and those windows with less than 100 variable sites were ignored. As we did not model recombination within alignments in our phylogenomic inference, the most suitable alignments for the inference were those with weak signals of recombination. Therefore, we calculated the number of hemiplasies per alignment by comparing the number of variable sites with the parsimony score, estimated with PAUP* [114], and excluded all alignments that had more than ten hemiplasies. Finally, we again removed all alignment sites for which BMGE determined a gap rate greater than 0.2 or an entropy-like score greater than 0.5. The resulting filtered dataset was composed of 109 alignments with a total length of 383,727 bp and a completeness of 91.0%.
We estimated the species tree and introgression among Gadus, Arctogadus, and Boreogadus under the isolation-with-migration model implemented in the AIM package [58] for BEAST 2. The inference assumed a strict clock, constant population sizes, the pure-birth tree model [115], and the HKY [116] substitution model with gamma-distributed rate variation among sites [117]. We timecalibrated the species tree with a single lognormal prior distribution on the divergence of Pollachius virens from all other taxa of the dataset (mean in real space: 8.56; standard deviation: 0.08), constraining the age of this divergence event according to the results of the analysis of divergence times of Gadinae (see above; Supplementary Figure 10). We performed ten replicate analyses that each had a length of five billion MCMC iterations, resulting in ESS values greater than 400 for all model parameters. The posterior tree distribution was subdivided according to tree topology and inferred gene flow, and we produced separate MCC consensus trees for each of the tree subsets.
To further test for introgression among Gadus, Arctogadus, and Boreogadus, we calculated Patterson's D-statistic from the masked dataset for all possible species trios (with Pollachius virens fixed as outgroup) using the "Dtrios" function of Dsuite v.0.1.r3 [63]. For the calculation of the D-statistic, species trios were sorted in two ways; with a topology fixed according to the species tree inferred under the isolation-with-migration model (D fix ), and so that the number of "BBAA" patterns was greater than those of "ABBA" and "BABA" patterns (D BBAA ). The significance of the statistic was assessed through block-jackknifing with 20 blocks of equal size. For the trios with the most significant signals of introgression, we further used the "Dinvestigate" function of Dsuite to calculate the f dM -statistic [60] within sliding windows of 50 SNPs, overlapping by 25 SNPs.
To corroborate the introgression patterns inferred with Dsuite, we performed two analyses based on comparisons of the frequencies of trio topologies in maximum-likelihood phylogenies. Alignments for these analyses were selected as for the species-tree inference under the isolation-with-migration model, except that up to 20 hemiplasies were allowed per alignment. The resulting set of 851 alignments had a total length of 3,052,697 bp and a completeness of 91.0%. From each of these alignments, a maximum-likelihood phylogeny was inferred with IQ-TREE v.1.6.8 [118] with a substitution model selected through IQ-TREE's standard model selection. Branches with a length below 0.001 were collapsed into polytomies. Based on the inferred maximum-likelihood trees, we calculated, for all possible species trios, the D tree -statistic of Ronco et al. [65], a tree-based equivalent to Patterson's D-statistic in which the frequencies of pairs of sister taxa are counted in a set of trees instead of the frequencies of shared sites in a genome (a related measure was proposed by Huson et al. [119]): where for a given trio, f 2nd is the frequency of the second-most frequent pair of sisters and f 3rd is the frequency of the third-most frequent (thus, the least frequent) pair of sisters. As a second tree-based analysis of introgression, we applied genealogy interrogation [66], comparing the likelihoods of trees with alternative topological constraints for the same alignment, as in Barth et al. [67]. We tested two hypotheses of introgression with this method: 1) Introgression between Arctogadus glacialis and either Boreogadus saida or the group of the four species of the genus Gadus; and 2) introgression between Gadus ogac and the sister species Gadus chalcogrammus and Gadus morhua.
Estimating divergence times, demography, and gene flow among Gadus morhua populations. To investigate divergence times among Gadus morhua populations, we applied phylogenetic analyses to the dataset based on whole-genome sequencing and variant calling for 24 Gadus morhua individuals (Supplementary Table 8). This dataset included, now considered as outgroups, the same representatives of Gadus chalcogrammus, G. macrocephalus, G. ogac, Arctogadus glacialis, and Boreogadus saida as our analyses of divergence times and introgression among Gadus, Arctogadus, and Boreogadus (see above). "Migratory" and "stationary" Gadus morhua individuals from Newfoundland, Iceland, Lofoten, and Møre were used as separate groups in these analyses. Subsequent to mapping with BWA MEM and variant calling with GATK's HaplotypeCaller and GenotypeGVCFs tools, we filtered the called variants with BCFtools to include only sites for which the Phred-scaled p value for Fisher's exact test was smaller than 20, the quality score normalised by read depth was greater than 2, the root-mean-square mapping quality was greater than 20, the overall read depth across all individuals was between the 10 and 90% quantiles, and the inbreeding coefficient was greater than -0.5. We further excluded sites if their Mann-Whitney-Wilcoxon ranksum test statistic was smaller than -0.5 either for site position bias within reads or for mapping quality bias between reference and alternative alleles. After normalizing indels with BCFtools, SNPs in proximity to indels were discarded with a filter that took into account the length of the indel: SNPs were removed within 10 bp of indels that were 5 bp or longer, but only within 5, 3, or 2 bp if the indel was 3-4, 2, or 1 bp long, respectively. After applying this filter, all indels were removed from the dataset. For the remaining SNPs, genotypes with a read depth below 4 or a genotype quality below 20 were set to missing. Finally, we excluded all sites that were no longer variable or had more than two different alleles; the filtered dataset then contained 20,402,423 biallelic SNPs.
We inferred the divergence times among Gadus morhua populations from the SNP data under the multi-species coalescent model with the SNAPP add-on package for BEAST 2. Due to the high computational demand of SNAPP, we performed this analysis only with a further reduced set of 1,000 SNPs, randomly selected from all biallelic SNPs that were without missing genotypes and located outside of the supergene regions. The input files for SNAPP were prepared with the script snapp prep.rb [70], implementing a strict-clock model and a pure-birth tree model. The tree of Gadus morhua populations and outgroup species was time-calibrated with a single lognormal prior distribution (mean in real space: 3.83; standard deviation: 0.093) that constrained the root age of the tree according to the results of the analysis of divergence times and introgression among Gadus, Arctogadus, and Boreogadus (see above; Fig. 2b, Supplementary Figure 3). We performed three replicate SNAPP analyses, each with a length of 400,000 MCMC iterations, resulting in ESS values that were all greater than 400. The posterior tree distribution was again summarised as a MCC consensus tree.
Gene flow among Gadus morhua populations and outgroup species was investigated with Dsuite from all biallelic SNPs that were without missing genotypes and located outside of the four supergene regions; there were 408,574 of these. The gene flow analyses were performed with Dsuite's "Dtrios" function as described above.
Population sizes over time were estimated for all sampled Gadus morhua populations with Relate v.1.1.2 [73]. To maximize the number of suitable SNPs for this analysis, we excluded all outgroups except the sister species, Gadus chalcogrammus, and repeated variant calling and filtering with the same settings as before. After applying a mask to exclude all variants from repetitive regions in the gadMor2 assembly (784,488 bp in total) [31], 10,872,496 biallelic SNPs remained and were phased with BEAGLE v.5.1 [120], setting the population size assumed by BEAGLE to 10,000. We excluded all sites that were heterozygous in the Gadus chalcogrammus individual and then reconstructed an "ancestral" genome sequence from the gadMor2 assembly and the called variants for G. chalcogrammus. Following this reconstruction, we removed G. chalcogrammus from the set of SNPs and excluded all sites that had become monomorphic after this removal, leaving 7,101,144 SNPs that were biallelic among the sampled Gadus morhua individuals. In addition to the "ancestral" genome sequence and the set of biallelic SNPs, we prepared a mask for the Relate analysis, covering all sites that were also included in the mask for repetitive regions, all sites that would have been excluded from variant calling due to proximity to indels (see above), and all sites that were ignored in the reconstruction of the "ancestral" sequence due to heterozygous genotype calls for the G. chalcogrammus individual.
As Relate further requires an estimate of the mutation rate, we calculated this rate for the filtered set of SNPs as the mean number of substitutions between Gadus morhua individuals from the Northwest Atlantic (thus, from the populations Newfoundland and Labrador) and those from the Northeast Atlantic (thus, from all other populations), divided by 2 times the expected coalescence time between the two groups and the genome size. We excluded the four linkage groups carrying supergenes from this calculation. The expected coalescence time was calculated as the divergence time between the two groups, which was estimated in the analysis with SNAPP as 65,400 years ( Fig. 3), plus the expected time to coalescence within the common ancestor, which is the product of the generation time and the diploid population size under the assumption of a panmictic ancestral population. With an assumed generation time of 10 years [121] and a population size of 57,400, as estimated in the SNAPP analysis, the expected time to coalescence within the common ancestor is 574,000 years, and the total expected coalescence time was thus set to 65, 400 + 574, 000 = 639, 400 years. As the mean number of substitutions between the individuals of the two groups was 878,704.31 and the size of the gadMor2 assembly without LGs 1, 2, 7, and 12, and excluding masked sites, is 419,183,531 bp, the calculated mutation rate was µ = 878, 704.31/(2 × 639, 400 × 419, 183, 531) = 1.64 × 10 −9 per bp and per year, or 1.64 × 10 −8 per bp and per generation. Because the number of substitutions was calculated from the filtered set of SNPs, this rate is likely to underestimate the true mutation rate of Gadus morhua; however, because the same filtered set of SNPs was used as input for Relate, this rate is applicable in our inference of population sizes over time. The input file was converted from variant call format to haplotype format using RelateFileFormats with the flag "-mode ConvertFromVcf". The script PrepareInputFiles.sh was used to flip genotypes according to the reconstructed "ancestral" genome sequence and to adjust distances between SNPs using the mask prepared for this analysis. Relate was first run to infer genome-wide genealogies and mutations assuming the above calculated mutation rate of 1.64 × 10 −8 per bp and per generation and a diploid effective population size of 50,000. This was followed by an estimation of population-size changes over time by running the script EstimatePopulationSize.sh for five iterations, applying the same mutation rate and setting the threshold to remove uninformative trees to 0.5. The tools and scripts RelateFileFormats, PrepareInputFiles.sh, and EstimatePopulationSize.sh are all distributed with Relate.
Estimating divergence times, demography, and gene flow specific to supergenes. The analyses of divergence times, demography, and gene flow among Gadus morhua populations were repeated separately with SNPs from each of the four supergene regions on LGs 1, 2, 7, and 12. While the SNAPP analyses for these regions were again performed with reduced subsets of 1,000 SNPs per region, the data subsets used in analyses of gene flow with Dsuite comprised 11,474, 3,123, 10,4121, and 10,339 biallelic SNPs, and those used in the analyses of demography with Relate comprised 211,057, 71,046, 130,918, and 130,620 biallelic SNPs, respectively. The mutation rate used as input for these Relate analyses was identical to the one used for the analysis with genome-wide SNPs.
Estimating population divergence times across the genome. In addition to the genome-wide and supergene-specific SNAPP analyses that used biallelic SNPs from the entire genome or the entire length of supergene regions, we also performed sliding-window SNAPP analyses across all linkage groups to quantify differences in population divergence times across the genome. Our motivation for these analyses was primarily to assess whether or not divergence times were homogeneous over the lengths of supergenes, as differences in these divergence times within supergenes could be informative both about the presence of separate inversion within these regions and about their erosion processes. Additionally, we expected that these analyses could reveal further putative inversions elsewhere in the genome if they should exist.
From the set of 20,402,423 biallelic SNPs, we extracted subsets of SNPs for each non-overlapping window of a length of 250,000 bp, with a minimum distance between SNPs of 50 bp. We discarded windows with less than 500 remaining biallelic SNPs and used a maximum of 1,000 biallelic SNPs per window; these were selected at random if more biallelic SNPs were available per window. Input files for SNAPP were then prepared as for the genome-wide and supergene-specific SNAPP analyses. Per window, we performed two replicate SNAPP analyses with an initial length of 100,000 MCMC iterations, and these analyses were resumed up to a maximum of 500,000 MCMC iterations as long as the lowest ESS value was below 100. Windows with less than 300 sufficiently complete SNPs for SNAPP analyses, with an ESS value below 100 after the maximum number of MCMC iterations, or with a mean BPP node support value below 0.5 were discarded after the analysis. Per remaining window, posterior tree distributions from the two replicate analyses were combined and summarised in the form of MCC consensus trees. Additionally, a random sample of 100 trees was drawn from each combined posterior distribution.
Instead of showing all resulting trees, we developed a type of plot that shows, without loss of phylogenetic information, the divergence times stacked upon each other on a single axis, which allowed us to illustrate these divergence times efficiently across linkage groups. For this plot, all trees were first ladderised, outgroups were pruned, and the divergence times between each pair of populations adjacent to each other on the ladderised trees were extracted. Per window, the order of populations on the ladderised tree, together with the extracted divergence times between them, was used to define the positions of points on the vertical axis of the plot, so that each point represents a population and their vertical distances indicate the divergence times between populations that are next to each other on the ladderised tree. The positions of window on the linkage group were used to place these dots on the horizontal axis of the plot, and all dots representing the same population were connected by lines to produce the complete plot of divergence times across linkage groups.

Data availability
The gadMor Stat assembly (ENA accession GCA 905250895) and read data for all Gadus morhua specimens listed in Supplementary Table 8 are deposited on ENA with project number PRJEB43149. Alignment files, SNP datasets in PED and VCF format, and input and output of phylogenetic analyses are available from Zenodo (doi: 10.5281/zenodo.4560275).