Mutation accumulation in chromosomal inversions maintains wing pattern polymorphism in a butterfly

Paul Jay; Mathieu Chouteau; Annabel Whibley; Héloïse Bastide; Violaine Llaurens; Hugues Parrinello; Mathieu Joron

doi:10.1101/736504

Abstract

While natural selection favours the fittest genotype, polymorphisms are maintained over evolutionary timescales in numerous species. Why these long-lived polymorphisms are often associated with chromosomal rearrangements remains obscure. Combining genome assemblies, population genomic analyses, and fitness assays, we studied the factors maintaining multiple mimetic morphs in the butterfly Heliconius numata. We show that the polymorphism is maintained because three chromosomal inversions controlling wing patterns express a recessive mutational load, which prevents their fixation despite their ecological advantage. Since inversions suppress recombination and hamper genetic purging, their formation fostered the capture and accumulation of deleterious variants. This suggests that many complex polymorphisms, instead of representing adaptations to the existence of alternative ecological optima, could be maintained primarily because chromosomal rearrangements are prone to carrying recessive harmful mutations.

Polymorphic complex traits, which implicate the coordination of multiple elements of phenotype, are often controlled by special genetic architectures involving chromosomal rearrangements. Examples include dimorphic social organization in several ant species (1), coloration and behavioral polymorphisms in many birds and butterflies (2–6), dimorphic flower morphology in plants (7), as well as the extreme cases provided by sexual dimorphism encoded by the extensively re-arranged sex chromosomes. Why these polymorphisms arise is a long-standing question in biology (8–12).

The so-called supergenes controlling these striking polymorphisms are characterized by the suppression of recombination between linked loci, often through polymorphic chromosomal rearrangements which are thought to preserve alternative combinations of co-adapted alleles (1, 4, 5, 7, 12). The encoded phenotypes are often assumed to reflect the existence of multiple, distinct adaptive optima, and are frequently associated with antagonistic ecological factors such as differential survival or mating success (3, 13–15). Yet why and how alternative chromosomal forms become associated with complex life-history variation and ecological trade-offs is not understood.

The Amazonian butterfly Heliconius numata displays wing pattern polymorphism with up to seven morphs coexisting within a single locality, each one engaged in warning color mimicry with distinct groups of toxic species. Adult morphs vary in mimicry protection against predators and in mating success via disassortative mate preferences (13, 16). Polymorphic inversions at the mimicry locus on chromosome 15 (supergene P) form three distinct haplotypes (5). The standard, ancestral haplotype constitutes the class of recessive P alleles and is found, for example, in the widespread morph silvana. Two classes of derived haplotypes are known, both associated with a chromosomal inversion called P₁ (∼400kb, 21 genes), each conferring increased protection against predatory attacks via mimicry. The first derived haplotype, encoding the morph bicoloratus, carries P₁ alone; the second class of derived haplotypes carries P₁ linked with additional yet still uncharacterized rearrangements (called BP2 in (5)) and occurs in morphs which typically exhibit intermediate levels of dominance, such as tarapotensis and arcuella. Inversion polymorphism and supergene formation originated via the introgression of P₁ from the H. pardalinus lineage (17). The series of chromosomal rearrangements initiated by introgression allows us to unravel the stepwise process by which structural variation has become associated with directional and balancing selection.

Comparative analysis of de novo genome assemblies of 12 H. numata individuals revealed a history of supergene formation characterized by the sequential accretion of three adjacent inversions with breakpoint reuse. Pairwise alignment of assemblies shows that all derived haplotypes belonging to the intermediate dominant allelic class display two newly-described inversions: P₂ (200kb, 15 genes), adjacent to P₁, and the longer P₃ (1150 kb, 71 genes), adjacent to P₂ (Fig 1A, Sup. Fig. S1, Sup. Fig. S2). Sliding-window PCA along the supergene region confirmed the dominance of derived arrangements (denoted Hn1 and Hn123) to the ancestral arrangement (denoted Hn0) and their prevalence across all populations of the Amazon (Fig 1B, Fig 1C, Sup. Fig. S3, Sup. Fig. S4). Multiple genes in the inverted regions showed significant differential expression compared to ancestral segments, but this likely reflects divergence rather than direct breakpoint effects (Sup. Fig. S5). Indeed, none of the break-points of P₁, P₂ or P₃ fell within a gene, and no transcript found in Hn0 specimens was missing, disrupted, or differentially spliced in specimens with inversions (Hn1 and Hn123).

Fig. S1. Alignment of genome assemblies of H. numata silvana (Hn0, genome 38) and the H. melpomene reference genome (Hmel2.5) focused on the region of the supergene on chromosome 15

No major chromosomal rearrangements are observed between Hn.0 and Heliconius melpomene on chromosome 15.

Fig. S2. Alignment of the supergene region of genome 38 (.H. n. silvana) against other H. numata genome assemblies.

Fig. S3. Sliding window PCA computed along the supergene for all specimens.

Computed on 5kb sliding windows. Each line represents the position of a specimen on the first axis of the PCA along chromosome 15. See Sup. Fig. 4S for summary PCAs, not computed on sliding wind. ows but on the whole regions.

Fig. S4. PCA computed on SNPs, on the inversion segments and outside the supergene, for all specimens.

Each dot represents the position of a specimen on the PCA two first axis. A. PCA computed on SNPs on the chromosome 15 but not within the supergene region. The PCA reflect the geographic structure of the dataset. B. PCA computed on SNPs on P₁ segment. The first axis of the PCA reflects individual genotypes for the inversion : homozygote for the ancestral gene order (P0/P0), Homozygote for the inversion (P1/P1), or heterozygote (P0/P1). The second axis of the PCA reflects the geographic structure of the dataset. C. PCA computed on SNPs on P₂+P3 segment. The first axis of the PCA reflects individual genotypes for the two inversions : homozygote for the ancestral gene order (P0/P0), Homozygote for the two inversions (P23/P23), or heterozygote (P0/P23); the second axis of the PCA reflects the geographic structure of the dataset.

Fig. S5. Differential gene expression across the chromosome 15

Expression difference in early pupal (24h) wing discs between Hn0 and Hn1/Hn123. RNAseq data from (1) were reanalysed using the EdgeR R package (2)). The -log10 of the false discovery rate is plotted along the chromosome 15, with each dot representing a different transcript, and reveal that genes within the inversion segments are differentially expressed between Hn0 and Hn1-Hn123.

Fig. 1. Genomic architecture of the H. numata wing pattern polymorphism

A. Alignment of the genome assemblies from 4 H. numata morphs across the supergene region on chromosome 15. B. Sliding window Principal Component Analysis (PCA) computed along the supergene (non-overlapping 5kb windows). For clarity, only a subset of morphs are shown here (full dataset presented in Sup. Fig. S3). Each colored line represents the variation in the position of a specimen on the first PCA axis along chromosome 15. Within the inversions, individual genomes are characterized by one of three genotypes : homozygous for the inversion (down), heterozygous (middle), homozygous for the standard arrangement (top). The gene annotation track is shown under the plot, with the forward strand in the lower panel and the reverse strand in the upper panel. Each gene is represented by a different colour C. Structure of the H. numata supergene P. Three chromosome types are found in H. numata populations, carrying the ancestral gene order (Hn0), inversion P₁ (Hn1), or inversions P₁, P₂ and P₃ (Hn123). D. Analysis of divergence times between Hn123 and Hn0 at inversions segment. The TMRCA between Hn123 and Hn0 and the most ancient common ancestor of Hn123 provide respectively the upper and lower bound of the inversions formation time. Boxplots display the distribution of estimated times computed on 5kb sliding window across the supergene (estimates plotted along the supergene presented in Sup. Fig. S7).Time intervals are consistent with the stepwise accretion of P₁, P₂ and P₃, but the simultaneous origin of P₂ and P₃ cannot be formally rejected.

In contrast to the introgressive origin of P₁(Sup Fig. S6, (17)), inversions P₂ and P₃ are younger and originated within the H. numata lineage. Upper and lower estimates of inversion ages, obtained by determining the most recent coalescence events between Hn0+Hn1 and Hn123, and within Hn123, respectively, suggest that the P supergene has evolved in three steps, involving the introgression of P₁followed by the successive occurrence of P₂ and P₃ between ca. 1.8 and 3.0 Mya (Fig. 1D, Sup. Fig. S7). Haplotypes show size-able peaks of differentiation (Fst) across inversion blocks (Sup. Fig. S8), reflecting their distinct histories of recombination suppression and confirming the stepwise accretion of these inversions. The three adjacent inversions underlying the mimicry polymorphism of H. numata are therefore of distinct ages and originated in distinct lineages, which provides an opportunity to partition their mutational history and distinguish the consequences of their formation from those resulting from their maintenance in a polymorphism.

Fig. S6. Phylogenies of the silvaniform clade with H. cydno and H. melpomene as outgroups, using the genomic segments orthologous to P₁, P₂ and P₃ in H. numata.

Phylogenies computed with RAxML (3) using the GTRCAT model and only individuals homozygous for the inversions or the standard arrangement. A. Phylogeny of segments orthologous to P₂ and P₃. This shows the unique origin of the P₂ and P₃ inversions within H. numata. B. Phylogeny of segments orthologous to P₁. This show the introgression of P₁ from H. pardalinus into H. numata. Incongruent position of H. elevatus, H. hecale and H. ethilla result from incomplete lineage sorting at the clade level around the gene cortex and to gene flow among species of the clade (especially an introgression between H. elevatus and H. melpomene) (4).

Fig. S7. Analysis of divergence times between Hn123 and Hn0 along the chromosome 15.

Divergence time estimates computed with Phylobayes on 5kb sliding windows. Bold red and blue lines represent the LOESS smoothing (span = 0.05) of the raw data (thin lines) and give the upper and lower bound of the times inversions P₂ and P₃ occurred. This supports the formation of P supergene by the stepwise accretion of P₁, P₂ a.nd P₃

Fig. S8. Fst analysis between the three main supergene alleles : without inversion (Hn0), with P₁ inversion (Hn1) and with all three inversion P₁,P2 and P₃ (Hn123)

A “suspension bridge” pattern of differentiation can be observed at P₂-P3 by comparing Hn123 to Hn0 and Hn1 haplotypes, suggesting the rare occurrence of recombination around the center of the inversion, as predicted by (5). A peak of differentiation can be seen between Hn1 and Hn123 around the gene cortex, which controls melanic variations of the wing pattern in Heliconius butterflies (6). This peak was unexpected since these two classes of haplotypes have the same genomic orientation (P1 inversion) in this region. Moreover, this region also show the highest differential gene expression when comparing Hn1 to Hn123 (Sup. Fig. S5). Analyses of assemblies as well as of read coverage (data not shown) do not support the presence of major rearrangements between Hn1 and Hn123 at this position, suggesting that this peak of differentiation on cortex is caused by selection on wing pattern divergence rather than recombination suppression via structural variation.

Since chromosomal regions carrying inversions rarely form chiasma during meiosis, recombination is strongly reduced among haplotypes with opposite orientations (18). Recombination suppression between structural alleles is predicted to lead to inefficient purging of deleterious variants and there-fore to the accumulation of deleterious mutations and transposable elements (TEs) (19). Consistent with this prediction, estimation of the TE dynamics obtained by computing whole genome TE divergence supports a recent burst of TE insertion within the inversions, reported particularly by TEs belonging to the RC, DNA and LINE classes (Fig. 2A, Fig. 2B). Inverted haplotypes show a significant size increase (mean=+9.47%) compared to their corresponding non-inverted region in Hn0 (Fig. 2C) and this expansion was caused primarily (71.8%, Fig. 2A) by recent TE insertions from these classes (Fig. 2B, Sup. Fig. S9).

Fig. S9. Proportion of TE classes in whole genome, inversions, and insertions in inversions

Fig. 2. Variation in inversion size due to accumulation of transposable elements.

A. Proportion of transposable elements in the whole genome, in the 3 inversions, and in the region present uniquely in inversion P₁, P₂ or P₃ and not in ancestral non-inverted haplotype -i.e. sequences that were inserted in P₁, P₂, or P₃. Insertions in inversions are mostly transposable elements. B. Timing of insertion, in units of nucleotide divergence, for the distinct classes of transposable elements found in inversions or only in sequences that were inserted in P₁, P₂, or P₃. Recently active TEs (RC, DNA and LINE) are those that have accumulated within inversions. C. Size comparisons of orthologous standard and inverted chromosomal segments. Inverted haplotypes are longer than haplotypes with the ancestral gene order.

To investigate the impact of polymorphic inversions on the accumulation of deleterious mutations, we calculated, independently on inverted and non-inverted segments, the rate of non-synonymous to synonymous polymorphism (pN/pS), the rate of non-synonymous to synonymous substitution (dN/dS) and the direction of selection (DoS, (20)). Consistent with a low efficiency of selection in eliminating deleterious variants, P₁, P₂, and P₃ were all found to be enriched in non-synonymous relative to synonymous polymorphisms compared to the whole genome and to non-inverted ancestral segments (pN/pS_P1=0.83, pN/pS_P2=0.54, pN/pS_P3=0.49, Fig. 3A, Sup. Tab. S12). The inversions were also found to be under negative selection (DoS_P1=-0.136, DoS_P2 = −0.087, DoS_P3=-0.079), with values reflecting their sequential origin (Fig. 3A, Sup. Tab. S12). Because P₁ was introgressed from the H. pardalinus lineage (Sup. Fig. S6, (17)), mutations that accumulated in P₁ before the introgression (i.e. shared with H. pardalinus) could be distinguished from those arising after supergene formation in H. numata (i.e. unique to Hn1 and Hn123). This revealed that non-synonymous mutations which existed in the P₁ segment before the introgression underwent a high rate of fixation in H. pardalinus (dN/dS = 0.78, Sup. Fig. S10), and in H. numata (dN/dS=1.33, Fig 3B), suggesting that both the formation of P₁ and its introgression led to the fixation of deleterious mutations. By contrast, 99.9 % of the mutations that accumulated in coding regions of P₁ after its introgression -i.e. after super-gene formation-remain polymorphic in Hn1/Hn123 and a high proportion of them are non-synonymous (dN/dS=0.00, pN/pS=0.978, DoS=-0.49, Fig 3B, Sup. Tab. S12). Taken together, these results suggest that the inversions have captured and accumulated deleterious mutations during their evolution, presumably owing to bottlenecks generated by their formation and to recombination suppression with their ancestral, coexisting counterparts

Fig. S10. Mutation accumulation analysis on H. pardalinus.

Density curve representing the whole genome distribution computed on 500kb windows across 12 H. pardalinus specimens. P₁ shows an increase in non-synonymous polymorphisms and substitutions c. ompared to whole genome.

Fig. S11. Experimental crosses designed to assess the survival of the larvae of the distinct genotypes at the supergene P.

Fig. S12. Accumulation of deleterious variants in inversions.

dN/dS, pN/pS and Direction of Selection computed on the whole genome or only on segments P₁, P₂, or P₃. Only samples homozygotes for the ancestral or the inverted gene order are used for the analysis. Hn0 display the ancestral gene order at P₁, P, and P₃. Hn1 are inverted at P₁ and non-inverted at P₂ and P₃. Hn123 are inverted at P₁, P₂, and P₃. Because P₁ was introgressed from H. pardalinus (Hpa), we were able estimate parameters on mutations that are unique to Hn1-Hn123, which occurred after the inversion formation, and on mutations that are common to Hn1-Hn123 and Hpa, which occurred before the introgression. Inverted segments consistently show a more negative direction of selection compared to non-inverted segments and a higher pN/pS ratio, suggesting a lower efficiency of selection to purge deleterious variants in inversion. Contrarly, dN/ds ratio are slightly lower in inverted compared to non-inverted segments. P₁ segments help to understand this pattern. Non-synonymous SNPs that occurred in coding region of P₁ in Hpa before the introgression (“Hn1-Hn123 common Hpa”) underwent a very high rate of fixation in Hn1-Hn123 (dN/dS=1.33), but none of the SNPs that occurred in Hn1-Hn123 after the introgression is fixed (dN/dS=0,000). This suggest that the indermediate dN/dS values observed at inversions may result from the balance between the very high rate of fixation during inversions formation (and introgression) and the reduction of fixation rate during their su.bsequent evolution, likelly because of recombination suppression.

Fig. S13. Summary of genome assemblies quality

Fig. 3. Accumulation of deleterious variants in inversions

A. Direction of selection and ratio of non-synonymous to synonymous polymorphisms (pN/pS) ratio, computed on 500 kb windows genome-wide and in the inversions segments, for both inverted and non-inverted haplotypes. Only genes with coding sequences >5kb (n=6364) were retained in this analysis. Inversions tend to be under negative selection and to accumulate non-synonymous polymorphism. B. Ratios of non-synonymous to synonymous substitutions (dN/dS) and polymorphisms (pN/pS) on the different mutations partitions observed in the P₁segment : all mutations observed in Hn0 (purple), all mutations observed in Hn1/Hn123 (red), all mutations shared by H. pardalinus and Hn1/Hn123 and not observed in Hn0 (blue) and all mutations present uniquely in Hn1/Hn123 (yellow).

Inversions with an accumulated mutational load are expected to incur a fitness cost. Indeed, H. numata inversions were found to have detrimental effects on larval survival in homozygotes. When comparing survival among P genotypes from 1016 genotyped F2 progeny, and controlling for genome-wide inbreeding depression, homozygotes for a derived haplotype showed a far lower survival than other genotypes, with only 6.2% of Hn1/Hn1 larvae and 31.3% of the Hn123/Hn123 larvae surviving to the adult stage (GLMM within-family and genotype analyses, Fig. 4A). By contrast, ancestral homozygotes Hn0/Hn0 had a good survival rate (77.6%), and all heterozygous haplotype combinations (Hn0/Hn1; Hn1/Hn123; Hn0/Hn123) displayed similar survival. Inversions therefore harbor fully recessive variants with a strong impact on individual survival in homozygotes. Interestingly, individuals with the Hn1/Hn123 genotype do not experience the deleterious effects of the P₁inversion (83,8% survival), despite being effectively homozygous for this rearrangement (Fig. 4A). This may indicate that Hn1 and Hn123 harbor different deleterious variants within P₁, for instance in the region surrounding the gene cortex which shows peaks of differentiation between those two haplotypes (Sup. Fig. S8), or that variants in P₂ or P₃ compensate for the deleterious effects of P₁.

Fig. 4. Fitness variation associated with chromosomal inversions at the supergene in H. numata.

A. Larval survival rate for the different supergene genotypes. GLMM analysis confirmed that genotype was a significant predictor of survival (χ₂ = 459.776; df = 5; p<0.001) while experimental cross design was unimportant (χ₂ = 0.8117; df = 2; p = 0.666), validating the joint analysis of all families and crosses. B. Variation in fitness components associated with supergene genotypes. Adult survival estimates are based on protection against predator. Selection coefficients were calculated relative to the population mean, and estimated in the H. numata population of Tara-poto, Peru. Predation and mating success data come from (16)) and (13)

Inversions have largely been considered for their value in preserving combinations of co-adapted alleles through sup-pressed recombination with ancestral chromatids, yet this also makes them prone to capturing deleterious mutations (19). Our results bring key insights into how the ecological and genetic components of balancing selection allow inversion polymorphisms to establish. Inversions in H. numata show strongly positive dominant effects on adult survival through protection against predators via wing-pattern mimicry, which should lead to their rapid fixation (Fig. 4B, (16)). Yet we found that these inversions are also enriched in deleterious variation from their very formation, as well as from an accumulation of mutations owing to the reduction in recombination-driven purging. The expression of a recessive genetic load associated with inversions inevitably translates into negative frequency-dependent selection (21). The balancing selection acting on these inversions in H. numata thus results from their antagonistic ecological and genetic effects: positive selection and dominant effects on adult mimicry but negative frequency-dependent selection through recessive effects on viability (Fig 4B). The initial mutation load associated with the formation and introgression of inversion P₁ likely initiated the balancing selection as soon as P₁ rose in frequency, and was further reinforced by the accumulation of deleterious mutations under low recombination. This led to the formation of haplotypes expressing net beneficial effects only when heterozygous.

Individuals carrying inversion P₁ express disassortative mate preferences, which also balance inversion frequencies in the population (Fig 4B, (13)). Disassortative mating is likely to have evolved in response to the fitness costs associated with homozygous inversions, as selection may have favoured mate preferences minimizing the proportion of homozygous off-spring (4). Disassortative mating further hampers the purging of deleterious variation located within the inversions. The initial capture of genetic load in the inversions thus triggered cascading ecological effects and led to the long-term persistence of polymorphism. The low recombination regime associated to inversions also favoured the insertion of transposons, increasing the size of the inverted haplotype. A similar pattern has also been observed in the Papaya neo sex-chromosomes (22) and in the fire ant supergene (23), indicating that this initial increase in size due to accumulation of TE may be a general pattern in the early evolution of polymorphic chromosomes.

Our findings shed new light on the origin and evolution of complex polymorphisms controlled by supergenes and related architectures, such as sex-chromosomes. The build-up of antagonistic fitness effects found here is likely to be a general feature of the formation of inversion polymorphisms and their evolution through time. Therefore, the benefits of structural variants in terms of recombination suppression between ecologically adaptive traits may only explain why they are initially favoured, whereas their maintenance as polymorphisms may be driven by their initial and gradually accumulating mutation load. In summary, balancing selection may not be generated by extrinsic ecological factors, but by intrinsic features of the genetic architecture selected during the evolution of complex phenotypes. Taken together, these novel insights into the consequence of chromosomal rearrangements may explain why inversions are often found polymorphic and linked with complex phenotypes in nature. In a broader context, dissecting the opposing effects of sup-pressed recombination and how they determine the fate of chromosomal rearrangements may bring new light to our understanding of the variation in genome architecture across the tree of life.

Methods

Sampling and sequencing

To investigate the structure of the P supergene allele, we intercrossed wild-caught individuals in cages in order to obtain F2 (or later generation) autozygous individuals (i.e. with the two identical copies of the supergene allele). Samples were either conserved in NaCl saturated DMSO solution at 20°C or snap frozen alive in liquid nitrogen and conserved at −80°C (Sup. Tab. S13). DNA was extracted from the whole butterfly bodies except the head with a protocol adapted from (24), with the following modification. Butterflies were ground in a frozen mortar with liquid nitrogen, 150 mg of tissue powder was mixed with 900µl of preheated buffer and 6µl of RNaseA. Tube were incubated during 120 minutes at 50°C for lysis, and then at −10°C for 10 minutes, with the addition of 300µl of Potassium acetate for the precipitation. One volume of binding buffer was added with 100µl of Serapure beads solution. 3 washing cycles were used and DNA was resuspended in 100µl of EB buffer. Samples 35 and 36 were prepared using the NEBNExt FFPE DNA Repair MIX (NEB). DNA fragment shorter than 20Kb were removed for sample 35 and 36, and shorter than 40kb for samples 26 and 28. 10x Chromium linked-read libraries of 10 autozygous individuals corresponding to 8 different morphs, as well as 2 wild-caught homozygous individuals, were prepared and 2×150bp paired-end reads were sequenced using Illumina HiSeq 2500. Draft genomes were assembled using Super-nova (v2.1.1, (25)) (Sup. Tab. S13).

Whole genome assemblies analysis

The assembled genomes were compared to the H. melpomene reference genome (Version 2.5) and to each other using BLAST (26), and LAST (27). Because for some specimens, the supergene was dispersed across multiple scaffolds, we used Ragout2 (28) to re-scaffold their supergene assembly, using as reference the four individual assemblies with the highest quality assembly statistics (n°38, 29, 40, and 26). Genome quality analysis was assessed with BUSCO using the insecta_odb9 database. MAKER (29) was used to annotate the genomes, using protein sequences obtained from the H. melpomene genome (v2.5, http://lepbase.org/) in combination with an H. numata transcriptome dataset (30). Repeat-Modeler (31) was used to identify unannotated TEs in the 12 H. numata genomes. Unknown repeat elements detected by RepeatModeler were compared by BLAST (26) (-evalue cut-off 1e^-10) to a transposase database (Tpases080212) from (32). TE identified were merged with the Heliconius repeat database (Lavoie et al. 2013) and redundancy was filtered using CDHIT (REF) with a 80 % identity threshold. Repeat-Masker (31) was then used to annotate transposable elements and repeats using this combined database and results were parsed with scripts from https://github.com/4ureliek/Parsing-RepeatMasker-Outputs.git.

Population Genomic Analysis

Whole genome re-sequence data from H. numata and other Heliconius species from (17) were used, as well as 37 new wild-caught H. numata specimens. For the latter samples, butterfly bodies were conserved in NaCl saturated DMSO solution at −20°C and DNA was extracted using QIAGEN DNeasy blood and tissue kits according to the manufacturer’s instructions with RNase treatment. Illumina Truseq paired-end whole genome libraries were prepared and 2×100bp reads were sequenced on the Il-lumina HiSeq 2000 platform. Reads were mapped to the H. melpomene Hmel2 reference genome (33) using Stampy (version 1.0.28; (34)) with default settings except for the substitution rate which was set to 0.05 to allow for expected divergence from the reference. Alignment file manipulations were performed using SAMtools v0.1.3 (35). After mapping, duplicate reads were excluded using the MarkDuplicates tool in Picard (v1.1125; http://broadinstitute.github.io/picard) and local indel realignment using IndelRealigner was performed with GATK(v3.5; (36)). Invariant and polymorphic sites were called with GATK HaplotypeCaller, with options – min_base_quality_score 25 –min_mapping_quality_score 25 -stand_emit_conf 20 –heterozygosity 0.015. VCF data were processed using bcftools (37). PCA analyses were computed with the SNPRelate R package (38), using 5kb windows. Using Phylobayes (39), on 5kb sliding windows, we estimated the most recent coalescence event between Hn0+Hn1 and Hn123, which corresponds to age of the last recombination between Hn0+Hn1 and Hn123, and 2) the time to the most recent common ancestor (TMRCA) of all Hn123 haplotypes. This provides respectively the upper (1) and the lower (2) bounds of the date of the inversion event (Sup Fig. S7). In order to compute the Fst and standard population genetic analyses, we manually curated the phasing of heterozygous individuals since computational phasing packages such as SHAPEIT or BEAGLE were found to introduce frequent phase switch errors. For each heterozygous SNP in inversion regions, if one and only one of the two alleles is observed in more than 80 % of individuals without inversions (Hn0), this allele is considered as being on the haplotype 1, the other being on haplotype 2. For SNPs which did not fit this criterion, each allele was placed randomly on one of the two haplotypes.

Deleterious mutation accumulation

SnpEff (40) with default was used to annotate the H. numata SNPs using the H. melpomene reference genome annotation. We computed the ratio of synonymous and non-synonymous variants (pN/pS), the rate of synonymous and non-synonymous substitution (dN/dS) compared to H. melpomene, and the direction of selection with DoS = Dn/(Dn + Ds) Pn/(Pn + Ps) (20), using all individuals, or only those homozygous for a given inversion type, for every genes larger 5kb (to ensure there is a several SNP within each gene). Whole genome distribution was computed on 500kb non-overlapping sliding windows.

Fitness Assay

H. numata specimens used for the fitness analyses originated the Tarapoto valley, San Martin, Peru. Brood designs are illustrated in Sup. Fig. 10. First, F1 P heterozygotes butterflies were generated by crossing F0 wild males to captive bred virgin females. Unrelated F1 male-female pairs were then selected for their P genotype and hand paired to generate an F2 progeny. We specifically designed these crosses to generate a F2 progeny containing both homozygotes and heterozygotes, within a single family. Larvae were monitored twice a day to assess survival or mortality. Upon death or butterfly emergence, individuals were stored in 96° ethanol until genotyping. We generated a total of 486 F2 progeny from 6 independent replicate of broods for the F1 Hn0/Hn1 x Hn0/Hn1 cross, 504 F2 progeny from 6 brood of the F1 Hn1/Hn123 x Hn1/Hn123 cross and 454 F2 progeny from 7 broods of the F1 Hn1/Hn123 x Hn0/Hn1 cross. Supergene genotypes was assessed using (13) methodology. Briefly, the amplification of the Heliconius numata orthologue of HM00025 (cortex) (Genbank accension FP236845.2), included in the supergene P enables to discriminate between the distinct supergene haplotype by PCR product size: Hn1 (∼1200bp), Hn123 (∼800bp) and Hn0 (∼600 bp). 1,016 F2 progeny could be genotyped. For each of the 19 broods, we used a Chi-squared test of independence to assess variation in survival between the different genotypes of the F2 progeny. When significant, the Freeman-Tukey deviates (FT) was compared to an alpha = 0.05 criterion, and corrected for multiple comparisons using the Bonferroni correction. To compare genotype survival between families and crosses we performed generalized linear mixed models analysis followed by a Tukey’s HSD post-hoc test (package “lme4” (41); in R version 3.1.3, (42)), with the survival of an individual with a given genotype as the response variable (binomial response with logit link). The significance of the predictors was tested using likelihood ratio tests. The genotype was a covariate predictor, crosses was a fixed effect and family identity as a random effect to control for non-independence of measures. Plots were created with ggplot2 (43).

AUTHOR CONTRIBUTIONS

P.J., M.C., and M.J. designed the study. P.J., M.C., A.W., and M.J. wrote the paper. P.J., A.W., and M.J. generated the genomic data. M.C., H.B and V.L. generated the RNAseq data. P.J. performed the genomic analyses with input from A.W.. M.C. managed butterfly rearing and performed fitness assays. H.P. performed whole genomes sequencing. All authors contributed to editing the manuscript.

COMPETING FINANCIAL INTERESTS

The authors declare no competing interests

ACKNOWLEDGEMENTS

We thank Emmanuelle d’Alençon and Marie-Pierre Dubois for their help in the lab, Thomas Aubier for being a DNA extraction wizard, Melanie McClure, Mario Tuatama, Ronald Mori-Pezo for their help during field work, Patrice David for his careful and critical reading of the manuscript, Konrad Lhose, Dominik Laetsch, Benoit Nabolz, Pierre-Alexandre Gagnaire, Mathieu Gauthier, Claire Lemaitre, Fab-rice Legeai and Anna-Sophie Fiston-Lavier for insightful discussions. We thank the Peruvian government for providing the necessary research permits (236-2012-AG-DGFFS-DGEFFS, 201-2013-MINAGRI-DGFFS/DGEFFS and 002-2015-SERFOR-DGGSPFFS). This research was supported by Agence Nationale de la Recherche (ANR) grants ANR-12-JSV7-0005 and ANR-18-CE02-0019-01 and European Research Council grant ERC-StG-243179 to MJ and by fellowships from the Natural Sciences and Engineering Research Council of Canada and a Marie Sklodowska-Curie fellowship (FITINV, N 655857) to MC. This project benefited from the Mont-pellier Bioinformatics Biodiversity platform supported by the LabEx CeMEB, ANR “Investissements d’avenir” program ANR-10-LABX-04-01. MGX acknowledges financial support from France Génomique National infrastructure, funded as part of ANR “Investissement d’avenir” program ANR-10-INBS-09.

Bibliography

1.↵
John Wang, Yannick Wurm, Mingkwan Nipitwattanaphon, Oksana Riba-Grognuz, Yu-Ching Huang, DeWayne Shoemaker, and Laurent Keller. A Y-like social chromosome causes alternative colony organization in fire ants. Nature, 493(7434):664–668, January 2013. issn 1476-4687. doi: 10.1038/nature11832.
OpenUrl CrossRef PubMed Web of Science
2.↵
Sangeet Lamichhaney, Guangyi Fan, Fredrik Widemo, Ulrika Gunnarsson, Doreen Schwo-chow Thalmann, Marc P. Hoeppner, Susanne Kerje, Ulla Gustafson, Chengcheng Shi, He Zhang, Wenbin Chen, Xinming Liang, Leihuan Huang, Jiahao Wang, Enjing Liang, Qiong Wu, Simon Ming-Yuen Lee, Xun Xu, Jacob Höglund, Xin Liu, and Leif Andersson. Structural genomic changes underlie alternative reproductive strategies in the ruff (Philomachus pugnax). Nature Genetics, 48(1):84–88, January 2016. issn 1546-1718. doi: 10.1038/ng.3430.
OpenUrl CrossRef
3.↵
Clemens Küpper, Michael Stocks, Judith E. Risse, Natalie dos Remedios, Lindsay L. Farrell, Susan B. McRae, Tawna C. Morgan, Natalia Karlionova, Pavel Pinchuk, Yvonne I. Verkuil, Alexander S. Kitaysky, John C. Wingfield, Theunis Piersma, Kai Zeng, Jon Slate, Mark Blaxter, David B. Lank, and Terry Burke. A supergene determines highly divergent male reproductive morphs in the ruff. Nature Genetics, 48(1):79–83, January 2016. issn 1546-1718. doi: 10.1038/ng.3443.
OpenUrl CrossRef
4.↵
Elaina M. Tuttle, Alan O. Bergland, Marisa L. Korody, Michael S. Brewer, Daniel J. New-house, Patrick Minx, Maria Stager, Adam Betuel, Zachary A. Cheviron, Wesley C. Warren, Rusty A. Gonser, and Christopher N. Balakrishnan. Divergence and Functional Degradation of a Sex Chromosome-like Supergene. Current Biology, 26(3):344–350, February 2016. issn 0960-9822. doi: 10.1016/j.cub.2015.11.069.
OpenUrl CrossRef PubMed
5.↵
Mathieu Joron, Lise Frezal, Robert T. Jones, Nicola L. Chamberlain, Siu F. Lee, Christoph R. Haag, Annabel Whibley, Michel Becuwe, Simon W. Baxter, Laura Ferguson, Paul A. Wilkin-son, Camilo Salazar, Claire Davidson, Richard Clark, Michael A. Quail, Helen Beasley, Rebecca Glithero, Christine Lloyd, Sarah Sims, Matthew C. Jones, Jane Rogers, Chris D. Jiggins, and Richard H. ffrench Constant. Chromosomal rearrangements maintain a polymorphic supergene controlling butterfly mimicry. Nature, 477(7363):203–206, September 2011. issn 1476-4687. doi: 10.1038/nature10341.
OpenUrl CrossRef PubMed Web of Science
6.↵
K. Kunte, W. Zhang, A. Tenger-Trolander, D. H. Palmer, A. Martin, R. D. Reed, S. P. Mullen, and M. R. Kronforst. doublesex is a mimicry supergene. Nature, 507(7491):229–232, March 2014. issn 1476-4687. doi: 10.1038/nature13112.
OpenUrl CrossRef PubMed Web of Science
7.↵
Jinhong Li, Jonathan M. Cocker, Jonathan Wright, Margaret A. Webster, Mark McMullan, Sarah Dyer, David Swarbreck, Mario Caccamo, Cock van Oosterhout, and Philip M. Gilmartin. Genetic architecture and evolution of the S locus supergene in Primula vulgaris. Nature Plants, 2(12):16188, December 2016. issn 2055-0278. doi: 10.1038/nplants.2016.188.
OpenUrl CrossRef
8.↵
R. A. Fisher. The genetical theory of natural selection. The genetical theory of natural selection. Clarendon Press, Oxford, England, 1930. doi: 10.5962/bhl.title.27468.
OpenUrl CrossRef
9.
E.B. Ford. Genetic Polymorphism. Faber & Faber: London, 1965.
10.
D. Charlesworth and B. Charlesworth. Theoretical genetics of Batesian mimicry II. Evolution of supergenes. Journal of Theoretical Biology, 55(2):305–324, December 1975. issn 0022-5193. doi: 10.1016/s0022-5193(75)80082-8.
OpenUrl CrossRef
11.
Michael Kopp and Joachim Hermisson. The evolution of genetic architecture under frequency-dependent disruptive selection. Evolution; International Journal of Organic Evolution, 60(8):1537–1550, August 2006. issn 0014-3820.
OpenUrl CrossRef PubMed Web of Science
12.↵
Jessica K. Abbott, Anna K. Nordén, and Bengt Hansson. Sex chromosome evolution: historical insights and future perspectives. Proceedings. Biological Sciences, 284(1854), May 2017. issn 1471-2954. doi: 10.1098/rspb.2016.2806.
OpenUrl CrossRef PubMed
13.↵
Mathieu Chouteau, Violaine Llaurens, Florence Piron-Prunier, and Mathieu Joron. Polymorphism at a mimicry supergene maintained by opposing frequency-dependent selection pressures. Proceedings of the National Academy of Sciences, page 201702482, June 2017. issn 0027-8424, 1091-6490. doi: 10.1073/pnas.1702482114.
OpenUrl Abstract/FREE Full Text
14.
B. Sinervo and C. M. Lively. The rock–paper–scissors game and the evolution of alternative male strategies. Nature, 380(6571):240–243, March 1996. issn 1476-4687. doi: 10.1038/380240a0.
OpenUrl CrossRef
15.↵
Mark R Christie, Gordon G McNickle, Rod A French, and Michael S Blouin. Life history variation is maintained by fitness trade-offs and negative frequency-dependent selection. Proceedings of the National Academy of Sciences, 115(17):4441–4446, 2018.
OpenUrl Abstract/FREE Full Text
16.↵
Mathieu Chouteau, Mónica Arias, and Mathieu Joron. Warning signals are under positive frequency-dependent selection in nature. Proceedings of the National Academy of Sciences, 113(8):2164–2169, February 2016. issn 0027-8424, 1091-6490. doi: 10.1073/pnas.1519216113.
OpenUrl Abstract/FREE Full Text
17.↵
Paul Jay, Annabel Whibley, Lise Frézal, María Ángeles Rodríguez de Cara, Reuben W. Nowell, James Mallet, Kanchon K. Dasmahapatra, and Mathieu Joron. Supergene Evolution Triggered by the Introgression of a Chromosomal Inversion. Current Biology, 28(11):1839–1845.e3, June 2018. issn 0960-9822. doi: 10.1016/j.cub.2018.04.072.
OpenUrl CrossRef
18.↵
Mark Kirkpatrick. How and Why Chromosome Inversions Evolve. PLoS Biology, 8(9), September 2010. issn 1544-9173. doi: 10.1371/journal.pbio.1000501.
OpenUrl CrossRef PubMed
19.↵
Rui Faria, Kerstin Johannesson, Roger K. Butlin, and Anja M. Westram. Evolving Inversions. Trends in Ecology & Evolution, 34(3):239–248, March 2019. issn 0169-5347. doi: 10.1016/j.tree.2018.12.005.
OpenUrl CrossRef
20.↵
Nina Stoletzki and Adam Eyre-Walker. Estimation of the neutrality index. Molecular Biology and Evolution, 28(1):63–70, January 2011. issn 1537-1719. doi: 10.1093/molbev/msq249.
OpenUrl CrossRef PubMed Web of Science
21.↵
Violaine Llaurens, Annabel Whibley, and Mathieu Joron. Genetic architecture and balancing selection: the life and death of differentiated variants. Molecular Ecology, 26(9):2430–2448, May 2017. issn 0962-1083. doi: 10.1111/mec.14051.
OpenUrl CrossRef
22.↵
Jianping Wang, Jong-Kuk Na, Qingyi Yu, Andrea R. Gschwend, Jennifer Han, Fanchang Zeng, Rishi Aryal, Robert VanBuren, Jan E. Murray, Wenli Zhang, Rafael Navajas-Pérez, F. Alex Feltus, Cornelia Lemke, Eric J. Tong, Cuixia Chen, Ching Man Wai, Ratnesh Singh, Ming-Li Wang, Xiang Jia Min, Maqsudul Alam, Deborah Charlesworth, Paul H. Moore, Jiming Jiang, Andrew H. Paterson, and Ray Ming. Sequencing papaya X and Yh chromosomes reveals molecular basis of incipient sex chromosome evolution. Proceedings of the National Academy of Sciences, 109(34):13710–13715, August 2012. issn 0027-8424, 1091-6490. doi: 10.1073/pnas.1207833109.
OpenUrl Abstract/FREE Full Text
23.↵
Eckart Stolle, Rodrigo Pracana, Philip Howard, Carolina I. Paris, Susan J. Brown, Claudia Castillo-Carrillo, Stephen J. Rossiter, and Yannick Wurm. Degenerative Expansion of a Young Supergene. Molecular Biology and Evolution, 36(3):553–561, March 2019. issn 0737-4038. doi: 10.1093/molbev/msy236.
OpenUrl CrossRef
24.↵
Baptiste Mayjonade, Jérôme Gouzy, Cécile Donnadieu, Nicolas Pouilly, William Marande, Caroline Callot, Nicolas Langlade, and Stéphane Muños. Extraction of high-molecular-weight genomic DNA for long-read sequencing of single molecules. BioTechniques, 61(4): 203–205, 2016. issn 1940-9818. doi: 10.2144/000114460.
OpenUrl CrossRef
25.↵
Neil I. Weisenfeld, Vijay Kumar, Preyas Shah, Deanna M. Church, and David B. Jaffe. Direct determination of diploid genome sequences. Genome Research, 27(5):757–767, 2017. issn 1549-5469. doi: 10.1101/gr.214874.116.
OpenUrl Abstract/FREE Full Text
26.↵
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215(3):403–410, October 1990. issn 0022-2836. doi: 10.1016/S0022-2836(05)80360-2.
OpenUrl CrossRef PubMed Web of Science
27.↵
Szymon M. Kielbasa, Raymond Wan, Kengo Sato, Paul Horton, and Martin C. Frith. Adaptive seeds tame genomic sequence comparison. Genome Research, 21(3):487–493, January 2011. issn 1088-9051, 1549-5469. doi: 10.1101/gr.113985.110.
OpenUrl Abstract/FREE Full Text
28.↵
Mikhail Kolmogorov, Joel Armstrong, Brian J. Raney, Ian Streeter, Matthew Dunn, Feng-tang Yang, Duncan Odom, Paul Flicek, Thomas M. Keane, David Thybert, Benedict Paten, and Son Pham. Chromosome assembly of large and complex genomes using multiple references. Genome Research, October 2018. issn 1088-9051, 1549-5469. doi: 10.1101/gr.236273.118.
OpenUrl Abstract/FREE Full Text
29.↵
Brandi L. Cantarel, Ian Korf, Sofia M.C. Robb, Genis Parra, Eric Ross, Barry Moore, Carson Holt, Alejandro Sánchez Alvarado, and Mark Yandell. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research, 18(1):188–196, January 2008. issn 1088-9051. doi: 10.1101/gr.6743907.
OpenUrl Abstract/FREE Full Text
30.↵
Suzanne V Saenko, Mathieu Chouteau, Florence Piron-Prunier, Corinne Blugeon, Mathieu Joron, and Violaine Llaurens. Unravelling the genes forming the wing pattern supergene in the polymorphic butterfly heliconius numata. EvoDevo, 10(1):1–12, 2019.
OpenUrl
31.↵
Hubley R & Green P. Smit, AFA. RepeatMasker Open-4.0. 2013-2015.
32.↵
Michael S. Campbell, MeiYee Law, Carson Holt, Joshua C. Stein, Gaurav D. Moghe, David E. Hufnagel, Jikai Lei, Rujira Achawanantakun, Dian Jiao, Carolyn J. Lawrence, Doreen Ware, Shin-Han Shiu, Kevin L. Childs, Yanni Sun, Ning Jiang, and Mark Yandell. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations. Plant Physiology, 164(2):513–524, February 2014. issn 1532-2548. doi: 10.1104/pp.113.230144.
OpenUrl Abstract/FREE Full Text
33.↵
John W. Davey, Mathieu Chouteau, Sarah L. Barker, Luana Maroja, Simon W. Baxter, Fraser Simpson, Richard M. Merrill, Mathieu Joron, James Mallet, Kanchon K. Dasmahapatra, and Chris D. Jiggins. Major Improvements to the Heliconius melpomene Genome Assembly Used to Confirm 10 Chromosome Fusion Events in 6 Million Years of Butterfly Evolution. G3: Genes, Genomes, Genetics, 6(3):695–708, March 2016. issn 2160-1836. doi: 10.1534/g3.115.023655.
OpenUrl Abstract/FREE Full Text
34.↵
Gerton Lunter and Martin Goodson. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Research, 21(6):936–939, June 2011. issn 1549-5469. doi: 10.1101/gr.111120.110.
OpenUrl Abstract/FREE Full Text
35.↵
Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, and Richard Durbin. The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16):2078–2079, August 2009. issn 1367-4803. doi: 10.1093/bioinformatics/btp352.
OpenUrl CrossRef PubMed Web of Science
36.↵
M.A. DePristo, E. Banks, R.E. Poplin, K.V. Garimella, J.R. Maguire, C. Hartl, A.A. Philip-pakis, G. del Angel, M.A Rivas, M. Hanna, A. McKenna, T.J. Fennell, A.M. Kernytsky, A.Y. Sivachenko, K. Cibulskis, S.B. Gabriel, D. Altshuler, and M.J. Daly. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature genetics, 43 (5):491–498, May 2011. issn 1061-4036. doi: 10.1038/ng.806.
OpenUrl CrossRef PubMed Web of Science
37.↵
Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert E. Handsaker, Gerton Lunter, Gabor T. Marth, Stephen T. Sherry, Gilean McVean, and Richard Durbin. The variant call format and VCFtools. Bioinformatics, 27(15): 2156–2158, August 2011. issn 1367-4803. doi: 10.1093/bioinformatics/btr330.
OpenUrl CrossRef PubMed Web of Science
38.↵
Xiuwen Zheng, David Levine, Jess Shen, Stephanie M. Gogarten, Cathy Laurie, and Bruce S. Weir. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics, 28(24):3326–3328, December 2012. issn 1367-4803. doi: 10.1093/bioinformatics/bts606.
OpenUrl CrossRef PubMed Web of Science
39.↵
Nicolas Lartillot and Hervé Philippe. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Molecular Biology and Evolution, 21(6): 1095–1109, June 2004. issn 0737-4038. doi: 10.1093/molbev/msh112.
OpenUrl CrossRef PubMed Web of Science
40.↵
P. Cingolani, A. Platts, M. Coon, T. Nguyen, L. Wang, S.J. Land, X. Lu, and D.M. Ruden. A program for annotating and predicting the effects of single nucleotide polymorphisms, snpeff: Snps in the genome of drosophila melanogaster strain w1118; iso-2; iso-3. Fly, 6 (2):80–92, 2012.
OpenUrl CrossRef PubMed Web of Science
41.↵
Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1):1–48, October 2015. issn 1548-7660. doi: 10.18637/jss.v067.i01.
OpenUrl CrossRef
42.↵
R Core Team. R: A language and environment for statistical computing., 2013.
43.↵
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-verlag new york edition, 2016.

Bibliography

1.
Suzanne V Saenko, Mathieu Chouteau, Florence Piron-Prunier, Corinne Blugeon, Mathieu Joron, and Violaine Llaurens. Unravelling the genes forming the wing pattern supergene in the polymorphic butterfly heliconius numata. EvoDevo, 10(1):1–12, 2019.
OpenUrl
2.
Mark D Robinson, Davis J McCarthy, and Gordon K Smyth. edger: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1):139–140, 2010.
OpenUrl CrossRef PubMed Web of Science
3.
Alexandros Stamatakis. Raxml version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313, 2014.
OpenUrl CrossRef PubMed Web of Science
4.
Paul Jay, Annabel Whibley, Lise Frézal, María Ángeles Rodríguez de Cara, Reuben W. Nowell, James Mallet, Kanchon K. Dasmahapatra, and Mathieu Joron. Supergene Evolution Triggered by the Introgression of a Chromosomal Inversion. Current Biology, 28(11):1839–1845.e3, June 2018. issn 0960-9822. doi: 10.1016/j.cub.2018.04.072.
OpenUrl CrossRef
5.
Mark Kirkpatrick. The evolution of genome structure by natural and sexual selection. Journal of Heredity, 108(1):3–11, 2016.
OpenUrl
6.
Nicola J Nadeau, Carolina Pardo-Diaz, Annabel Whibley, Megan A Supple, Suzanne V Saenko, Richard WR Wallbank, Grace C Wu, Luana Maroja, Laura Ferguson, Joseph J Hanly, et al. The gene cortex controls mimicry and crypsis in butterflies and moths. Nature, 534(7605):106, 2016.
OpenUrl CrossRef

View the discussion thread.

Posted August 15, 2019.

Download PDF

Citation Tools

Subject Area

Evolutionary Biology

Subject Areas

All Articles

Animal Behavior and Cognition (5214)
Biochemistry (11745)
Bioengineering (8751)
Bioinformatics (29195)
Biophysics (14971)
Cancer Biology (12095)
Cell Biology (17411)
Clinical Trials (138)
Developmental Biology (9421)
Ecology (14178)
Epidemiology (2067)
Evolutionary Biology (18306)
Genetics (12245)
Genomics (16801)
Immunology (11867)
Microbiology (28083)
Molecular Biology (11592)
Neuroscience (60965)
Paleontology (451)
Pathology (1870)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2885)
Systems Biology (7339)
Zoology (1651)

[1] 1.↵
John Wang, Yannick Wurm, Mingkwan Nipitwattanaphon, Oksana Riba-Grognuz, Yu-Ching Huang, DeWayne Shoemaker, and Laurent Keller. A Y-like social chromosome causes alternative colony organization in fire ants. Nature, 493(7434):664–668, January 2013. issn 1476-4687. doi: 10.1038/nature11832.
OpenUrl CrossRef PubMed Web of Science

[2] 2.↵
Sangeet Lamichhaney, Guangyi Fan, Fredrik Widemo, Ulrika Gunnarsson, Doreen Schwo-chow Thalmann, Marc P. Hoeppner, Susanne Kerje, Ulla Gustafson, Chengcheng Shi, He Zhang, Wenbin Chen, Xinming Liang, Leihuan Huang, Jiahao Wang, Enjing Liang, Qiong Wu, Simon Ming-Yuen Lee, Xun Xu, Jacob Höglund, Xin Liu, and Leif Andersson. Structural genomic changes underlie alternative reproductive strategies in the ruff (Philomachus pugnax). Nature Genetics, 48(1):84–88, January 2016. issn 1546-1718. doi: 10.1038/ng.3430.
OpenUrl CrossRef

[3] 3.↵
Clemens Küpper, Michael Stocks, Judith E. Risse, Natalie dos Remedios, Lindsay L. Farrell, Susan B. McRae, Tawna C. Morgan, Natalia Karlionova, Pavel Pinchuk, Yvonne I. Verkuil, Alexander S. Kitaysky, John C. Wingfield, Theunis Piersma, Kai Zeng, Jon Slate, Mark Blaxter, David B. Lank, and Terry Burke. A supergene determines highly divergent male reproductive morphs in the ruff. Nature Genetics, 48(1):79–83, January 2016. issn 1546-1718. doi: 10.1038/ng.3443.
OpenUrl CrossRef

[4] 4.↵
Elaina M. Tuttle, Alan O. Bergland, Marisa L. Korody, Michael S. Brewer, Daniel J. New-house, Patrick Minx, Maria Stager, Adam Betuel, Zachary A. Cheviron, Wesley C. Warren, Rusty A. Gonser, and Christopher N. Balakrishnan. Divergence and Functional Degradation of a Sex Chromosome-like Supergene. Current Biology, 26(3):344–350, February 2016. issn 0960-9822. doi: 10.1016/j.cub.2015.11.069.
OpenUrl CrossRef PubMed

[5] 5.↵
Mathieu Joron, Lise Frezal, Robert T. Jones, Nicola L. Chamberlain, Siu F. Lee, Christoph R. Haag, Annabel Whibley, Michel Becuwe, Simon W. Baxter, Laura Ferguson, Paul A. Wilkin-son, Camilo Salazar, Claire Davidson, Richard Clark, Michael A. Quail, Helen Beasley, Rebecca Glithero, Christine Lloyd, Sarah Sims, Matthew C. Jones, Jane Rogers, Chris D. Jiggins, and Richard H. ffrench Constant. Chromosomal rearrangements maintain a polymorphic supergene controlling butterfly mimicry. Nature, 477(7363):203–206, September 2011. issn 1476-4687. doi: 10.1038/nature10341.
OpenUrl CrossRef PubMed Web of Science

[6] 6.↵
K. Kunte, W. Zhang, A. Tenger-Trolander, D. H. Palmer, A. Martin, R. D. Reed, S. P. Mullen, and M. R. Kronforst. doublesex is a mimicry supergene. Nature, 507(7491):229–232, March 2014. issn 1476-4687. doi: 10.1038/nature13112.
OpenUrl CrossRef PubMed Web of Science

[7] 7.↵
Jinhong Li, Jonathan M. Cocker, Jonathan Wright, Margaret A. Webster, Mark McMullan, Sarah Dyer, David Swarbreck, Mario Caccamo, Cock van Oosterhout, and Philip M. Gilmartin. Genetic architecture and evolution of the S locus supergene in Primula vulgaris. Nature Plants, 2(12):16188, December 2016. issn 2055-0278. doi: 10.1038/nplants.2016.188.
OpenUrl CrossRef

[8] 8.↵
R. A. Fisher. The genetical theory of natural selection. The genetical theory of natural selection. Clarendon Press, Oxford, England, 1930. doi: 10.5962/bhl.title.27468.
OpenUrl CrossRef

[9] 9.
E.B. Ford. Genetic Polymorphism. Faber & Faber: London, 1965.

[10] 10.
D. Charlesworth and B. Charlesworth. Theoretical genetics of Batesian mimicry II. Evolution of supergenes. Journal of Theoretical Biology, 55(2):305–324, December 1975. issn 0022-5193. doi: 10.1016/s0022-5193(75)80082-8.
OpenUrl CrossRef

[11] 11.
Michael Kopp and Joachim Hermisson. The evolution of genetic architecture under frequency-dependent disruptive selection. Evolution; International Journal of Organic Evolution, 60(8):1537–1550, August 2006. issn 0014-3820.
OpenUrl CrossRef PubMed Web of Science

[12] 12.↵
Jessica K. Abbott, Anna K. Nordén, and Bengt Hansson. Sex chromosome evolution: historical insights and future perspectives. Proceedings. Biological Sciences, 284(1854), May 2017. issn 1471-2954. doi: 10.1098/rspb.2016.2806.
OpenUrl CrossRef PubMed

[13] 13.↵
Mathieu Chouteau, Violaine Llaurens, Florence Piron-Prunier, and Mathieu Joron. Polymorphism at a mimicry supergene maintained by opposing frequency-dependent selection pressures. Proceedings of the National Academy of Sciences, page 201702482, June 2017. issn 0027-8424, 1091-6490. doi: 10.1073/pnas.1702482114.
OpenUrl Abstract/FREE Full Text

[14] 14.
B. Sinervo and C. M. Lively. The rock–paper–scissors game and the evolution of alternative male strategies. Nature, 380(6571):240–243, March 1996. issn 1476-4687. doi: 10.1038/380240a0.
OpenUrl CrossRef

[15] 15.↵
Mark R Christie, Gordon G McNickle, Rod A French, and Michael S Blouin. Life history variation is maintained by fitness trade-offs and negative frequency-dependent selection. Proceedings of the National Academy of Sciences, 115(17):4441–4446, 2018.
OpenUrl Abstract/FREE Full Text

[16] 16.↵
Mathieu Chouteau, Mónica Arias, and Mathieu Joron. Warning signals are under positive frequency-dependent selection in nature. Proceedings of the National Academy of Sciences, 113(8):2164–2169, February 2016. issn 0027-8424, 1091-6490. doi: 10.1073/pnas.1519216113.
OpenUrl Abstract/FREE Full Text

[17] 17.↵
Paul Jay, Annabel Whibley, Lise Frézal, María Ángeles Rodríguez de Cara, Reuben W. Nowell, James Mallet, Kanchon K. Dasmahapatra, and Mathieu Joron. Supergene Evolution Triggered by the Introgression of a Chromosomal Inversion. Current Biology, 28(11):1839–1845.e3, June 2018. issn 0960-9822. doi: 10.1016/j.cub.2018.04.072.
OpenUrl CrossRef

[18] 18.↵
Mark Kirkpatrick. How and Why Chromosome Inversions Evolve. PLoS Biology, 8(9), September 2010. issn 1544-9173. doi: 10.1371/journal.pbio.1000501.
OpenUrl CrossRef PubMed

[19] 19.↵
Rui Faria, Kerstin Johannesson, Roger K. Butlin, and Anja M. Westram. Evolving Inversions. Trends in Ecology & Evolution, 34(3):239–248, March 2019. issn 0169-5347. doi: 10.1016/j.tree.2018.12.005.
OpenUrl CrossRef

[20] 20.↵
Nina Stoletzki and Adam Eyre-Walker. Estimation of the neutrality index. Molecular Biology and Evolution, 28(1):63–70, January 2011. issn 1537-1719. doi: 10.1093/molbev/msq249.
OpenUrl CrossRef PubMed Web of Science

[21] 21.↵
Violaine Llaurens, Annabel Whibley, and Mathieu Joron. Genetic architecture and balancing selection: the life and death of differentiated variants. Molecular Ecology, 26(9):2430–2448, May 2017. issn 0962-1083. doi: 10.1111/mec.14051.
OpenUrl CrossRef

[22] 22.↵
Jianping Wang, Jong-Kuk Na, Qingyi Yu, Andrea R. Gschwend, Jennifer Han, Fanchang Zeng, Rishi Aryal, Robert VanBuren, Jan E. Murray, Wenli Zhang, Rafael Navajas-Pérez, F. Alex Feltus, Cornelia Lemke, Eric J. Tong, Cuixia Chen, Ching Man Wai, Ratnesh Singh, Ming-Li Wang, Xiang Jia Min, Maqsudul Alam, Deborah Charlesworth, Paul H. Moore, Jiming Jiang, Andrew H. Paterson, and Ray Ming. Sequencing papaya X and Yh chromosomes reveals molecular basis of incipient sex chromosome evolution. Proceedings of the National Academy of Sciences, 109(34):13710–13715, August 2012. issn 0027-8424, 1091-6490. doi: 10.1073/pnas.1207833109.
OpenUrl Abstract/FREE Full Text

[23] 23.↵
Eckart Stolle, Rodrigo Pracana, Philip Howard, Carolina I. Paris, Susan J. Brown, Claudia Castillo-Carrillo, Stephen J. Rossiter, and Yannick Wurm. Degenerative Expansion of a Young Supergene. Molecular Biology and Evolution, 36(3):553–561, March 2019. issn 0737-4038. doi: 10.1093/molbev/msy236.
OpenUrl CrossRef

[24] 24.↵
Baptiste Mayjonade, Jérôme Gouzy, Cécile Donnadieu, Nicolas Pouilly, William Marande, Caroline Callot, Nicolas Langlade, and Stéphane Muños. Extraction of high-molecular-weight genomic DNA for long-read sequencing of single molecules. BioTechniques, 61(4): 203–205, 2016. issn 1940-9818. doi: 10.2144/000114460.
OpenUrl CrossRef

[25] 25.↵
Neil I. Weisenfeld, Vijay Kumar, Preyas Shah, Deanna M. Church, and David B. Jaffe. Direct determination of diploid genome sequences. Genome Research, 27(5):757–767, 2017. issn 1549-5469. doi: 10.1101/gr.214874.116.
OpenUrl Abstract/FREE Full Text

[26] 26.↵
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215(3):403–410, October 1990. issn 0022-2836. doi: 10.1016/S0022-2836(05)80360-2.
OpenUrl CrossRef PubMed Web of Science

[27] 27.↵
Szymon M. Kielbasa, Raymond Wan, Kengo Sato, Paul Horton, and Martin C. Frith. Adaptive seeds tame genomic sequence comparison. Genome Research, 21(3):487–493, January 2011. issn 1088-9051, 1549-5469. doi: 10.1101/gr.113985.110.
OpenUrl Abstract/FREE Full Text

[28] 28.↵
Mikhail Kolmogorov, Joel Armstrong, Brian J. Raney, Ian Streeter, Matthew Dunn, Feng-tang Yang, Duncan Odom, Paul Flicek, Thomas M. Keane, David Thybert, Benedict Paten, and Son Pham. Chromosome assembly of large and complex genomes using multiple references. Genome Research, October 2018. issn 1088-9051, 1549-5469. doi: 10.1101/gr.236273.118.
OpenUrl Abstract/FREE Full Text

[29] 29.↵
Brandi L. Cantarel, Ian Korf, Sofia M.C. Robb, Genis Parra, Eric Ross, Barry Moore, Carson Holt, Alejandro Sánchez Alvarado, and Mark Yandell. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research, 18(1):188–196, January 2008. issn 1088-9051. doi: 10.1101/gr.6743907.
OpenUrl Abstract/FREE Full Text

[30] 30.↵
Suzanne V Saenko, Mathieu Chouteau, Florence Piron-Prunier, Corinne Blugeon, Mathieu Joron, and Violaine Llaurens. Unravelling the genes forming the wing pattern supergene in the polymorphic butterfly heliconius numata. EvoDevo, 10(1):1–12, 2019.
OpenUrl

[31] 31.↵
Hubley R & Green P. Smit, AFA. RepeatMasker Open-4.0. 2013-2015.

[32] 32.↵
Michael S. Campbell, MeiYee Law, Carson Holt, Joshua C. Stein, Gaurav D. Moghe, David E. Hufnagel, Jikai Lei, Rujira Achawanantakun, Dian Jiao, Carolyn J. Lawrence, Doreen Ware, Shin-Han Shiu, Kevin L. Childs, Yanni Sun, Ning Jiang, and Mark Yandell. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations. Plant Physiology, 164(2):513–524, February 2014. issn 1532-2548. doi: 10.1104/pp.113.230144.
OpenUrl Abstract/FREE Full Text

[33] 33.↵
John W. Davey, Mathieu Chouteau, Sarah L. Barker, Luana Maroja, Simon W. Baxter, Fraser Simpson, Richard M. Merrill, Mathieu Joron, James Mallet, Kanchon K. Dasmahapatra, and Chris D. Jiggins. Major Improvements to the Heliconius melpomene Genome Assembly Used to Confirm 10 Chromosome Fusion Events in 6 Million Years of Butterfly Evolution. G3: Genes, Genomes, Genetics, 6(3):695–708, March 2016. issn 2160-1836. doi: 10.1534/g3.115.023655.
OpenUrl Abstract/FREE Full Text

[34] 34.↵
Gerton Lunter and Martin Goodson. Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Research, 21(6):936–939, June 2011. issn 1549-5469. doi: 10.1101/gr.111120.110.
OpenUrl Abstract/FREE Full Text

[35] 35.↵
Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, and Richard Durbin. The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16):2078–2079, August 2009. issn 1367-4803. doi: 10.1093/bioinformatics/btp352.
OpenUrl CrossRef PubMed Web of Science

[36] 36.↵
M.A. DePristo, E. Banks, R.E. Poplin, K.V. Garimella, J.R. Maguire, C. Hartl, A.A. Philip-pakis, G. del Angel, M.A Rivas, M. Hanna, A. McKenna, T.J. Fennell, A.M. Kernytsky, A.Y. Sivachenko, K. Cibulskis, S.B. Gabriel, D. Altshuler, and M.J. Daly. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature genetics, 43 (5):491–498, May 2011. issn 1061-4036. doi: 10.1038/ng.806.
OpenUrl CrossRef PubMed Web of Science

[37] 37.↵
Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A. Albers, Eric Banks, Mark A. DePristo, Robert E. Handsaker, Gerton Lunter, Gabor T. Marth, Stephen T. Sherry, Gilean McVean, and Richard Durbin. The variant call format and VCFtools. Bioinformatics, 27(15): 2156–2158, August 2011. issn 1367-4803. doi: 10.1093/bioinformatics/btr330.
OpenUrl CrossRef PubMed Web of Science

[38] 38.↵
Xiuwen Zheng, David Levine, Jess Shen, Stephanie M. Gogarten, Cathy Laurie, and Bruce S. Weir. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics, 28(24):3326–3328, December 2012. issn 1367-4803. doi: 10.1093/bioinformatics/bts606.
OpenUrl CrossRef PubMed Web of Science

[39] 39.↵
Nicolas Lartillot and Hervé Philippe. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Molecular Biology and Evolution, 21(6): 1095–1109, June 2004. issn 0737-4038. doi: 10.1093/molbev/msh112.
OpenUrl CrossRef PubMed Web of Science

[40] 40.↵
P. Cingolani, A. Platts, M. Coon, T. Nguyen, L. Wang, S.J. Land, X. Lu, and D.M. Ruden. A program for annotating and predicting the effects of single nucleotide polymorphisms, snpeff: Snps in the genome of drosophila melanogaster strain w1118; iso-2; iso-3. Fly, 6 (2):80–92, 2012.
OpenUrl CrossRef PubMed Web of Science

[41] 41.↵
Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1):1–48, October 2015. issn 1548-7660. doi: 10.18637/jss.v067.i01.
OpenUrl CrossRef

[42] 42.↵
R Core Team. R: A language and environment for statistical computing., 2013.

[43] 43.↵
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-verlag new york edition, 2016.

Mutation accumulation in chromosomal inversions maintains wing pattern polymorphism in a butterfly

Abstract

Methods

Sampling and sequencing

Whole genome assemblies analysis

Population Genomic Analysis

Deleterious mutation accumulation

Fitness Assay

AUTHOR CONTRIBUTIONS

COMPETING FINANCIAL INTERESTS

ACKNOWLEDGEMENTS

Bibliography

Bibliography

Citation Manager Formats

Subject Area