Abstract
Siamese fighting fish, commonly known as betta, are among the world’s most popular and morphologically diverse pet fish, but the genetic processes leading to their domestication and phenotypic diversification are largely unknown. We assembled de novo the genome of a wild Betta splendens and whole-genome sequenced multiple individuals across five species within the B. splendens species complex, including wild populations and domesticated ornamental betta. Given our estimate of the mutation rate from pedigrees, our analyses suggest that betta were domesticated at least 1,000 years ago, centuries earlier than previously thought. Ornamental betta individuals have variable contributions from other Betta species and have also introgressed into wild populations of those species. We identify dmrt1 as the main sex determination gene in ornamental betta but not in wild B. splendens, and find evidence for recent directional selection at the X-allele of the locus. Furthermore, we find genes with signatures of recent, strong selection that have large effects on color in specific parts of the body, or the shape of individual fins, and are almost all unlinked. Our results demonstrate how simple genetic architectures paired with anatomical modularity can lead to vast phenotypic diversity generated during animal domestication, and set the stage for using betta as a modern system for evolutionary genetics.
One-Sentence Summary Genomic analyses reveal betta fish were domesticated more than 1,000 years ago and the genes that changed in the process.
Main Text
Domesticated animals have provided important insights into the genetic bases of a wide range of morphological, physiological, and behavioral traits. Because of their intimate relationship with people, domesticates have also furthered our understanding of human history and culture, and of our interactions with other species (1). Genetic studies of animal domestication, however, have largely focused on mammals and birds (1, 2), and only few genome-wide analyses of fish domestication have been performed (3–5).
Siamese fighting fish have been selectively bred for fighting in Southeast Asia for centuries, with reports dating back to as early as the 14th century A.D. in Thailand, making them one of the oldest fish domestications (6). Starting in the early 20th century, Siamese fighting fish also began to be bred for ornamental purposes, becoming one of the world’s most popular pet fish, commonly known as betta (7). Although it is generally presumed —based on morphology and few genetic markers (8, 9)— that domesticated fighting fish derive mainly from Betta splendens, it has been suggested that other closely related species (collectively called the Betta splendens species complex) may have contributed to modern varieties (10). Ornamental betta have been diversified from their short finned ancestors into an astonishing array of fin morphologies, colors and pigmentation patterns, providing a rich phenotypic repertoire for genetic analysis. This remarkable and long history of domestication for fighting, followed by breeding for ornamental purposes, combined with one of the smallest vertebrate genomes at only ~450 megabase pairs (Mbp) (11–13), makes betta an appealing subject for evolutionary genetic studies of domestication.
Here, we use a synergistic combination of population and quantitative genetic approaches to investigate the historical processes and molecular changes that lead to the domestication and phenotypic diversification of betta fish.
A wild Betta splendens reference genome
We generated a high-quality reference genome assembly of wild B. splendens using long-read PacBio technology, optical mapping with BioNano, scaffolding with 10X Genomics linked reads, and polishing with Illumina short reads. We obtained a genome reference comprised of 441 Mb, of which 98.6% is assigned to the 21 chromosomes expected from its karyotype (14), with a contig N50 of 2.50 Mb and scaffold N50 of 20.13 Mb, meeting the standards set forth by the Vertebrate Genomes Project (15). To annotate the genome, we performed RNA sequencing from male and female brain, fin, liver, spleen, and gonad. This annotated reference genome is now the representative B. splendens reference in NCBI (fBetSpl5.3, GCA_900634795.3).
To discover structural chromosomal rearrangements that may have arisen during domestication, we performed whole genome alignments using three ornamental betta references (11–13) and our wild B. splendens reference, with Anabas testudineus (climbing perch) as an outgroup (8, 15). Except for a large intrachromosomal rearrangement of chromosome 16 in ornamental betta, the genome was largely syntenic between wild B. splendens and ornamental betta (Suppl. Fig. 1, Note 1).
Complex evolutionary relationships between Betta species
To determine the genetic origin of ornamental betta and understand its relationships with species of the Betta splendens complex, we sequenced to ~15× coverage the whole genomes of (i) 37 ornamental betta from different sources, representing a diversity of ornamental traits (Fig. 1A,B; Suppl. Table 1); (ii) 58 wild individuals, including representatives of all species of the B. splendens complex (except for Betta stiktos), and four populations of B. splendens from different parts of its natural range (Fig. 1A); and (iii) an outgroup (Betta compuncta). We aligned the sequencing reads to our B. splendens reference genome, then called and filtered variants to generate a final set of 27.8 million phased biallelic SNPs.
We first assessed relationships across the wild species of the B. splendens complex by constructing neighbor-joining (NJ) and maximum-likelihood (ML) based phylogenies (Fig. 1B; Suppl. Fig. 2A,B). We observed strong bootstrap support for B. smaragdina as the outgroup to the other species of the B. splendens complex with B. mahachaiensis as the outgroup to the remaining species. B. imbellis and B. siamorientalis together form a sister clade to all wild B. splendens populations.
We then tested for evidence of evolutionary processes that violate tree-like species relationships such as hybridization, by computing ABBA-BABA statistics (Patterson’s D and f4 admixture ratio (16)) for all triplets of individuals organized according to the phylogeny. This analysis revealed widespread patterns of excess allele sharing between non-sister species, suggesting that the speciation history of these groups was complex, involving either structured ancestral populations, cross-species gene flow, or both (Fig. 1B; Suppl. Fig. 3A; Suppl. Note 2). Interestingly, two out of three B. mahachaiensis samples and one of the two B. imbellis samples showed highly significant excess allele sharing with B. splendens populations compared to their conspecifics sampled from different locations, consistent with gene flow from B. splendens into particular populations of these species (Suppl. Fig. 3A,B; Suppl. Note 2).
Ornamental betta derive from B. splendens but have variable contributions from other species
Adding the ornamental betta samples to the phylogeny, we found that they cluster with B. splendens (Fig. 1B; Suppl. Fig. 2). This result was also observed through principal component analysis (PCA), where ornamental betta showed no apparent loading on axes representing non-splendens species (Suppl. Fig. 7). In both phylogenies and PCA, ornamentals form a clearly defined group distinct from all wild B. splendens populations (Fig. 1B,C). These results indicate that ornamental betta are genetically most similar to B. splendens.
To test whether ornamental betta carry non-splendens ancestry, we computed all ABBA-BABA tests of the form D(ornamental except focal, focal ornamental; non-splendens species, outgroup) (Suppl. Fig. 4G-J). These tests revealed that 76% (28 out of 37) of ornamental betta carry significant ancestry from non-splendens species. To examine the chromosomal distribution of non-splendens ancestry in these individuals, we computed regional ABBA-BABA statistics (fdM) along their genomes and confirmed non-splendens ancestry in high-fdM regions by constructing local gene trees (Fig. 1D,E; Suppl. Figs. 8,12). The analyses revealed that signals of excess allele sharing are driven by genomic tracts where one or, more rarely, both haplotypes of the focal sample clustered with B. imbellis or B. mahachaiensis (Supp. Fig. 8). The genomic locations of these tracts, which encompass between 0 and 6% of the genomes of ornamental betta (Fig. 1D), are generally different among individuals (Suppl. Fig.12). The ornamental sample with the second highest levels of introgression from other species is particularly interesting, since some of its chromosomes are a mosaic of alternating regions of B. imbellis and B. mahachaiensis ancestry, consistent with a natural or man-made hybrid of those species having been backcrossed into ornamental betta (Fig. 1E). Altogether, our analyses indicate that ornamental betta are clearly derived from B. splendens, yet most individuals have relatively recent contributions from B. mahachaiensis and B. imbellis.
Ornamental betta introgression is widespread among wild Betta
Interestingly, the topology of relationships between wild B. splendens populations in NJ-based phylogenies changed after including ornamental bettas (Suppl. Fig. 2A,C; Suppl. Note 3,4). To further investigate this, we computed ABBA-BABA statistics within the framework of the phylogeny including ornamentals (Suppl. Fig. 2C), and assessed each individual’s relationship with respect to the other species of the Betta splendens species complex, as well as ornamentals (Suppl. Fig. 5A,7). Together, these analyses revealed strong evidence for ornamental betta ancestry in two out of three wild B. mahachaiensis samples and in individuals from three out of four populations of wild B. splendens (Suppl. Note 4). Investigating the signals along the genome, we found that for the two B. mahachaiensis samples, mahachaiensis-like and ornamental-like haplotypes alternate at near-chromosome scale, suggesting an ornamental ancestor only a few generations back (Suppl. Fig. 6B). Conversely, for wild B. splendens individuals with ornamental betta ancestry, the genome-wide signals of excess allele sharing with ornamentals were diffusely distributed along the chromosomes with only a few relatively short, clearly distinguishable ornamental haplotypes (Suppl. Fig. 6A), suggesting that there was enough time for introgressed haplotypes to be broken down by recombination. In summary, ornamental introgression into wild Betta seems to be geographically diffuse and to have happened both long ago and very recently. This finding is perhaps related to the practice by breeders of releasing excess domesticated betta into the wild and may constitute a conservation threat to wild Betta populations.
Timing the domestication of B. splendens
To determine when ornamental betta initially diverged from wild populations, we performed coalescence-based demographic analysis. In order to date events in the domestication of B. splendens, we needed to know the germline mutation rate. To determine this, we sequenced an ornamental trio and a quartet to >30× coverage and found the mutation rate to be 3.75×10−9 per bp per generation (95% CI: 9.05×10−10 to 9.39×10−9). This rate is similar to the rate previously inferred for cichlids (17) and approximately 3-fold lower than that of humans (18). Assuming a generation time of six months (7), our demographic analyses suggest that ornamental and wild populations began to split around 4,000 years ago (~1,000 to ~7,000 years based on mutation rate CI). This divergence was coupled to a reduction in population size in ornamental betta as would be expected if a subset of wild individuals began to be bred in captivity (Fig. 1F; Suppl. Fig. 9). Low nucleotide diversity (0.00137 per bp in wild fish and 0.00113 per bp in ornamental betta) and elevated linkage disequilibrium relative to the wild populations further support a decrease in population size throughout domestication that has not fully recovered (Suppl. Figs. 10,12). Even the lower bound (~1,000 years ago) for divergence of the ornamental betta population from wild is earlier than the origin of domestication in the 14th century previously suggested by historical documents (6).
Genetic signals of selection in ornamental betta
Genetic variants that increase fitness in captivity or that are associated with phenotypic traits actively selected by breeders are expected to increase in frequency during domestication. To discover such loci with signatures of selective sweeps in ornamental betta, we searched for extended homozygosity tracts using H-scan (19) and for high-frequency haplotypes using G12 (20) across 37 ornamental betta (Fig. 2A). Both tests identified concordant loci with strong evidence of selective sweeps in 11 of the 21 B. splendens chromosomes, and peaks remained when run on a downsampled set of 24 ornamentals (Supp. Fig 11). Equivalent selection scans using whole-genome sequencing of 24 wild B. splendens did not reveal clear signals (Fig. 2A). These results are consistent with footprints of selection in ornamental betta being related to the domestication process.
The most prominent selection peak shared across ornamentals but absent in wild B. splendens falls on chromosome 9 and is centered on zinc and ring finger 3 (znrf3). In zebrafish, znrf3 is required for the formation of fin rays, and in mammals it is required for limb formation and testis development (21–23). All the ornamental fish we sequenced have large fins compared to wild B. splendens. Therefore, we hypothesize that znrf3 has contributed to either sexual development or the expansion of fins during betta domestication.
The majority (34/37) of the ornamental betta we sequenced represented four of the most popular varieties along two phenotypic dimensions: color and fin morphology. The fish were royal blue (n=17), solid red (n=17), veiltail (n=18) and crowntail (n=16), represented by males (n=20) and females (n=17). Veiltails are characterized by large, flowing caudal fins, and crowntails have fins that are webbed between the rays (Fig. 1B). To determine whether the footprints of selection we detected were driven by fish of a particular variety or sex, we compared H-scan and haplotype frequencies across subsets of fish representing these traits (Fig. 2B-E).
A peak close to znrf3, centered on double-sex and mab-3 related transcription factor 1 (dmrt1), became apparent when comparing males to females (Fig. 2C). dmrt1 is critical for gonad development in vertebrates, and functions as the sex determination gene in several fish species (24–26), in Xenopus laevis frogs (27), and in birds (28), suggesting dmrt1 has a role in sex determination in betta.
A strong sweep in blue fish on chromosome 2 harbors multiple genes involved in pigmentation (Fig. 2B): proopiomelanocortin (pomc), which encodes alpha and beta melanocyte stimulating hormones) (29); T-box transcription factor 19 (tbx19), which encodes a transcription factor expressed specifically in pituitary cells that will express pomc (30); xanthine dehydrogenase (xdh), which encodes an enzyme whose homologs synthesize yellow-red pteridine pigments (31, 32); ALK and LTK-ligand 2-like (alkal2l), which encodes a cell-signaling molecule important for the development of iridophores (33–36), and beta-carotene oxygenase like-1 (bco1l), which encodes an enzyme whose homologs metabolize orange-red carotenoid pigments (37, 38). These results suggest that one or more of these pigmentation genes were a target of selection by betta breeders.
Two selection peaks, one on chromosome 22 and another on chromosome 24, were not detected when all ornamental fish were combined or in an analysis including only veiltail fish, but were significant in the subset of crowntail fish (Fig. 2D,E), suggesting their importance to crowntail fin morphology.
The evolution of sex determination
To test whether the loci containing znrf3 and dmrt1, which had evidence of a selective sweep in ornamental betta, are involved in sex determination, we performed a genomewide association study (GWAS) using sex as the phenotype. We focused on ornamental betta, since we had a large enough sample size (20 males and 17 females) to detect variants with large effect on sex. A ~30-kb region overlapping dmrt1 but not znrf3 was strongly associated with sex, with 16/17 females being homozygous at the most strongly associated SNPs, while 16/20 males were heterozygous (Fig. 3A,C). We call “Y” the male-specific allele of dmrt1 and “X” the allele present in both males and females. These results strongly implicate dmrt1 as the sex determination gene in ornamental betta and indicate that males are the heterogametic sex.
The reference genome was generated from a wild male B. splendens, so genomic sequences present only in females or only in ornamental betta would not be represented in the SNPs that we used for GWAS. Only 0.5% of sequencing reads of individuals from both sexes could not be mapped to the reference genome (male vs female P=0.64), indicating there are no major sex-specific regions that are absent from the reference (Suppl. Fig 13A). To test if smaller-scale sequence differences were associated with sex, we performed a GWAS independent of the reference genome using k-mers from the sequencing reads. We found k-mers significantly associated with sex, and when we assembled those k-mers into contigs, they corresponded to dmrt1, consistent with the results from SNP-based GWAS (Suppl. Fig. 13B). Smaller copy number variations (CNVs) not captured by genome size estimation or by k-mers could be associated with sex but not be tagged by linked SNPs. To test for this possibility, we compared the frequency of individual CNVs genomewide between the sexes, but none were significantly associated (Suppl. Fig 13C). Although sex chromosomes often carry chromosomal rearrangements, we found no evidence of an inversion in X or Y (Fig. 3D and Methods). These results indicate that, at this level of detection, only a small genomic region <30 kb within otherwise non-sexually differentiated chromosomes (autosomes) distinguish female and male ornamental betta.
Because dmrt1 had a strong signal of a selective sweep in ornamental betta, we hypothesized that dmrt1’s role in sex determination evolved rapidly during domestication. To explore the relationship between dmrt1 and sex in wild and ornamental betta, we first built a phylogenetic tree of the dmrt1 locus defined as a ~30 kb linkage-disequilibrium block (Fig. 3E). Consistent with the selective sweep, all ornamental females, but only one wild female, had a particular haplotype we call X1. In wild B. splendens, 50% (6/12) of XX individuals were female and 91% (10/11; binomial P=0.00048) of XY individuals were male (Fig. 3F,G). While this evidence suggests dmrt1_Y promotes maleness in wild B. splendens, it is possible that multiple sex determination systems segregate in the wild, similar to what is seen in African cichlids (39). In contrast, in ornamental betta, 87% (94/108; the 17 fish in the GWAS plus 91 independent samples; binomial P<10−12) of XX individuals were female and 93% (83/89; binomial P<10−12) of XY individuals were male (Fig. 3F,G). These results are consistent with a higher penetrance of XX in promoting female development in ornamental betta than in wild B. splendens (Fisher’s exact test two-tailed P=0.005) and suggest this effect contributed to the selective sweep around dmrt1. In line with selection at the dmrt1 locus occurring preferentially on the X, the ornamental X1 haplotype had 33% lower nucleotide diversity than the ornamental Y haplotype. Assuming no sex differences in mutation rates, X has ~3× more opportunity to accumulate mutations than Y, since it is present as two alleles in most females (XX) but only as one in most males (XY). Adjusting by this 3:1 ratio of X to Y, X1 has 78% lower diversity than Y. The more marked decrease in diversity on X1 supports the hypothesis that selection in ornamental betta has preferentially occurred in the X1 haplotype.
Since dmrt1 XX-XY status was not perfectly related to gonadal sex, we searched for additional sex-linked loci that may have been missed by GWAS. To do so, we performed two quantitative trait locus (QTL) mapping experiments, one in a cross between an XX female and an XY male, and another between an XX female and an XX male. In the XX × XY cross, 52% of the offspring were female and we detected a single sex-linked locus encompassing dmrt1 (Fig 3B,H). In the XX × XX cross, 90% of the offspring were female and no locus was linked to sex (Fig 3B,H). In the XX × XY cross, 85% of the XX offspring were female and 90% of the XY offspring were male. However, in the XX × XX cross all offspring were XX yet 10% of these fish developed as males, confirming the incomplete penetrance of the XX-XY locus in sex determination, as has been observed in Oryzias latipes (medaka fish) that also bear a dmrt1 XX-XY sex determination system (40). In sum, these results confirm that the dmrt1 locus is strongly linked to sex in ornamental betta, but that XX and XY are neither necessary nor sufficient to determine a particular sex.
To determine whether the X and Y transcripts of dmrt1 are differentially expressed during sex determination, we performed allele-specific expression analyses in XY ornamental larvae at several time points after fertilization. The results indicated that the dmrt1 Y allele constitutes 65% of the dmrt1 mRNA molecules at 4 days post fertilization (dpf) and that this allelic bias progressively decreases at 8 and 12 dpf, until it reverses in adult testis, where only 45% of the dmrt1 transcripts originate from the Y allele (Fig. 3I). This timing of dmrt1 XY allele-specific expression is consistent with that of sex determination, since we found that by 4 dpf, XX and XY larvae have started the process of sex differentiation: XY larvae express higher levels of gonadal soma derived factor (gsdf), a teleost-specific gene essential for testis development (41, 42), and higher levels of antimullerian hormone (amh), a gene that promotes vertebrate male development (Fig 3J). Each of these genes are the sex determination locus in other fish species (43, 44) and are in separate chromosomes from dmrt1 in betta, indicating that their sex-specific expression is a response in trans to dmrt1. Thus, the variants that distinguish dmrt1 X from Y are associated with higher expression of the dmrt1 Y allele in a manner that is temporally linked to sex differentiation, further implicating dmrt1 as the major sex determination gene in ornamental betta.
Genetic bases of coloration in ornamental betta
Ornamental betta breeders have generated a vast array of fish varieties (e.g. “royal blue”) that differ along multiple axes of coloration: hue, brightness, saturation, and the anatomical distribution of these features. To determine if any of the genes we found to be under strong selection, as well as any others, contribute to coloration in ornamental betta, we performed a GWAS of the red (n=17) and blue (n=17) fish that were used for the selection scans (Fig. 4A; Suppl. Fig. 14A). Red and blue fish lie at opposite ends of the betta hue spectrum and also differ in their brightness and saturation (Fig. 4A,C; Suppl. Fig. 15). However, association mapping alone between pure red and pure blue fish, which are largely fixed for all these color features, cannot establish which of these features are affected by significant loci. Therefore, we also performed a QTL mapping experiment by generating a second-generation (F2) hybrid population of red-blue fish in which individual coloration components could segregate (Fig. 4B). In these 211 F2 hybrids, we measured the proportion of the anal, caudal and dorsal fins, of the side of the body, and of the head, that was red, blue, or very dark (which we refer to as black). We also measured the hue, brightness, and saturation of the red and blue areas on each body part and used these phenotypes for QTL mapping.
The strongest GWAS signal occurred between augmentator-α2 (alkal2l) and beta-carotene oxygenase 1-like (bco1l) on chromosome 2 (Fig. 4A,K), a region with a large difference in selection sweep signal between blue and red fish (Fig. 2B). This GWAS peak was aligned with a QTL at which the allele swept in blue fish increased the proportion of blue and decreased the proportion of red on fins and body in the hybrids (Fig 4E,G). Interestingly, this locus modulates blue saturation only on the body and not on the fins or the head (Fig. 4E). alkal2l encodes a ligand of Leukocyte Tyrosine Kinase (33, 36) which is expressed in the precursors of iridophores, the chromatophores that generate refractive colors such as blue (34, 35). In zebrafish, alkal2l is necessary for iridophore development (33). Altogether, this suggests that the large number of iridophores in blue ornamental betta, compared to red fish, is caused by genetic variation affecting this developmental cell-signaling ligand. alkal2l likely corresponds to the gene referred to by betta breeders as the spread iridocyte gene, hypothesized to increase the prevalence of iridescence throughout the body (45).
Notably, the alkal2l–bco1l locus also modulated the red hue of the red parts of the body (Fig. 4E,G), suggesting that bco1l, which encodes a protein predicted to metabolize orange-red carotenoids, could also be involved in differences between red and blue fish. Through biochemical assays, we found that, as predicted by its sequence homology to other BCO1 proteins, BCO1L has 15,15′-dioxygenase activity that cleaves β-carotene into two molecules of all-trans retinal (Suppl. Fig. 16A-D). Consistent with the QTL effect on red hue and BCO1L biochemical activity, we found that red fish have more β-carotene and echinenone in their skin than blue fish (Fig. 4D). One of the bco1l variants most strongly associated with red and blue coloration results in a change from threonine in red fish to isoleucine in blue fish (Suppl. Fig. 14B and Suppl. Fig. 16E,F). We did not detect differential biochemical activity of the two alleles in vitro, but it is possible that their activity, stability, or gene expression differs in vivo (Suppl. Fig. 14C). Therefore, variation in the locus containing alkal2l and bco1l likely affects both blue and red coloration through these two genes located only ~50 kb apart. The tight linkage might explain why breeders struggle to make the “perfect” red fish without any iridescence.
The second strongest GWAS peak, on chromosome 8, mapped to adenylosuccinate lyase (adsl), and the strongest QTL at this locus was for the brightness of blue areas on the body (Fig. 4E,I,M). adsl encodes an enzyme involved in the de novo synthesis of purines (46). Purines are the major components of the reflective platelets in fish skin iridophores that underlie iridescence (47), and these platelets differ in structure between blue and red betta fish (48). While the homologs of adsl have not been previously implicated in animal coloration, mutations in other genes in the de novo purine synthesis pathway cause iridophore defects in zebrafish (49). adsl likely corresponds to the gene betta breeders refer to as blue (48, 50–52).
The third strongest GWAS peak, on chromosome 1, mapped to solute carrier family 2, member 15b (slc2a15b), a gene necessary for the development of larval yellow xanthophores in medaka (53), but whose role in adult pigmentation was previously not described (Fig. 4A,J). We found a QTL that overlaps slc2a15b that strongly affected the saturation of red areas in the fins, but not of the body or the head (Fig. 4E,F). Intense coloration on the fins relative to the body is a phenotype referred to by breeders as the “Cambodian” variety, and our results suggest slc2a15b contributes to this phenotype.
The fourth strongest GWAS peak, on chromosome 6, mapped to kit ligand (kitlga), whose orthologues affect melanin pigmentation in other fish and in mammals (54, 55) (Fig. 4A,L). A QTL overlapping kitlga strongly modulated the proportion of black, blue, and red on the head and fins, but less so on the body (Fig. 4E,H). A black head, a phenotype we found is linked to kitlga, is referred to by breeders as the mask trait (7). This QTL also modified the saturation of blue areas on the fins but not on the body, and had minor effects on red saturation outside the head. Its comparatively stronger impact on blue saturation may be related to the tight histological association of iridophores and melanophores as a unit in betta skin (48).
Altogether, we discovered that red-blue variation in ornamental betta is linked to genetic polymorphisms near two genes encoding cell-signaling ligands (alkal2l and kitlga), two enzymes (bco1l, which metabolizes pigments, and adsl, which produces material for reflective structures), and a membrane solute transporter (slc2a15b). Genes we identified likely correspond to those inferred, but not molecularly identified, by betta geneticists beginning in the 1930s (52, 56). Notably, all of these genes had anatomical specificity, and all but two were on separate chromosomes (Fig. 4A,D).
Genetic bases of tail morphology in ornamental betta
We found strong signals of selective sweeps in crowntail fish on chromosomes 22 and 24, suggesting these regions could harbor variants associated with crown morphology (Fig. 2D,E). To identify such variants within selective peaks, and elsewhere throughout the genome, we performed a GWAS with the 18 veiltail and 16 crowntail fish used for the selection scans. We found two significant peaks, one on chromosome 22 and another on chromosome 24, overlapping the selection peaks (Fig. 5A), indicating that these regions are not only under selection but are the main loci contributing to differences between veiltail and crowntail fish.
To confirm the involvement of the GWAS loci in fin morphology, we performed a QTL mapping experiment in an F2-hybrid population from a cross between veil and crowntail fish (Fig. 5B). In agreement with the GWAS results, we found two significant QTLs, one on chromosome 22 and another on chromosome 24, that overlap the GWAS peaks. Surprisingly, we found that the chromosome 22 locus is significantly linked only to anal fin webbing and not caudal fin webbing, whereas the chromosome 24 locus is linked to caudal fin webbing but not significantly linked to anal fin webbing (Fig. 5B). These complementary association and quantitative mapping experiments demonstrate that two loci are the primary determinants of veil–crown morphology and that webbing of different fins is under separate genetic control.
Examining the genes at the crown-veil GWAS peaks identified promising causal genes. The strongest association signal on chromosome 22 maps to frmd6, which encodes the protein willin that regulates tissue growth as part of the hippo pathway (57) (Fig. 5C,D; Suppl. Fig. 17). The region of strongest association on chromosome 24 is larger and encompasses 22 genes. Of these, tfap2b and tfap2d, which have evolutionarily ancient roles in ectodermal development (58), are prominent candidate genes (Fig. 5C,D; Suppl. Fig. 17). Interestingly, as with the variants that affect coloration, we also find evidence of anatomical modularity for the variants that affect fin morphology. These results also demonstrate that there is no single “crowntail gene”, as had been speculated by ornamental betta breeders (7).
Discussion
Using whole genome sequencing of multiple Betta species, populations, and individuals, we take an important first step in unraveling the domestication history of betta fish. Our results suggest that betta were domesticated more than 1,000 years ago —at least three centuries earlier than previously suggested. While domesticated betta are largely derived from Betta splendens, they carry genetic contributions from two other species that are also endemic to the Malay Peninsula: B. imbellis and B. mahachaiensis. None of the alleles derived from these other Betta species are present in all ornamental individuals, nor do they contribute to the regions under selection driving sex determination, coloration, or fin morphologies (Supp. Fig 12). These introgressed alleles, however, might contribute to other traits of domesticated betta or represent historical attempts by breeders to introduce new phenotypes into ornamental fish through hybridization.
The strongest genetic evidence of selection during domestication involves dmrt1, which we discover is the sex determination gene in ornamental betta. Most females are XX and most males are XY, settling a long-standing question in the field (59). A selective sweep of a dmrt1_X allele with increased penetrance may have been selected by breeders, since it would lead to more predictable sex ratios in spawns. The lower penetrance of dmrt1 on sex determination in wild B. splendens suggests that additional sex determination loci that operate in the wild are not present in domesticated betta, similar to what is seen in zebrafish (60). In contrast to domesticated zebrafish, where sex is not determined by a single locus, in ornamental betta sex is predominantly determined by a single large-effect locus that maps to dmrt1.
In poeciliid fishes such as guppies and swordtails, the sex determination locus is linked to multiple color genes that contribute to sexually dimorphic coloration and shape the genetics of female preference for male color traits (61). In contrast, the betta sex determination locus is only ~30 kb in size and is not linked to genes known to affect color. These results are consistent with coloration not being particularly sexually dimorphic in betta. Instead, we find that color and fin morphology in betta has a lego-like logic, in which major-effect genes located on different chromosomes modulate color and fin morphology with surprising anatomical specificity (Suppl. Table 2). Betta breeders are keenly aware of the mix-and-match possibilities of betta, and leverage this feature to breed new fish varieties by combining different body, head, and fin colors with various fin morphologies.
Our results provide molecular entry points for further study of the developmental and evolutionary bases of change in morphology and sex determination. The genomic resources we generated will also enable genetic studies into how centuries of artificial selection of betta for fighting purposes have shaped their aggression and other fighting-related traits. Altogether, our work elucidates the genomic consequences of the domestication of ornamental betta and helps establish this fish as a modern system for evolutionary genetic interrogation.
Funding
Searle Scholarship and Sloan Foundation Fellowship (AB). Wellcome grant WT206194 (IB, JW, SM, WC, KH, RD). Wellcome grant WT207492 (SM, RD). Flemish University Research Fund (JC-G, HS). FWO Research Foundation Flanders Ph.D. fellowship (NV). National Institutes of Health grant EY020551 (JvL). National Institutes of Health grant EY028121 (JvL).
Author contributions
Conceptualization: AB, RD, YMK, HS. Formal analysis: AB, IB, WC, JC-G, KH, YMK, SM, HS, NV, JW. Funding acquisition: AB, RD, JvL, HS. Investigation: AB, SB, KXF, CH, YMK, MRL, HS. Project administration: AB, RD, HS. Resources: HHT, LR. Software: JC-G, YMK, HS, NV. Supervision: AB, RD, JvL, HS. Visualization: SB, YMK, HS, NV. Writing – original draft: AB, YMK, HS. Writing – review & editing: AB, RD, CH, KH, YMK, JvL, MRL, LR, HS.
Competing interests
Authors declare that they have no competing interests.
Data and materials availability
Data used in the analysis are available in NCBI GenBank GCA_900634795.3 and BioProject PRJNA486171.
Acknowledgments
The DNA pipelines staff at the Wellcome Sanger Institute generated sequencing data. Debbie Leung and Hiroki Tomida photographed fish. Ronny Kyller as well as members of the International Betta Congress including Liz Hahn, Sieg Illig, Karen MacAuley, and Holly Rutan provided samples. Leo Buss, Darcy Kelley, Carol Mason, and Molly Przeworski provided comments on the manuscript.
References and Notes
- 1.↵
- 2.↵
- 3.↵
- 4.
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.
- 23.↵
- 24.↵
- 25.
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.↵
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.
- 63.
- 64.
- 65.
- 66.
- 67.
- 68.
- 69.
- 70.
- 71.
- 72.
- 73.
- 74.
- 75.
- 76.
- 77.
- 78.
- 79.
- 80.
- 81.
- 82.
- 83.
- 84.
- 85.
- 86.
- 87.
- 88.
- 89.
- 90.
- 91.
- 92.
- 93.
- 94.
- 95.
- 96.
- 97.
- 98.
- 99.
- 100.
- 101.
- 102.
- 103.
- 104.
- 105.
- 106.
- 107.
- 108.
- 109.