Abstract
Accurate species phylogenies are a prerequisite for evolutionary research. Teleosts are by far the largest and the most diversified group of extant vertebrates, but relationships among the three oldest lineages of extant teleosts remain unresolved. Based on seven high-quality new genome assemblies in Elopomorpha (tarpons, eels), we revisited the topology of the deepest branches of the teleost phylogeny using independent gene sequence and chromosomal rearrangement phylogenomic approaches. These analyses converged to a single scenario that unambiguously places the Elopomorpha and Osteoglossomorpha (bony-tongues) in a monophyletic group sister to all other teleosts, i.e., the Clupeocephala lineage. This finding resolves over 50 years of controversy on the evolutionary relationships of these lineages and highlights the power of combining different levels of genome-wide information to solve complex phylogenies.
One-Sentence Summary Whole-genome analyses place Elopomorpha (tarpons, eels) and Osteoglossomorpha (bony-tongues) as sister groups at the deepest branching of crown teleosts.
Main Text
Species phylogenies retrace sister-group relationships resulting from evolutionary histories and pathways from common ancestors to descendant species (1). Accurate species phylogenies are important for our understanding and representation of the evolution of life on earth, but they are also a fundamental prerequisite for evolutionary analyses at the developmental, anatomical, genetic, and species levels.
With more than 30,000 species, teleost fishes are by far the largest and the most diversified clade of extant vertebrates (2). Understanding their phylogeny has been and is still subject to many disputes at different taxonomic levels (2, 3). Among these debates, a long-standing and unresolved question concerns the topology of the earliest-branching clades of crown teleosts, i.e., the Elopomorpha (named after “Elops-like” and including tarpon, bonefish and eels) and the Osteoglossomorpha (named after “bony-tongues” and including goldeye, arapaima, and elephantnose fish) relative to all the other extant teleosts in the Clupeocephala lineage (including for instance zebrafish, a major biomedical species) (3, 4).
Based on anatomical and morphological characters, Elopiformes (tarpons and ladyfishes) were first suggested to be the most “primitive living teleosts” nearly 100 years ago ((5) cited in (4)). Since then, Elopomorpha and Osteoglossomorpha have been alternatively placed as the earliest branching clade of teleosts. A first scenario, proposed in 1977 (6), placed the Osteoglossomorpha as the earliest teleost crown group and outgroup to the Elopocephala consisting of Elopomorpha and Clupeocephala. (Fig. 1A). This scenario was later challenged in 1997 (7, 8) by the placement of Elopomorpha as the earliest branching clade of teleosts, with Osteoglossomorpha and Clupeocephala composing the Osteoglossocephala clade (Fig. 1B). These early controversies were based on morphological evidence and remained largely unsolved, but the most recent authoritative view still considers the Elopomorpha as the earliest branching clade of crown teleosts (2).
With the emergence of molecular phylogenetic approaches in the nineties, this question was extensively revisited using gene sequence phylogeny reconstructions (reviewed in (3, 9)). Despite extensive efforts, including several large-scale multi-locus approaches (10–12), no consensus has, however, been reached in favor of neither the Elopocephala nor the Osteoglossocephala hypothesis. In addition, a third topology placing Elopomorpha and Osteoglossomorpha as sister groups (Fig. 1C) was even suggested in the early nineties (13) and since then supported by a few more recent studies (11, 14–18). This clade, which we tentatively named the Eloposteoglossocephala (Fig. 1C), was never formally retained, probably because this topology was not supported by any morphological evidence (3). The prevailing hypothesis, confirmed by a recent meta-analysis of gene sequence phylogeny studies (9), thus remains the Osteoglossocephala hypothesis that places Elopomorpha as the earliest branching clade of extant teleosts (12)). However, the precise phylogenetic relationships of these major teleost lineages are still debated and have even been recently reviewed by Dornburg and Near (3) as one of the major unresolved questions of the twenty-first century regarding the evolution of actinopterygian fishes. To promote a reexamination of this problem, they provocatively proposed to retain “the unconventional and intriguing possibility of an osteoglossomorph and elopomorph sister group relationship” (3).
To resolve the phylogenetic relationships of these early-branching teleost clades, we first sequenced, assembled, and annotated high-quality reference genome sequences of seven species that represent major Elopomorpha orders or families (Fig. 2 and table S1) for which chromosome-level whole-genome resources were lacking. We combined genome information from these seven Elopomorpha species with 18 additional publicly available genome assemblies including four Osteoglossomorpha, 10 Clupeocephala and four vertebrate outgroups, including the spotted gar and bowfin non-teleost fishes, to perform phylogenomic analyses.
A major challenge for achieving accurate phylogenetic analysis of teleost genomes is their high number of duplicate (paralogous) gene copies. Many of these paralogs are inherited from a whole genome duplication (WGD) in their last common ancestor (20), and are known to mislead phylogenetic reconstructions (11). To mitigate the effect of paralog inclusion, we applied a WGD-tailored pipeline leveraging gene sequences and synteny conservation (supplementary materials, Methods section) to select 955 high-confidence 1-to-1 orthologous genes across all the 25 genomes we analyzed. This list represents by far the largest molecular dataset considered for teleost phylogeny reconstruction, both in terms of included Elopomorpha genomes and of total alignment size (see fig. S1). We then performed phylogenetic reconstructions of these 955 individual gene trees using summary analyses with ASTRAL (Fig. 3A, and fig. S2 for protein trees), as well as Maximum Likelihood analyses of their concatenated sequences both at the nucleotide and amino-acid levels (fig. S3 and S4). These analyses all provided highly significant support for the Eloposteoglossocephala hypothesis that places Osteoglossomorpha and Elopomorpha as sister groups. Additionally, this Eloposteoglossocephala clade was further supported by gene-genealogy interrogation, which directly compares the likelihood of each of the three evolutionary scenarios based on individual gene sequence alignment (Fig. 3B).
However, because previous sequence-based studies have yielded opposing results to resolve the three early diverging teleost branches (9, 11), we also used two novel genome-wide methods to infer species trees based on conservation of genome structures. First, we analyzed the conservation of gene adjacencies between 3,041 orthologous marker genes covering 57-98% of each teleost genome, and inferred a Neighbor-Joining species tree from local microsyntenic conservation (21). Second, we analyzed macrosyntenic evolution by measuring the fraction of shared chromosomal breakpoints between species with PhyChro (22). These two complementary approaches (Figs. 3C-3D, fig. S5) also provided convergent and robust support for the Eloposteoglossocephala scenario, confirming the results from gene sequence phylogenies.
Finally, by looking at all potential chromosomal macro-rearrangements we identified a single chromosomal fusion exclusively shared between karyotypes of Osteoglossomorpha and Elopomorpha species (Fig. 4, fig. S6). Together with the absence of other rearrangements that would be consistent with alternative groupings, this identical chromosomal macro-rearrangement further supports that the two groups descend from a common ancestor, strengthening the phylogenomics evidence for the Eloposteoglossocephala clade.
Using a combination of new whole-genome resources for Elopomorpha and an array of complementary phylogenomic reconstruction methods, we unambiguously resolved the long-standing question of the topology of the deepest branches in the phylogeny of extant teleost fishes. This achievement highlights the power of genome-wide methods to resolve complex and ancient phylogenies, especially when these methods consider a variety of informative evolutionary characters in complement to sequence information. Chromosome rearrangements, in particular, are fixed at a low rate and thus are less prone to mutational saturation and character reversal, which can occur in sequence-based phylogenies (24).
Our results resolve over 50 years of controversy and demonstrate that Elopomorpha and Osteoglossomorpha constitute a clade for which we propose the name Eloposteoglossocephala (supplementary materials, section 1). This conclusion raises questions about the paucity of anatomical evidence in favor of this hypothesis, despite more than 70 years of extensive research (3, 4). We carefully reexamined the available literature on these anatomical characters in light of our results and we were not able to find a morphological character exclusively and unambiguously shared by Elopomorpha and Osteoglossomorpha (supplementary materials, section 2). However, the fusion of the retroarticular with the angular and/or the articular, a derived character previously considered a synapomorphy of the Elopomorpha (25, 26), has been shown to be shared with at least mormyrids among bony-tongues (26, 27). Even if this character is described as either present (27) or ambiguous in goldeye Hiodon alosoides, and absent in two other Osteoglossomorpha (26), we propose this derived state as a morphological synapomorphy of the Eloposteoglossocephala, which was secondarily lost in some Osteoglossomorpha. We anticipate that based on our results, more character mapping and new targeted anatomical and morphological searches will soon provide novel and non-ambiguous synapomorphies shared by the Eloposteoglossocephala.
Funding
This work was supported by the Agence Nationale de la Recherche, France (ANR) on the GenoFish project, 2016-2021 (grant No. ANR-16-CE12-003) to HRC., CB., JB., JHP., M.R.C and YG, and by France Génomique National infrastructure, funded as part of “Investissement d’avenir” program managed by ANR (grant No. ANR-10-INBS-09) to CD. Part of the fellowship to E.P. was supported by funds from the European Union Horizon 2020 research and innovation program under Grant Agreement No 817923 (AQUA-FAANG). M.R.-R. was supported by the Swiss National Science Foundation grant 31003A_173048. J.H.P was supported by the National Institute of Health under grant agreement No R01OD011116.
Author contributions
Conceptualization: CB, JB, IB, YG, CK, AL, JHP, MRR, HRC
Software: CB, AL, EP, HRC
Formal analysis: CB, AL, EP, HRC
Investigation: CB, OB, CC, AC, CD, RD, CFB, YG, CI, HJ, EJ, CK, GL, JL, AL, JM, EP, HRC, CR, MW, MZ
Resources: JB, IB, WJC, RD, YG, CH, HJ, SM, JHP, AT, MW
Data curation: CB, CC, YG, CK, AL, JM, MZ
Visualization: CB, YG, AL, EP
Funding acquisition: CB, JB, CD, YG, JHP, MRR, HRC
Project administration: YG
Supervision: CB, YG, HRC
Writing – original draft: CB, YG, EP, HRC
Writing – review & editing: CB, JB, IB, CC, TD, YG, GL, EP, HRC, CR, MRR
Competing interests
Authors declare that they have no competing interests.
Data and materials availability
The Whole Genome Shotgun projects for the seven Elopomorpha species are available in the Sequence Read Archive (SRA), under the following BioProject references PRJNA702045 (Conger conger), PRJNA692825 (Albula goreensis), PRJNA743502 (Aldrovandia affinis), PRJNA690086 (Megalops atlanticus), PRJNA693699 (Anguilla anguilla), PRJNA743503 (Synaphobranchus kaupii), PRJNA702255 (Gymnothorax javanicus). All genome assemblies plus their annotations are also available in the omics Dataverse (Open source research data repository) server (https://doi.org/10.15454/GWL0GP). All input data (sets of orthologous marker genes, CDS codons alignments, gene coordinates files) and the generated reconstructed species phylogenies have been deposited in Zenodo (doi: 10.5281/zenodo.6414307), along with all scripts and environments to reproduce the analyses.
Acknowledgments
We thank Yoann Guilloux, Fabien Quendo and Aaron J. Adams for their help in providing fish samples. We would also like to thank the leaders of the oceanography cruises and the crew of the RV Atalante, France and ORI, Taiwan in organizing the survey and helping to collect the deep-sea fish samples under the TDSB-TFDeepEvo joint Program.