Abstract
Adult body sizes in Austrolebias annual killifish vary a lot between species. The current hypothesis is that large sizes and specialized piscivory have evolved in successive vicariance speciation events with gradual evolution. We revisit this hypothesis using size measures combined with range data and new phylogenetic trees based on mitochondrial and nuclear molecular markers. Our analysis repeats biogeographic and Ornstein-Uehlenbeck trait evolution modeling across the posterior distributions of phylogenetic trees. We identify three events where large evolved from small. Our mtDNA phylogenetic trees reject the null hypothesis that all speciations were allopatric, but we note that some sympatric speciations might mark introgression events. Vicariance is unlikely to have played an important role: this type of speciation event can be simplified out of the biogeographic models. We propose a new scenario for the emergence of large piscivorous Austrolebias species: giant-dwarf speciation. In this eco-evolutionary scenario a large morph evolves in a population due to disruptive selection driven by character displacement and cannibalism. It can lead to the emergence of a sympatric large and small species pair. The clade containing A. elongatus seems a plausible case of giant-dwarf speciation. The species in it experience stabilizing selection with an optimum shifted towards larger bodies and longer jaws. The branch leading to this clade has the fastest evolving jaw lengths across the phylogeny. Our analysis suggests a potential alternative giant-dwarf scenario where large body size evolved in sympatry first, followed by a fast increase in body and jaw length for one daughter species in a subsequent event. A. wolterstorffi is selected towards a larger body and longer jaw, but evolves with slower rates than the elongatus clade. Discordance between mitochondrial and nuclear gene trees suggests introgression and the timing of trait divergence cannot be determined well at this point. For the clade with the remaining large species, we can reject the giant-dwarf hypothesis. The absence of substantial and rapid jaw length evolution suggests that they evolved due to other selective processes.
Introduction
Allopatric speciation is generally seen as the default mode of speciation, or at least as the most frequent one (Coyne and Orr 2004). Biogeographic models fitted to phylogenetic trees are essential to construct and tests hypotheses on speciation modes (Crisp et al. 2011) and to infer their frequencies. They are an improvement over previous approaches (Lynch 1989; Barraclough and Vogler 2000). In comparison to the estimation of frequencies of speciation modes or testing the null hypothesis that all speciation events are allopatric, inferring the mode and mechanism of a specific speciation event, in particular when not recent and not between a pair of sister taxa, seems particularly challenging and preoccupying (Coyne and Orr 2004). It might in many cases never lead to the falsification of specific scenarios. However, the scope for that might improve when a proposed scenario involves a particular pattern of phenotypic trait evolution that can be modeled using phylogenetic comparative methods (Garamszegi 2014).
Austrolebias (Costa 2006) is a genus of South-Amerian annual killifish mostly occurring in temperate zones. A biogeographic reconstruction based on large areas of endemism (Costa 2010), found many cladogenetic events without vicariance nor dispersal. Nevertheless, it is believed that the dominant mode of speciation in the genus is allopatric with some potential cases of sympatric speciation between sister taxa (Loureiro et al 2015). A null hypothesis where all speciations are allopatric has not been falsified using statistical inference. Speciation events within the genus where large species have evolved from small ones have contributed to a major axis of life history variation in the genus and we aim to understand the mechanisms that drove this divergence. Different elements of the evolution of large species from small and of piscivory within Austrolebias have been studied separately and often with weak conclusions. Based on morphological data alone, Costa (2009, 2011) concluded that there has been a single evolutionary change from small to large size within Austrolebias, plus a single event leading to its hypothesized sister genus with large species, Cynolebias. Large Austrolebias generally coexist with small Austrolebias, while large Cynolebias species co-occur with Hypsolebias small-bodied annuals (Costa 2010). For Austrolebias, Costa (2010) found the piscivores in an apex position in his cladogram and concluded that large sizes and piscivory evolved gradually within the genus in a series of vicariant speciation events where first body size increased and then sporadic and subsequently specialized piscivory evolved. In contrast with a single origin of large size within Austrolebias, the first phylogenetic trees of Austrolebias based on mitochondrial DNA sequences (mtDNA, Garcia et al 2000, 2002) suggest repeated evolution of large from small. However, these mtDNA trees nearly exclusively contained Uruguayan species and had nodes with bootstrap support values below 50%. A simple biogeographic reconstruction involving no modelling (Garcia et al 2000) assigned two origins of large species to sympatric speciation events, but this was without taking phylogenetic uncertainty into account. A more recent mtDNA-based study (Garcia et al 2014) suggested that there are three clades with large Austrolebias species, a first one containing A. elongatus, a second including A. robustus and with A. wolterstorffi as a third clade branching off at the origin of Austrolebias. This would falsify Costa’s (2010) hypothesis, but the posterior probability of the latter clade was 0.67 and the placements of the two other clades had below 50% support, which we consider insufficient support to reject the gradual scenario within a single clade proposed by Costa (2010).
As there seems a possibility that the scenario of vicariance and gradual specialization of large Austrolebias will be rejected, we already propose an alternative hypothesis for the emergence of these species. This gives us the opportunity to check whether our results are in agreement with the new scenario. If not, we will need to propose further alternative scenarios.
Giant-dwarf sympatric speciation (Claessen et al 2000; Dercole and Rinaldi 2002) by competition and cannibalism can be an alternative scenario for the evolution of large species from small. The giant-dwarf eco-evolutionary scenario was proposed in theoretical studies by Claessen et al (2000) and Dercole & Rinaldi (2002). Coexisting dwarf and giant morphs can emerge within populations as a result of size-dependent cannibalism and plastic growth rate variation between cohorts (Claessen et al 2000). Complementary to this, an evolutionary model demonstrated that sympatric species pairs can be an outcome of the evolution of cannibalism (Dercole and Rinaldi 2002). The possibility of sympatric speciation by giant-dwarf speciation has been little investigated using comparative data. Leijs et al (2012) carried out a partial assessment by estimating that subterranean beetle assemblages in isolated aquifers must have evolved locally in a number of cases, but did not assess the evolution of traits involved in giant-dwarf speciation along the phylogeny. We are aware that sympatric speciation remains a contentious issue and that the presence of sympatry strongly depends on the scale and detail of the ranges considered. In plants, examples of in situ speciation have turned out to be more likely representing parapatric processes (Papadopulos et al 2014). Similarly, in two examples of potential sympatric speciation in lacustrine fish (Schliewen et al 2001; Barluenga et al 2006), emerging species tend to specialize on different resources that can be unevenly distributed across habitats within a lake. In contrast, range overlap is required for cannibalism to be possible,naturally limiting the scope for parapatry in giant-dwarf speciation. In this first assessment of this scenario, we will therefore focus on results from phylogenetic comparative methods, and will assess sympatry using the ranges considered by Costa (2010).
Species trees based on many markers are referred to as the true bifurcation history of species (Mallet et al 2015) and seem to be preferably used for any comparative analysis. However, all discordance between gene trees is regularly assumed to be due to lineage sorting (Liu and Pearl 2007), facilitating reconstructions while neglecting part of the true history, namely introgression and hybridization between species. As explained above, the existing mtDNA phylogenetic trees for Austrolebias have important nodes with poor support. In a comparative or biogeographic analysis taking phylogenetic uncertainty into account, the consequence will be that the phylogenetic uncertainty likely dominates the inference with an extreme loss of statistical power. There is no molecular phylogenetic tree based on nuclear markers that includes large Austrolebias. Therefore no assessment of introgression or hybridization based on mito-nuclear discordance has been possible so far. Lab data demonstrate that at least some species pairs in Austrolebias can be hybridized (Oviedo et al 2016) so that we should not opt too quickly for methods assuming the absence of introgression and hybridization. If exchange has happened between established species, there is a high likelihood of at least some spatial overlap during the exchange, and the amount of trait divergence must still have allowed mating between the species. Even while current biogeographic reconstruction methods allow for example constraining ranges at nodes or the allowed dispersal jumps (Matzke 2014), nearly all comparative methods and all biogeographic reconstructions assume a single true bifurcation history. There are currently no reconstruction methods that arrive at predictions for ancestors that are compatible with the historical patterns of several discordant gene trees in a single fit. It is therefore reasonable not to consider a priori just a single species tree based on concatenated markers or coalescent methods, or to try to obtain a species phylogenetic network using next-generation sequencing which can’t be used well for further analysis. Instead we will consider first of all a set of potentially discordant bifurcation histories, those found in separate gene trees. We will assess the hypotheses for each of these while checking consistency and discordance, and respecting phylogenetic uncertainty.
In our analysis, we will propose that giant-dwarf speciation has occurred at some nodes in the phylogeny. If the evolution of traits related to cannibalism and piscivory is not in agreement with the divergence scenario, we will immediately falsify that hypothesis at a node and similarly so when we reconstruct a non-sympatric speciation with sufficiently high probability. If giant and dwarf species emerge in a sympatric process of evolutionary branching by negative frequency-dependent selection (Geritz et al 1998), disruptive selection causes large phenotypic differences to emerge from an ancestral population with much less phenotypic variation. Since a minimum size difference is often necessary for cannibalism to be possible (e.g., Persson et al 2000), one can assume that a relatively rapid and large size and gape change is necessary for evolutionary branching by cannibalism. Fitness landscapes for species pairs that have arisen after branching are bimodal, with each species sitting at a different optimum. In order to sustain the hypothesis, phylogenetic comparative methods should detect these differences in optimal trait values as a selection regime shift at nodes where the giant-dwarf speciation occurred. What is also apparent from Claessen et al (2000) is that the emerging dwarfs do not become smaller when a giant evolves, thus a single regime shift at the node is expected, for the cannibal.
We will test whether all speciation events in Austrolebias were allopatric and whether Costa’s (2011) proposed scenario for the evolution of large piscivores is true, using (1) two sets of posterior distributions of molecular phylogenetic gene trees, one obtained by modeling nuclear ribosomal DNA and one based on three mitochondrial markers (mtDNA); (2) a phylogenetic comparative analysis of trait variation assessing selection regime shifts and rates of trait change and (3) parametric biogeographic analysis. We use morphological and geographic data available from previous studies on Austrolebias and focus on the Argentinian and Uruguayan Austrolebias species because reliable community composition data are available for them and permanent assemblages in many sites consist of Austrolebias species only. Our analysis will take phyogenetic uncertainty into account throughout.
Materials and Methods
We used individuals from laboratory populations of South-American annual killifish species in the Animal Ecology Lab in Leiden, the Netherlands (Supplementary Table S1) complemented with two species from field samples (A. monstrosus and A. vandenbergi). Lab populations were obtained from field trips or expert amateur breeders. They are systematically maintained by crossing individuals from the same clutch, therefore individuals from the same population of origin are kin. The sample consists of 112 individuals of species in the Austrolebias, Hypsolebias, Ophthalmolebias, Nematolebias and Spectrolebias genera (Costa 2010) and of Cynolebias albipunctatus, in order to confirm the monophyly of Austrolebias and to determine whether Cynolebias is its sister taxon. Individuals of Aphyolebias schleseri (Costa 2003), Pterolebias longipinnis (Garman 1895) and Leptolebias citrinnipinnis (Costa et al 1988) are included as more distant outgroups.
DNA extraction and amplification
In the African annual Nothobranchius killifish, phylogenetic trees using nuclear markers had poor resolution at the gene level (Dorn et al 2011, 2014) and used concatenated sequences (Dorn et al 2014) or assumed a multispecies coalescent without introgression to obtain a species tree where half the nodes have nodal support above 80% posterior probability (Dorn et al 2011). Remarkably, a study on South-American annual Simpsonichthys subgenera using 842bp of mtDNA markers (Ponzetto et al 2016) obtained nodal support values above 80% for most nodes at the interspecific level suggesting that the lack of support in previous trees for Austrolebias could be due to limited sequence length. In this first assessment of hypotheses regarding size divergence in Austrolebias, where it was impossible to resolve the true phylogenetic history, we constructed two gene bifurcation histories, one from concatenated mitochondrial markers and the other with concatenated markers in different segments of a single nuclear gene. We have investigated an additional set of nuclear DNA markers, but did not manage to obtain separate gene tree with sufficient support for further analysis. Analyses based on the multispecies coalescent or concatenation of nuclear markers in different genes will be presented elsewhere.
DNA was extracted from fin clips, muscle and liver tissue using the Qiagen DNEasy Blood and Tissue kit following the manufacturer’s protocol and then used for direct amplification by Polymerase Chain Reaction (PCR) of mitochondrial sequences of cytochrome-b (cytB), 12S ribosomal DNA (rDNA), 16S rDNA and nuclear 28S rDNA (3 regions). Primers used for PCR and sequencing are listed in Table S2 and PCR reaction conditions in Table S3. PCR products were cleaned using the Promega Wizard SV Gel and PCR Clean-Up System (Promega, Madison, Wisconsin, USA). Cleaned PCR products were sent to a commercial sequencing facility (Macrogen Inc., Korea, www.macrogen.com). Sequencing reactions were carried out using our supplied primers and the sequence products were run on an ABI3730XL genetic analyser.
Sequence analysis
Electropherograms were edited in Sequencher 4.1.4 (GeneCodes, Madison, Wisconsin), and sequences aligned using ClustalX 1.83 (Thompson et al 1997; Jeanmougin et al 1998) using default parameters (gap opening = 10.00, gap extension = 0.20, delay divergent sequences =30%, DNA transition weight = 0.50). Alignments were straightforward for cytB, but contained indels in the rDNA sequences. We identified stems and loops in the mitochondrial rDNA sequences based on an alignment in RNAalifold (Bernhart et al 2008) resulting in improved alignments containing few ambiguous bases except for some regions within indels. Ambiguous regions were removed from the alignments used for further analysis of the 12S and 16S sequences. Indels were coded as presence-absence using simple indel coding in SeqState 1.37 (Müller 2005). The final alignments used for the phylogenetic analyses contained 1590 (mitochondrial) and 727 (nuclear) base pairs and an additional 82 (mtDNA) and 27 (28S rDNA) binary indel-coding characters.
Bayesian phylogenetic inference was carried out in MrBayes 3.2.5 (Huelsenbeck and Ronquist 2001; http://mrbayes.csit.fsu.edu). Datasets for the mitochondrial and nuclear markers were analysed separately but with each subset concatenated. Each dataset was partitioned into separate character sets for the DNA sequence and indels per marker. Character sets for 12S and 16S mtDNA sequences were further partitioned into stems and loops. In a first assessment, we fitted different substitution models, with parameters shared between partitions or not and estimated marginal likelihoods of each model using the stepping stone approximation (Xie et al 2011) to determine which model fitted each dataset best. Based on these comparisons, General Time Reversible models with a proportion of invariant sites and gamma distributed rate variation (GTR+I+G) were implemented, with different parameter estimates per partition of DNA sequence data, except for the stems in the 12S and 16S alignments, where we used the doublet model (Schöniger and von Haseler 1994). Changes in indel state were fitted with binary models. A doublet model would also be appropriate for 28S stems. We aligned our partial 28S rDNA sequences to the best-matching complete sequence of the gene we could find (from Oreochromis auratus), for which we determined its secondary structure using RNAfold (Hofacker et al 1994). Assuming that our aligned sequences would have stems and loops as in the complete sequence, we observed that stem distributions differed when the structure was determined using minimal free energy or ensemble prediction, that stems often consisted of a sequence in our data and another outside of it, and that some stems consisted of sequences we obtained for different markers. This would cripple an implementation of the doublet model for 28S stems and we used the GTR+I+G model instead. Per run, base frequency parameters were estimated assuming a Dirichlet distribution and prior settings were the defaults in MrBayes 3.2.5 except for the prior on branch lengths, which was exponential with parameter 20. Per model, two separate runs of four Markov Chain Monte Carlo (MCMC) chains were ran until convergence to a stationary regime allowed sampling at least 4000 trees spaced by 5000 generations. The selection of phylogenetic hypotheses continued on the basis of 50% majority consensus trees for these samples, by removing individuals from the dataset and refitting the model. Our sample contained 24 pairs or groups of sibs (same species/population). We preferred consensus trees with each sib group as sister branches and removed some individuals to achieve such a configuration. We checked for long-branch attraction by removing individuals with long branches (Bergsten 2005), and removed subsets of species from the dataset when long-branch attraction was identified. We also ran models for data subsets where only those individuals with data for all mitochondrial or nuclear markers were included.
The biogeographic and phenotypic analyses require species trees. We constructed species trees from mtDNA and 28S gene trees separately using the GLASS approach (Mossel and Roch 2010; Liu et al 2010). Gene trees were forced to be ultrametric using the penalized likelihood method of Sanderson (2002) with parameter lambda fixed at value zero (variable rates). Morphological and biogeographical analysis focused on the individuals of the genus Austrolebias and we pruned the trees for that. We first carried out phenotypic and biogeographic modeling on the mtDNA and 28S consensus trees, for illustration and to understand potential pitfalls in the analysis. Most modeling was then repeated on random samples of 500 trees from the posterior distributions and results summarized through averaging parameter estimates and predictions across the sampled posterior distributions.
Phenotypic Trait Analysis
For comparability and to maximise the number of species with data in the analysis, we used the morphological field data obtained from the tables of Costa’s (2006) revision of the Austrolebias genus, which include standard lengths. Other measures are given as fractions of standard length, and size measures of the head given as fractions of head length. Maximum and minimum values were reported per trait making allometry analyses requiring individual data or calculation of size-independent principal components impossible. We did not use the minimum standard lengths reported. Fish have indeterminate growth and information on individual age is lacking. Traits pertaining to our hypothesis are sex-specific maximum standard lengths and maximum and minimum lower jaw lengths. Before further analysis, we calculated scores for the most important principal components (PC) of (1) the two size traits and (2) the four measures of jaw length. We did not carry out phylogenetic PC’s as it was undetermined what the evolutionary model should be to derive correlations between species from. The scores were used as body size and relative jaw size traits in further analysis. The Austrolebias species in our analysis that are commonly denoted as large and that form a single monophyletic group in Costa (2006, 2010) are A. vazferreirai, cinereus, robustus, wolterstorffi, monstrosus, elongatus, prognathus and cheradophilus (we will often omit the genus abbreviation from now on). There is a single large species missing from our analysis, which is similar to robustus, cinereus and vazferreirai (Costa 2006). We were unable to obtain recent samples from it.
We did not a priori assign these eight species as large in our comparative analysis of trait evolution. Per tree, we fitted Ornstein-Uhlenbeck (OU) models to species trait values with a single or multiple optima for stabilizing selection (Butler and King 2004). We determined the number of selection regimes in the OU model by applying an automated routine with a modified AICc to the size and jaw scores combined (surface R library. Ingram and Mahler 2013), for each species tree in the samples from the posterior distributions of gene trees. Only the 500 AICc selected models are included in the averaging of parameters and predictions. To calculate average differences between estimated selection optima, we carried out a parametric bootstrap on each selected model and averaged across the replicates and trees. In the samples of 500 trees, trees recur. We therefore limited bootstrap replication to ten per tree.
Evolutionary models fitted to phenotypic and phylogenetic data should take measurement error into account (Silvestro et al 2015). We carried out an approach similar to SIMEX extrapolation (Cook and Stefanski 1994). Pseudo-data were generated by adding random error terms of increasing variance to the length and jaw scores and we ran the automated model selection procedure using a random tree from the posterior for each pseudo-data set. Low sensitivity of the results to increasing values of the added measurement error would suggest that measurement error in our data has limited effects on inference.
To investigate the relative magnitudes of trait changes, we fitted stable distributions as the model for trait changes (Elliot and Mooers 2014) to each of the species trees in the samples from the posteriors and the size and jaw scores separately. These distributions can be heavy-tailed and thus enriched for large changes. In the results, we inspected whether relatively large changes occurred on branches towards large species by calculating ranks of the estimated median rates of trait change on each branch. Reported are the averages of these ranks for particular branches.
Biogeographical analysis
Four areas of endemism occur among the species used in this analysis (Costa 2010, Table 1), which we used as discrete single-area species ranges. Each extant species occurs in a single area, but the analysis allows for multi-area ranges in ancestors and extant species. Two areas in which small Austrolebias occur are lacking in our data, namely the Iguaçu River basin inhabited by A. carvalhoi and araucarianus (Costa 2006, 2014) and the upper Uruguay River basin with varzeae (Costa 2006).
We used an extension of the likelihood framework and modeling approach of Ree (Ree et al 2005, Ree and Smith 2008) as implemented in the BioGeoBEARS package (Matzke 2014) which allows for anagenetic migration and extinction along branches and different cladogenetic scenarios at each tree node: a jump dispersal to a different area in one descendant, vicariance of a multi-area range, sympatric speciation within a single-area range, and subset speciation when a multi-area range occupied by the ancestor is also occupied by one descendant and the other descendant becomes limited to a single area. Weaknesses pointed out by Losos and Glor (2003) are reduced in these newer methods which allow range changes in between speciation events. In order not to a priori bias the overall plausibility of certain scenarios of speciation, we first implemented a model with the largest possible number of parameters that could be estimated and applied it to the samples of species trees. Simplified models without vicariance or without sympatric and subset speciation were fitted too and the average change in AICc reported. If the overall rate of sympatric speciation is non-zero and models allowing for sympatric speciation have lower AICc than the simplification without, we reject the null hypothesis that all speciation events were allopatric for the ranges considered. If models without vicariance have lower AICc, then the hypothesis where piscivores have evolved in a series of vicariance events has insufficient support. To assess the probability of vicariance and sympatric speciation at specific nodes in the full model, we used stochastic mapping (100 replications) given the tip data and model parameter estimates (Matzke 2014) and then averaged probabilities of events across the gene species trees.
Results
Mitochondrial and nuclear gene trees
In the analyses of the mitochondrial data, 5 or 7 individuals were removed from the dataset (Supplementary Material Figures S4-S6) depending on whether the model was fitted to individuals with full or partial data. The resulting phylogenetic hypothesis for the mitochondrial markers resolved all species (consensus species tree Fig. 2, gene tree Figure S6) and grouped the Austrolebias species traditionally seen as large in three well delineated and separated clades: in only 4% of the trees in the posterior, there are two clades with large species. Two of the three clades with large species are not recognized as species complexes by Garcia et al (2014).
Overall, partition posterior probabilities at nodes are above 0.8 for 80% of nodes within the genus Austrolebias, when within-species nodes are excluded. The consensus gene tree for the nuclear 28S data showed long branch attraction between A. vandenbergi and Pterolebias and Aphyolebias outgroups (Figures S1-2). We removed these outgroups. Additionally, five more individuals were removed that did not group with kin or conspecifics (Fig. S3). We obtained a phylogenetic hypothesis for this gene with the species traditionally seen as large in three clades in each of the trees in the posterior. These are again well separated, but robustus is paraphyletic to vandenbergi. The mitochondrial and nuclear phylogenetic trees reject Costa’s (2010) hypothesis with a single origin of large Austrolebias.
The large polytomy involving the placement of 34 of the 72 Austrolebias individuals in the consensus phylogenetic trees in Figs. 2 and S3 corresponds to very short branch lengths in the separate trees of the posterior distribution. Note that contrary to the results of Ponzetto et al (2015), we found Ophthalmolebias as a basal split in the mtDNA tree, and therefore rooted the nuclear tree on O. suzarti. Our two gene trees confirm that there is no reason to consider A. apaii and bellotti as different species, as already suggested by the results of Garcia et al (2012). The large Cynolebias albipunctatus apparently originated within Hypsolebias (Figure S3) and is not a sister taxon to Austrolebias.
We observed discordance between nuclear and mitochondrial trees (Figs 2, S3 and S6). This was the case for S. chacoensis, A. nigripinnis and for the placement of A. cheradophilus within one clade of large Austrolebias (Fig. 2). More importantly, some taxa with discordances show a geographic association: A. wolterstorffi, gymnoventris and luteoflammulatus are sister species in the mitochondrial tree and co-occur in The Patos Lagoon system and associated plains. In the nuclear 28S tree A. wolterstorffi is placed at the root of Austrolebias, and luteoflammulatus is the sister taxon of the group of large species that includes elongatus. A. periodicus, juanlangi and affinis are grouped together in the mitochondrial tree and all three occur in the “Negro” area of endemism. In the 28S tree affinis is the sister taxon of alexandri, occurring in the “La Plata” area of endemism. These last two sets of species and their geographic associations suggest that introgression (and potentially mitochondrial capture) or hybridization might have taken place among the species of each clade (Toews and Brelsford 2012) such that the true phylogeny will probably be a reticulate network.
Species Trees
We determined species trees for the genus Austrolebias using the posterior distribution for the mitochondrial gene tree with full data on all included individuals (Fig. S6). Removing individuals with partial mtDNA data improved the properties of the consensus phylogenetic tree considerably. We used the posterior nuclear trees that include individuals with partial data (Fig. S3).
Morphological Traits
The large and piscivorous Austrolebias are not belonging to a single apical taxon, invalidating Costa’s (2010) scenario. We can therefore start considering our new hypothesis, that large and piscivorous species evolved according a giant-dwarf scenario.
Fitting multivariate OU models to the size and jaw scores and the mitochondrial trees resulted in 95% of the cases in a model with three selection regimes as the model with lowest AICc (Fig 2). The model selection routine applied to the consensus mtDNA species tree prefers three regimes and two shifts (Fig. 2a). In comparison to an OU model with a single optimum, adding selection regime shifts decreased the AICc on average with 45 (s.d. 1.7). When optimal trait values of different regime shifts were allowed to be equal, the AICc could be decreased further with an average of 9.6 (2.5). Multivariate OU models fitted to the size and jaw scores and the nuclear phylogenetic trees also selected a model with three selection regimes most often (95%, Fig. 2). Adding selection regime shifts with selection to different optima decreased the AICc with 42.6 (4.2). When optimal trait values of different regime shifts were allowed to be equal, the AICc decreased a further 5.6 (1.4) on average. In the selected models, optimal trait values for the smallest species are always well separated from the others (Figs. 2 c&d). The less reliable 28S trees classify intermediate optima as clearly different (Fig. 2d), whereas the better resolved mtDNA trees don’t. There, the largest and second largest species have overlapping 95% posterior density regions (Fig. 2c). Jaw length scores for the intermediate optima are often overlapping with these for the smallest species. Adding measurement errors of gradually increasing magnitude to both traits leads to two selection optima being more often preferred than three (Figure S7). For the 28S trees, this already occurs for small added errors. The large elongatus, monstrosus and prognathus are generally selected towards a different optimum than all other species irrespective of measurement error. Small shifts in estimated values of the selection optima depend more strongly on the trees sampled from the posteriors than on the magnitude of measurement error (Fig. S7). The general pattern across the assessment of measurement error is that it blurs the detection of some selective optima, not that it generates additional spurious selection regimes.
From inspection of the posterior distributions, it seems appropriate to assess the plausibility of giant-dwarf speciation at the nodes leading towards the most recent common ancestor (mrca) of two clades with large species (elongatus, prognathus, monstrosus, cheradophilus; vazferreirai, cinereus, robustus) and at the node leading to wolterstorffi. The results on the 28S species trees below demonstrate that changes at the mrca of elongatus, prognathus and monstrosus needs to be assessed as well. Table 2 lists how often regime shifts were fitted at these nodes, and what the changes in selection optima were. For the mtDNA trees the most recent common ancestor of the clade containing elongatus shows a regime shift with changes in size and jaw length in nearly all trees (Table 2),. Regimes shifts towards wolterstorffi and the clade with robustus are fitted in a large percentage of trees, but the jaw length changes there are much smaller. We cannot reject the possibility of giant-dwarf speciation here. Remarkably, the 28S trees all have a regime shift with large changes in body size and jaw length towards the subclade containing elongatus, prognathus and monstrosus. In half of the trees this shift is preceded by a different one which is mainly for body length. We thus recover an element of the previously hypothesized scenario, with evolution of more specialized piscivory within a clade of large Austrolebias in two successive trait regime shifts. Regime shifts towards wolterstorffi are preferred for more than half of the 28S trees, again with moderate jaw length changes. The clade with robustus,vazferreirai and cinereus often has a regime shift for body length and jaw length on the 28S tree, but not often so for the mtDNA trees for which the consensus resolved these species well. Figure 3 shows the estimated median evolutionary rate parameters (speeds) for all branch segments (per arrival node) and sampled trees, i.e., score changes according the stable distribution model divided by branch lengths. Trait changes towards the clade containing elongatus systematically rank among the fastest for size and jaw length (Table 2). The rate on the branch towards wolterstorffi shows rapid change - even on the 28S trees - for jaw length and less extreme rates for body length. Evolutionary rates leading towards the clade with robustus are never very fast. This rejects the giant-dwarf scenario for that clade.
Biogeography
The rates of anagenetic dispersal and extinction are consistently estimated to be zero. Model comparison between models with and without vicariance revealed that this model simplification reduced the AICc (mtDNA reduction 0.47 (s.d. 0.21); 28S rDNA 0.92 (s.d. 0.49)). We conclude that scenarios including vicariance are not supported. Allowing sympatric speciation decreases the AICc on average with 23.84 (1.7) across the mtDNA trees. For the nuclear 28S trees, it does not improve the model substantially. The AICc of a model without sympatric speciation is on average 3.78 (6.1) smaller, and in 60% of the sampled trees, the model does not fit a positive weight of sympatric speciation (Table 2). We refitted the models to sub-trees where all species that were not well resolved were removed, and found that all model fits then assign a positive weight to sympatric speciation. However, given that there are only 13 species in these trees, the AICc remains the smallest for a model without sympatric speciation.
Figure 4 shows ancestral ranges of largest marginal likelihood at each node for the consensus trees, for the full biogeographic model. The mtDNA tree has a single multi-area range, namely the La Plata – Negro range at the root of Austrolebias (Figure 4a). According to that reconstruction, area Negro has been invaded four times, La Plata and Western Paraguay each three times, and the Patos area of endemism has been invaded once. Of the three events leading to species that were traditionally denoted as large, two occurred in Patos and one in La Plata. Most ancestral ranges are single-area ranges. Across the posterior tree distributions, stochastic mapping placed a vicariance event at a node where large species evolved in less than 0.1% of the simulations, except for wolterstorffi in the 28S tree, where vicariance occurred in 49.5% of the simulations. Therefore we can reject that vicariance was involved in the origin of the large species of the elongatus and robustus clades.
We report further results concerning the giant-dwarf scenario in Table two. For the 28S trees, we present the results for trees in the posterior with a non-zero estimate of sympatric speciation (199/500) separately. When stochastic mapping is used to determine probabilities of event types given the mtDNA posterior tree distribution, tip ranges and model parameters, the split towards wolterstorffi is given probability 0.97 of being a sympatric speciation. At the node towards the clade containing vazferreirai and bellottii this probability is 0.82. At the node towards the clade with elongatus it is 0.61. When inspecting the models fitted to the 28S nuclear data that do not estimate a probability of sympatric speciation (Fig. 4b), it becomes clear that the predictions are not parsimonious, if jump dispersals are seen as events. In the clade with elongatus in the consensus tree it favours four dispersal events and not two (Figure 4b). For the clade containing robustus the overall probability that the split was a sympatric speciation is 0.07 given our 28S phylogenetic tree. Taking all evidence for this clade together, there appears a scope to reject the giant-dwarf scenario with a better resolved nDNA tree. Among the trees in the 28S posterior with non-zero weights for sympatric speciation, the probability of sympatric speciation is also small for the clade containing robustus, intermediate for wolterstorffi, 0.75 for the split where large species diverge from luteoflammulatus and one half for the subsequent speciation event where cheradophilus originates.
When inspecting stochastic mapping results, it became clear that probabilities differ between nodes directly connected to a tip and nodes two edges away from a tree tip (Figure 5). We inspected them separately to assess whether probabilities found at specific nodes were extreme. We observe that simulated probabilities of speciation for each taxon with large species are consistently among the largest probabilities on the mtDNA trees, except for the clade with robustus (Fig. 5a). On the 28S rDNA trees and when a non-zero probability of sympatric speciation is estimated, probabilities are much more dispersed and smaller (Fig. 5b). This confirms that the 28S trees provide generally weaker support for the occurrence of sympatric speciation.
Discussion
The large Austrolebias are not monophyletic in either gene tree we constructed. A scenario for the evolution of piscivory in Austrolebias with gradual evolution and vicariance in a single clade of large species (Costa 2010) can be rejected. Our results on the mtDNA trees reject the null hypothesis that all speciation events were allopatric for ranges consisting of areas of endemism (Costa 2009). The clade with the most specialized piscivores evolved without vicariance events and with significant selection regime shifts and the fastest evolving trait changes.
At the three nodes where species traditionally seen as large originate, OU model comparisons detect selection regime changes for size and jaw length traits in at least half of the sampled trees up to in as good as all of them, except for the clade containing robustus. If we assess whether trait changes at these nodes were large, then in particular the clade containing elongatus, prognathus and monstrosus originated with rapid trait changes for body size and relative lower jaw length. The changes for the clade with robustus are small. The biogeographical analysis finds a model-based probability of sympatric speciation which is above zero for all mtDNA trees and for 40% of the 28S rDNA trees.
A plausible case of giant-dwarf speciation
For the clade with A. elongatus, monstrosus and prognathus we propose that it originated by giant-dwarf speciation. Their piscivory is uncontested, entire fish have been found in the guts of species from this clade (Costa 2009). We do need to consider a new scenario suggested by the 28S trees where body length would show evolutionary branching in a first speciation followed by body and jaw length divergence at a subsequent one. It should be investigated whether this scenario can occur completely in sympatry or imposes allopatry/parapatry between the different large species originating. The probability that the speciation events were sympatric is not extreme for this clade but high relative to the probabilities at other comparable nodes and requires further investigation.
For A. wolterstorffi trait changes were smaller and slightly less rapid. The placement in the mtDNA tree leads to a high predicted probability of sympatric speciation at the range scale considered, but the event might mark an introgression. The placement of wolterstorffi in the 28S tree near the root makes the probability of a trait shift less likely. We need to assess introgression in more detail such that constraints on trait and range divergence can be determined. Costa (2009) identified tooth characteristics indicative of molluscivory in this species. Gut content data will need to show whether it is indeed such a specialist or whether piscivory does occur. Costa’s (2010) analysis based on morphological characteristics did place it among the species of the previous clade.
For the clade containing A. vazferreirai, cinereus and robustus, the analysis allows us to draw the conclusion that it is unlikely that giant-dwarf speciation was involved. Rates of trait change are not high. On the mitochondrial gene tree, this clade is selected towards an optimum with intermediate jaw length. The nuclear gene tree is not well resolved for this clade, but the absence of substantial jaw length evolution suggests that these large species evolved due to other selective processes than the ones occurring in giant-dwarf speciation. The probability of sympatric speciation for the node at the base of this clade is intermediate for a node connected to a descendant tip (mtDNA) or low. The 28S trees nearly reject sympatric speciation as a hypothesis for this clade.
Austrolebias phylogenetic trees and biogeography
Our results indicate that the purely mitochondrial trees used so far to generate phylogenetic hypotheses for this genus (e.g. Garcia et al 2014) should be systematically complemented with nuclear ones now that mito-nuclear discordance is suggested. The species complexes described in Garcia et al (2014) are only partially supported. The clade of large species with long jaws containing elongatus is confirmed by our analysis. C. albipunctatus is a sister taxon to some of the Hypsolebias species in our analysis, and has maybe originated within a clade of Hypsolebias species. Contrary to what Costa (2010) concluded, Cynolebias is most likely not paraphyletic to Austrolebias. It is clearly worthwhile to repeat the assessment carried out here for as many Cynolebias species as possible, to assess whether all speciation events in the genus were allopatric or not and if not, whether giant-dwarf speciation could have occurred.
Our biogeographical analysis reconstructs ancestral ranges differently from Costa (2010), who hypothesized a three-area range at the root node of Austrolebias. For the mitochondrial species tree, a root range consisting of the La Plata – Negro areas is predicted. However, numerous small species were still missing from our tree and also including non-Austrolebias sister taxa might lead to different inference on the geographic origin of the genus. Our nuclear 28S tree did not resolve all species well. We should not interpret the predicted range at the root node. It is striking that the evolution of two clades with large species is predicted to have occurred in the region of the Patos Lagoons (mtDNA). Previous studies also hypothesized sympatric speciation in this region (Loureiro et al 2015). It has apparently been less affected by marine incursions in the Miocene (Hubert and Renno 2006). Combined with the lack of secondary immigration of Austrolebias species from elsewhere, this could have increased the scope for local diversification. The third taxon with large species (which includes vazferreirai) is more associated with the areas where marine incursions have taken place, such that one can now hypothesize that the large sizes in these species maybe correlate with increased dispersal capabilities into newly available and more rapidly changing habitats.
Determining the odds of “mechanistic” speciation scenarios
Whether sympatric speciation occurs easily and with appreciable frequency has been repeatedly debated (Coyne and Orr 2004; Dieckmann et al 2004; Bolnick and Fitzpatrick 2007; Fitzpatrick et al 2009). It has been noted (Coyne and Orr 2004) that overall probabilities across groups of sympatric speciation are easier to obtain that conclusive proof for individual cladogenetic events. Studies that combine an empirical example with a specific model tend to focus on a single exemplary case (Gavrilets et al 2007; Gavrilets and Vose 2007; Duenez-Guzman et al 2009; Sadedin et al 2009), while it might be more fruitful to tailor models such as the ones for giant-dwarf speciation to entire taxonomic groups and use them to predict events across a group. Advantageous side conditions can make it easier to demonstrate sympatric speciation (Bolnick and Fitzpatrick 2007). Geographic isolation on islands or isolation in habitat systems with similar properties such as isolated subterranean aquifers in the case of blind beetles (Leijs et al 2012), prevent regular dispersal. This allows constraining dispersal parameters at zero so that they don’t have to be estimated. Current methods allow assigning probabilities of different biogeographical scenarios at particular nodes in a phylogeny (Nylander et al 2008; Ree and Smith 2008; Buerki et al2011; Ronquist and Sanmartin 2011; Matzke 2014) and we have used such a parametric model to predict probabilities of sympatric speciation at specific nodes, after checking that the model predicts an overall non-zero probability of sympatric speciation events. A thorough inspection of the results and fitting models to pruned trees (see below) revealed that these models have their own advantageous side conditions to predict sympatric speciation at deeper nodes in a tree: sister species with identical ranges.
Limitations and prospects
When we pruned the mtDNA trees to remove four species where secondary contact might have been mistaken for sympatric speciation (A. wolterstorffi, gynmoventris, periodicus, affinis), zero probabilities for sympatric speciation were estimated for all trees in the posterior. If range changes and dispersal jumps are seen as events and range copying with sympatric speciation as non-events, unparsimonious reconstructions were made here with more events than necessary, just as in Fig. 4b. This seems to be a side effect occurring when all pairs of sister species in the tree occupy pairwise different ranges. It seems likely that this type of analysis is sensitive to the range patterns of missing species, as predictions at deeper nodes can strongly depend on the occurrence of events near tip nodes. Fortunately, stratification of this type of analysis is already possible (Matzke 2014), with different parameter values in different parts of a tree. To avoid model selection bias and cherry picking, it should be embedded in an automated routine such as surface (Ingram and Mahler 2013) which we did not develop here. We believe that such an approach would reduce the effects of missing data.
Our analysis and results note that introgression between sympatric species could lead to overestimates of sympatric speciation on mtDNA phylogenetic trees while incomplete lineage sorting in nuclear genes might blur geographic associations between species and underestimate sympatric speciation. Moreover, species trees based on nuclear genes alone are consistent in the statistical sense when the number of loci goes to infinity and with accurate trees per gene (e.g., Liu et al 2010), such that the now commonly used species trees based on nuclear markers with low gene tree resolution might lead to aggravated underestimation of the probability of sympatric speciation. Our results based on the 28S gene trees suggest such underestimation, while a comparison of our mtDNA and 28S trees suggests introgression. The true probability of sympatric speciation would then be in between the estimates obtained from mtDNA and nuclear DNA trees and we expect that only a robust phylogenetic network will resolve the probabilities reliably.
At several instances in our analysis, we had to scrutinize results and reconsider data and models. We believe that this effort has payed off, given the limitations of the data used. Our data indicate that a reticulate phylogeny is probable for Austrolebias, but for the moment we based our analysis on a pair of species trees which we interpreted as alternative bifurcation histories suggested by each gene tree. It would be advantageous if it were generally easy to constrain trait and range evolution along certain branches and at nodes in phylogenetic trees, even for exploratory purposes. Phylogenetic comparative modeling now does start allowing for exchange between branches, by modeling hybridization effects on traits (Jhwueng and O’Meara 2015) but the models will inevitably become more parameter-rich hence sacrificing statistical power. We stated in the Methods section that biogeographic models already allow different types of constraints. It is not obvious how to construct these in agreement with introgressions suggested by different gene trees.
Measurement error in the morphological species traits could not be estimated jointly with multi-modal OU phylogenetic comparative models, motivating our simulation of measurement error effects. In future studies, replacing field measurements by lab measurements in standardized environments might allow for a more precise assessment of variability in species traits. The areas of endemism we used to delineate species ranges might be too crude as well, and more detailed range assessments would be useful. This study motivates more ecological life history assessments. In the field, annual killifish occur in different local assemblages with the large species sometimes absent, which might facilitate investigations of current selective pressures on body size and jaw length. The demonstration of multi-peaked adaptive landscapes or local fitness minima in the absence of piscivores would lend further indirect support to the plausibility of giant-dwarf speciation. However, since the main traits involved will be size-related, we will need to discriminate between this scenario and other types of character displacement (Schluter 2000; Pfennig and Pfennig 2010).
Given that our phylogenetic hypotheses were based on limited data, much scope seems left to reject our new hypotheses with better trees. It will be useful to investigate whether introgression, hybridization and the associated genomic consequences are common in Austrolebias, also from a life history perspective. Hybridization events might be an alternative avenue towards diversification, leading to sudden jumps in trait values for a lineage.
Alternative scenarios
The giant-dwarf diversification scenario we propose for the evolution of large piscivorous Austrolebias may not necessarily lead to sympatric speciation. In general, disruptive frequency dependent selection can lead to multiple alternative evolutionary outcomes (Rueffler et al 2006) of which speciation is one, and growth plasticity another (Claessen et al 2000). It is likely that the ecological and evolutionary processes involved in size divergence are affected by growth plasticity, but so far the necessary data to test this are lacking. We note that polymorphisms with cannibal morphs were predicted to be possible by Dercole and Rinaldi (2002). The fact that killifish life histories are annual combined with a large size divergence between incipient species may result in assortative mating and thus a smaller probability of hybridization. This makes diversification without speciation less likely. On the other hand, species that are both under selection towards a shared size optimum might interbreed more easily and allow the introgressions causing mito-nuclear discordance.
Giant-dwarf diversification could also be temporary and lead to extinction of a species or the disappearance of a morph. In addition, the result of divergence within a population is often identical to that of repeated invasions from elsewhere (Rundle and Schluter 2004), calling again for a combined phenotypic-biogeographical comparative analysis. In this study, the Patos area where the largest jaw length and size divergences occurred, was invaded only once by Austrolebias (mtDNA). The La Plata and Chaco areas where we did not detect a dwarf-giant speciation were invaded several times. We therefore did not find an association between potential cases of giant-dwarf speciation and repeated invasions. In the near future however, effects of climate change in the regions where Austrolebias occur might cause new invasions, changes in local population composition and changed selection pressures. Using eco-evolutionary modeling, we might be able to predict whether giant-dwarf speciation might be favoured or take place again, and where it would be most or least likely.
In conclusion, we reject the hypothesis that large piscivorous Austrolebias annual killifish evolved in a series of vicariance events with gradual evolution. We have demonstrated selection regime shifts for size and jaw length in the annual killifish genus Austrolebias, and estimated a non-zero probability of sympatric speciation for the genus. We have identified a particular speciation event (the node in the phylogeny where the clade containing elongatus originates) where, given all limitations of the methods and data used, giant-dwarf speciation is plausible. Concerning the clade with robustus, we propose that large sizes might have been selected for their effects on dispersal capabilities.
Acknowledgements
We thank Leonie Doorduin and Azahara Barra who collected part of the DNA sequence data while the first three authors worked at the Institute of Biology in Leiden. TVD was supported by an NWO Veni grant, the Amsterdam Treub foundation and received support from Killidata.org. VS and AJH were supported by NERC. Heber Salvia, Martin Fourcade, Lucho Lobo and Fabiana Cancino are thanked for help with obtaining samples.