Phylogenomics supports a Cenozoic rediversification of the “living fossil” Isoetes

The fossil record provides an invaluable insight into the temporal origins of extant lineages of organisms. However, establishing the relationships between fossils and extant lineages can be difficult in groups with low rates of morphological change over time. Molecular dating can potentially circumvent this issue by allowing distant fossils to act as calibration points, but rate variation across large evolutionary scales can bias such analyses. In this study, we apply multiple dating methods to genome-wide datasets to infer the origin of extant species of Isoetes, a group of mostly aquatic and semi-aquatic isoetalean lycopsids, which closely resemble fossil forms dating back to the Triassic. Rate variation observed in chloroplast genomes hampers accurate dating, but genome-wide nuclear markers place the origin of extant diversity within this group in the mid-Paleogene, 45-60 million years ago. Our genomic analyses coupled with a careful evaluation of the fossil record indicate that despite resembling forms from the Triassic, extant Isoetes species do not represent the remnants of an ancient and widespread group, but instead have spread around the globe in the relatively recent past.


Introduction
Determining the evolutionary relationships and divergence times between lineages is crucial for understanding the processes that generate diversity and evolutionary novelty [1][2][3]. Fossils provide a glimpse of the past, by preserving the anatomical features of organisms that existed millions or hundreds of millions of years ago. The fossil record is however very incomplete and often needs to be combined with analyses of extant diversity to infer periods of 40 diversification and extinction of different lineages [4]. A fossil assigned to a lineage of organisms based on shared morphology provides a minimal age for the group, and can therefore help date evolutionary events. Putative causal factors can then be inferred for these events, such as the Chicxulub meteorite impact and the disappearance of the non-avian dinosaurs [5], or the co-incident radiation of angiosperm and insect lineages [6]. In many 45 cases, however, fossils assignable to particular extant lineages of organisms are unavailable, because of a lack of readily fossilisable tissues (e.g. jellyfish) or because the organisms live in environments that do not favour fossilization (e.g. cacti). In addition, morphological traits preserved in fossils may not vary sufficiently to distinguish multiple extant lineages, preventing a precise assignment of the fossils [7][8][9]. When this pattern of conservation 50 concerns a large number of morphological traits, the extant species are referred to as "living fossils", a category that includes the coelacanth, cycads, sturgeons, platypus and lungfish that closely resemble fossils from the Mesozoic [10][11][12][13][14]. Clearly a lack of morphological change does not preclude changes in traits poorly represented in the fossil record, such as biochemical or behavioural changes -nevertheless, their unusually conserved morphology through time 55 has long attracted the interest of biologists [15,16]. This morphological stasis is often associated with decline -with current distributions of "living fossil" taxa interpreted as the remnants of larger ancestral ranges [17,18], and extant species being the last members of ancient lineages diverging long in the past [19,20]. However, the morphological uniformity and the resulting difficulties in fossil assignment mean that these hypotheses are difficult to 60 test from the fossil record alone.
Analyses of DNA sequences over the last few decades have resolved the phylogenetic relationships between many extant lineages, and large numbers of selectively neutral changes in the genomes allowed inferring accurate phylogenies even for the most morphologically uniform organisms [21,22]. Molecular divergence in parts of the tree with informative fossils 65 can then be used to time-calibrate molecular changes in the rest of the tree, allowing inference of divergence times of groups of organisms lacking an appropriate fossil record [23][24][25][26]. This molecular dating technique has been used to investigate the evolutionary dynamics of some "living fossil" groups, which in some cases indeed showed ancient within-group divergence and extant distributions resulting from continental drift tens of millions of years ago [17]. In 70 groups such as cycads [13], bryophytes [27] and Ginkgo [26], however, extant diversity originated more recently than their conserved morphology would suggest, indicating complex evolutionary dynamics for some "living fossil" taxa. Age estimates from molecular dating techniques remain however sensitive to the treatment of fossil data, variability in the rates of nucleotide substitutions between molecular markers and species, and the correct alignment of 75 nucleotide markers [24,[28][29][30][31][32][33]. These problems are exacerbated when the only available calibration points are distant from the group of interest, as is by definition the case for "living fossils" [28,[34][35][36][37]. Each possible source of error therefore needs to be isolated and carefully considered.
The lycopod genus Isoetes exemplifies many of the problems of "living fossil" taxa. 80 The genus has long been of interest due to its status as the last lineage of the isoetalean lycopods. This group, known from at least the late Devonian, dominated terrestrial floras in the Carboniferous [38]. The extant Isoetes genus is a small herbaceous aquatic or semiaquatic plant, generally lacking a stem and consisting of a number of stiff leaves atop a woody corm [39]. It demonstrates a number of unusual features such as roots comparable to fossil 85 stigmarian rootlets [40] and aquatic Crassulacean Acid Metabolism (CAM) [41]. Fossils resembling the Isoetes growth form are found in the Triassic onwards, although their exact affinities and relationships to Isoetes are unclear [42][43][44]. A variety of morphological features (such as sunken sporangia, an elaboration of the basal part of the ligule into a glossopodium, and a velum or labium covering the sporangium) that characterise extant Isoetes appear at this 90 time, although no single fossil displays all of these features [44]. The appearance of Isoetites rolandii in the Jurassic represents the earliest clear example of a isoetalean lycopsid containing all the major features uniting modern Isoetes, including the loss of both vegetative leaves and an elongating stem [44,45], although elongated-stem forms such as Nathorstiana persisted until the Early Cretaceous [46]. Fossils of the modern Isoetes growth form such as I. 95 horridus are subsequently found from the Early Cretaceous and into the Tertiary [42,44]. The morphology of Isoetes appears to have persisted virtually unchanged since at least the Jurassic, and the general growth habit in the lineage is potentially as old as the Triassic.
The close resemblance of fossil taxa to modern Isoetes suggests the extant species could be the remnants of a very ancient genus. However, establishing the relationship between 100 these fossils and modern Isoetes has proven difficult due to the highly conserved morphology of the genus [18,38,44,47]. The extant 250 Isoetes species have a global distribution, yet display very little morphological variation -spore morphology is currently used to distinguish extant species, but reliance on a single character trait makes a classification system vulnerable to homoplasy, convergent evolution and is complicated by widespread hybridisation within 105 the genus [39, 47,48]. Morphology and the fossil record alone therefore provide limited insights into the relationships among extant and fossil species of Isoetes, restricting our ability to understand the temporal origins of extant Isoetes species diversity.
The relationships between extant species of Isoetes have been inferred using molecular phylogenetics [18,49,50], but attempts at linking fossils and extant species have not always 110 been successful. Taylor and Hickey [39] hypothesised based on shared leaf morphology that a small group of South American species and fossil Isoetes represented a basal split within the genus, but molecular phylogenetics falsified this hypothesis for extant species [18,49].
Recent molecular dating studies suggest an origin of extant Isoetes species in the Triassic to Jurassic, with species distributions consistent with the breakup of the Gondwana 115 supercontinent [18,51,52]. These studies were, however, based on a limited number of markers, mainly from the chloroplast genomes, where high rate variation can make dating estimates strongly dependent on molecular clock model assumptions [24], a potentially significant source of error given the ancient divergence between Isoetes and its sister group Selaginella [44,53]. We therefore decided to re-evaluate the divergence times within Isoetes 120 using a combination of phylogenomic methods capturing markers spread across the genomes of numerous land plants.
In this study, we generate transcriptomes and genomic datasets for multiple Isoetes species and apply multiple molecular dating approaches to estimate the time to the most recent common ancestor of extant Isoetes based on nuclear and plastid genomes. Our results 125 shed new light on the age and evolutionary dynamics of this "living fossil" lineage, and show how careful integration of large genomic datasets can help analyses of groups with a poorly informative fossil record.

General approach
In this study, we selected six Isoetes for generating genome-wide DNA datasets. These were selected to capture the deeper divergence events within this group based on previous molecular studies [18]. Analyses of nuclear ribosomal DNA available for a large number of Isoetes confirmed that the last common ancestor of the selected species likely corresponds to 135 the last common ancestor of extant Isoetes, and the low branch length variability throughout the genus suggests the sequenced species represent a good sample of evolutionary rates within the genus (Fig 1). Sparse taxon sampling has been shown to significantly affect estimated dates in some molecular dating studies [54,55], although not in every case [56,57]. Rate heterogeneity likely plays an important role in the effect of sparse taxon sampling on the 140 accuracy of molecular dating, with high levels of rate heterogeneity demanding more sampling [55]. The relatively low levels of rate heterogeneity within Isoetes (Fig. 1) suggest it is a suitable group to perform molecular dating with a relatively small number of taxa. To further investigate the impact of our sampling scheme, we reanalysed the dataset of Larsén and Rydin [18], which contains 45 Isoetes species including all the major clades identified by 145 previous studies of Isoetes [50,58]. This dataset was reanalysed using the same constraints and BEAST settings as previously, but with the 45 Isoetes species used reduced to the 6  Rydin [18].

160
DNA from these species was then used to generate genome-wide datasets, and different genome partitions were analysed in isolation to get accurate estimates of divergence times. Herbarium specimens represent a useful source of DNA, particularly for globally distributed, hard-to-access groups such as Isoetes [59,60]. Low-coverage whole-genome scans can be applied to these samples, and will yield high coverage for genomic fractions 165 present as multiple copies, such as the organellar genomes [61]. However, highly variable evolutionary rates in chloroplast markers have been reported from seed plants [62,63], which potentially affect the results of dating methods that differ in their assumptions of rate heterogeneity [24].
Previous studies of the chloroplast marker rbcL indicate much higher rates of 170 sequence evolution in Selaginella than in Isoetes [18,63,64], urging for a consideration of nuclear markers, which can be more useful for molecular dating if they show less variation in rates among branches [24]. Genome skimming can provide nuclear sequences, but low coverage makes de novo assembly difficult. However, the sequencing reads can be mapped to a reference dataset, providing phylogenetically informative characters [65,66]. A reference 175 genome is available for Selaginella, but it is too distant to allow accurate mapping of reads from Isoetes. Transcriptomes provide high coverage of expressed protein-encoding genes, which represent regions of the genome allowing read mapping across distinct species [65,66].
We consequently decided to generate and assemble a transcriptome for a single Isoetes species, which was used as a reference to map reads from low-coverage whole-genome

Sequence acquisition
Live Isoetes lacustris were sampled from Cwm Idwal, Wales and maintained at the University of Sheffield in 40 x 30 x 25 cm transparent plastic containers, with a substrate of sand to a 190 depth of 5 cm, and the containers filled to the top with deionised water. These were placed in a Conviron growth chamber with a 12-h day/night cycle, 495 μmol m 2 s -1 light, temperature at 20°C during the day and 18°C at night, and CO 2 at 400 ppm for six days. To maximise the number of transcripts retrieved, leaves from three individuals were sampled 3 hours after dark and 3 hours after light and stored immediately in liquid nitrogen. We also generated a 195 transcriptome for Littorella uniflora (Plantaginaceae), another species of aquatic plant that shares aquatic CAM photosynthesis [67]. Individuals from this species were also sampled from Cwm Idwal and were grown under a variety of conditions before sampling their leaves as described above. Dried specimens were deposited in the Sheffield University Herbarium (I.

210
DNA from herbarium specimens of five Isoetes species were acquired from the DNA Bank from the Royal Botanical Gardens, Kew (Table S2). This was supplemented with one silica gel dried leaf each of I. lacustris and L. uniflora collected from the field (Cwm Idwal, Wales). Whole genome sequencing of these seven samples was performed at the Genotoul from the University of Toulouse, using previously described protocols [65,68]. Each sample 215 was sequenced on a 24 th of a lane of a flow cell, with other samples from various projects.
Raw sequencing reads were cleaned using NGS QC toolkit v2.3.3 [69] by removing adapter sequences, reads with ambiguous bases and reads with less than 80% of positions with a quality score above 20. Low quality bases (q<20) were removed from the 3' end of remaining reads. Species identity and branch length variability within the genus were assessed by 220 assembling the nuclear ribosomal internal transcribed spacer (nrITS) using NOVOPlasty 2.5.9 [70]. The assembled sequences were aligned to nrITS sequences used in Larsén and Rydin (2015) [18] using MAFFT v7.164 [71]. A phylogeny for this marker was then produced using

Chloroplast data matrix
Cleaned reads from Isoetes and Littorella corresponding to the chloroplast genomes were assembled using NOVOPlasty, with a 39-bp kmer and a seed sequence of the I. flaccida chloroplast genome [64]. In cases where a circular chloroplast genome was not produced, 230 contigs were aligned to the I. flaccida chloroplast genome using blastalign [73], and reads corresponding to regions of the reference chloroplast genome not covered by the contigs were used as seed sequences to assemble new contigs. All contigs were subsequently realigned to the reference genome, and overlapping contigs were merged. Chloroplast genome assemblies from 24 additional species representing the major embryophyte taxa, including two 235 Selaginella species, were downloaded from NCBI database (Table S3).
Chloroplast protein-coding genes were identified from all chloroplast genomes using DOGMA [74] and coding sequences were extracted using TransDecoder v2.1.0 [75]. A total of 64 genes were identified and aligned by predicted amino acids using t-coffee [76] and MAFFT. Gene alignments were manually inspected and trimmed using AliView [77]. 270 uniflora and the additional embryophyte species to homolog families, with a minimum match length of 50 amino acids and e-value of 10 -7 . The expanded homolog families were then aligned according to their protein sequences using MAFFT, and phylogenies were constructed using RAxML and the GTR + G + I model, which fits most genes and is therefore appropriate for constructing large numbers of gene trees [88][89][90]. presence within "Clade A" identifies these features as derived [18,49]. No morphological features reliably distinguishing clade A and the rest of the Isoetes appear to exist [18,49].
Therefore, no fossil will contain features distinguishing these two groups, so fossils cannot provide a hard minimum age for this node. Within the Isoetes crown group, no morphological 305 features clearly divide different clades [18,38,52], therefore using fossil calibrations within the Isoetes crown group to produce a hard minimum age is also not possible.  Table S5.

Molecular Dating Software
Molecular dating was performed using r8s [110] and BEAST [111], two commonly used relaxed-clock methods that differ in their general approach and the strategy used to assign rates to internal branches of the phylogeny. r8s implements a semiparametric method 345 that uses a penalised likelihood approach to assign rates among branches [112]. The smoothing parameter, which determines the extent to which rates vary among branches, is determined for each dataset using an empirical approach [110]. The method takes a phylogram as input, assumes no uncertainty in topology, and uses a simplified model of nucleotide substitution. BEAST implements a highly parametrised Bayesian method that 350 samples trees generated from nucleotide data using an explicit model of sequence evolution [113]. When using the relaxed molecular clocks implemented in BEAST, rates are uncorrelated across the tree, but an overall distribution of rates is assumed.
For r8s, version 1.81 was used, with the "TN" algorithm and additive penalty function.
Cross validation was performed for a range of smoothing parameters from 10 -2 to 10 6 , 355 increasing by a power of 10 0.5 each time, and the best smoothing parameter was used for molecular dating. Confidence intervals were obtained by generating 100 bootstrap pseudoreplicates using seqboot [114] and obtaining branch lengths for each of these using RaxML while constraining the trees to the topology generated by the original dataset. These trees were then individually dated using r8s, providing a distribution of ages across the pseudo-360 replicates. This approach was used to date the chloroplast dataset, the concatenated nuclear dataset, as well as individual nuclear markers.
For BEAST, version v1.8.4 was used. A lognormal relaxed clock was adopted with a GTR + G + I model of nucleotide substitution with four rate categories and a birth-death speciation prior. For the concatenated chloroplast markers, four independent analyses were 365 run for at least 20,000,000 generations and appropriate burn-in periods (at least 10%) were assigned by inspection of the traces using Tracer v1.6 [111]. For individual nuclear genes, BEAST was run for 3,000,000 generations (based on observing convergence times with a subset of genes) with a burn-in period of 50%. Dating the concatenated nuclear dataset was computationally too intensive with this approach. We therefore randomly subsampled 370 55,743bp (the length of the chloroplast alignment) from the 694,437bp nuclear alignment eight times, and analysed these subsamples with BEAST. The same parameters as the individual nuclear genes were used, with the exception that BEAST was run twice for 10,000,000 generations for each alignment, with a burn-in period of 10%, with convergence verified using Tracer. For comparison we performed r8s on these alignments as described 375 previously.

Phylogenetic reconstruction using Isoetes nrITS sequences
The maximum likelihood phylogeny based on nrITS broadly agrees with phylogenies 380 previously inferred for the group (Fig 1) [18], and confirms that the species sampled for genomic analyses encapsulate the first divergence within the Isoetes species for which molecular data is available (Fig 1). These six samples are moreover representative of the group in terms of branch lengths, and therefore evolutionary rates (Fig 1). The samples sequenced in this study generally cluster with other individuals from the same species 385 sequenced previously, with the exception of Isoetes nuttallii. The newly sequenced sample of I. nuttallii groups with I. asiatica and I. echinospora, disagreeing with the topologies found in other studies [18,49]. The sample in this study was collected in Alaska, USA, where the ranges of I. nuttallii and I. echinospora overlap [115], making misidentification or a hybrid origin of this individual the likely cause of this discrepancy. We therefore refer to this sample 390 as I. sp. throughout the rest of this manuscript and in supplementary data files.

Phylogenetic reconstruction and dating based on the chloroplast genome
The maximum likelihood phylogeny based on chloroplast markers recapitulated major land 395 plant relationships and expected relationships within the Isoetes clade, with I. coromandelina being sister to the rest of samples (Fig 2a). The tree was well resolved, with only the Ceratophyllum/eudicot split receiving less than 95% bootstrap support. Branch lengths were highly variable, particularly between Isoetes and Selaginella, with the latter having accumulated approximately 4.5 times more substitutions than Isoetes since their most recent 400 common ancestor (Fig 2a).   Fig 3). Whilst low smoothing values result in over-fitted models 415 that perform poorly in cross validation, high levels of smoothing may produce rates that are nevertheless poor predictors of branch lengths in particular parts of the tree. For high smoothing values, the ratio of the effective rate (the branch length divided by the estimated time elapsed) to the rate assigned by the model was 0.33 for the stem branch of Isoetes (Fig   4), showing that the branch is significantly shorter than would be expected for the assigned 420 rate and divergence time. On the other hand, the average ratio for the crown branch lengths was 1.33, indicating that the crown branches are longer than would be expected for the assigned rates and divergence times (Fig 4).

Fig 3. Effect of different smoothing factors on Isoetes crown date estimation in r8s.
425 Estimated crown dates for Isoetes produced by r8s for concatenated chloroplast (green) and nuclear (blue) datasets are shown for a range of smoothing factors. The best fitting smoothing factor, as identified by cross validation, is highlighted for each dataset by a filled circle.  For the same chloroplast markers, BEAST estimated the crown age of Isoetes at 28.5 Ma (95% HPD = 5.5-51.7; middle Oligocene; Table 1), similar to the value obtained with the 440 optimum level of smoothing in r8s. Unlike in r8s, rates in BEAST can vary throughout the tree, but their distribution is assigned a priori -in this case a lognormal distribution. Rates in the maximum clade credibility tree accordingly follow a lognormal distribution, with the Isoetes stem branch being assigned the lowest rate in the tree and the crown branches assigned rates closer to the average rates in the rest of the tree (Fig 5). For both r8s and BEAST, a date 445 of 23-29 Ma (Oligocene) is obtained via the implicit or explicit inference of a decrease in the rate of evolution along the stem branch, with rates in the crown branches being more similar to those in the rest of the tree. This assumption results from the model, and is not necessarily correct, urging for independent evidence.

Phylogenetic reconstruction and dating based on nuclear markers
The concatenated nuclear phylogram also recapitulated major land plant relationships (Fig   2b). The topology of the Isoetes clade was consistent with that of the chloroplast phylogeny, with I. coromandelina again being sister to all other species. Despite overall longer branch lengths in the concatenated nuclear phylogeny, variation among groups was reduced.
460 Particularly, the total branch lengths from the common ancestor of Isoetes and Selaginella were much more similar than in the chloroplast phylogeny, with Selaginella having accumulated approximately 1.25 times more mutations than Isoetes since their last common ancestor (Fig 2b). However, the ratio of the average crown branch length to stem length in the Isoetes lineage was very similar between the nuclear and chloroplast markers; approximately 465 5.8 for the chloroplast dataset and 5.6 for the nuclear dataset (Fig 2).
Dating of the concatenated matrix of nuclear markers in r8s gave an estimated crown node age of Isoetes at 58.9 Ma (Table 1), with an estimated stem node age of 358 Ma, at an optimum smoothing value of 0.1. Unlike with the chloroplast markers, the date of the Isoetes crown node was similar across all smoothing values tested (Fig 3). Increased smoothing 470 values led to increases in the disparity between effective and assigned rates (Fig. 4), although this was low compared with the concatenated chloroplast alignment (0.82 vs 0.33 for the stem branch and 1.25 vs. 1.33 for the crown branch for a smoothing value of 10 6 ). The conservation of the effective rates in the stem and crown branches of Isoetes across a range of smoothing parameters indicates that the average rates predicted across the entire nuclear tree 475 are a relatively good fit to both stem and crown branches of Isoetes (Fig 4). This suggests that stem and crown branches of Isoetes have similar rates, which is consistent with their highly similar length ratios between the chloroplast and nuclear trees (Fig 2).
Dating individual nuclear genes in r8s resulted in a wide range of optimum smoothing values (Fig S1). Low smoothing values frequently resulted in gradient check failures, 480 indicating a single optimum solution is not reached (Fig S1). For genes reaching a single optimum, the median estimated crown date for Isoetes was 46.4 Ma with 95% of estimates between 16.1 and 85.8 Ma and 50% of results between 31.9 and 58.3 Ma (Table 1). Overall, the estimated dates form a unimodal distribution (Fig S2). While low values of the smoothing parameter increased the age estimates, all values above 10 yielded estimates centred around 485 50 Ma, similar to those based on the optimum smoothing values ( Fig S2). As with the chloroplast datasets, increasing smoothing values resulted in a decreased effective/assigned stem rate and increased effective/assigned crown rate (Fig S3). The disparities for the optimum smoothing values were again reduced compared to the chloroplast data (Fig S3), indicating the globally optimum smoothing values for the individual nuclear markers fit the 490 stem and crown branches of the Isoetes better than in the chloroplast dataset.
Dating individual genes using BEAST gave a median estimate of 47.6 Ma for the crown of Isoetes, with 95% of estimates between 24.1 and 90.1 Ma, and 50% between 39.2 and 58.6 Ma. The ages obtained for individual genes were highly correlated between r8s and BEAST (linear model, slope = 0.94, p-value < 0.001; R 2 = 0.64; Fig 6). Linear modelling 495 suggested a significant but small effect of the percent completeness of the alignments on the estimate for the crown age of Isoetes, with a larger effect from the average completeness of Isoetes sequences (Table S6)

Nuclear analysis supports a recent origin of extant Isoetes
In this study, we used phylogenomics to estimate the age of Isoetes, a group of lycopods often 515 interpreted as "living fossils". Using molecular dating with calibration points on deep branches of the land plant phylogeny, we found very different dates for the crown of Isoetes using the chloroplast and nuclear datasets, at 23-24 Ma (Oligocene) and 45-60 Ma (Paleocene and Eocene), respectively (Table 1). These differences are unlikely to be caused by the methods, since BEAST and r8s produced almost identical dates (Table 1; Fig 6), despite the 520 very distinct ways in which these two programs deal with rate variation among branches.
Subsets of 55,743bp (the same size as the chloroplast alignment) of the concatenated nuclear alignment gave dates consistent with the other nuclear datasets, indicating alignment size was not the cause of this disparity (Table 1). Instead, the incompatibilities between estimates based on nuclear and chloroplast datasets probably arise from differences in rate variation 525 among branches. Branch lengths varied widely between Selaginella and Isoetes chloroplast markers (Fig 2a), as previously reported [18,[62][63][64]. These high levels of variability make low levels of smoothing in r8s relatively poor fits to the data, effectively forcing a uniform rate on the tree that is determined by the average branch length. That, in turn, results in a rate fitted to Isoetes that is a poor match to its relatively short branch lengths (Fig 4). As a result, 530 the fitted models predict more changes along the stem branch and fewer changes along the crown branches than occur in the data. Similarly, in BEAST, the lognormal prior distribution results in a relatively low rate assignment on the stem branch compared to the crown branches, which leads to a better fit to the lognormal distribution across branches than if all crown branches had a low rate (Fig 5). We conclude that the high rate variability hampers 535 accurate molecular dating using the chloroplast data. By contrast, the individual and concatenated nuclear datasets have a small disparity between estimated and effective rates, and the crown age estimate for Isoetes is consistent across genes (Figs 4 and S2), both in the concatenated versus individual datasets (Table 1) and between BEAST and r8s (Fig 6). The more consistent rates make the nuclear dataset more appropriate for estimating divergence 540 times.
We conclude, based on our nuclear genome-wide analyses, that the diversity of extant In both these studies, fossils are used to provide strong prior constraints for an 555 ancient date for the crown node of Isoetes. However, these fossils do not provide evidence that the split between "clade A" and the rest of Isoetes had occurred, as no synapomorphies are known from the extant members of these clades that could be preserved in fossils.
The number of taxa sampled in this study is however lower than in these previous studies. Reduced taxon sampling has been shown to have an impact in some [54,55], but not 560 all [56,57], cases, with high levels of rate heterogeneity likely requiring increased taxon sampling [55]. The relatively low levels of rate heterogeneity in Isoetes (Fig. 1) indicate this is unlikely to affect our age estimates, and reduction of the taxon sampling in the comprehensive Larsén and Rydin [18] dataset by 87% only resulted in a 7% change in the estimated age of the Isoetes crown date (see Materials and Methods). Reanalysis of the entire 565 Larsén and Rydin [18] dataset only with markers alignable outside Isoetes resulted in a similar age estimate to the present study, despite the significant differences in taxon sampling (Table S1). These considerations suggest that rather than taxon sampling, the distribution of nucleotide data among groups and the use of fossil evidence are responsible for the differences between this and previous age estimates of Isoetes.

570
Our approach generated nucleotide data that are homogeneously distributed among taxonomic groups, and the fossil evidence is used in a very conservative way, relying solely on external fossils to estimate the age of the crown of Isoetes, our group of interest. These considerations allow disentangling the effects of methodological variation, rates of molecular evolution, and treatment of fossils on the molecular dating of a group of "living fossils".

575
Despite morphological stasis, Isoetes recently expanded Our results are consistent with an increasing number of studies in groups such as cycads [13], bryophytes [27] and Ginkgo [26] showing relatively recent origins of extant species of these 585 groups, despite long periods of morphological stasis.