Population genomic analysis of the speckled dace species complex (Rhinichthys osculus) identifies three species-level lineages in California

The speckled dace (Rhinichthys osculus) is small cyprinid fish that is widespread in the Western USA. Currently treated as a single species, speckled dace consists of multiple evolutionary lineages that can be recognized as species and subspecies throughout its range. Recognition of taxonomic distinctiveness of speckled dace populations is important for developing conservation strategies. In this study, we collected samples of speckled dace from 38 locations in the American West, with a focus on California. We used RAD sequencing to extract thousands of SNPs across the genome from samples to identify genetic differences among seven California populations informally recognized as speckled dace subspecies: Amargosa, Owens, Long Valley, Lahontan, Klamath, Sacramento, and Santa Ana speckled dace. We performed principal component analysis, admixture analysis, estimated pairwise Fst, and constructed a phylogeny to explore taxonomic relationships among these groups and test if these subspecies warrant formal recognition. Our analyses show that the seven subspecies fit into three major lineages equivalent to species: western (Sacramento-Klamath), Santa Ana, and Lahontan speckled dace. Death Valley speckled dace were determined to be two lineages (Amargosa and Long Valley) within Lahontan speckled dace. Western and Lahontan speckled dace lineages had branches that can be designated as subspecies. These designations fit well with the geologic history of the region which has promoted long isolation of populations. This study highlights the importance of genetic analysis for conservation and management of freshwater fishes.

We investigated the following questions using a genome-wide data set: (1) Is the SDC just one 146 species or multiple species throughout its native range but especially in California? (2) Are the 147 subspecies of speckled dace found in California, as listed in , supported by genomic 158 whole fish stored in ethanol and dried on Whatman qualitative filter paper and stored at room 159 temperature. DNA was extracted from fin clips with a magnetic bead-based protocol [18] and 160 quantified using Quant-iT PicoGreen dsDNA Reagent (Thermo Fisher Scientific) with an 161 FLx800 Fluorescence Reader (BioTek Instruments). Genomic DNA was used to generate SbfI 162 RAD libraries [18] and sequenced with paired-end 100-bp reads on an Illumina HiSeq 2500.
163 Demultiplexing was performed requiring an exact match with well and plate barcodes [18].

RAD De Novo Assembly and Alignments
215 because both species are western cyprinids (material S2). The tips were assigned to the 216 subspecies described in . Undesignated speckled dace were represented by the 217 location where they were collected. If a significant genetic difference was shown between the 218 locations in one region or between subspecies in PCA or admixture analysis, the group was 219 separated into two tips based on genetic differences shown in the other analysis.

221
We used ANGSD to perform genotype calling, and we used the same parameters as mentioned 222 above, except generating a VCF file (-dovcf 1). BCFTOOLS were used to prune the SNPs with 223 r 2 greater than 0.9 within each RAD contig [24]. The pruned VCF file was transformed into 224 NEXUS format by vcf2phylip [25]. The pruned NEXUS file was analyzed by SVDQuartets 225 loaded within PAUP* 4.0 [26]. We selected multispecies coalescent model to construct the 226 phylogeny with 1,000,000 random quartets and 100 bootstraps. 227 228 1.7 Designation of species and subspecies 229 The genomic methods described above were used to determine the evolutionary relationships 230 among the sampled populations. Our assumption is that evolutionary distances among 231 populations provide support for designation of species and subspecies within the SDC.

232
233 For this study, we started with the accepted designation of speckled dace as a single species 234 throughout its wide, geographically diverse range [3]. We then used the Unified Species 235 Concept to evaluate evidence that there are multiple lineages within the accepted speckled dace 236 subspecies that might be distinct enough to qualify as species [27]. We selected the Unified 237 Species Concept because it provides flexibility in determining species, given that speckled dace 238 hybridize readily with other cyprinid species, a common phenomenon among cyprinids.

239
240 Evidence needed to support likely species using genomics included (a) previous designation as a 241 species based on conventional taxonomy, using morphological and meristic traits, (b) co-242 occurrence with other fishes endemic to a particular region, and (c) distribution limited to a 243 geographically defined area with an underlying geology that indicates a high likelihood of long 244 reproductive isolation. Sample sites were selected based on these criteria before the project 245 started. Subspecies determination used the same criteria although we do not expect subspecies 246 to be as differentiated from one another as species.    Table). Taken together, these results 275 revealed that the subspecies in   [1] have highly variable levels of genetic 276 divergence and taxonomic revision may be warranted.  (2002), respectively. 16.71% genetic variation is explained in total (PC1 explains 8.64% 300 variation while PC2 explains 8.07% variation). Three groups are distinguishable. Group One 301 includes Sacramento speckled dace, Klamath speckled dace, and speckled dace collected from 302 Butte Lake and Warner Basin. Group Two includes Amargosa, Long Valley, Owens, and 303 Lahontan speckled dace subspecies. Group Three includes Santa Ana speckled dace and 304 speckled dace from outside California which were collected from Washington Coast, Columbia 305 River, Bonneville Basin, and Colorado River Basin. B. Admixture analysis of all samples when 306 K = 3, which means we assumed the current populations are admixed by three populations in 307 the past. The upper label represents the locations, and the lower label represents the subspecies 308 designated in . Washington, Colorado, and Bonneville are abbreviated as WA, 309 CO, and B. PC analysis results are supported by results from Admixture analysis; the colors in 310 three graphs therefore correspond. C. SVDQuartets results of the range-wide samples. Relict 311 dace and tui chub were used as the outgroup. Speckled dace taxa designated in  312 are split into three groups. Group One and Group Two are monophyletic and are the sister 313 groups of each other, while Santa Ana speckled dace were clustered with Colorado Speckled 314 dace and were the sister group of all the other speckled dace included in this study.   Table). However,  438 Our genomic data analyses suggest that the speckled dace is not one species but rather a species 439 complex with hierarchical evolutionary lineages, some of which may be designated as species 440 and subspecies. In California, these lineages coincide with zoogeographic regions that are 441 largely isolated from one another and that contain other endemic fishes, suggesting long 442 isolation [1]. While some morphological and meristic differences exist among the lineages 443 within the speckled dace, as discussed in the introduction, they may reflect local adaptations to 444 diverse conditions rather than traits that allow species to be defined. Smith et al. (2017) [3] 445 indicated that the lack of clear morphological differences was the result of frequent 446 hybridization events that allowed gene flow among populations over wide areas. Because 447 hybridization is common in cyprinid fish, we rely on pre-mating isolation as the basis for 448 designating species and subspecies. We followed a combination of genetic and ecological 449 differences to designate species and subspecies. Ecological differences and allopatric isolation 450 ensure that genetic differences will accumulate. Thus, we hypothesize that a hybrid between

565
566 In our study, we clarify the genetic distinctness of Santa Ana speckled dace. All analyses show 567 that Santa Ana speckled dace have remarkably high genetic differences from the other 568 subbspecies in   [1] and the other speckled dace in Group Three (S2A and S2B 570 California speckled dace in addition to those from the Columbia and Colorado river basins, 571 Santa Ana speckled dace clearly merit full species recognition. This same basic conclusion was 572 reached by Cornelius (1969) [32] who conducted a detailed study of the morphometrics and 573 meristics of Santa Ana speckled dace, as well as of dace from neighboring streams (Sacramento 574 basin), the Virgin River (Lower Colorado basin), and Lake Tahoe (Lahontan basin). His study 575 was the first to link the origins of Santa Ana dace to the lower Colorado River basin. Using