Surprising spatiotemporal stability and frequency-independence across multiple fitness peaks driving adaptive radiation in the wild

The effect of the environment on fitness in natural populations is a fundamental question in evolutionary biology. However, most empirical field studies of fitness do not experimentally manipulate phenotypes or environmental conditions and rarely investigate more than a single species or population. Thus, the relative importance of the competitive environment versus intrinsic organismal performance in shaping the location, height, and fluidity of fitness peaks on the adaptive landscape remains largely unknown. We experimentally tested the effect of competitive environment on a multi-peak fitness landscape driving a microendemic adaptive radiation of generalist and trophic specialist pupfishes on San Salvador Island, Bahamas. We manipulated phenotypes, by generating lab-reared hybrid crosses, and competitive environment, by altering the frequency of rare phenotypes within field enclosures in their natural hypersaline lake environments on San Salvador. We tracked the growth and survival of 2,611 F4/F5 hybrids for 3 to 11 months in high- and low-frequency treatments replicated across two lake populations. We found strong evidence for frequency-dependent growth rates within and between enclosures, but no evidence for frequency-dependent survival differences. However, both fitness proxies supported a complex fitness landscape isolating generalist phenotypes on a local fitness peak separated by a small fitness valley from a higher fitness peak for molluscivores and a large fitness valley isolating the scale-eating phenotype in all trait dimensions. The striking consistency of this multi-peak fitness landscape across competitive environments, multivariate trait axes, and a previous field experiment provides experimental evidence for stasis, possibly due to fixed biomechanical constraints on organismal performance. These results challenge existing theory and highlight the interplay of organism and environment underlying the static and dynamic features of fitness landscapes.


Introduction
Results 121

Field and laboratory survival of transgressive hybrid populations 122
We individually tagged and photographed 2,611 F4-F5 outbred juvenile hybrids resulting from 123 crosses of all three species, generalist (C. variegatus), molluscivore (C. brontotheroides), and 124 scale-eater (C. desquamator), from two different isolated lake populations on San Salvador 125 Island, Bahamas before release into a high-and low-frequency field enclosure in each lake (Fig. 126 1; Table S1). The frequency-manipulation increased the frequency of transgressive hybrid 127 phenotypes in the high-frequency treatment and generalist-type hybrids in the low-frequency 128 treatment, resulting in a significant reduction in phenotypic variance on discriminant axis 2 129 (predominantly nasal protrusion) in lake 1 (Levene's test, P < 0.0001) and discriminant axis 1 130 (predominantly oral jaw size) in lake 2 (Levene's test, P < 0.0001) within the bivariate 131 discriminant morphospace separating all three parental species (Fig. S1). The total density of 132 hybrids was held approximately constant between high and low-frequency treatments (lake 1: 133 high/low: 923/823 individuals; lake 2: high/low: 842/819 individuals; Table S1). 134 To sample from a broader range of environmental variability, we measured hybrid 135 survival after 3 months in lake 1 (high-frequency: 77.1% survival; low-frequency: 75% survival) 136 and after 11 months in lake 2 (high-frequency: 1.4% survival; low-frequency: 1.2% survival; 137 Table S1), in the latter case spanning half the pupfish reproductive lifespan (approximately 2 138 years) but avoiding mortality due to senescence. There were no differences in survival 139 probability between treatments in each lake (two-way logistic regression, treatment effect: P = 140 0.237). For a control comparison, additional hybrids from each lake population (N = 199 total individuals) were simultaneously tagged, raised in laboratory aquaria, and their deaths and 142 growth rates were tracked over one year. 143 144 Hybrid phenotypic diversity 145 Phenotypic similarity of each hybrid to the three parental species in each lake (n = 236) was 146 calculated from 30 linear traits and angles measured from three pre-release photographs of each 147 fish (Fig. S3). These traits were used to estimate two linear discriminant (LD) axes with major 148 loadings (Table S2)  We fit thin-plate splines using generalized cross-validation or restricted maximum likelihood to 157 survival and growth rate data to visualize fitness landscapes across the discriminant morphospace 158 [59]. We found evidence for directional selection on the survival of hybrid phenotypes most 159 similar to generalist and molluscivore phenotypes in lake 1 (Fig. 3). We also found evidence of 160 stabilizing selection on the growth rates of generalist and molluscivore phenotypes in this lake 161 (Fig. S4). Despite low survival rates after 11 months in lake 2, we found strong evidence of 162 nonlinear selection for hybrid phenotypes resembling the molluscivore. These landscapes were 163 estimated from very few survivors (n = 22); however, the density of non-survivors in the surrounding regions of morphospace and similar patterns in both field enclosures provides robust 165 support for nonlinear divergent selection for the molluscivore phenotype in this lake. Patterns of 166 survival in the wild were contrasted by strong directional selection for hybrids resembling the 167 scale-eater in laboratory control populations from both lakes (Fig. 3). 168 We also used generalized projection pursuit regression to estimate the two multivariate 169 linear phenotypic axes most strongly associated with survival across the 30-trait morphospace 170 (Tables S3-S4)  to parental phenotypes) suffered the lowest survival probability across treatments in both lakes 174 (Fig. 4). In contrast, survival in laboratory control populations was shifted or opposite to the 175 direction of selection in field enclosures along these two dominant axes of selection (Fig. 4). 176 We estimated the strength of multivariate selection gradients along these two major axes 177 of selection and found significant evidence of directional selection (P < 0.00001) on ridge axis 1 178 in both lakes and marginal evidence of directional selection on ridge axis 2 in lake 2 (Table 1). 179 The traits with the highest loadings on ridge axis 1 were a) lower jaw length and b) distance from 180 the jaw joint to the orbit and on ridge axis 2 were a) angle between the premaxilla and orbit and 181 b) distance from the premaxilla to the pectoral girdle (Tables S2-S3) We found no evidence of significant treatment effects on either survival probability or the overall 186 topography of the survival fitness landscape within the discriminant morphospace ( Table 2). The 187 fixed effect of treatment did not improve the fit to the survival data in any of the generalized 188 additive models examined. Instead, models without the effect of frequency treatment were 189 strongly favored (Table 2; Δ AIC = 11). Similarly, we found no evidence for frequency-190 dependent effects on survival between treatments on the two major axes of selection in either 191 lake estimated from generalized projection pursuit regression (Table 3). Models without the 192 effect of treatment provided a marginally better fit to the survival data on the two major axes of 193 selection in lake 1 (ΔAIC = 1) and were supported in lake 2 (Table 3;

Strong evidence of frequency-dependent growth rates between treatments 196
In contrast to survival, we found strong support for treatment effects on fitness landscapes 197 estimated from the independent fitness proxy of growth rate in lake 1 (Table 2; lake 2 was  198 excluded from all growth rate analyses due to the low number of survivors). Models including 199 the effect of frequency treatment on log-transformed growth rates in lake 1 were strongly favored 200 (ΔAIC = 68.6). This effect was robust to models including univariate splines and thin-plate 201 splines within the discriminant morphospace (Table 2). 202 Similarly, we found strong support for models including the effect of treatment for the 203 two major axes of selection across the entire morphospace in each lake estimated from 204 generalized projection pursuit regression (Table 3). The best supported model included the 205 effects of frequency treatment and univariate splines on the two major ridge axes of selection and 206 was strongly favored over a model without the effect of treatment (Table 3; Δ AIC = 46). 207 208 Phenotypic scale of frequency-dependence for growth rate but not survival 209   Within each enclosure we estimated the Mahalanobis distance for each hybrid individual, the  210   distance to the mean hybrid phenotype in the 30-dimensional morphospace correcting for trait  211 covariances, as an estimate of the rarity of each individual phenotype. We also calculated the 212 Euclidean nearest neighbor distance to the ten most similar hybrid phenotypes in 30-dimensional 213 morphospace for each hybrid as an estimate of the local frequency of competing phenotypes. 214 Overall, these measures estimate the frequency of similar hybrid phenotypes relative to each 215 hybrid to examine the phenotypic scale of frequency-dependence within each enclosure [51]. 216 The frequency of similar phenotypes was not significantly associated with residual variation in 217 survival not explained by hybrid phenotype along the two discriminant axes ( Fig. 5; Table 2). 218 Generalized additive models including the fixed effect of competitor frequency (distance to mean 219 phenotype) only marginally improved the fit to the survival data (Table 2; Δ AIC = 1; similar 220 results were found when substituting nearest neighbor Euclidean distance for Mahalanobis 221 distance: Fig. S1). 222 In contrast, generalized additive models including the fixed effect of competitor 223 frequency (both Mahalanobis and nearest neighbor distance) strongly improved the fit to the 224 growth data in lake 1 within the discriminant morphospace, even after accounting for the 225 treatment effect (Table 2; Δ AIC = 12). Similarly, for the two major axes of selection estimated 226 from generalized projection pursuit progression, models including the fixed effect of competitor 227 frequency substantially improved the fit to the growth data, even after accounting for the 228 treatment effect (Table 3; Δ AIC = 3.85). 229 230 Both fitness proxies support multiple fitness peaks on the adaptive landscape Generalized additive modeling enables estimation and visualization of a joint fitness landscape 232 after controlling for the effects of lake and treatment. The best supported models for survival 233 included a fixed effect of lake and no effect of treatment, with either two univariate smoothing 234 splines or a thin-plate plate and two smoothing splines modeling selection within the 235 discriminant morphospace (Table 2). Models including spline terms were strongly supported 236 over models including fixed linear effects of the discriminant axes (Table 2; Δ AIC = 18.7). The 237 best supported models for growth rate in lake 1 included a fixed effect of treatment (lake 2 was 238 omitted from these analyses due to the low number of survivors available for growth rate 239 estimates) and two univariate smoothing splines within the discriminant morphospace (Table 2). 240 Strikingly, the best combined models for survival (across both treatments in both lakes) 241 and growth rate (both lake 1 treatments) each independently supported an isolated fitness peak 242 for hybrids resembling the generalist separated by a fitness valley from a region of higher fitness 243 corresponding to the molluscivore phenotype (Fig. 6). This multi-peak landscape, consistent 244 across both survival and growth rate fitness proxies and different exposure periods in each lake, 245 was also striking similar to a previous independent field experiment using F2 hybrids in these 246 same lakes [56]. This supports the surprising spatiotemporal stability of a complex fitness 247 landscape topography spanning a recent adaptive radiation of trophic specialists across years, 248 seasons, divergent lake environments, and manipulated frequencies of hybrid transgressive 249 phenotypes. 250 Combined estimates of the fitness landscape for both survival and growth rate indicated a 293 surprisingly consistent topography across space and time comprised of an isolated fitness peak 294 corresponding to the generalist phenotype separated by a small fitness valley from a higher 295 fitness peak corresponding to the molluscivore phenotype (Fig. 6). For survival, multiple peaks 296 emerged from evidence of higher survival of generalist and molluscivore phenotypes after three 297 months in lake 1 combined with evidence of much higher survival of the molluscivore phenotype 298 after 11 months in lake 2 (Fig. 3). For growth rate in lake 1, multiple peaks emerged from higher 299 growth rates of generalist phenotypes in both enclosures combined with moderate molluscivore growth rates and very low scale-eater growth rates in the low-frequency enclosure (Fig. S4). 301 These joint landscapes are admittedly reflective of combining different frequency treatments, 302 exposure periods, lake environments, and potentially different selective regimes; however, 303 comparison of general additive models provided no evidence of different selective regimes 304 between treatments and only a fixed effect of lake environment, rather than a change in fitness 305 landscape topography between lake environments (e.g. Table 2: very low support for models Overall, while there was strong selection against hybrid phenotypes driving species divergence 319 in this radiation, hybrid survival showed minimal sensitivity to the frequency of competitors, 320 supporting the classic view of static fitness peaks and valleys on the adaptive landscape. Indeed, 321 the combined estimate of the fitness landscape across all four field enclosures in this study was 322 strikingly similar to the original fitness landscape estimated for the high-density field enclosure in lake 1 in our preview work [56]. This provides robust support for the spatiotemporal stability 324 of an isolated generalist fitness peak separated by a large fitness valley from scale-eating and a 325 smaller fitness valley and higher fitness peak for snail-eating. Our study also provides empirical 326 field experimental data supporting a role for the stable fitness optima and minima frequently 327 We tested this prediction by looking for fitness ridges connecting scale-eater phenotypes to 432 generalist and molluscivore phenotypes across 500 random cross-sections in the 30-trait 433 morphospace, rather than examining only a single two-dimensional cross-section. We found no 434 evidence of a fitness ridge in any linear combination of these 30 traits connecting scale-eaters to 435 other species in the radiation, instead nearly all survival curves appeared to decline in the region 436 of the morphospace containing scale-eaters (Fig. 7). In one of the low-frequency treatments, 437 survival curves showed no decline, but this treatment also included few hybrids closely 438 resembling the scale-eater phenotype and showed high survival overall (Fig. 7b). Thus, the entire 439 volume of high-dimensional morphospace within our trait dataset containing the scale-eater 440 phenotype appears to lead to a fitness minimum, or 'hole', isolating scale-eating phenotypes 441 from the rest of the radiation. However, we failed to find evidence of any fitness ridges 442 connecting these phenotypes to other species, one of the major predictions of the holey adaptive 443 landscape hypothesis. Nonetheless, fitness ridges may still exist in genotype space or across 444 were allowed to recover for at least 4 days in flow-through holding tanks at the Gerace Research 506 Centre to fully regrow their caudal fins while fed on a diet of newly hatched brine shrimp and 507 commercial pellet foods. 508 Field enclosures were 3.6 m x 4.9 m rectangular fully-enclosed bags with a mesh size of 509 0.318 cm (Christiansen's Net Company, Inc.) secured to either PVC pipe set in concrete (lake 1) 510 or iron rebar hammered into the substrate (lake 2; Fig. 1). Two enclosures were deployed in the 511 littoral zone of each lake after removing any debris, then the bottom mesh wall was weighted 512 down with rocks and logs covered in macroalgae from the surrounding area and filled with 513 benthic substrate, macroalgae, and wigeon grass from surrounding areas. Care was taken to avoid 514 introduction of any adult fishes, but smaller pupfish and mosquitofish could still enter the 515 enclosure through the mesh. 516 One enclosure in each lake was randomly selected as the high frequency treatment and 517 the second enclosure was the low frequency treatment. Hybrids were individually selected for 518 each treatment by eye, selecting more divergent phenotypes for the high-frequency treatments 519 and selecting the most generalist-like hybrids for the low-frequency treatments. This resulted in 520 reduced phenotypic variance and morphospace occupation within the low-frequency treatment 521 within each lake. This also effectively reduced the frequency of scale-eater hybrids falling within 522 the 95% confidence interval of parental scale-eater phenotypes. Hybrid densities within 523 enclosures approximated the natural densities of 0.9% and 3% scale-eaters and 6 and 5% 524 molluscivores in Crescent Pond and Little Lake, respectively [65]. 525 In Crescent Pond (lake 1), tagged hybrids were released in large batches into the high-526 frequency enclosure on May 15 th, 19 th , and 28 th and low-frequency enclosure on May 21 st , 26 th , 527 and 28 th , 2014, respectively. In Little Lake (lake 2), all tagged hybrids were released into the 528 high-frequency enclosure on May 18 th and into the low-frequency enclosure on May 25 th and 529 27 th , 2014. Surviving hybrids were recovered from lake 1 on August 26 th and 27 th , 2014 after 3 530 months by carefully removing the substrate and sequentially lifting the entire mesh bottom, then 531 photographed laterally and stored in 100% ethanol. To sample from a wider range of seasonal 532 environments and recover the full time to reproductive maturity, surviving hybrids were 533 recovered from lake 2 on April 28 th and 29 th , 2015 after 11 months in field enclosures. Tags were 534 dissected from all survivors after preservation, read using a 100x tag-reading scope from 535 Northwest Marine Technologies, Inc, and matched with archival tags to identify the survival 536 status (0 or 1) of each tagged hybrid. 537 538

Laboratory control 539
Additional hybrids from each population (n = 199, Table S1) were raised in two 151-liter 540 laboratory aquaria concurrent with the field experiment for 11 months. Control hybrids were 541 raised on a diet of only commercial pellet foods to provide a uniform resource offering no 542 advantages for specialized trophic morphology. Hybrids were fed once daily an amount of food 543 that could be consumed in five minutes but not ad libitum and raised at 26 -27° C in 5-10 ppt 544 salinity (Instant Ocean) with weekly 95% water changes. Laboratory hybrids grew faster than 545 fish placed in field enclosures and high densities within each aquarium population led to intense 546 competition for food and high mortality rates, whereas survivors collected from field enclosures 547 never reached maximum adult sizes or approached senescence. The day of each laboratory death 548 was recorded, followed by removal of the tag to identify the pre-release photograph of that to overestimate the slope of the regression line, particularly for highly variable traits among 569 parental and hybrid populations such as oral jaw length. However, our results were robust to 570 OLS size-correction. No allometric scaling was observed among different species except for 571 nasal protrusion distance and nasal protrusion angle, which exhibited no association with log-572 transformed SL and were not size-corrected. All size-corrected trait residuals and uncorrected 573 nasal protrusion distance and angle were standardized to a standard deviation of one and mean of 574 zero for comparisons across traits. There was no effect of standard length at introduction on 575 survival of hybrids (GLM logistic regression with effects of field enclosure and log-transformed 576 SL: P = 0.709) 577 578

Visualization of fitness landscapes 579
Fitness landscapes were visualized in each enclosure by fitting thin-plate splines to the survival 580 (binomial) or growth rate (normal) data for each hybrid using generalized cross-validation 581 (GCV), which minimizes residual prediction error of the spline surface. Splines were estimated 582 using the Fields package [59] in R. When over-fitting was apparent, restricted estimation of 583 maximum likelihood (REML) was used to estimate the curvature of the spline instead of GCV 584 (used for both survival landscapes in lake 2; REML estimation of splines was identical to GCV 585 surfaces in lake 1). Growth rate was only examined in lake 1 due to the low number of survivors 586 with growth rate data from lake 2 (Table S1). 587 We focused on two different cross-sections of the 30-dimensional hybrid morphospace. 588 nonlinear data such as ours, and enables visualization of the strongest axes of nonlinear selection 598 within the dataset. We calculated the first two ridge axes using the gppr function with a binomial 599 family of response distributions and then projected hybrid phenotypes onto each ridge axis by 600 matrix multiplication. We performed generalized projection pursuit regression separately for lake 601 1 and lake 2 hybrid populations due to the large differences in survival (Table S1). 602 603

Generalized additive modeling 604
We formally tested for experimental treatment effects on fitness landscapes using generalized 605 additive modeling in the mgcv package [109] in R. This modeling framework enables 606 incorporation of spline terms into generalized linear models and comparisons of models 607 containing spline, fixed, and random effect terms using AIC. For the survival data, we compared 608 models with the fixed effects of treatment and lake and all combinations of univariate smoothing 609 splines and thin-plate splines on both discriminant axes (Table 2) or both major ridge axes of 610 selection estimated from generalized projection pursuit regression (Table 3). We also explored 611 models allowing the thin-plate spline surface (i.e. the fitness landscape) to vary between lake 612 environments using the 'by' term within the thin-plate function. Finally, we included models 613 with a covariates including log-transformed standard length and distance measures of competitor 614 frequency within each enclosure based on either Mahalanobis distance or nearest-neighbor 615 Euclidean distance (see below). Models were compared using AIC. We examined a similar range 616 of models for the growth rate data, calculated from the difference in log-transformed SL between 617 pre-release photographs and surviving fish. However, we excluded lake 2 from all growth rate 618 analyses due to the low number of survivors in this lake. 619 620

Analyses of frequency-dependent selection within enclosures
We used two approaches to measure the frequency of competitors within each enclosure. First, 622 we calculated the Mahalanobis distance from each hybrid phenotype to the mean hybrid 623 phenotype in the full 30-trait morphospace using the mahalanobis function in R. This distance 624 estimates the disparity of each hybrid relative to the most abundant hybrid phenotypes while 625 accounting for trait correlations. We also measured the frequency of competitors in the local 626 region of morphospace surrounding each hybrid by calculating the sum of the Euclidean distance 627 to the ten nearest neighbors in the full 30-trait morphospace, following the approach in