Contemporary risk of extinction in an extreme environment

The increased frequency and intensity of extreme events are recognized among the most worrisome aspects of climate change. However, despite increased attention from scientists and conservationists, developing and testing general theories and hypotheses on the effects of extreme events on natural populations remains intrinsically challenging. Using numerical simulations with general—but realistic for moderately fast-leaving species—parameter values, I tested some of the hypotheses on risk of extinction and population and genetic dynamics in an environment in which both climate (e.g., temperature, rainfall) and point (e.g., fires, floods) extremes occur. In the simulations, a quantitative trait is selected for by a climate variable, but point extremes cause trait-independent massive mortalities. I found additive effects between age at first reproduction and fecundity on risk of extinction. The extent of population bottlenecks (operationally, the number of years in which a population was at low numbers) was a good predictor of allelic richness for the quantitative trait selected for by the climate. Simple models including basic demographic and vital rates information of the species, along with climate/environmental measures, provided excellent predictions of contemporary risk of population extinction. Mean and minimum population size measured in a 10-year “observation window” were largely the most important predictors of risk of population extinction in the following 10-year “prediction window”.

The expected optimum phenotype µQ(t) moves at a constant rate over time (i.e., 168 trend), fluctuating randomly around its expected value . The optimum phenotype 169 Q(t) is randomly drawn at each time step t from a normal distribution Q(t) ~ N (µQ(t),sQ(t)). It is equivalent to consider Q(t) as both the optimum phenotype and the 171 value of a continuous climate variable (e.g., mean summer temperature or yearly rainfall), 172 and I will use the two terms interchangeably throughout this work.  With the model formulation of Eq. (1), both the mean and variance of the distribution 192 of the climate variable change over time so as to make the occurrence of events more 193 likely after climate change (i.e., t > tch) than before, since after climate change, the 194 realized (i.e., random draw from the statistical distribution of climate) climate is 195 increasingly likely to be in the region of extremes (say, in the right 5% or 2.5% of the 196 Gaussian distribution of climate) of the statistical distribution of climate before climate 197 change. 198 Point extreme events E leading to trait-independent high mortalities occur with annual 199 probability p(Eb) when t < tch (i.e. b -before climate change) and p(Ea) when t > tch.
The narrow sense heritability h 2 = / is the proportion of the phenotypic variance 205 present in the population that is explained by the additive genetic variance (i.e. 206 the variance of a in the population). 207 For an individual i, the genetic value ai is determined by nl freely recombining diploid 208 variance equal to s 2 a. For simplicity , I did not model either dominance or epistatic   212   variation or other complicating factors such as genotype-environment interaction and  213 linkages. Likewise, I did not model mutation, since previous work has shown that 214 mutation does not appear to have any effect short-term on extinction risk and the 215 evolution of traits on contemporary temporal scales . 216 Stabilizing selection is modeled with a Gaussian function (Bürger and Lynch 1995, 217 Zhang 2012), with fitness W (Endler 1986) for an individual with phenotypic trait zi equal 218 to: 219 (3) 220 and equivalent in this model to the annual survival probability of individual i. The 221 curvature of the fitness function near its optimum increases with decreasing ω 2 ; it follows 222 that that the smaller ω 2 , the stronger is selection. Stabilizing selection is usually measured 223 by the standardized quadratic selection gradient g, which is defined as the regression of 224 fitness W on the squared deviation of trait value from the mean (Lynch and Walsh 1998). 225 An optimum phenotype in the tails of the distribution is likely to cause a large drop in 226 population size and can be considered an extreme climate event (Fig. 1). 227 The median g = -0.1 for stabilizing selection found by Kingsolver where . With g = -0.

Simulations 239
As this study focuses on the more immediate effects of climate change, the 240 simulations last 250 years. Parents mate at time t-1, offspring are born at time t and 241 become of age 1 at t+1. The sequence of operations is mortality of adults, mating and 242 reproduction, mutation, mortality of offspring. At the start of each simulation, for each 243 individual a value of a and e (Eq. 2) is randomly drawn from their initial distribution. A 244 population is considered extinct if at any time during the simulation there are fewer than 245 2 individuals in the population. Parents form mating pairs starting at age af and produce a 246 number of offspring randomly drawn from a Poisson distribution with intensity lo .

247
Offspring receive for the same locus one allele from each parent.
s values that are both realistic for natural populations with moderately fast life histories, 254 like salmonids (e.g., age distribution skewed toward individuals younger than 10 years 255 old, sexual maturity reached when individuals are between 1 and 4 years old), and 256 instrumental for the main goal of the study, e.g., testing hypotheses on the effects of 257 extreme events on population and genetic dynamics and on risk of extinction. 258 I performed simulations with selection strength s equal to either 8 10 -2 (average 259 selection strength) or 1.1 10 -2 (moderately strong selection). For the rate of increase in the 260 mean of the climate variable, I used = 0 (base scenario) and 1.5 10 -2 . I used rates of 261 the increase in the standard deviation of the climate variable from 0 (base scenario) 262 to 1.5 10 -2 . According to Bürger and Lynch (1995), when the standard deviation of the 263 distribution of the optimum reaches the same order of magnitude as the width of the 264 fitness function, the population is at risk of going suddenly extinct, with little role played 265 by genetics. Therefore, I chose values of that strongly increase the probability of 266 climate extremes, but did not inevitably make the population go extinct. 267 I used frequency of point extreme events p(Ea) of either 5 10 -2 (no variation before and 268 after climate change, corresponding to a recurrence interval of 20 years), 10 10 -2 (i.e., 269 recurrence interval is 10 years) or 15 10 -2 (Table 1) Table 1. 277

Initialization 278
To reach mutation-selection-drift balance, I first let the population evolve for tch years 279 in an environment in which mean and variance of the distribution of the optimal 280 phenotype Q are constant. In preliminary simulations it was found that after tch ~ 100 281 years both phenotypic mean and variance did not noticeably change. Then, the mean of 282 Q increases for 150 years and the variance of Q for 25 years, which was then kept 283 constant up to the end of simulation time. 284 I started every simulation replicate with 500 individuals. I modeled 10 alleles present 285 in the population for each locus, which value was randomly drawn from a normal 286 distribution N(0, ). Since I set = 1 and = 0.2, the narrow sense heritability h 2 287 was around 0.2 at t = 1, close to what commonly observed for life-history traits (Lynch 288 and Walsh 1998) and consistent with the Gaussian allelic approximation including only 289 quasi-neutral and adaptive mutations, for which (Lande 1995). distribution bounded between 20 and 250 for the replicates that survived; u is a random 343 deviate from a uniform distribution bounded between 1 and 10 years. This way, I am 344 trying to model extinction or persistence not at a specific time, but in a "prediction 345 window" of 10 years, whose beginning lays between year 10 and year 240 of simulation 346 time. I used as candidate predictors, as measured in the 10 years ("observation window") 347 before the "prediction window", minimum and mean population size N, the maximum 348 value of the optimum phenotype, the expected yearly number of offspring per mating 349 pair, age at first reproduction, and maximum and mean distance over the observation 350 window between the mean phenotype and the optimum (i.e., maximum and mean 351 population-level maladaptation, or maximum and mean "extremeness" of the climate). For the GAMs and GLMs, I estimated the optimal cutoff given equal weight to extinct) and specificity (the probability that the model predicts persistence when the 366 replicate persisted). Then, I tested the model by predicting population extinction and 367 persistence on the test dataset using the computed optimal cutoff. For the classification 368 RF models, I directly used the binary prediction (population going extinct or surviving) 369 from the models. I used different modeling approaches because I did not explicitly model 370 any mechanism or process that can be hypothesized to lead to extinction (i.e., models are 371 correlative and not mechanistic), and different modeling approaches can give different 372 insights on how contemporary extinctions are predicted (e.g., tree-based models like RFs 373 provide a measure of variable importance for predicting the target variable (Breiman 374 2001a) and GAMs can model non-linear relationships between predictors and target 375 variable using semi-parametric estimation). For the GAMs and GLMs, I centered and 376 scaled the predictors in order to compare their importance (Schielzeth 2010). As I use 377 realistic variable ranges representing the variability that may be observed in nature, some 378 of the estimated parameters can be compared in terms of effects on a standardized scale. I 379 also fitted the same models using non-standardized predictors to test for possible data 380 leakage between training and testing data sets (results were directionally the same). I did 381 not include interactions among predictors in the models in order to improve the 382 interpretability of results. I visually checked residuals for violation of model assumptions. 383 3 Results 384 increased occurrence and severity of point extremes led to noticeable fluctuations in 388 population size over time (Fig. 1). Twenty-one per cent of the 34 560 simulation 389 replicates went extinct. As expected, risk of extinction increased with higher frequency 390 and severity of point extreme events, older age at first reproduction, and fewer offspring 391 produced per mating pair (Fig. 2). Given an expected yearly number of offspring per 392 mating pair, the proportion of replicates that went extinct increased approximately 393 linearly with age at first reproduction (Fig. 2a). When considering all replicates or only 394 the most extreme scenario, for a fixed expected number of offspring produced per mating 395 pair, increasing age at first reproduction by one year would increase the probability of 396 going extinct by approximately 10% (Fig. 2a). The combination of relative high 397 frequency and high severity of extreme events led to a noticeably higher proportion of 398 replicates that went extinct (Fig. 2b). Among the replicates that went extinct, 16% of 399 them were not affected by a point extreme in the 10 years before extinction and 34% in 400 the 5 years preceding extinction. 401 In populations that persisted until the end of simulation time, the GLM and GAM models 402 that included n_low provided a good prediction of the total number of alleles at the end of 403 simulation time ( Table 2). The total number of years with population size smaller than 404 150 individuals had a strong, negative non-linear effect on the total number of alleles at 405 the end of simulation time (Fig. 3). Models that did not include n_low had much lower 406 predictive performance (Table 2). 407 Models for differences in allelic frequency of more or less theoretically advantageous 408 alleles were able to explain less than 5% of the variance of the target variable (Fig. 4).

20
The average value of population-mean phenotype in the last ten years of simulation 410 time was negatively correlated with the difference in frequency of top and bottom 10% of 411 alleles according to their allelic value at the end of simulation time (r = 0.71, p < 0.01). 412 The GLM, GAM, and RF models fitted on the training data sets had similar high 413 predictive accuracy and low false positive and negative rates when predicting extinction 414 or survival in the 10-year "prediction window" (Table 3). For more than 98% of 415 replicates included in the test data set, the GLM, GAM, and RF models provided the 416 same prediction of either extinction or persistence (Fig. 5). In the GAM model, only 417 minimum population size in the observation window had a strong non-linear (negative) 418 effect on the log-odds of population extinction. Likewise, minimum population size was 419 the most important predictor in the RF model (Fig. 6a). The models without minimum 420 and mean population size as predictors had fairly low accuracy (Table 3). The RF model 421 without minimum and mean population size as predictors found age at reproduction and 422 expected yearly number of offspring per mating pair as the most important predictors of 423 contemporary extinction (Fig. 6b). 424

Discussion 425
Understanding and predicting the effects of extreme events on risk of extinction and 426 population and genetic dynamics of natural populations is critical for both population 427 forecasting and for managing human intervention in an increasingly more extreme world. 428 The results of the numerical simulations showed additive effects-with largely no 429 interaction effects-between age at first reproduction and fecundity on risk of extinction 430 for the range of values I simulated. In the replicates that survived up to the end of simulation time, the total number of years in which the population was at a small size was 432 a good predictor of allelic richness for the quantitative trait under selection. The 433 population frequency of theoretically advantageous alleles was strongly correlated with 434 the mean value of the phenotype under selection but was otherwise largely unpredictable. 435 Last, simple models including basic demographic and vital rates information, along with 436 climate and environmental data, provided excellent predictions of contemporary risk of 437 population extinction. alleles is difficult to predict from just the coarse-grain description of the environment and 495 of the species, even in an extreme environment that also causes trait-independent mass 496 mortalities, a shift in the phenotype is likely to be caused by the increased prevalence of 497 the most advantageous alleles. 498 One of the foundational tenets of conservation biology is that small, fragmented 499 populations should be considered locally vulnerable to extinction-even more critically 500 so when affected by highly variable climatic conditions and other environmental 501 disturbances. The conditions that led to the extinction of a population or species can 502 almost always be understood retrospectively, but forecasting extinction, especially over 503 contemporary time horizons, is much more challenging. In this work, I found that the 504 most important predictors of contemporary extinctions were mean and minimum 505 population size measured in the few years before the "prediction window". However, it 506 was not uncommon for the simulated populations to swiftly rebound after collapses in 507 numbers, and age at reproduction and yearly fecundity were the most important 508 predictors of extinction when measures of population size were not included in the model. 509 This result highlights the importance of age at first reproduction and yearly fecundity for 510 population persistence in highly stochastic environments, as also suggested by theoretical 511 (Bürger and Lynch 1995) and experimental (Griffen and Drake 2008) studies. However, 512 other genetic challenges not accounted for in my simulation model are likely to be 513 encountered by populations that decline to very small numbers, such as a reduction of 514 viability and/or fecundity due to either inbreeding or the expression of deleterious alleles 515 (Willi et al. 2006). 516 Although it may appear from the simulation results that vital rates and environmental 517 conditions play a small role in models that predict contemporary risk of extinction, those 518 rates and conditions heavily contribute to determining population size in the "observation 519 window"; for example, populations with higher fecundity and younger age at first 520 reproduction are less likely to remain at low population sizes than populations with lower 521 fecundity and delayed sexual maturity. Likewise, a more extreme environment (e.g., 522 greater selection strength and more frequent and/or severe climate and point extremes) 523 tends to decrease average population size, either due to acute events that kill individuals 524 or constant recovery from population crashes. 525 Luck also plays a major role in determining whether a population will recover after a 526 population crash. Vincenzi et al. (2017) found that the almost-complete recovery of a 527 trout population that was reduced to a handful of individuals after a flash flood was due 528 to the large production of offspring by a single mating pair. Considering the high 529 variance in adult reproductive success in trout populations-which is at least partially due 530 to differences in individual "quality" ( However, models of population and genetic dynamics are limited in their scope of 539 prediction by both the understanding of the biology and ecology of the species and the 540 availability of data to parametrize, train, and test models, along with modeling choices 541 (e.g., explicit or implicit modeling of alleles, clustered or "well-mixed" individuals or 542 populations). Although often intuitive, it is nevertheless important to remind ourselves 543 that the simulation results of modeling exercises depend on, first, our biological and 544 ecological understanding, and, second, the simplified modeling of the species and the to competing physiological functions are not only often intrinsically challenging to 547 model, but they may also vary over time and space for the same species or population. 548 That is, even when there is a qualitative understanding of biological or ecological 549 processes, the parameterization and choice of parameter values for the model may be too 550 uncertain to provide actionable predictions. 551 Trade-offs between model accuracy and interpretability also need to be taken into 552 account when developing models of population and genetic dynamics. Accuracy 553 describes the ability of a model to explain observed data and make correct predictions, 554 while interpretability concerns to what degree the model allows for understanding 555 processes. Often, there is a trade-off between accuracy and interpretability: more 556 complex models are usually opaque, while more interpretable models often do not 557 provide the same accuracy or predictive power of more complex models (Breiman 558 2001b). Then, although intuitively more complex models are expected to provide more 559 accurate predictions of risk of extinction or population and genetic dynamics, this is not 560 always the case. For instance, Ward et al. (2014) tested the predictive performance of 561 short-term forecasting models of population abundance of varying complexity. They 562 found that more complex models often performed worse than simpler models, which 563 simply treated the most recent observation as the forecast. In their case, the estimation of 564 even a small number of parameters imposed a high cost while providing little benefit for 565 short-term forecasting of population abundance. However, when there was a clear signal 566 of cyclic dynamics, more complex models were able to more accurately predict future 567 population sizes.      Table 2. 760 The GAM algorithm found k = 3 as the optimal number of degrees of freedom for the 761 spline (i.e., a cubic). Model