## Abstract

Genetic assimilation results from selection on phenotypic plasticity, but quantitative genetics models of linear reaction norms considering intercept and slope as traits do not fully incorporate the process of genetic assimilation. We argue that intercept-slope reaction norm models are insufficient representations of genetic effects on linear reaction norms, and that defining the intercept as a trait is unfortunate. Instead we suggest a model with three traits representing genetic effects that respectively (1) are independent of the environment, (2) alter the sensitivity of the phenotype to the environment, and (3) determine how the organism perceives the environment. The model predicts that, given sufficient additive genetic variation in environmental perception, the environmental value at which reaction norms tend to cross will respond rapidly to selection, and eventually become equal to the mean environment. Hence, in this model, genetic assimilation in a new environment becomes complete without changes in genetic correlations, genetic drift or imposing any fitness costs on maintaining plasticity. The asymptotic evolutionary outcome of this three-trait linear reaction norm generally entails a lower degree of phenotypic plasticity than the two-trait model, and maximum expected fitness does not occur at the mean trait values in the population.

## Introduction

All natural populations evolve in environments that are to some degree variable. Biologists have long realized that the phenotypic expression of different genotypes may respond differently to the same environmental change, and that such phenotypic plasticity may be heritable (DeWitt and Scheiner 2004, Pigliucci 2005). Depending on the effect this phenotypic plasticity has on selection (fitness), evolution may thus bring about mechanisms that either buffer the phenotypic expression against environmental variation (i.e., environmental canalization) or modify the responses to some environmental influence in an adaptive manner (Nijhout 2003). Phenotypic plasticity involves developmental, physiological and/or behavioral phenotypic responses to some component(s) of the environment (DeWitt and Scheiner 2004, Pigliucci 2005, Pigliucci et al. 2006). These environmental components, often referred to as environmental ‘cues’(DeWitt and Scheiner 2004), are often just correlated with, but not identical to, the environmental variables affecting fitness (e.g. McNamara et al. 2011, Svennungsen et al. 2011, Gienapp et al. 2014). Hence, cues do not provide perfect information about the optimal phenotypic expression, and it is usually adaptive to respond more conservatively towards information-poor cues than more informative ones (Yoccoz et al. 1993, Ergon 2007, McNamara et al. 2011). The phenotypic expression of a particular genotype as a function of environmental cues is called a reaction norm (Woltereck 1909, Pigliucci 2005). There has been considerable interest in evolutionary processes governing reaction norms as this is crucial for our understanding of how populations may respond to environmental change (e.g. Lande 2009, McNamara et al. 2011, Gienapp et al. 2014).

Waddington (1953, 1961) originally used the term ‘genetic assimilation’ to describe experimental selection results where qualitative phenotypes (such as lack of cross-veins in *Drosophila* wings) that are initially only expressed in response to a particular environmental stimuli (such as heat shock during a particular stage of development) becomes constitutively produced (i.e., becomes expressed independently of the environmental stimuli) after continued selection. However, ‘genetic assimilation’ is also used to describe similar phenomena in evolution of the mean of quantitative phenotypes that may remain plastic at equilibrium in a stochastic environment after an environmental change (Pigliucci and Murren 2003; Lande 2009). In such cases, the new equilibrium phenotypes will not be independent of the environment unless the reaction norm slope is zero. We adopt Lande’s (2009) definition of genetic assimilation in an altered environment as the reduction in the plastic component of the phenotype with a concomitant genetic evolution, while maintaining the phenotype initially produced by plasticity in the altered environment. By ‘plastic component’ we here mean the difference between the phenotypic value and the mean phenotype in the environment where phenotypic variance is minimized (i.e., where individual reaction norms tend to cross; see Fig. 1 in Pigliucci et al. 2006). We consider this process of genetic assimilation as complete when the expected plastic component of the phenotype is zero and the phenotypic variance is minimized in the new mean environment (but both mean reaction norm slope and phenotypic variance in the mean environment may remain non-zero; see Fig. 1 in Pigliucci and Murren 2003 and Fig. 2 in Pigliucci et al. 2006).

Several experiments, mainly on the fruit fly *Drosephila melanogaster*, have demonstrated genetic assimilation in laboratory populations (Braendle and Flatt 2006), and there is substantial evidence for genetic assimilation, also for quantitative traits, from field observations and experiments (reviewed by Pigliucci and Murren 2003). The most commonly proposed mechanisms for genetic assimilation is that extraordinary environments may expose hidden genetic variation in reaction norms for subsequent selection, followed by eventual canalization of the phenotypic expression in the new environment (Braendle and Flatt 2006, Pigliucci et al. 2006, Lande 2009). The latter stage of this process is perhaps the least understood; it has been suggested that genetic drift or fitness costs of maintaining plasticity plays a part (West-Eberhard 2003, Pigliucci et al. 2006, Lande 2009, Bateson and Gluckman 2011), and changes in the genetic variances, covariances and genetic architecture of reaction norm components may be involved (Wagner et al. 1997, Steppan et al. 2002, Le Rouzic et al. 2013).

One approach to quantitative genetic analysis of phenotypic plasticity (Via et al. 1995, Rice 2004) is to consider the intercept and slope of linear reaction norms as two quantitative traits in their own right (de Jong 1990, Gavrilets and Scheiner 1993a, de Jong and Gavrilets 2000, Tufto 2000, Lande 2009). More generally, reaction norms have been modeled by considering polynomial coefficients as traits (Gavrilets and Scheiner 1993b, Scheiner 1993). In these models, the intercept trait is the phenotypic value at a reference cue designated as zero. Lande (2009) analyzed the evolution of such a linear reaction norm, assuming a stochastic environment undergoing a sudden extreme change (relative to the background fluctuations) in both the mean environmental cue and the phenotypic value where fitness is maximum. In his model the population responded by a rapid increase in mean reaction norm slope (plasticity), followed by a slow increase in reaction norm elevation at the reference cue with a concomitant decrease in plasticity. However, the genetic assimilation was not completed, as the cue value at which the phenotypic variance was at its minimum could never move away from the reference cue because the covariance between reaction norm slope and intercept was assumed to remain constant. Lande (2009) argued that further canalization would take place (e.g., due to fitness costs of maintaining plasticity), but did not include any such mechanisms in his modeling.

In this paper, we argue that the two-trait model is an insufficient representation of genetic effects on linear reaction norms, and hence fails to predict critical aspects of the evolution of phenotypic plasticity and genetic assimilation. Instead we suggest modeling linear reaction norms as being composed of three traits based on fundamental ways that gene products may alter linear reaction norms in such a way that they remain linear. These three traits are (a) effects of gene products that are independent of the cue (variation in this trait will shift the reaction norm along the phenotype axis), (b) effects of gene products that alter the sensitivity of the phenotype to the cue (variation in this trait will alter the slope of the reaction norm), and (c) effects of gene products that affect how the organism perceives the cue (variation in this trait will shift the reaction norm along the cue axis). Reanalyzing the scenarios for extreme environmental change considered by Lande (2009), we show that, under the three-trait reaction norm model, genetic assimilation in the new stochastic environment becomes complete (as defined above) without changes in genetic correlations among the defined traits, genetic drift or imposing any fitness costs on maintaining plasticity. Further, we show that the evolutionary equilibrium of this three-trait linear reaction norm under random mating entails (with certain exceptions) a shallower mean reaction norm slope than the slope of the optimal individual reaction norm and the equilibrium slope of the two-trait model. Hence, maximum fitness does not occur at the mean trait values in the population.

We start by deriving an expression for optimal linear reaction norms as a function of environmental cues in stationary stochastic environments. We then derive our three-trait linear reaction norm model, and finally we analyze the evolutionary dynamics of this model in a quantitative genetics framework, and compare it to the dynamics of the two-trait reaction norm model analyzed by Lande (2009).

## Models

### Optimal linear reaction norms in temporally variable environments

Models for optimal adaptations in variable environments have traditionally assumed either that individuals have no information about the relevant environmental variables, or that individuals have exact information about the state of the environment (Yoshimura and Clark 1991, Roff 2002). Whenever the phenotype yielding highest fitness is not known exactly (i.e., the individuals do not have full information about the present and future environment), the long term success of a genotype depends not only on the expectation of fitness, but it is also adaptive to reduce the variance in mean fitness across generations (Yoshimura and Clark 1991, Starrfelt and Kokko 2012). Models that assume that individuals have no information about the environment have been used to explain risk-avoidance and bet-hedging strategies (den Boer 1968, Hopper et al. 2003, Starrfelt and Kokko 2012). On the other side of the spectrum, models that predict optimal trait values as a function of environmental variables, often assume that these variables are known to the individuals without error (e.g. Stearns 1992, Roff 2002).

The concept that phenotypic expressions are functions of more or less informative environmental cues is well established in evolutionary ecology (Tollrian and Harvell 1999, DeWitt and Scheiner 2004, Stephens et al. 2007, McNamara et al. 2011, Gienapp et al. 2014). For example, seasonal reproduction in many organisms must take place within a rather narrow time-window which often varies largely between years (Durant et al. 2007, Gienapp et al. 2014). Since such phenological events must often be prepared a long time in advance (due to acquiring resources, physiological developments and migration), seasonal reproduction may be influenced by rather information-poor cues such as temperature and food constituents weeks before reproductive success is determined (Berger et al. 1981, Korn and Taitt 1987, Lindstrom 1988, Negus and Berger 1998, Nussey et al. 2005). Examples of such obviously adaptive phenotypic plasticity to more or less informative environmental cues are ubiquitous in nature (Pigliucci 2005, Sultan 2010, Landry and Aubin-Horth 2014).

To derive an optimal norm of reaction to an imperfect cue, we may view the cue (*U*) and the phenotypic expression that maximize fitness (Θ) as having a joint distribution with given means, *μ _{U}* and

*μ*

_{Θ}, variances, and , and a correlation, (Figure 1). Note that we here define the cue (

*U*) in a general sense as the

*environmental component*that affects the phenotype,

*not*how this component is perceived by the individuals (as in e.g. Tufto (2000)). Also note that

*U*must not necessarily be interpreted as a proxy for another environmental component that affects fitness (e.g. Miehls et al. 2013), although this may be the case (see caption of Figure 1). Hence, following McNamara et al. (2011) we focus on the information content in the cue (

*U*) about the optimal phenotypic expression (Θ) in the given environment.

Under the assumption of no density or frequency dependence, the optimal phenotypic trait values are those that maximize the geometric mean of fitness across generations (Dempster 1955, Caswell 2001). This is equivalent to maximizing the expected logarithm of fitness. Hence, if fitness, *W*, is a Gaussian function (with constant width and peak values) of the phenotype value, *y*, such that ln(*W*(*y*)) is a quadratic function, the optimal *linear* reaction norm as a function of cue values *u* is

(Appendix A). Note that, due to the quadratic fitness function ln(*W*(*y*)), this is the same as the least squares prediction line of Θ as a function of cue values *u* (Battacharyya and Johnson 1977).

This optimal reaction norm under imperfect information (eq. (1)) may be seen as a weighted average of the optimal phenotype under no information (*μ*_{Θ}) and the optimal phenotype under perfect information , with the weight being |*ρ*| (Figure 1). Given that *W* is a Gaussian function of *y*, this linear reaction norm is the optimal reaction norm (i.e., a non-linear reaction norm would not perform better) as long as E[Θ| *U* = *u*] is a linear function of *u*, which is the case when *U* and Θ are bi-normally distributed (chap. 7.8 Johnson and Wichern 2007).

Optimality models of this kind have been central in the development of evolutionary ecology (Parker and Maynard Smith 1990, Sutherland 2005, Roff 2010). McNamara et al. (2011) analyzed the general optimal linear reaction norm given by equation (1) in terms of optimal phenology under environmental change. Ergon (2007) used a similar approach to analyze optimal trade-offs between pre-breeding survival, onset of seasonal reproduction and reproductive success in fluctuating multivoltine species.

### Quantitative genetics models for linear reaction norms - two vs. three traits

The optimal linear reaction norm given by equation (1) says nothing about the selection process and does not consider genetic constraints. In the following we will consider a quantitative genetic model for linear reaction norms, assuming phenotypic responses to an interval-scaled cue with an arbitrary zero point (Houle et al. 2011).

In quantitative genetic models for the evolution of phenotypic plasticity, it is common to model linear reaction norms by two traits, representing the intercept (*α*) and slope (*β*) of the reaction norm (e.g. de Jong 1990, Gavrilets and Scheiner 1993a, de Jong and Gavrilets 2000, Tufto 2000, Lande 2009, Scheiner 2013). I.e., the plastic phenotype is modeled as a function of an environmental cue *u* on the form

In this two-trait model, the intercept trait *α* is the phenotypic expression for the cue-value designated as zero. Lande (2009) assumed that minimum phenotypic variation occurred in the mean environment that the population had been adapted to, and hence defined the cue to have its zero point in this reference environment. He then used this reaction norm model (eq. (2)) in a quantitative genetics analysis of adaptations to a sudden extreme change in the mean environment when the reference environment remained unchanged.

We will here analyze a more general linear reaction norm model based on three fundamental ways that genetic effects can alter a linear reaction norm in such a way that it remains linear; a change along the plastic phenotype axis, a change in slope (cue sensitivity), and a change in the reaction norm along the cue axis. This leads us to consider a linear reaction model on the form
where *z _{a}*,

*z*and

_{b}*z*are considered as (latent) traits. A particular genetic effect may of course affect more than one of these traits, but any genetic effect on a linear reaction norm can be decomposed into these three components. Obviously, shifting the reaction norm along the cue-axis (a change in

_{c}*z*) may have exactly the same effect on the reaction norm as shifting it along the

_{c}*y*-axis (a change in

*z*). By rearranging the reaction norm model (3) as

_{a}*y*(

*u*) =

*α*+

*z*where

_{b}u*α*=

*z*–

_{a}*z*, we see that increasing

_{b}z_{c}*z*by one unit has the same effect on

_{a}*y*(

*u*) as decreasing

*z*by 1/

_{c}*z*units. However, traits

_{b}*z*and

_{a}*z*still represent very different genetic effects within the organisms. Trait

_{c}*z*may be thought of as representing genetic effects on “perception” of the environmental cue in a general sense. For example, variation in

_{c}*z*may represent genetic effects affecting the sensory apparatus in such a way that different genotypes perceive the same environmental cue as different, but cue perception may not necessarily involve the sensory apparatus or a nervous system (see Discussion). The component of the reaction norm that is independent of the environment is the intercept (

_{c}*z*–

_{a}*z*), although this component depends on the chosen zero-point of the interval scaled cue. However, trait

_{b}z_{c}*z*represents genetic effects that are invariant to which environment that has been designated (by the researcher) to have cue value zero;

_{a}*z*is the component of the intercept that depends on the chosen zero-point of the cue variable. Variation in trait

_{b}z_{c}*z*may thus represent variation in gene products for which both the production of these gene products and their effect on

_{a}*y*(

*u*) are independent of the cue. Finally, trait

*z*(reaction norm slope) represents variation in gene products that affect the sensitivity of the plastic phenotype

_{b}*y*(

*u*) to the cue. With this parameterization of the reaction norm (eq. (3)),

*z*may be referred to as a “cue reference trait” although we do not suggest that there is necessarily a “template” of a specific environment that is stored genetically in the organisms; what is essential is the types of genetic variation that is represented by the three traits in the model.

_{c}Note that the two-trait model (eq. (2)) is a special case of the more general three-trait model (eq. (3)) where *z _{c}* is fixed to zero. Reaction norm slope is considered as a trait in both models (i.e.,

*β*=

*z*), but we have used a separate notation in the two models for clarity.

_{b}## Analysis

### Basic properties of the reaction norm models

As already noted, an obvious difference between the two-trait (eq. (2)) and the three-trait (eq. (3)) reaction norm models is that the two-trait model implies a one-to-one correspondence between genotypes and reaction norms, whereas the three-trait model implies that one reaction norm can represent many genotypes. Nevertheless, as we will see below, linear reaction norms in a population will evolve very differently and reach a different equilibrium when we consider the reaction norm to result from three traits rather than two traits.

An essential difference between the two-trait and the three-trait reaction norm models relates to constraints in the evolution of the covariance between reaction norm intercept and slope in the population. To see this, it is elucidating to consider a particular rescaling of this covariance, *u*_{0}, defined as the cue value for which phenotypic variance is at a minimum and where the covariance between the plastic phenotypic value *y*(*u*) and reaction norm slope is zero. Given a phenotypic covariance between intercept and slope (*P _{αβ}*) and a variance in reaction norm slope (

*P*), this cue value is

_{ββ}(Appendix B).

From equation (4) we see that, in the two-trait model, where reaction norm intercept (*α*) and slope (*β*) are considered as traits, *u*_{0} is independent of the trait means, and directional selection on any of the traits will not affect *u*_{0} unless the selection also changes the variance of the slope or covariance of the traits.

On the other hand, in the three-trait model, the covariance between intercept and slope depends on the mean traits and . Under the assumption of normal traits, *u*_{0} then becomes
where *P _{bc}*,

*P*and

_{ab}*P*are the elements of the phenotypic variance-covariance matrix indicated by the subscripts (Appendix B). Thus, in this quantitative genetic model,

_{bb}*u*

_{0}may respond directly to directional selection on both trait

*z*(if

_{b}*P*≠ 0) and trait

_{bc}*z*. If trait

_{c}*z*is independent of trait

_{b}*z*and

_{a}*z*(i.e.,

_{c}*P*=

_{bc}*P*= 0),

_{ab}*u*

_{0}becomes . Note also that

*u*

_{0}is independent of

*P*.

_{ac}Lande (2009) defined the cue *u* (*ε _{t}__{τ}* in his model) to have its zero-point at

*u*

_{0}as a “reference environment”. Hence, one could define the two-trait model analyzed by Lande (2009) for any arbitrary interval scaled cue variable as

*y*(

*u*) =

*α*′ +

*β*(

*u*–

*u*

_{0}) where the genetic correlation between the traits

*α*′ and

*β*is by necessity zero since

*u*

_{0}is defined by

*cov*(

*y*(

*u*

_{0}),

*β*) =

*cov*(

*α*′,

*β*) = 0 (Appendix B; see also last paragraph on page 1438 in Lande (2009)). This model is structurally similar to our three-trait model except that the “reference environment” in our model is considered as an individual trait,

*z*(reflecting individual variation in cue “perception”), which is exposed to selection. Unlike in Lande’s (2009) model, where the definition of trait

_{c}*α*′ depends on

*u*

_{0}, there are no constraints on the phenotypic or genotypic covariances in our three-trait model (other than that the covariance matrix must be positive-definite). The two-trait model of Lande (2009) can only evolve in the same way as the three-trait model if

*u*

_{0}is treated as the mean of an individual trait with variance different from zero. Hence, the three-trait quantitative genetics model and Lande’s (2009) two-trait model are not alternative parameterizations of the same model. Lande’s (2009) two-trait model is a constrained (nested) version of our more general three-trait model with the trait

*z*fixed to

_{c}*u*

_{0}, which requires that

*P*=

_{cc}*P*=

_{ac}*P*= 0 as well as

_{bc}*P*= 0 (

_{ab}*P*= 0 is only required to maintain the same definition of

_{ab}*z*and

_{a}*α*′ and to give ).

For further analysis, we define the ‘plastic component’ of the phenotype as the difference between the phenotypic value and the expected phenotype at cue value *u*_{0}, *y*(*u*) – *E*[*y*(*u*_{0})]. This definition is not dependent on any particular reaction norm model or genetic architecture of the phenotypic plasticity, and the ‘plastic component’ can be estimated for any phenotypic observation when it is possible to estimate *E*[*y*(*u*_{0})]. The expectation of the plastic component in a random environment becomes , where the mean reaction norm slope is interchangeable with in the three-trait model. Note that we obtain the same expected value if we instead define the plastic component as *β*(*u* – *u*_{0}). We will later show that expected *u*_{0} at equilibrium in the three-trait model always becomes *μ _{U}*, and hence the expected plastic component at equilibrium will always be zero.

### Evolution of linear reaction norms

Environmental change may lead to changes in any of the parameters of the joint distribution of cue (*U*) and the best possible phenotype (Θ) (c.f., eq. (1) and Figure 1). Any such change will impose directional selection on the individual traits defining the reaction norm, and the evolutionary response to this selection will depend on the additive genetic variances and covariances of these traits. We will here compare the evolution of linear reaction norms based on the three-trait model (eq. (3)) and the more constrained two-trait model (eq. (2)) analyzed in detail by Lande (2009). Specifically, we will analyze the transient and asymptotic evolution of the reaction norm distribution after a sudden and extreme concomitant change in both *μ _{U}* and

*μ*, while , and remain unchanged. We assume that all individuals in each generation experience the same environment, and that the environments in subsequent generations are independent (as also in Lande’s (2009) analysis). Following Lande (2009) we also assume that trait variances and covariances remain constant under selection. Although this may be a particularly unrealistic assumption (Steppan et al. 2002), it serves the purpose of examining how reaction norms can evolve through changes in trait means only.

_{Θ}### Quantitative genetics - modeling

Assuming that the individual traits of the reaction norm (3) have a multi-normal distribution with a constant variance-covariance matrix in a population with discrete generations, the fundamental equation describing the change in the population mean of the traits from a generation *t* to the next,
is the product of the additive genetic variance-covariance matrix for the traits (** G**) and the selection gradient

**β**

*defined as the sensitivity of the logarithm of population mean fitness to changes in each of the mean trait values (Lande 1979, Lande and Arnold 1983),*

_{t}We will assume a Gaussian fitness function with width *ω* and peak value *W _{max}*, and that all individuals experience the same environment in any generation.

A random individual in generation *t* has phenotype *y _{t}*(

*u*) =

_{t}*z*+

_{a,t}*z*(

_{b,t}*u*–

_{t}*z*), where the traits [

_{c,t}*z*,

_{a,t}*z*,

_{b,t}*z*] are drawn from a multi-normal distribution with mean and phenotypic covariance matrix

_{c,t}**. When the phenotypic expression that maximizes fitness in that generation is θ**

*P**, this individual will have fitness*

_{t}To find an analytical expression of the selection gradient (7), the standard approach (Lande and Arnold 1983, Lande 2009) would be to first find the population mean fitness by integrating over the phenotype distribution, *p*(*y _{t}*(

*u*)),

_{t}However, because *p*(*y _{t}*(

*u*)) is not normal as it involves the product of the two normally distributed traits

_{t}*z*and

_{b,t}*z*, it is not straightforward to solve this integral analytically. Indeed, it seems that an exact analytical expression for the selection gradient (7) does not exist. We therefore initially based our analysis on simulations of the evolutionary process (6) where the selection gradient (7) is computed numerically by simulating a population of 10,000 individuals at each generation (see Supporting Information S1 for R code). These simulations are accompanied by (and compared to) a mathematical analysis presented in Supporting Information S2.

_{c,t}In the simulation results presented in Figure 2, we used the same parameter values as in Lande’s (2009) analysis of the two-trait model except that we, for convenience, used a somewhat less extreme sudden change in the environment, with a change in *μ _{U}* and

*μ*of 3 (instead of 5) standard deviations of the background fluctuations (

_{Θ}*σ*and

_{U}*σ*of equation (1)). As Lande (2009) we used a diagonal

_{Θ}**-matrix and sat**

*G**G*to half the cue variance (three-trait model) or zero (two-trait model). For simplicity, in the simulations we also assumed that only trait

_{cc}*z*had a non-additive residual component with variance , such that , and . The two-trait model is obtain simply by setting also

_{a}*P*= 0 and .

_{cc}### Quantitative genetics - results

Selection according to equations (6) to (9) will find equilibrium values of the mean traits that maximize . Since it is not possible to find a general analytical solution, we present simulation results, in addition to approximate theoretical results and an analytical solution in the case of a constant environment (Supporting Information S2).

The simulations show that immediately after the sudden environmental change, there is a rapid increase in reaction norm slope (Figure 2B), while (Figure 2C) swings back in the opposite direction of the change in mean cue *μ _{U}* (i.e., away from the new optimum). This phase of the adaptation may be characterized as a “state of alarm” where it becomes adaptive to exaggerate the perception of the environmental change. As moves towards the new optimum (Figure 2A), the reaction norm slope is reduced and turns towards the new optimum. Eventually, stabilizes around

*μ*and stabilizes around

_{U}*μ*

_{Θ}(Figure 2D).

Since we used a diagonal phenotypic variance-covariance matrix (** P**) in the simulations, the cue value

*u*

_{0}that yields minimum phenotypic variance (eq. (5)) equals , which stabilizes around

*μ*in the simulations (Figure 2C). Hence, equilibrium

_{U}*u*

_{0}appears to be . As shown both by simulations (Supporting Figures S1-S3) and theoretical considerations (Supporting Information S2), this property of the three-trait model holds also when

**is not diagonal – i.e., at equilibrium, phenotypic variance is**

*P**always*minimum in the mean environment. As a result, the three-trait model leads to complete genetic assimilation in the sense that the expected plastic component (defined above) at equilibrium becomes zero, . In contrast, in the two-trait model, the expected plastic component is zero only when

*μ*= –

_{U}*P*, and the phenotypic variance can only be minimized in this mean environment (see equation (4)). This contrast in the asymptotic state of the systems obtained from the two alternative reaction norm models is illustrated in Figure 3. Figure 4 shows the trajectories of phenotypic variation and the expected plastic component in the simulated scenario presented in Figure 2. Supporting Figures S4 and S5 show simulation results for a scenario where there is no environmental variation before and after the sudden environmental change (more similar to classic examples of genetic assimilation).

_{αβ}/P_{ββ}Interestingly, as seen in Figure 2B, the mean reaction norm slope in the three-trait model, stabilizes at a lower level than the optimal slope yielding the highest expected fitness of an individual, (see eq. (1)), which is also the equilibrium mean slope in the two-trait model (Gavrilets and Scheiner 1993a, Lande 2009). Intuitively, this is because the optimal value of trait *z _{b}* of an individual depends on the value of trait

*z*that this individual possesses, which is stochastic. As it is not straightforward to calculate equilibrium mean traits in the three-trait model (see Supplementary Information S2), we first investigated this by calculating the mean trait values that yield maximum expected logarithm of fitness,

_{c}*E*[ln(

*W*)], for a

*random*individual in the population (Appendix C).

Under the assumption that *P _{ac}* =

*P*= 0, we obtain , and , which is close to the stationary means in the simulations (Figure 2). For comparison, the equilibrium mean traits in the two-trait model become and (Gavrilets and Scheiner 1993a, Lande 2009). Note that the denominator in the expression for is the variance of (

_{bc}*U*–

*z*) and not the variance of the cue

_{c}*U*alone as in the expression for in the two-trait model; i.e., genetic variance in the perception trait

*z*inflates the variance of the perceived cue (

_{c}*U*–

*z*). Hence, if

_{c}*P*= 0, is always lower than the optimal slope in equation (1) unless

_{ac}*P*= 0 (which gives the two-trait reaction norm model). This is indicated by a stippled reaction norm in Figure 1. Note again, however, that selection will maximize in the population and not E[ln(

_{cc}*W*)] for a random individual. Nevertheless, as shown in Figure 2 and Supporting Information S2, maximizing when

*P*=

_{ac}*P*= 0 also leads to and (i.e., the same values that maximize E[ln(

_{bc}*W*)]), but the equilibrium mean slope equals above only when (i.e. only in a constant environment). Note that and are independent of the variances and covariance of

*U*and Θ when

*P*=

_{ac}*P*= 0. In Supporting Information S2 we conjecture that the equilibrium mean traits and are affected by , and only indirectly through (but when

_{bc}*P*=

_{ac}*P*= 0, and are independent of , and hence also of , , and ).

_{bc}As seen in Figure 2B the asymptotic mean in the simulations (where *P _{ac}* = 0) is close to but somewhat larger than . This discrepancy is to be expected for two reasons. First, as pointed out above, the mean trait that maximizes E[ln(

*W*)] for a random individual is not identical to the equilibrium mean trait where the selection gradient (7) is zero, i.e. where is maximized (Supporting Information S2). As shown in Supporting Information S2, the equilibrium mean reaction norm slope can be approximated analytically if we assume that the plastic phenotype

*y*(

*u*) has a normal distribution, which is very nearly the case with the parameter values in our simulations in Figure 2. The integral (9) then has an analytical solution, and as a result an approximate equilibrium slope can be found numerically from the equation (assuming

*P*=

_{ac}*P*= 0) where the large values of and especially ω

_{bc}^{2}used in the simulations make the second term positive but small compared to the first term which is equal to above. Note that the first and dominant term in equation (10) is found by maximization of E[ln(

*W*], without the assumption of a normal plastic phenotype (Appendix C).

The second reason for the discrepancy between the asymptotic mean n the simulations and is that when the population under directional selection based on equation (6) evolves towards a stationary state, the mean traits will fluctuate around the equilibrium because of the influence from the random inputs *u _{t}* and

*θ*(as seen in Figure 2). In stationarity this leads to etc.

_{t}(where and *E*[*v _{a}*] = 0 etc.), and, as shown in Supporting Information S2, the variances and covariances of

*v*,

_{a}*v*and

_{b}*v*then enter into equation (10). Note that we assume that

_{c}*u*and

_{t}*θ*have zero autocorrelation, such that the covariances between the mean reaction norm parameters and the environment caused by adaptive tracking (Tufto 2015) can be neglected.

_{t}Because the reaction norm slope is influenced by the phenotypic variance of the cue reference trait *z _{c}* (and its covariance with the other traits; eq. (10)), and hence deviates from the slope that maximizes fitness (eq. (1)), the expected fitness at equilibrium will be lower than the expected fitness of the optimal individual reaction norm in equation (1) (Figure 5, lower right panel). As a consequence a proportion of the population will have a higher expected fitness than an individual with mean trait values. Nevertheless, mean fitness in the population after the environmental change stabilizes around a higher level in the three-trait model than in the two-trait model (Figure 5, left panels), despite a lower expected fitness at mean trait values (right panels). The reason for this is that the three-trait model gives a lower phenotypic variance in the new environment (Figure 4A). Mean fitness in the two-trait model thus stabilizes around the optimum

*only*when the mean cue is zero because phenotypic variance will not be minimized in other environments (Figure 5, left panels).

## Discussion

Quantitative genetics models are theoretical models for the joint evolution of population means of quantitative individual phenotypic traits, where the researchers define traits that they find most meaningful in the context they are studied. In quantitative genetics models of reaction norms where plastic phenotypes are modeled as a linear function of an interval scaled environmental cue, the reaction norm intercept and slope are often considered as individual traits subjected to selection (Gavrilets and Scheiner 1993b, Scheiner 1993, de Jong and Gavrilets 2000, Tufto 2000, Lande 2009, Scheiner 2013, Tufto 2015). The intercept of such a reaction norm (i.e., the reaction norm value at cue value zero) is often not very biologically meaningful since this trait, as well as its variance and covariance with other traits, depend on the defined zero-point, or “reference cue”, of the (arbitrary) interval scaled cue variable. One may, however, as in Lande (2009), define the zero-point of the cue to be the mean cue value for which the population is adapted to. This ensures that the variance of the plastic phenotype is minimized in the mean environment, which is theoretically plausible (Bürger 2000, Lande 2009, Le Rouzic et al. 2013), but it is not clear how this “reference cue” may evolve (in Lande’s (2009) analysis it is assumed to remain constant; see however de Jong and Gavrilets 2000).

We have here suggested that the “reference cue” can be considered as an individual trait that reflects genetic variation in cue “perception” in a general sense, and hence considered a linear reaction norm on the form *y*(*u*) = *z _{a} + z_{b}(u – z_{c})*. In this model, the biological meaning of all the traits, and their variances and covariances, is not modified when redefining the zero-point of the cue variable

*u*(which is not the case for the intercept

*α*=

*z*and

_{a}+ z_{b}z_{c}, var(α)*cov(α, z*)). The three traits in this model reflect three fundamentally different genetic effects on linear reaction norms. While

_{b}*z*represents genetic effects on cue sensitivity,

_{b}*z*reflects genetic effects on cue “perception” (in the general sense discussed below) and has the same scale as the environmental cue, and

_{c}*z*represents genetic effects that are both independent of the cue value and are invariant to its defined zero-point (the latter is not the case for the intercept). These structural differences in the reaction norm models matter for the equilibrium mean reaction norms (and distributions) because the traits do not have independent effects on the plastic phenotype

_{a}*(y*(

*u*)

*)*(note the product

*z*in the three-trait model).

_{b}z_{c}In our analysis, we have shown that the cue value where variance of the plastic phenotype is minimized (where reaction norms “tend to cross”; *u*_{0}) always evolves to equal the mean environment at equilibrium. This means that the expected plastic component of the phenotype defined as the difference between the value of the plastic phenotype and the phenotype at *u*_{0} always becomes zero at equilibrium in a stationary stochastic environment (i.e., *E*[*y*(*u*) – *y*(*u*_{0})] = 0 at equilibrium). Hence, genetic assimilation, as defined in the Introduction, will always be completed at equilibrium in our three-trait model regardless of the mean environment, and without assuming any cost of maintaining plasticity (DeWitt et al. 1998, West-Eberhard 2003, Pigliucci et al. 2006, Lande 2009, Bateson and Gluckman 2011, Svennungsen et al. 2011) or any change in the variances or covariances of our defined traits (de Jong and Gavrilets 2000). Even though *u*_{0} may be interpreted as ‘–*cov*(intercept, slope)/*var*(slope)' we find *u*_{0} biologically more meaningful than the covariance between reaction norm slope and a somewhat arbitrarily defined intercept trait. Note that *u*_{0} is a population level parameter that does not depend on any quantitative genetic model for the linear reaction norm, and which can easily be estimated (as discussed below). Further, our analysis also demonstrate that the equilibrium mean reaction norm slope in the three-trait model will deviate from the optimal slope yielding the highest expected fitness of a hypothetical individual that can tune reaction norm intercept and slope accurately and independently (eq. (1)), which is also the equilibrium mean slope of the two-trait model (Gavrilets and Scheiner 1993a, Lande 2009). At least when there is weak correlation between *z _{a}* and

*z*(i.e.,

_{c}*P*is sufficiently small), the mean slope should be lower than the optimal individual slope. Intuitively, this is because the optimal slope is lower when the cue reference trait of a random individual, in addition to the environmental cue, is stochastic (see Appendix C). As a consequence, maximum expected fitness does not occur at the mean trait values in the population.

_{ac}In the three-trait model, phenotypic variance in a given environment increases with both and the distance between and the environmental cue (*u*), at least when the traits are independent (see equation S2-3 in Supporting Information S2), whereas in the two-trait model, phenotypic variance is independent of the trait means. In our simulations, after the sudden environmental change, there is a rapid initial increase in both and the distance between and the new mean cue value (i.e., initially evolves rapidly in the *opposite* direction of the change in the environmental cue, such that the perception of the environmental change is exaggerated). Hence, due to the positively interacting (epistatic) effects of and on the plastic phenotype *y*(*u*), this efficiently increases phenotypic variance in the new environment which enhances the evolvability of the plastic phenotypic character and acts to restore population mean fitness (see Figure 2 and Figure 5). The subsequent process of assimilation whereby reaction norm slope is reduced, moves towards the mean cue value, and evolves towards mean Θ, is a much slower process.

### Genetic effects on linear reaction norms

Although a shift in the reaction norm along the cue-axis (through trait *z _{c}*) can have exactly the same effect on the individual reaction norm as a shift along the phenotype-axis (through trait

*z*), the genetic bases for these effects are fundamentally different, and, as explained above, changes in the means of these two traits have different effects on the population. It also seems obvious that there will often be genetic variation on both these traits.

_{a}Phenotypic plasticity involves complex pathways, at both organismal and cell levels, from perception of environmental cues and physiological transduction to phenotypic expression (reviewed in Sultan and Stearns 2005). Depending on the type of organism and the nature of the phenotypic characters and the environmental cues, these pathways may, to varying degrees, involve sensory systems, neuroendocrine and metabolic systems, cellular reception, gene regulation networks, and other developmental, physiological and behavioral processes. Environmental conditions may directly affect any of these systems and processes, not just the sensory systems (e.g., temperature may directly affect metabolism and gene regulation in ectothermic organisms (Gillooly et al. 2002, Ellers et al. 2008), and various processes may be affected by food constituents (Sanders et al. 1981, Meek et al. 1995, Krol et al. 2012) and nutritional state (Lõmus and Sundström 2004, Rui 2013, Mueller et al. 2015)). Genetic variation in upstream (i.e., close to the cue perception) regulatory processes, which may involve cue activation thresholds for transduction elements, may affect the way the environment is “perceived” (in a general sense) by the organism, and hence the cue reference trait (trait *z _{c}*) in our model. Genetic variation in downstream processes close to the phenotypic expression of quantitative characters, on the other hand, may affect the degree of up/down regulation in response to given levels (and types) of transduction elements and hence the slope of linear reaction norms (trait

*z*in our model). Finally, some genetic variation may have the same additive effect on the phenotype irrespective of the environmental cue (trait

_{b}*z*in our model). The importance of differentiating between these three traits may be better appreciated when considering the effects of the mean traits on the population; A change in will change the cue value at which different genotypic reaction norms tend to cross (

_{a}*u*

_{0}), whereas a change will not.

While there is ample evidence for widespread genetic variation for reaction norms in natural populations (Falconer and Mackay 1996, Sultan and Stearns 2005, Sengupta et al. 2015), there are not many examples where the full pathway of phenotypic plasticity from cue perception to phenotype expression is known in great detail (Sultan 2010, Morris and Rogers 2014), and even less is known about the genetic variation of the different elements of these pathways. It seems, however, obvious that there may be substantial genotypic variation in perception of (and not just responses to) environmental cues (i.e., variation in trait *z _{c}* in our model). Examples indicating genetic variation in environmental perception include, among other examples, substantial among-population variation in the signal transduction pathway of induced plant defense in

*Arabidopsis thaliana*(Kliebenstein et al. 2002), and individual variation in systemic stress responses has likely components of individual variation in what is perceived as stressful (Hoffmann and Parsons 1991, Badyaev 2005, Dingemanse et al. 2010). There is also considerable variation and “fine tuning” in light (and shading) perception systems involving ph

*y*ochromes that are sensitive to different wave lengths in plants (Smith 1990, 1995, Schlichting and Smith 2002).

_{t}### Predictions and empirical evaluations

Parameters in a reaction norm function considered as quantitative traits are always latent in the sense that one cannot measure their phenotypic value by a single measurement of an individual (except for traits that are defined for a particular environment, such as an intercept). While one may estimate reaction norm intercept and slope from multiple measurement of the same genotype or related individuals with known genealogy (Nussey et al. 2007; Martin et al. 2011), such data alone does not provide enough information to separate the traits *z _{a}* and

*z*(from a statistical point of view, the three-trait model fitted to such data is over-parameterized, which may be one of the reasons it has not previously been considered; note however that the three-trait model predicts a different phenotypic distribution than the two-trait model due to the product

_{c}*z*. Nevertheless, if one have a detailed understanding of the physiological (or developmental) mechanisms of the plastic response one may still be able to estimate meaningful reaction norm traits beyond a phenomenological ‘intercept’ and ‘slope’, including traits associated with cue perception (trait

_{b}z_{c})*z*) in the sense discussed above. Time-series data from selection experiments may also provide information about the genetic architecture of the reaction norms (Fuller et al. 2005).

_{c}The cue value that gives minimum phenotypic variation in the population *(u*_{0}*)*, may be estimated by fitting data on genotype specific phenotypic measurements to mixed-effects linear models with random individual slopes and intercepts (Martin et al. 2011, Bates et al. 2014), or from a random regression “animal model” building on a known relatedness among individuals (Nussey et al. 2007). Our three-trait quantitative genetics model gives certain predictions about the evolution of *u*_{0} under environmental change. Our analysis shows that the mean cue reference trait (), and hence *u*_{0} (eq. (5)), will respond rapidly to changes in the mean environment (provided sufficient additive genetic variation). Whenever there is selection for increased plasticity (i.e., selection for higher ), it also becomes adaptive to exaggerate the perception of the environmental change, and *u*_{0} will swing away in the *opposite* direction of the change in the mean cue during a “phase of alarm” (see Figure 2). Later, *u*_{0} will move towards, and eventually fluctuate around, the new cue value. In contrast, under the two-trait model *u*_{0} will not change in response to changes in the mean cue values.

### Future directions

In this paper we have made a number of simplistic, but quite standard, assumptions, including interval scaled cues and phenotypes, Gaussian fitness with constant width and peak, lack of density and frequency dependence, random mating, discrete generations where all individuals are exposed to the same environment (e.g. no spatial heterogeneity), and uncorrelated environments from one generation to the next. These assumptions may be modified or relaxed in future developments. In particular, the two-trait model has been used in theoretical studies involving within-generation heterogeneity (de Jong and Gavrilets 2000, Tufto 2000, Scheiner 2013, Tufto 2015). We suggest that these studies may be developed by including a cue reference trait in the linear reaction norms (our three-trait model). The models may also be modified by incorporating different reaction norm shapes. Notably, de Jong and Gavrilets (2000) allowed the genetic covariance between reaction norm intercept and slope, as well as their variances, to evolve through selection on allelic pleiotropy. It would be interesting to repeat their approach on our three-trait model to investigate the relative contributions (and synergies) of the evolution of trait means and trait variances and covariances.

Several authors have assumed flexible polynomial reaction norms with the polynomial coefficients considered as traits (Gavrilets and Scheiner 1993a, b, Scheiner 1993, Via et al. 1995). We suggest that such rather phenomenological reaction norm models may be modified by basing the polynomial expressions on how the environmental cues are perceived by the individuals (*u* − *z _{c}*) rather than on the environmental cue variable itself (

*u*), although the developmental or behavioral mechanistic basis for higher order terms may not be clear.

Regardless of the reaction norm shape, we argue that it is essential to distinguish between genetic variation in how the environmental cues are perceived from other genetic variation affecting the reaction norm distribution in the population. We suggest that future developmental and behavioral studies pay more attention to genetic variation in environment perception and transduction, and that the contributions of such genetic variation to phenotypic variation in natural environments are evaluated.

## Acknowledgements

We thank Arnaud Le Rouzic, Samuel M. Scheiner, Thomas F. Hansen and Øistein H. Holen for comments on an earlier version of the manuscript; we greatly appreciate their input and advice but may still disagree on some issues. This work made use of the Abel computing cluster, owned by the University of Oslo and the Norwegian meta-centre for High Performance Computing (NOTUR). We are thankful for technical support from the Research Computing Services group at USIT, University of Oslo.

## Appendix A: Optimal reaction norms

The aim is here to find the optimal reaction norm that maximizes E[ln(*W*)], irrespective of any genetic model. We assume that fitness *W* for a given phenotype y is Gaussian with constant width *W* and peak value *W _{max}*, such that
where

*U*is the environmental cue and Θ is the best possible phenotype under perfect information. We thus maximize E[ln(

*W*)] by minimizing the criterion function

*J*=

*E*[(

*y*(

*u*) − Θ)

^{2}], where both

*U*and Θ are random variables with a joint distribution (not necessarily normal).

A linear reaction norm, may be described as *y*(*u*) = *α* + *βU*. To find the intercept (*α*) and slope (*β*) that minimize *J* we first develop *J* and then solve for *α* and *β*. Using *E*[*y*(*U*)^{2}] = *va*r(*y*(*U*)) + E[*y*(*U*)]^{2}, etc., we get

Substituting and , and further solving and for *α* and *β*, we find the optimal intercept and slope as and which correspond to equation (1) in the main text.

## Appendix B: Cue value *u*_{0} where phenotypic variance is minimum and covariance between reaction norm slope and phenotype is zero

Assume a population of linear reaction norms on the form *y*(*u*) = *α* + *βu*, where *α* and *β* are traits with phenotypic variances *P _{αα}*,

*P*and covariance

_{ββ}*P*. The variance in the plastic phenotype at a given cue value,

_{αβ}*u*, is then

Minimization by setting gives the result

We find the same from

In the three-trait genetic model (3), on the other hand, the reaction norm intercept and slope are *α* = *z _{a}* −

*z*and

_{b}z_{c}*β*=

*z*, respectively. Since , and since normal distributions give (Isserlis’ (1918) theorem), we then have

_{b}Since *β* = *z _{b}* and thus

*P*, the result above applied to the three-trait model thus gives

_{ββ}= P_{bb}## Appendix C: Mean traits that maximize E[ln(*W*)] for a random individual

We here apply the same procedure as in Appendix A to find the mean trait values in the population that maximize E[ln(*W*)] for a random individual for which the traits are stochastic quantities. Starting with our linear three-trait model, *y* = *z _{a}* +

*z*where the traits

_{b}(U − z_{c})*z*,

_{a}*z*and

_{b}*z*are normally distributed with phenotypic covariance matrix

_{c}**, we develop the criterion function**

*P**J*by using etc., and etc. According to Isserlis’ (1918) theorem (given normal traits) we have , and . Further assuming that the traits are uncorrelated with both Θ and

*U*we find

In order to find the mean traits that maximize E[ln(*W*)], we now minimize *J* by finding the solution to the equations

With *P _{bb}* >0 the solution is

With *P _{bb}* =

*P*=

_{ab}*P*= 0 and hence constant

_{bc}*z*(i.e., “Baldwin effect” sensu Lande (2009)) we can only show that , i.e., that the point is located on the straight line with slope going through the point [

_{b}*μ*,

_{U}*μ*

_{Θ}]. The point is thus forced towards the solution above only when

*P*> 0.

_{bb}The traditional two-trait model is a special case of the three-trait model where , , *P _{cc}* = 0,

*P*= 0 and

_{ac}*P*= 0. Solving and gives the solution and , which is identical to the optimal intercept and slope derived in Appendix A, as well as the asymptotic trait means found by Lande (2009).

_{bc}