## Abstract

Natural selection acts on phenotypes constructed over development, which raises the question of how development affects evolution. Classic evolutionary theory indicates that development affects evolution by modulating the genetic covariation upon which selection acts, thus affecting genetic constraints. However, whether genetic constraints are relative, thus diverting adaptation from the direction of steepest fitness ascent, or absolute, thus blocking adaptation in certain directions, remains uncertain. This limits understanding of long-term evolution of developmentally constructed phenotypes. Here we formulate a general tractable mathematical framework that integrates age progression, explicit development (i.e., the construction of the phenotype across life subject to developmental constraints), and evolutionary dynamics, thus describing the evolutionary developmental (evo-devo) dynamics. The framework yields simple equations that can be arranged in a layered structure that we call the evo-devo process, whereby five core elementary components generate all equations including those mechanistically describing genetic covariation and the evo-devo dynamics. The framework recovers evolutionary dynamic equations in gradient form and describes the evolution of genetic covariation from the evolution of genotype, phenotype, environment, and mutational covariation. This shows that genotypic and phenotypic evolution must be followed simultaneously to yield a dynamically sufficient description of long-term phenotypic evolution in gradient form, such that evolution described as the climbing of a fitness landscape occurs in “geno-phenotype” space. Genetic constraints in geno-phenotype space are necessarily absolute because the phenotype is related to the genotype by development. Thus, the long-term evolutionary dynamics of developed phenotypes is strongly non-standard: (1) evolutionary equilibria are either absent or infinite in number and depend on genetic covariation and hence on development; (2) developmental constraints determine the admissible evolutionary path and hence which evolutionary equilibria are admissible; and (3) evolutionary outcomes occur at admissible evolutionary equilibria, which do not generally occur at fitness landscape peaks in geno-phenotype space, but at peaks in the admissible evolutionary path where “total genotypic selection” vanishes if exogenous plastic response vanishes and mutational variation exists in all directions of genotype space. Hence, selection and development jointly define the evolutionary outcomes if absolute mutational constraints and exogenous plastic response are absent, rather than the outcomes being defined only by selection. Moreover, our framework provides formulas for the sensitivities of a recurrence and an alternative method to dynamic optimization (i.e., dynamic programming or optimal control) to identify evolutionary outcomes in models with developmentally dynamic traits. These results show that development has major evolutionary effects.

**Highlights**

We formulate a framework integrating evolutionary and developmental dynamics.

We derive equations describing the evolutionary dynamics of traits considering their developmental process.

This yields a description of the evo-devo process in terms of closed-form formulas that are simple and insightful, including for genetic covariance matrices.

## 1. Introduction

Development may be defined as the process that constructs the phenotype over life (Barresi and Gilbert, 2020). In particular, development includes “the process by which genotypes are transformed into phenotypes” (Wolf et al., 2001). As natural selection screens phenotypes produced by development, a fundamental evolutionary problem concerns how development affects evolution. Interest in this problem is long-standing (Baldwin 1896, Waddington 1959 p. 399, and Gould and Lewontin 1979) and has steadily increased in recent decades. It has been proposed that developmental constraints (Gould and Lewontin, 1979; Maynard Smith et al., 1985; Brakefield, 2006; Klingenberg, 2010), causal feedbacks over development occurring among genes, the organism, and environment (Lewontin, 1983; Rice, 2011; Hansen, 2013; Laland et al., 2015), and various development-mediated factors (Laland et al., 2014, 2015), namely plasticity (Pigliucci, 2001; West-Eberhard, 2003), niche construction (Odling-Smee et al., 1996, 2003), extragenetic inheritance (Baldwin, 1896; Cavalli-Sforza and Feldman, 1981; Boyd and Richerson, 1985; Jablonka and Lamb, 2014; Bonduriansky and Day, 2018), and developmental bias (Arthur, 2004; Uller et al., 2018), may all have important evolutionary roles. Understanding how development — including these elements acting individually and together — affects the evolutionary process remains an outstanding challenge (Baldwin, 1896; Waddington, 1959; Müller, 2007; Pigliucci, 2007; Laland et al., 2014, 2015; Galis et al., 2018).

Classic evolutionary theory indicates that development affects evolution by modulating the genetic covariation upon which selection acts. This can be seen as follows. In quantitative genetics, an individual’s *i*-th trait value *x*_{i} is written as , where the overbar denotes population average, *y* _{j} is the individual’s gene content at the *j*-th locus, *α*_{i j} is the partial regression coefficient of the *i*-th trait deviation from the average on the deviation from the average of the *j*-th locus content, and *e*_{i} is the residual error (Fisher, 1918; Crow and Kimura, 1970; Falconer and Mackay, 1996; Lynch and Walsh, 1998; Walsh and Lynch, 2018). The quantity *α*_{i j} is Fisher’s additive effect of allelic substitution (his *α*; see Eq. I of Fisher 1918 and p. 72 of Lynch and Walsh 1998) and is a description of some of the linear effects of development, specifically of how genotypes are transformed into phenotypes. In matrix notation, the vector of an individual’s trait values is , where the matrix *α* corresponds to what Wagner (1984) calls the developmental matrix (his **B**). The breeding value of the multivariate phenotype **x** is defined as , which does not consider the error term that includes non-linear effects of genes on phenotype. Breeding value thus depends on development via the developmental matrix *α*. The Lande (1979) equation describes the evolutionary change due to selection in the mean multivariate phenotype as , where the additive genetic covariance matrix is **G** ≡ cov[**a**_{x}, **a**_{x}] = *α*cov[**y, y**]*α*^{⊺} (e.g., Wagner 1984), mean absolute fitness is , and the selection gradient is ∂ ln , which points in the direction of steepest increase in mean fitness (here and throughout we use matrix calculus notation described in Appendix A). An important feature of the Lande equation is that it is in gradient form, so the equation shows that, within the assumptions made, phenotypic evolution by natural selection proceeds as the climbing of a fitness landscape, as first shown by Wright (1937) for change in allele frequencies in a two-allele single-locus model. Moreover, the Lande equation shows that additive genetic covariation, described by **G**, may divert evolutionary change from the direction of steepest fitness ascent, and may prevent evolutionary change in some directions if genetic variation in those directions is absent (in which case **G** is singular). Since additive genetic covariation depends on development via the developmental matrix *α*, the Lande equation shows that development affects evolution by modulating genetic covariation via *α* (Charlesworth et al., 1982; Cheverud, 1984; Maynard Smith et al., 1985).

However, this mathematical description might have limited further insight into the evolutionary effects of development, particularly because it lacks two key pieces of information. First, the above description yields a limited understanding of the form of the developmental matrix *α*. The definition of *α* as a matrix of regression coefficients does not make available a developmentally explicit nor evolutionarily dynamic understanding of *α*, which hinders understanding of how development affects evolution. Although the developmental matrix *α* has been modelled (Pavlicev and Hansen, 2011) or analysed as unknowable (Martin, 2014), there is a lack of a general theory with an explicit description of the developmental process to unveil the general structure of the developmental matrix *α*.

Second, the description in the second paragraph above gives a very short-term account of the evolutionary process. The Lande equation in the second paragraph strictly describes the evolution of mean traits but not of mean gene content , that is, it does not describe change in allele frequency; yet, since *α* is a matrix of regression coefficients calculated for the current population, *α* depends on the current state of the population including allele frequency . Thus, the Lande equation above describes the dynamics of some traits as an implicit function of traits whose dynamics are not described. The equation thus contains fewer dynamic equations (as many as there are traits in ) than dynamic variables (as many as there are traits and loci ), so it is underdetermined. Consequently, the Lande equation strictly admits an infinite number of evolutionary trajectories for a given initial condition. Technically, the evolutionary trajectory is ill-defined by the Lande’s system, so the Lande equation is dynamically insufficient (i.e., it contains less information than needed to find how the mean phenotype evolves without making more assumptions; we note that these harsh-sounding terms do not mean that the Lande equation is wrong). The standard approach to this dynamic insufficiency is to assume Fisher’s (1918) infinitesimal model, whereby there is an infinite number of loci such that allele frequency change per locus per generation is negligible (Bulmer, 1971, 1980; Turelli and Barton, 1994; Barton et al., 2017; Hill, 2017). Thus, the Lande equation is said to describe short-term evolution, during which there is negligible allele frequency change per locus (Walsh and Lynch, 2018, pp. 504 and 879). The Lande equation is then supplemented by the Bulmer (1980) equation (Lande and Arnold, 1983, Eq. 12) which describes the dynamics of **G** primarily due to change in linkage disequilibrium under the assumption of negligible allele frequency change, thus still to describe short-term evolution (Walsh and Lynch, 2018, p. 553). Typically, the **G** matrix is assumed to have reached an equilibrium in such short-term dynamics or to remain constant although this has often been shown not to hold theoretically (Turelli, 1988) and empirically (Björklund et al., 2013). An alternative to the long-term dynamic insufficiency of the classic Lande’s system would be to consider the vector of gene content **y** to be a subvector of the vector of trait values **x** (Barfield et al., 2011), although such vector **x** does not admit the normality assumption of the Lande equation and doing so does not yield a description of linkage disequilibrium dynamics. Indeed, there appears to be no formal derivation of such extended Lande’s system that makes explicit the properties of its associated **G**matrix and the dependence of such matrix on development. Overall, understanding how development affects evolution using the classic Lande equation might have been hindered by a lack of a general mechanistic understanding of the developmental matrix *α* and by the generally long-term dynamic insufficiency of the classic Lande’s system.

Nevertheless, there has been progress on general mathematical aspects of how development affects evolution on various fronts. Both the classic Lande equation (Lande, 1979) and the classic canonical equation of adaptive dynamics (Dieckmann and Law, 1996) describe the evolutionary dynamics of a multivariate trait in gradient form without an explicit account of development, by considering no explicit age progression or developmental (i.e., dynamic) constraints (there is also an analogous equation for allele frequency change for multiple alleles in a single locus, first incorrectly presented by Wright, 1937 but later corrected by Edwards, 2000 and presented in Lande’s form by Walsh and Lynch, 2018, Eq. 5.12a). Various research lines have extended these equations to incorporate different aspects of development. First, one line considers explicit age progression by implementing age structure, which allows individuals of different ages to coexist and to have age-specific survival and fertility rates. Thus, evolutionary dynamic equations in gradient form under age-structure have been derived under quantitative genetics assumptions (Lande, 1982), population genetics assumptions (Charlesworth, 1993, 1994), and adaptive dynamics assumptions (Durinx et al., 2008). An important feature of age-structured models is that the forces of selection decline with age due to demography, in particular due to mortality and fewer remaining reproductive events as age advances (Medawar, 1952; Hamilton, 1966; Caswell, 1978; Caswell and Shyu, 2017). Such age-specific decline in the force of selection does not occur in unstructured models.

Second, another research line in life-history theory has extended age-structured models to consider explicit developmental constraints (Gadgil and Bossert, 1970; Taylor et al., 1974; León, 1976; Schaffer, 1983; Houston et al., 1988; Roff, 1992; Houston and McNamara, 1999; Sydsæter et al., 2008). This line has considered developmentally dynamic models with two types of age-specific traits: genotypic traits called control variables, which are under direct genetic control, and developed traits called state variables, which are constructed over life according to developmental constraints, although such literature calls these constraints dynamic. This explicit consideration of developmental constraints in an evolutionary context has mostly assumed that the population is at an evolutionary equilibrium. Thus, this approach identifies evolutionarily stable (or uninvadable) controls and associated states using techniques from dynamic optimization such as optimal control and dynamic programming (Gadgil and Bossert, 1970; Taylor et al., 1974; León, 1976; Schaffer, 1983; Houston et al., 1988; Roff, 1992; Houston and McNamara, 1999). While the assumption of evolutionary equilibrium yields great insight, it does not address the evolutionary dynamics which would provide a richer understanding. Moreover, the relationship between developmental constraints and genetic covariation is not made evident by this approach.

Third, another research line in adaptive dynamics has made it possible to mathematically model the evolutionary developmental (evo-devo) dynamics. By evo-devo dynamics we mean the evolutionary dynamics of genotypic traits that modulate the developmental dynamics of developed traits that are constructed over life subject to developmental constraints. A first step in this research line has been to consider function-valued or infinite-dimensional traits, which are genotypic traits indexed by a continuous variable (e.g., age) rather than a discrete variable as in the classic Lande equation. Thus, the evolutionary dynamics of univariate function-valued traits (e.g., body size across continuous age) has been described in gradient form by the Lande equation for function-valued traits (Kirkpatrick and Heckman, 1989) and the canonical equation for function-valued traits (Dieckmann et al., 2006). Although function-valued traits may depend on age, they are not subject to developmental constraints describing their developmental dynamics, so the consideration of the evolutionary dynamics of function-valued traits alone does not model the evo-devo dynamics. To our knowledge, Parvinen et al. (2013) were the first to mathematically model what we here call the evo-devo dynamics (but note that there have also been models integrating mathematical modeling of the developmental dynamics and individual-based modeling of the evolutionary dynamics, for instance, Salazar-Ciudad and Marín-Riera, 2013 and Watson et al., 2013). Parvinen et al. (2013) did so by considering the evolutionary dynamics of a univariate function-valued trait (control variable) that modulates the developmental construction of a multivariate developed trait (state variables) subject to explicit developmental constraints (they refer to these as process-mediated models). This approach requires the derivation of the selection gradient of the control variable affecting the state variables, which, as age is measured in continuous time, involves calculating a functional derivative (of invasion fitness; Dieckmann et al., 2006; Parvinen et al., 2013, Eq. 4). Parvinen et al. (2013) noted the lack of a general simplified method to calculate such selection gradient, but they calculated it for specific examples. Metz et al. (2016) illustrate how to calculate such selection gradient using a fitness return argument in a specific example. Using functional derivatives, Avila et al. (2021) derive the selection gradient of a univariate function-valued trait modulating the developmental construction of a univariate developed trait for a broad class of models (where relatives interact and the genotypic trait may depend on the developed trait). They obtain a formula for the selection gradient that depends on unknown associated variables (costate variables or shadow values) (Avila et al., 2021, Eqs. 7 and 23), but at evolutionary equilibrium these associated variables can be calculated solving an associated partial differential equation (their Eq. 32). Despite these advances, the analysis of these models poses substantial technical challenges, by requiring calculation of functional derivatives or (partial) differential equations at evolutionary equilibrium with terminal age conditions in addition to the equations describing the developmental dynamics with intial age conditions (a two-point boundary value problem). These models have yielded evolutionary dynamic equations in gradient form for genotypic traits, but not for developed traits, so they have left unanswered the question of how the evolution of developed traits with explicit developmental constraints proceeds in the fitness landscape. Additionally, these models have not provided a link between developmental constraints and genetic covariation (Metz 2011; Dieckmann et al. 2006 discuss a link between constraints and genetic covariation in controls, not states; see Supplementary Information section S1 for further details).

Fourth, a separate research line in quantitative genetics has considered models without age structure where a set of traits are functions of underlying traits such as gene content or environmental variables (Wagner, 1984, 1989; Hansen and Wagner, 2001; Rice, 2002; Martin, 2014; Morrissey, 2014, 2015). This dependence of traits on other traits is used by this research line to describe development and the genotype-phenotype map. However, this research line considers no explicit age progression, so it considers implicit rather than explicit developmental (i.e., dynamic) constraints. Thus, this line has not considered the effect of age structure nor explicit developmental constraints (Wagner, 1984, 1989; Hansen and Wagner, 2001; Rice, 2002; Martin, 2014; Morrissey, 2014, 2015). Also, this line has not provided an evolutionarily dynamic understanding of the developmental matrix, nor long-term dynamically sufficient equations in gradient form describing the evolution of developed traits.

Here we formulate a tractable mathematical framework that integrates age progression (i.e., age structure), explicit developmental constraints, and evolutionary dynamics. The framework describes the evolutionary dynamics of genotypic traits and the concomitant developmental dynamics of developed traits subject to developmental constraints. It yields dynamically sufficient expressions describing the long-term evolutionary dynamics in gradient form including for developed traits, so it shows how the climbing of an adaptive topography proceeds for developed traits in a broad class of models. It also obtains a mechanistic counterpart of Fisher’s (1918) additive effects of allelic substitution and Wagner’s (1984) developmental matrix thus relating development to genetic covariation for a broad class of models. The resulting equations are long-term dynamically sufficient in the sense that the evolutionary dynamics of all variables involved are described over evolutionary time scales (i.e., for an arbitrary number of mutation-fixation events), including the evolutionary dynamics of the genotype, phenotype, environment, and genetic covariation modulated by development (provided the elementary components below are known or assumed).

We base our framework on adaptive dynamics assumptions (Dieckmann and Law, 1996; Metz et al., 1996; Champagnat, 2006; Durinx et al., 2008). We obtain equations describing the evolutionary dynamics in gradient form of traits that are constructed over a developmental process with explicit developmental constraints occurring as age progresses. Developmental constraints allow the phenotype to be “predisposed” to develop in certain ways, thus allowing for developmental bias (Arthur, 2004; Uller et al., 2018). We allow development to depend on the environment, which allows for a mechanistic description of plasticity (Pigliucci, 2001; West-Eberhard, 2003). We also allow development to depend on social interactions, which allows for a mechanistic description of extra-genetic inheritance (Boyd and Richerson, 1985; Jablonka and Lamb, 2014; Bonduriansky and Day, 2018) and indirect genetic effects (Moore et al., 1997). Such social development means that a mutant’s phenotype may change as the mutant genotype spreads, which complicates evolutionary invasion analysis. In turn, we allow the environment faced by each individual to depend on the traits of the individual and of social partners, thus allowing for individual and social niche construction although we do not consider ecological inheritance (Odling-Smee et al., 1996, 2003). We also let the environment depend on processes that are exogenous to the evolving population, such as eutrophication or climate change caused by members of other species, thus allowing for exogenous environmental change. To facilitate analysis, we let population dynamics occur over a short time scale, whereas environmental and evolutionary dynamics occur over a long time scale. Crucially, we measure age in discrete time, which simplifies the mathematics yielding closed-form formulas for otherwise implicitly defined quantities. Our methods use concepts from optimal control (Sydsæter et al., 2008) and integrate tools from adaptive dynamics (Dieckmann and Law, 1996) and matrix population models (Caswell, 2001; Otto and Day, 2007). In particular, we conceptualize the genotype as “control” variables and the phenotype as “state” variables whose developmental dynamics are modulated by controls. While we use concepts from optimal control, we do not use optimal control itself. Instead, we derive a method to model the evolutionary dynamics of controls, which yields an alternative method to optimal control that can be used to obtain optimal controls in a broad class of evolutionary models with dynamic constraints. Our use of optimal control concepts is thus useful to see how our results relate to optimal control, which has wide-ranging theory and applications. Our approach differs somewhat from standard matrix population models, where the stage (e.g., age and size) of an individual is discrete and described as indices of the population density vector (Caswell, 2001; Caswell et al., 1997; de Vries and Caswell, 2018; Caswell, 2019, Ch. 6); instead, we let the stage of an individual be partly discrete (specifically, age), described as indices in the population density vector, and partly continuous (e.g., size), described as arguments of various functions.

We obtain three sets of main results. First, we derive several closed-form formulas for the total selection gradient of genotypic traits (i.e., of control variables) that affect the development of the phenotype (i.e., of state variables), formulas that can be easily computed with elementary operations. The total selection gradient of genotypic traits is the selection gradient that appears in the canonical equation of adaptive dynamics of , so coupling the total selection gradient of genotypic traits, the canonical equation, and the developmental constraint describing the developmental dynamics of developed traits provides simple expressions to model the evo-devo dynamics in a broad class of models. In particular, these expressions provide an alternative method to dynamic optimization (e.g., dynamic programming or optimal control) to calculate evolutionary outcomes for evolutionary (e.g., life history) models with developmentally dynamic traits, both analytically for sufficiently simple models and numerically for more complex ones. Second, we derive equations in gradient form describing the evolutionary dynamics of developed traits and of the niche-constructed environment. These equations motivate a definition of the “mechanistic additive genetic covariance matrix” in terms of “mechanistic breeding value”, defined in turn in terms of a mechanistic counterpart of Fisher’s (1918) additive effects of allelic substitution obtained from the developmental process rather than from regression. Specifically, we obtain formulas for a mechanistic counterpart of the developmental matrix *α* for a broad class of models. This yields closed-form formulas for the sensitivity of the solutions of a system of recurrence equations and are thus of use beyond evolutionary or biological applications, formulas that seem to have been previously unavailable (Johnson, 2011). Analogously to the classic Lande equation, our equation describing the evolutionary dynamics of the developed traits depends on the genotypic traits and so it is generally dynamically insufficient if the evolutionary dynamics of the genotypic traits is not considered. Third, we obtain synthetic equations in gradient form simultaneously describing the evolutionary dynamics of genotypic, developed, and environmental traits. These equations are in gradient form and are dynamically sufficient in that they include as many evolutionarily dynamic equations as evolutionarily dynamic variables, which enables one to describe the long-term evolution of developed multivariate phenotypes as the climbing of a fitness landscape. Such equations describe the evolutionary dynamics of the constraining matrix analogous to **G** as an emergent property, where genotypic traits play an analogous role to that of allele frequency under quantitative genetics assumptions while linkage disequilibrium is not an issue as we assume clonal reproduction. In this extended dynamically sufficient Lande’s system, the associated constraining matrix is always singular, which is mathematically trivial, but biologically crucial. The reason is that this singularity entails that genetic variation is necessarily absent in certain directions such that adaptive evolution at best converges to outcomes defined by both selection and development rather than by selection alone. Consequently, development plays a major evolutionary role.

## 2. Problem statement

We begin by describing the mathematical problem we address. We consider a finite age-structured population with deterministic density-dependent population dynamics with age measured in discrete time. Each individual is described by three types of traits that we call genotypic, phenotypic (or developed), and environmental, all of which can vary with age and can evolve. We let all traits take continuous values, which allows us to take derivatives. Genotypic traits are defined by being directly specified by genetic sequence: for instance, a genotypic trait may be the presence or absence of a given nucleotide at a given single-nucleotide locus described with a continuous representation (see below). Phenotypic traits are defined by being constructed over life subject to a developmental constraint: for instance, a phenotypic trait may be body size subject to the influence of genes, developmental history, environment, social interactions, and developmental processes constructing the body. Environmental traits are defined as describing the local environment of the individual subject to an environmental constraint: for instance, an environmental trait may be ambient temperature, which the individual may adjust behaviorally such as by roosting in the shade. We assume that reproduction transmits genotypic traits clonally, but developed and environmental traits need not be transmitted clonally due to social interactions. Given clonal reproduction of genotypic traits, we do not need to further specify the genetic architecture (e.g., ploidy, number of loci, or linkage) and it may depend on the particular model. We assume that the genotypic traits are *developmentally independent*, whereby genotypic traits are entirely specified by the individual’s genotype and do not depend on other traits expressed over development: in particular, this means that the genotype can only be modified by mutation, but the genotype at a given locus and age does not depend on other loci, the phenotype, or the environment. Developmental independence corresponds to the notion of “open-loop” control of optimal control theory (Sydsæter et al., 2008). Genotypic traits may still be *mutationally* correlated, whereby genotypic traits may tend to mutate together or separately. We assume that environmental traits are mutually independent, which facilitates derivations. We obtain dynamically sufficient equations in gradient form for the evolution of the phenotype by aggregating the various types of traits. We give names to such aggregates for ease of reference. We call the aggregate of the genotype and phenotype the geno-phenotype. We call the aggregate of the genotype, phenotype, and environment the geno-envo-phenotype.

The above terminology departs from standard terminology in adaptive dynamics as follows. In adaptive dynamics, our genotypic traits are referred to as the phenotype and our phenotypic traits as function-valued phenotypes (or state variables). We depart from this terminology to follow the biologically common notion that the phenotype is constructed over development. In turn, adaptive dynamics terminology defines the environment as any quantity outside the individual, and thus refers to the global environment. In contrast, by environment we refer to the local environment of the individual. This allows us to model niche construction as the local environment of a mutant individual may differ from that of a resident.

We use the following notation (Table 1). Each individual can live from age 1 to age *N*_{a} ∈ {2, 3, …}. Each individual has a number *N*_{g} of genotypic traits at each age. A mutant’s genotypic trait *i* ∈ {1, …, *N*_{g}} at age *a* ∈ {1, …, *N*_{a}} is *y*_{ia} ∈ ℝ. For instance, *y*_{ia} may be a continuous representation of nucleotide presence at a single-nucleotide locus: let be 1 if nucleotide A is at locus *i* or 0 otherwise, which can be approximated by , where *γ >* 0 is small and *y*_{ia} ∈ ℝ. This example has the benefit of allowing one to represent the genotype in a continuous variable to take derivatives and meet the assumptions of weak and unbiased mutation we make, at the expense of introducing the artificiality that an individual can have a fraction of a nucleotide. Such artificiality can be minimized by reducing *γ* at the cost that changes in *y*_{ia} away from 0 may leave nucleotide presence largely unchanged. Another example is that *y*_{ia} is the value of a life-history trait *i* at age *a* assumed to be directly under genetic control (i.e., a control variable in life-history models; Gadgil and Bossert, 1970; Taylor et al., 1974; León, 1976; Schaffer, 1983). While *y*_{ia} may be often constant with age *a* in the first example, it generally is not in the second so we allow genotypic traits to depend on age. Given our assumption of developmental independence of genotypic traits, the genotypic trait value *y*_{ia} for all *i* ∈ {1, …, *N*_{g}} and all *a* ∈ {1, …, *N*_{a}} of a given individual is exclusively controlled by her genotype but mutations can tend to change the value of *y*_{ia} and *y*_{k j} simultaneously for *k* ≠ *i* and *j* ≠ *a*. Additionally, each individual has a number *N*_{p} of developed traits, that is, of phenotypes at each age. A mutant’s phenotype *i* ∈ {1, …, *N*_{p}} at age *a* ∈ {1, …, *N*_{a}} is *x*_{ia} ∈ ℝ. Moreover, each individual has a number *N*_{e} of environmental traits that describe her local environment at each age. A mutant’s environmental trait *i* ∈ {1, …, *N*_{e}} at age *a* ∈ {1, …, *N*_{a}} is *ϵ*_{ia} ∈ ℝ. Although we do not consider the developmental or evolutionary change of the number of traits (i.e., of *N*_{g}, *N*_{p}, or *N*_{e}), our framework still allows for the modeling of the developmental or evolutionary origin of novel traits (e.g., the origin of a sixth digit where there was five previously in development or evolution; Chan et al., 1995; Litingtung et al., 2002; Müller, 2010) by implementing a suitable codification (e.g., letting *x*_{ia} mean sixth-digit length, being zero in a previous age or evolutionary time).

We use the following notation for collections of these quantities. A mutant’s *i*-th genotypic trait across all ages is denoted by the column vector , where the semicolon indicates a line break, that is, . A mutant’s *i*-th phenotype across all ages is denoted by the column vector . A mutant’s *i*-th environmental trait across all ages is denoted by the column vector . A mutant’s genotype across all genotypic traits and all ages is denoted by the block column vector . A mutant’s phenotype across all developed traits and all ages is denoted by the block column vector . A mutant’s environment across all environmental traits and all ages is denoted by the block column vector . To simultaneously refer to the genotype and phenotype, we denote the geno-phenotype of the mutant individual at age *a* as , and the geno-phenotype of a mutant across all ages as . Moreover, to simultaneously refer to the genotype, phenotype, and environment, we denote the geno-envo-phenotype of a mutant individual at age *a* as , and the geno-envo-phenotype of the mutant across all ages as . We denote resident values analogously with an overbar (e.g., is the resident geno-phenotype).

The developmental process that constructs the phenotype is as follows (with causal dependencies described in Fig. 1). We assume that an individual’s multivariate phenotype at a given age is a function of the genotypic, phenotypic, and environmental traits that the individual had at the immediately previous age as well as of the social interactions experienced at that age. Thus, we assume that a mutant’s multivariate phenotype at age *a* + 1 is given by the developmental constraint

for all *a* ∈ {1, …, *N*_{a} − 1} with initial condition . The function
is the developmental map at age *a*, which we assume is a differentiable function of the individual’s geno-envophenotype at that age and of the geno-phenotype of the individual’s social partners who can be of any age; thus, an individual’s development directly depends on the individual’s local environment but not directly on the local environment of social partners. The developmental constraint (1) is a mathematical, deterministic description of Waddington’s (1957) “epigenetic landscape”. Eq. (1) is a constraint in that the phenotype **x**_{a+1} cannot take any value but only those that satisfy the equality (e.g., an individual’s body size today cannot take any value but depends on her body size, gene expression, and environment since yesterday). The term developmental function can be traced back to Gimelfarb (1982) through Wagner (1984). The developmental map in Eq. (1) is an extension of the notions of genotype-phenotype map (often a function from genotype to phenotype, without explicit developmental dynamics) and reaction norm (often a function from environment to phenotype, also without explicit developmental dynamics), as well as of early mathematical descriptions of development in an evolutionary context (Alberch et al., 1979). The dependence of the mutant phenotype on the phenotype of social partners in (1) allows one to implement Jablonka and Lamb’s (2014) notion that extra-genetic inheritance transmits the phenotype rather than the genotype (see their p. 108), such that in (1) the mutant phenotype can be a possibly altered copy of social partners’ phenotype. The developmental map in Eq. (1) may be non-linear and can change over development (e.g., from *g*_{ia} = sin *x*_{ia} to for *a < j* and some parameter *β*, for instance, due to metamorphosis) and over evolution (e.g., from a sine to a power function if as nucleotide presence evolves from 0 to 1). Simpler forms of the developmental constraint (1) are standard in life-history models, which call such constraints dynamic stemming from the terminology of optimal control theory (Gadgil and Bossert, 1970; Taylor et al., 1974; León, 1976; Schaffer, 1983; Sydsæter et al., 2008). Simpler forms of the developmental constraint (1) are also standard in physiologically structured models of population dynamics (de Roos, 1997, Eq. 7). The developmental constraint (1) can describe gene regulatory networks (Alon, 2020), learning in deep neural networks (Saxe et al., 2019), and reaction-diffusion models of morphology (Murray, 2003) in discrete developmental time and space, once such models are written in the form of Eq. (1) (e.g., if space is one-dimensional, the *i*-th developed trait may refer to the *i*-th spatial location; more spatial dimensions would require care in the mapping from multidimensional space to the unidimensional *i*-th phenotypic index, but doing so is possible; Supplementary Information section S6). The developmental constraint (1) also admits that a slight perturbation in the geno-envo-phenotype at an early age yields a large change in the phenotype at a later age, possibly changing it from zero to an appreciable value (as in descriptions of developmental innovation (Goldschmidt, 1940; Gould, 1977; Orr and Coyne, 1992; Orr, 2005; Müller, 2010), possibly via exploratory processes highlighted by Gerhart and Kirschner 2007 and Kirschner and Gerhart (2010) provided a mathematical model of such processes satisfies Eq. 1). However, slight perturbations yielding large phenotypic effects raise the question of whether our assumption below that invasion implies fixation is violated if mutant phenotypes **x** deviate substantially from resident phenotypes ; indeed, it has previously been established that invasion implies fixation if mutant *genotypes* **y** do not deviate substantially from resident genotypes (Geritz et al., 2002; Geritz, 2005; Dieckmann et al., 2006; Priklopil and Lehmann, 2020), which we assume. We leave for future work to address explicitly whether large deviations in mutant phenotypes in our sense of the word still entail that invasion implies fixation because of small deviations in mutant genotypes. For simplicity, we assume that the phenotype at the initial age is constant and does not evolve. This assumption corresponds to the common assumption in life-history models that state variables at the initial age are given (Gadgil and Bossert, 1970; Taylor et al., 1974; León, 1976; Schaffer, 1983; Sydsæter et al., 2008).

We describe the local environment as follows. We assume that an individual’s local environment at a given age is a function of the genotypic traits, phenotype, and social interactions of the individual at that age, and of processes that are not caused by the population considered. Thus, we assume that a mutant’s environment at age *a* is given by the environmental constraint
for all *a* ∈ {1, …, *N*_{a}}. The function
is the environmental map at age *a*, which can change over development and evolution. We assume that the environmental map is a differentiable function of the individual’s geno-phenotype at that age (e.g., the individual’s behavior at a given age may expose it to a particular environment at that age), the geno-phenotype of the individual’s social partners who can be of any age (e.g., through social niche construction), and evolutionary time *τ* due to slow exogenous environmental change (so the exogenous process changing the environment in Fig. 1 acts as forcing, as *τ* appears explicitly in Eq. 2). We assume slow exogenous environmental change to enable the resident population to reach carrying capacity to be able to use relatively simple techniques of evolutionary invasion analysis to derive selection gradients. The environmental constraint (2) may also be non-linear and can change over development (i.e., over *a*) and over evolution (as the genotype or phenotype evolves or exogenously as evolutionary time advances). The environmental constraint (2) is a minimalist description of the environment of a specific kind (akin to “feedback functions” used in physiologically structured models to describe the influence of individuals on the environment; de Roos, 1997). A different, perhaps more realistic environmental constraint would be constructive of the form , in which case the only structural difference between an environmental trait and a developed trait would be the dependence of the environmental trait on exogenous processes (akin to “feedback loops” used in physiologically structured models to describe the influence of individuals on the environment; de Roos, 1997). The environmental constraint could be further extended to model ecological inheritance by letting the environmental constraint have the form , where the environmental map now depends on the resident environment at the previous evolutionary time (a similar lag could be added to the developmental map so it depends on the resident geno-phenotype at the previous evolutionary time to model certain aspects of symbolic social learning; Jablonka and Lamb, 2010, 2014; Odling-Smee, 2010). We use the minimalist environmental constraint (2) as a first approximation to shorten derivations; our derivations illustrate how one could obtain equations with more complex developmental and environmental constraints. With the minimalist environmental constraint (2), the environmental traits are mutually independent in that changing one environmental trait at one age does not *directly* change any other environmental trait at any age (i.e., ∂*ϵ*_{k j}*/*∂*ϵ*_{ia} = 0 if *i* ≠ *k* or *a* ≠ *j*). We say that development is social if .

Our aim is to obtain closed-form equations describing the evolutionary dynamics of the resident phenotype subject to the developmental constraint (1) and the environmental constraint (2). The evolutionary dynamics of the phenotype emerge as an outgrowth of the evolutionary dynamics of the genotype and environment . In the Supplementary Information section S3, we provide a short derivation of the canonical equation of adaptive dynamics closely following Dieckmann and Law (1996) although assuming deterministic population dynamics. The canonical equation describes the evolutionary dynamics of resident genotypic traits as:
where is invasion fitness, *ι* is a non-negative scalar measuring mutational input proportional to the mutation rate and the carrying capacity, and **H**_{y} = cov[**y, y**] is the mutational covariance matrix (of genotypic traits). The selection gradient in Eq. (3) involves total derivatives so we call it the *total* selection gradient of the genotype, which measures the effects of genotypic traits **y** on invasion fitness *λ* across all the paths in Fig. 1 (*λ* has the same derivatives with respect to mutant trait values as fitness *w*, defined below). Total selection gradients differ from Lande’s selection gradient in that the latter is defined in terms of partial derivatives and so measures only the direct effects of traits on fitness (e.g., measures the effect of **y** on *w* only across the path directly connecting the two in Fig. 1). We will be concerned with describing the evolutionary dynamics to first-order of approximation, so we will treat the approximation in Eq. (3) as an equality although we keep the approximation symbols throughout to distinguish what is and what is not an approximation.

The arrangement above describes the evolutionary developmental (evo-devo) dynamics: the evolutionary dynamics of the resident genotype are given by the canonical equation (3), while the concomitant developmental dynamics of the phenotype are given by the developmental (1) and environmental (2) constraints evaluated at resident trait values. To complete the description of the evo-devo dynamics, we obtain closed-form expressions for the total selection gradient of the genotype. Moreover, to determine whether the evolution of the resident developed phenotype can be described as the climbing of a fitness landscape, we derive equations in gradient form describing the evolutionary dynamics of the resident phenotype , environment , geno-phenotype , and geno-envo-phenotype . To do so, we first give an overview of the model, which describes a complication introduced by social development, how we handle it, and a fitness function that has the same gradient as invasion fitness in age-structured populations. We then use these descriptions to write our results.

## 3. Model overview

Here we give an overview of the model. We describe it formally in the Supplementary Information section S2.

### 3.1. Set up

We base our framework on standard assumptions of adaptive dynamics, particularly following Dieckmann and Law (1996). We separate time scales, so developmental and population dynamics occur over a short discrete ecological time scale *t* and evolutionary dynamics occur over a long discrete evolutionary time scale *τ*. Although the population is finite, in a departure from Dieckmann and Law (1996), we let the population dynamics be deterministic rather than stochastic for simplicity, so there is no genetic drift. Thus, the only source of stochasticity in our framework is mutation. We assume that mutation is rare, weak, and unbiased. Weak mutation means that the variance of mutant genotypic traits around resident genotypic traits is marginally small (i.e., a mutant **y** is marginally different from the resident , so . Weak mutation (Gillespie, 1983; Walsh and Lynch, 2018, p. 1003) is also called *δ*-weak selection (Wild and Traulsen, 2007). Unbiased mutation means that mutant genotypic traits are symmetrically distributed around the resident genotypic traits (i.e., the mutational distribution is even, so . Yet, unbiased mutation in genotypic traits still allows for bias in the distribution of mutant phenotypes since a function of a random variable may have a different distribution from that of the random variable (i.e., the distribution of is not even in general); thus, we do not make the isotropy assumption of Fisher’s (1930) geometric model (Orr, 2005), although isotropy may arise for mechanistic breeding values (defined below) with large *N*_{a}*N*_{g} and additional assumptions (e.g., high pleiotropy and high developmental integration) from the central limit theorem (Martin, 2014). We assume that a monomorphic resident population having geno-envo-phenotype undergoes density-dependent population dynamics that bring it to carrying capacity. At this carrying capacity, rare mutant individuals arise which have a marginally different genotype **y** and that develop their phenotype in the context of the resident. If the mutant genotype increases in frequency, it increasingly faces mutant rather than resident individuals. Thus, with social development, the mutant phenotype may change as the mutant genotype spreads, which complicates invasion analysis.

### 3.2. A complication introduced by social development

With social development, the phenotype an individual develops depends on the traits of her social partners. This introduces a complication to standard evolutionary invasion analysis, for two reasons. First, the phenotype of a mutant genotype may change as the mutant genotype spreads and is more exposed to the mutant’s traits via social interactions, making the mutant phenotype frequency dependent. Thus, the phenotype developed by a rare mutant genotype in the context of a resident phenotype may be different from the phenotype developed by the same mutant genotype in the context of itself once the mutant genotype has approached fixation. Second, because of social development, a recently fixed mutant may not breed true, that is, her descendants may have a different phenotype from her own despite clonal reproduction of the genotype and despite the mutant genotype being fixed (Fig. 2; see also Kobayashi et al. 2015, Eq. 14 in their Appendix). Yet, to apply standard invasion analysis techniques, the phenotype of the fixed genotype must breed true, so that the phenotype of a mutant genotype developed in the context of individuals with the mutant genotype have the same phenotype.

To carry out invasion analysis, we proceed as follows. Ideally, one should follow explicitly the change in mutant phenotype as the mutant genotype increases in frequency and achieves fixation, and up to a point where the fixed mutant phenotype breeds true. Yet, to simplify the analysis, we separate the dynamics of phenotype convergence and the population dynamics. We thus introduce an additional phase to the standard separation of time scales in adaptive dynamics so that phenotypic convergence occurs first and then resident population dynamics follow. Such additional phase does not describe a biological process but is a mathematical technique to facilitate mathematical treatment (akin to using best-response dynamics to find Nash equilibria). However, this phase might still be biologically justified under somewhat broad conditions. In particular, Aoki et al. (2012, their Appendix A) show that such additional phase is justified in their model of social learning evolution if mutants are rare and social learning dynamics happen faster than allele frequency change; they also show that this additional phase is justified for their particular model if selection is *δ*-weak. As a first approximation, here we do not formally justify the separation of phenotype convergence and resident population dynamics for our model and simply assume it for simplicity.

### 3.3. Phases of an evolutionary time step

To handle the above complication introduced by social development, we partition a unit of evolutionary time in three phases: socio-developmental (socio-devo) dynamics, resident population dynamics, and resident-mutant population dynamics (Fig. 3).

At the start of the socio-devo dynamics phase of a given evolutionary time *τ*, the population consists of individuals all having the same resident genotype, phenotype, and environment. A new individual arises which has identical genotype as the resident, but develops a phenotype that may be different from that of the original resident due to social development. This developed phenotype, its genotype, and its environment are set as the new resident. This process is repeated until convergence to what we term a “socio-devo stable” (SDS) resident equilibrium or until divergence. These socio-devo dynamics are formally described by Eq. (S1) and illustrated in Fig. 2A. If development is not social, the resident is trivially SDS so the socio-devo dynamics phase is unnecessary. If an SDS resident is achieved, the population moves to the next phase; if an SDS resident is not achieved, the analysis stops. We thus study only the evolutionary dynamics of SDS resident geno-envo-phenotypes. More specifically, we say a geno-envo-phenotype is a socio-devo equilibrium if and only if is produced by development when the individual has such genotype and everyone else in the population has that same genotype, phenotype, and environment (Eq. S2). A socio-devo equilibrium is locally stable (i.e., SDS) if and only if a marginally small deviation in the initial phenotype from the socio-devo equilibrium keeping the same genotype leads the socio-devo dynamics (Eq. S1) to the same equilibrium. A socio-devo equilibrium is locally stable if all the eigenvalues of the matrix
have absolute value (or modulus) strictly less than one. For instance, this is always the case if social interactions are only among peers (i.e., individuals of the same age) so the mutant phenotype at a given age depends only on the phenotype of immediately younger social partners (in which case the above matrix is block upper triangular so all its eigenvalues are zero; Eq. G9). We assume that there is a unique SDS geno-envo-phenotype for a given developmental map at every evolutionary time *τ*.

If an SDS resident is achieved in the socio-devo dynamics phase, the population moves to the resident population dynamics phase. Because the resident is SDS, an individual with resident genotype developing in the context of the resident geno-phenotype is guaranteed to develop the resident phenotype. Thus, we may proceed with the standard invasion analysis. Hence, in this phase of SDS resident population dynamics, the SDS resident undergoes density dependent population dynamics, which we assume asymptotically converges to a carrying capacity.

Once the SDS resident has achieved carrying capacity, the population moves to the resident-mutant population dynamics phase. At the start of this phase, a random mutant genotype **y** marginally different from the resident genotype arises in a vanishingly small number of mutants. We assume that the mutant becomes either lost or fixed in the population (Geritz et al., 2002; Geritz, 2005; Priklopil and Lehmann, 2020), establishing a new resident geno-envophenotype.

Repeating this evolutionary time step generates long term evolutionary dynamics of an SDS geno-envo-phenotype.

### 3.4. Fitness in age structured populations

To compute the total selection gradient of the genotype, we now write a fitness function that has the same derivatives with respect to mutant trait values as invasion fitness for age-structured populations. To do this, we first write a mutant’s survival probability and fertility at each age. At the resident population dynamics equilibrium, a rare mutant’s fertility at age *a* is
and the mutant’s survival probability from age *a* to *a* + 1 is
The first argument **m**_{a} in Eqs. (4) is the direct dependence of the mutant’s fertility and survival at a given age on her own geno-envo-phenotype at that age. The second argument in Eqs. (4) is the direct dependence on social partners’ geno-envo-phenotype at any age (thus, fertility and survival may directly depend on the environment of social partners, specifically, as it may affect the carrying capacity, and fertility and survival are density dependent). In the Supplementary Information section S2.3, we show that the gradients of invasion fitness *λ* with respect to mutant trait values are equal to (not an approximation of) the corresponding gradients of the relative fitness *w* of a mutant individual per unit of generation time (Eqs. S19), defined as
where a mutant’s relative fitness at age *j* is
and generation time is
(Charlesworth 1994, Eq. 1.47c; Bulmer 1994, Eq. 25, Ch. 25; Bienvenu and Legendre 2015, Eqs. 5 and 12). The superscript ◦ denotes evaluation at (so at as the resident is a socio-devo equilibrium). The quantity is the survivorship of mutants from age 1 to age *j*, and is that of neutral mutants. Thus, generation time in *w* is for a neutral mutant, or resident, rather than the mutant. This can be intuitively understood as the mutant having to invade over a time scale determined by the resident rather than the mutant. We denote the force of selection on fertility at age *j* (Hamilton 1966 and Caswell 1978, his Eqs. 11 and 12) as
and the force of selection on survival at age *j* (Baudisch 2005, her Eq. 5a) as
which are independent of mutant trait values because they are evaluated at the resident trait values. It is easily checked that *ϕ*_{j} and *π*_{j} decrease with *j* (respectively, if and provided that ).

In the Supplementary Information section S5, we show that the gradients of invasion fitness *λ* with respect to mutant trait values are also equal to 1*/T* times the corresponding gradients of a rare mutant’s expected lifetime reproductive success *R*_{0} (Eqs. S36) (Bulmer, 1994; Caswell, 2009). This occurs because of our assumption that mutants arise when residents are at carrying capacity (Mylius and Diekmann, 1995). For our life cycle, a mutant’s expected lifetime reproductive success is
(Caswell, 2001). From the equality of the gradients of invasion fitness and fitness, it follows that invasion fitness *λ* for age-structured populations is to first-order of approximation around resident genotypic traits equal to the mutant’s relative fitness *w*, that is, *λ* ≈ *w* (Eq. S21). Similarly, from the equality of the gradients of invasion fitness and a mutant’s lifetime reproductive success, it follows that invasion fitness *λ* for age-structured populations is to first-order of approximation around resident genotypic traits given by *λ* ≈ 1 + (*R*_{0} − 1)*/T* (Eq. S23). Taking derivatives of *w* with respect to mutant trait values is generally simpler than for *λ* or *R*_{0}, so we present most results below in terms of *w*.

## 4. Summary of main results

We use the model above to obtain three sets of main results. First, we obtain formulas for the total selection gradient of the genotype and underlying equations. Second, we obtain formulas and underlying equations for the evolutionary dynamics in gradient form for the phenotype and environment, which if considered on their own yield an underdetermined and so dynamically insufficient evolutionary system. Third, we obtain formulas and underlying equations for the evolutionary dynamics in gradient form for the geno-phenotype and the geno-envo-phenotype, which if considered on their own yield a determined and so dynamically sufficient system.

In section 4, we give an overview of these three sets of main results. We provide ancillary results and further interpretation and analysis in section 5. The derivations of the results in sections 4 and 5 are in the Appendices and involve repeated use of the chain rule due to the recurrence and feedbacks involved in the developmental constraint (1).

### 4.1. Total selection gradient of the genotype

The total selection gradient of the genotype is
This gradient depends on the block matrix of *total effects of a mutant’s genotype on her phenotype*, given by
This matrix is a mechanistic counterpart of Fisher’s (1918) additive effects of allelic substitution and of Wagner’s (1984) developmental matrix. This matrix also gives the sensitivity of a recurrence of the form (1) to perturbations in parameters **y** at any time *a*.

The block matrix of *total effects of a mutant’s phenotype on her phenotype* is
This matrix describes developmental feedback and gives the sensitivity of a recurrence of the form (1) to perturbations in state variables **x** at any time *a*.

The block matrix of *total immediate effects of the phenotype or genotype on a mutant’s phenotype* is
for ** ζ** ∈ {

**x, y**}. This matrix depends on direct niche construction (∂

*ϵ*^{⊺}

*/*∂

**) and direct plasticity (∂**

*ζ***x**

^{⊺}

*/*∂

**).**

*ϵ*The total selection gradient of the genotype also depends on the block matrix of *total effects of a mutant’s genotype on her environment*
This matrix quantifies total niche construction by the genotype and depends on direct niche construction by the phenotype and the genotype.

Additionally, the *direct selection gradient of the phenotype* is
where the direct selection gradient of the phenotype at age *a* ∈ {1, …, *N*_{a}} is
and the direct effect on fertility of the mutant phenotype at age *a* ∈ {1, …, *N*_{a}} is
The other direct selection gradients and direct effects on fertility or survival are defined analogously.

In turn, the block matrix of *direct effects of a mutant’s phenotype on her phenotype* is
where the matrix of *direct effects of a mutant’s phenotype at age a on her phenotype at age a* + 1 is

To build a minimal evo-devo dynamics model using our approach, one first needs expressions for fertility *f*_{a}, survival *p*_{a}, and development **g**_{a}, then one computes the partial derivatives (16), (18), and analogous partial derivatives, and finally these partial derivatives are fed to Eqs. (9)-(18) to compute the evo-evo dynamics using Eqs. (1)-(3).

### 4.2. Evolutionary dynamics of the phenotype in gradient form

The evo-devo dynamics above allow for the modeling of the evolutionary dynamics of the phenotype under explicit development, but do not provide an interpretation of the evolutionary dynamics the phenotype as the climbing of a fitness landscape. To obtain such an interpretation, we obtain equations in gradient form for the evolutionary dynamics of the phenotype.

As a first step, temporarily assume that the following four conditions hold: (I) development is non-social , and there is (II) no exogenous plastic response of the phenotype , (III) no total immediate selection on the genotype , and (IV) no niche-constructed effects of the phenotype on fitness . Then, in the limit as Δ*τ* → 0, the evolutionary dynamics of the phenotype satisfies
This is a mechanistic version of the Lande equation for the phenotype and is in gradient form, where the *mechanistic additive genetic covariance matrix of the phenotype* (H for heredity) is
which guarantees that developmental constraint (1) is met at all times given the formulas for the total effects of the genotype on phenotype given in section 4.1. The matrix **H**_{x} describes genetic covariation in the phenotype as the covariation of *mechanistic* breeding value **b**_{x}. Mechanistic breeding value is a mechanistic counterpart of breeding value, defined not in terms of regression coefficients but in terms of total derivatives and so has different properties explained in section 5.6. Now, Eq. (19) depends on the resident genotype but does not describe its evolutionary dynamics and is thus dynamically insufficient. Indeed, the only reason that the resident phenotype evolves under the assumptions of Eq. (19) is that the resident genotype evolves, but the evolution of the resident genotype is not described by such equation. Hence, the mechanistic Lande equation (19) is not sufficient to describe phenotypic evolution as the climbing of a fitness landscape.

### 4.3. Evolutionary dynamics of the geno-envo-phenotype in gradient form

To describe phenotypic evolution as the climbing of a fitness landscape, we obtain dynamically sufficient equations in gradient form for phenotypic evolution. Dropping assumptions (I-IV) above, one such an equation describes the evolutionary dynamics of the geno-envo-phenotype as
which is an extended mechanistic Lande equation. This equation describes evolution of the geno-envo-phenotype as the climbing of a fitness landscape and is dynamically sufficient because it describes the evolution of all the variables involved, including the resident genotype . This equation depends on the *mechanistic additive socio-genetic crosscovariance matrix of the geno-envo-phenotype* (L for legacy)
We say that the matrix **L**_{m} describes the “socio-genetic” covariation of the geno-envo-phenotype, that is, the covariation between the stabilized mechanistic breeding value and mechanistic breeding value **b**_{m}. Whereas mechanistic breeding value considers the total effect of mutations on the phenotype within the individual, stabilized mechanistic breeding value includes the total effects of mutations after social development has stabilized in the population. Such stabilized effects are given by s**m***/*s**y**^{⊺}, which describes the stabilized effects of the genotype on the geno-envophenotype, that is, the total effects of former on the latter after social development has stabilized in the population. Hence, **L**_{m} may be asymmetric and its main diagonal entries may be negative (unlike variances) due to social development. Moreover, the matrix **L**_{m} is always singular because d**m**^{⊺}*/*d**y** has fewer rows than columns, regardless of whether development is social. This means that there are always directions in geno-envo-phenotypic space in which there cannot be socio-genetic covariation, so there are always absolute socio-genetic constraints to adaptation of the geno-envo-phenotype (see Supplementary Information section 2.2 for a definition of absolute mutational or genetic constraints). In particular, this implies that evolution does not generally stop at peaks of the fitness landscape where the direct selection gradient is zero. The formulas for additional quantities involved in Eq. (21) are described in section 5.

Although the extended mechanistic Lande equation (21) is useful to interpret evolution as the climbing of a fitness landscape, in practice it may often be more useful to compute the evo-devo dynamics using Eqs. (1)-(3) since the extended mechanistic Lande equation may still require computing the evo-devo dynamics, particularly when development is social.

## 5. The layers of the evo-devo process

In this section, we provide additional ancillary results, list the equations that underlie the results given in section 4, and provide additional interpretation and analysis. These results provide formulas for genetic covariation and other high-level quantities from low-level mechanistic processes. We term the “evo-devo process” the whole set of equations describing the evolutionary dynamics under explicit development. The evo-devo process can be arranged in a layered structure, where each layer is formed by components in layers below (Fig. 4). This layered structure helps see how complex interactions between variables involved in genetic covariation are formed by building blocks describing the direct interaction between variables. We thus present the evo-devo process starting from the lowest-level layer up to the highest. A reader interested in seeing an illustration of the method may jump to section 6.

### 5.1. Layer 1: elementary components

The components of the evo-devo process can be calculated from ten elementary components. These include five “core” elementary components: the fertility , survival probability , developmental map , and environmental map for all ages *a*, as well as the mutational covariance matrix **H**_{y} (Fig. 4, Layer 1). The remaining five elementary components of the evo-devo process are the mutation rate *μ* and the initial conditions for the various dynamical processes, namely, the evolutionarily initial resident genotype , the developmentally initial resident phenotype , the population density at carrying capacity of initial-age residents, and the socio-devo initial resident phenotype . Once the five core elementary components are available, either from purely theoretical models or using empirical data, the equations of all the remaining layers of the evo-devo process can be derived. The remaining elementary components are then needed to compute the solution of the evo-devo dynamics. The five core elementary components except for **H**_{y} correspond to the elementary components of physiologically structured models of population dynamics (de Roos, 1997).

### 5.2. Layer 2: direct effects

We now write the equations for the next layer, that of the direct-effect matrices which constitute nearly elementary components of the evo-devo process. Direct-effect matrices measure the direct effect that a variable has on another variable. Direct-effect matrices capture various effects of age structure, including the declining forces of selection as age advances.

Direct-effect matrices include direct selection gradients, which have the following structure due to age-structure. The *direct selection gradient of the phenotype, genotype, or environment* is
for ** ζ** ∈ {

**x, y,**}, with dimensions for , and . These gradients measure direct directional selection on the phenotype, genotype, or environment, respectively. Analogously, Lande’s (1979) selection gradient measures direct directional selection under quantitative genetics assumptions. Also, the direct selection gradient of the environment measures the environmental sensitivity of selection (Chevin et al., 2010). The block entries of Layer 2, Eq. 1 can be computed by differentiating Eq. (5b). Note that Layer 2, Eq. 1 takes the derivative of fitness at each age, so from Eq. (5b) each block entry in Layer 2, Eq. 1 is weighted by the forces of selection at each age. Thus, the selection gradients in Layer 2, Eq. 1 capture the declining forces of selection in that increasingly rightward block entries have smaller magnitude if survival and fertility effects are of the same magnitude as age increases.

*ϵ*We use the above definitions to form the following aggregate direct selection gradients. The *direct selection gradient of the geno-phenotype* is
and the *direct selection gradient of the geno-envo-phenotype* is
Direct-effect matrices also include matrices that measure direct developmental bias. These matrices have specific, sparse structure due to *the arrow of developmental time*: changing a trait at a given age cannot have effects on the developmental past of the individual and only directly affects the developmental present or immediate future. Using matrix calculus notation (Appendix A), the block matrix of *direct effects of a mutant’s phenotype on her phenotype* is
which can be understood as measuring direct developmental bias from the phenotype. The equality (Layer 2, Eq. 2a) follows because the direct effects of a mutant’s phenotype on her phenotype are only non-zero at the next age (from the developmental constraint in Eq. 1) or when the phenotypes are differentiated with respect to themselves. The block entries of Layer 2, Eq. 2a can be computed by differentiating the developmental constraint (1). Analogously, the block matrix of *direct effects of a mutant’s genotype on her phenotype* is
which can be understood as measuring direct developmental bias from the genotype. Note that the main block diagonal is zero.

Direct-effect matrices also include matrices measuring direct plasticity and direct niche construction. Indeed, the block matrix of *direct effects of a mutant’s environment on her phenotype* is
which can be understood as measuring the direct plasticity of the phenotype (Noble et al., 2019). In turn, the block matrix of *direct effects of a mutant’s phenotype or genotype on her environment* is
for ** ζ** ∈ {

**x, y**}, which can be understood as measuring direct niche construction by the phenotype or genotype. The equality (Layer 2, Eq. 2d) follows from the environmental constraint in Eq. (2) since the environment faced by a mutant at a given age is directly affected by the mutant phenotype or genotype at the same age only (i.e., for

*a*≠

*j*).

Direct-effect matrices also include a matrix describing direct mutual environmental dependence. This is measured by the block matrix of *direct effects of a mutant’s environment on itself*
The first equality follows from the environmental constraint (Eq. 2) and the second equality follows from our assumption that environmental traits are mutually independent, so for all *a* ∈ {1, …, *N*_{a}}. It is conceptually useful to write rather than only **I**, and we do so throughout.

Additionally, direct-effect matrices include matrices describing direct social developmental bias, which includes the direct effects of extra-genetic inheritance and indirect genetic effects. The block matrix of *direct effects of social partners’ phenotype or genotype on a mutant’s phenotype* is
for , where the equality follows because the phenotype **x**_{1} at the initial age is constant by assumption. The matrix in Layer 2, Eq. 4 can be understood as measuring direct social developmental bias from either the phenotype or genotype, and mechanistically measures the direct effects of extra-genetic inheritance and indirect genetic effects. This matrix can be less sparse than direct-effect matrices above because the mutant’s phenotype can be affected by the phenotype or genotype of social partners of *any* age.

Direct-effect matrices also include matrices describing direct social niche construction. The block matrix of *direct effects of social partners’ phenotype or genotype on a mutant’s environment* is
for , which can be understood as measuring direct social niche construction by either the phenotype or genotype. This matrix does not contain any zero entries in general because the mutant’s environment at any age can be affected by the phenotype or genotype of social partners of any age.

We use the above definitions to form direct-effect matrices involving the geno-phenotype. The block matrix of *direct effects of a mutant’s geno-phenotype on her geno-phenotype* is
which measures direct developmental bias of the geno-phenotype, and where the equality follows because genotypic traits are developmentally independent by assumption. The block matrix of *direct effects of a mutant’s geno-phenotype on her environment* is
which measures direct niche construction by the geno-phenotype. The block matrix of *direct effects of social partners’ geno-phenotypes on a mutant’s environment* is
which measures direct social niche construction by partners’ geno-phenotypes. The block matrix of *direct effects of a mutant’s environment on her geno-phenotype* is
which measures the direct plasticity of the geno-phenotype, and where the equality follows because genotypic traits are developmentally independent.

We will see that the evolutionary dynamics of the environment depends on a matrix measuring “inclusive” direct niche construction. This matrix is the transpose of the matrix of *direct social effects of a focal individual’s genophenotype on hers and her partners’ environment*
where we denote by the environment a resident experiences when she develops in the context of mutants (a donor perspective for the mutant). Thus, this matrix can be interpreted as inclusive direct niche construction by the genophenotype. Note that the second term on the right-hand side of Layer 2, Eq. 10 is the direct effects of social partners’ geno-phenotypes on a focal mutant (a recipient perspective for the mutant). Hence, inclusive direct niche construction by the geno-phenotype as described by Layer 2, Eq. 10 can be equivalently interpreted either from a donor or a recipient perspective.

### 5.3. Layer 3: total immediate effects

We now proceed to write the equations of the next layer of the evo-devo process, that of total immediate effects. Total-immediate-effect matrices measure the total effects that a variable has on another variable only at a given age, thus without considering the downstream effects over development. With the developmental and environmental constraints assumed, if there are no environmental traits, total immediate effect matrices (*δ**ζ*^{⊺}*/δξ*) reduce to direct effect matrices (∂*ζ*^{⊺}*/*∂*ξ*).

Total-immediate-effect matrices include total immediate selection gradients, which capture some of the effects of niche construction. The *total immediate selection gradient of the phenotype, genotype, or geno-phenotype* is
for ** ζ** ∈ {

**x, y, z**}. Here, the total immediate selection gradient of

**depends on direct directional selection on**

*ζ***, direct niche construction by**

*ζ***, and direct environmental sensitivity of selection. Thus, total immediate selection gradients measure total immediate directional selection, which is directional selection in the fitness landscape modified by the interaction of niche construction and environmental sensitivity of selection. In a standard quantitative genetics framework, the total immediate selection gradients correspond to Lande’s (1979) selection gradient if the environmental traits are not explicitly included in the analysis.**

*ζ*Total immediate selection on the environment equals direct selection on the environment because we assume environmental traits are mutually independent. The *total immediate selection gradient of the environment* is
Given our assumption that environmental traits are mutually independent, the matrix of direct effects of the environment on itself is the identity matrix. Thus, the total immediate selection gradient of the environment equals the selection gradient of the environment.

Total-immediate-effect matrices also include matrices describing total immediate developmental bias, which capture additional effects of niche construction. The block matrix of *total immediate effects of the phenotype, genotype, social partner’s phenotype, or social partner’s genotype on a mutant’s phenotype* is
for . Here, the total immediate effects of ** ζ** on the phenotype depend on the direct developmental bias from

**, direct niche construction by**

*ζ***, and the direct plasticity of the phenotype. Consequently, total immediate effects on the phenotype can be interpreted as measuring total immediate developmental bias, which measures developmental bias in the developmental process modified by the interaction of niche construction and plasticity.**

*ζ*Moreover, total immediate-effect matrices include matrices describing total immediate plasticity of the phenotype, which equals plasticity of the phenotype because environmental traits are mutually independent by assumption. The block matrix of *total immediate effects of a mutant’s environment on her phenotype* is
Given our assumption that environmental traits are mutually independent, the matrix of direct effects of the environment on itself is the identity matrix. Thus, the total immediate plasticity of the phenotype equals the direct plasticity of the phenotype.

We use the above definitions to form a matrix quantifying the total immediate developmental bias of the geno-phenotype. This is the block matrix of *total immediate effects of a mutant’s geno-phenotype on her geno-phenotype*
Consequently, the total immediate developmental bias of the geno-phenotype depends on the direct developmental bias of the geno-phenotype, direct niche construction by the geno-phenotype, and direct plasticity of the geno-phenotype.

### 5.4 Layer 4: total effects

We now move to write the equations for the next layer of the evo-devo process, that of total-effect matrices. Totaleffect matrices measure the total effects of a variable on another one over the individual’s life, thus considering the downstream effects over development, but before the effects of social development have stabilized in the population. More generally, total-effect matrices include matrices that give the sensitivity to perturbations of the solution of a recurrence of the form (1).

The total effects of the phenotype on itself describe the *developmental feedback* of the phenotype. This is given by the block matrix of *total effects of a mutant’s phenotype on her phenotype*
which is always invertible (Appendix B, Eq. B15) and where the last equality follows by the geometric series of matrices. This matrix can be interpreted as a lifetime collection of total immediate effects of the phenotype on itself. Also, the developmental feedback of the phenotype can be seen as describing the total developmental bias of the phenotype. More generally, Layer 4, Eq. 1 gives the sensitivity of the solution **x** of the recurrence (1) to perturbations in the solution at other times (ages): in particular, d*x*_{k j}/d*x*_{ia} gives the sensitivity of the solution *x*_{k j} of the *k*-th variable at time *j* to perturbations in the solution *x*_{ia} of the *i*-th variable at time *a*. Developmental feedback may cause major phenotypic effects at subsequent ages as its block entries involve matrix products. Indeed, the total effects of the phenotype at age *a* on the phenotype at age *j* are given by
Since matrix multiplication is not commutative, the ↷ denotes right multiplication. By depending on the total immediate developmental bias from the phenotype, the developmental feedback of the phenotype depends on direct developmental bias from the phenotype, direct niche-construction by the phenotype, and direct plasticity of the phenotype (Layer 3, Eq. 3). Layer 4, Eq. 1 has the same form of an equation for total effects used in path analysis (Greene 1977, p. 380; see also Morrissey 2014, Eq. 2) if is interpreted as a matrix listing the path coefficients of “direct” effects of the phenotype on itself (direct, without explicitly considering environmental traits).

The total effects of the genotype on the phenotype are a mechanistic analogue of Fisher’s additive effects of allelic substitution and of Wagner’s developmental matrix. The block matrix of *total effects of a mutant’s genotype on her phenotype* is given by
which is singular because the developmentally initial phenotype is not affected by the genotype (by our assumption that the initial phenotype is constant) and the developmentally final genotypic traits do not affect the phenotype (by our assumption that individuals do not survive after the final age; so has rows and columns that are zero; Appendix C, Eq. C16). From Layer 4, Eq. 3, this matrix can be interpreted as involving a developmentally immediate pulse caused by a change in genotypic traits followed by the triggered developmental feedback of the phenotype. The matrix of total effects of the genotype on the phenotype measures total developmental bias of the phenotype from the genotype. By giving the total effects of a perturbation in the genotype on the phenotype, the entries of this matrix are a mechanistic analogue of Fisher’s additive effect of allelic substitution, which he defined as regression coefficients (his α; see Eq. I of Fisher 1918 and p. 72 of Lynch and Walsh 1998). Also, this matrix is a mechanistic analogue of Wagner’s (1984, 1989) developmental matrix (his **B**) (see also Martin 2014), Rice’s (2002) rank-1 **D** tensor, and Morrissey’s (2015) total effect matrix (his **Φ**, but not Morrissey’s (2014) **Φ**, which is a regression-based form of d**x**^{⊺}/d**x**) (interpreting these authors’ partial derivatives as total derivatives, although using derivatives rather than regression coefficients violates the standard partition of phenotypic variance into genetic and “environmental” variances, as explained below). More generally, interpreting **y** as parameters affecting the recurrence (1) over **x**, Layer 4, Eq. 3 gives the sensitivity of the solution **x** to perturbation in the parameters at other times (ages): in particular, d*x*_{k j}/d*y*_{ia} gives the sensitivity of the solution *x*_{k j} of the *k*-th variable at time *j* to perturbations in the *i*-th parameter *y*_{ia} at time *a*. The definition of total effects of the genotype on the phenotype in terms of derivatives (Layer 4, Eq. 3) differs from Fisher’s in terms of regression coefficients both in that it reveals its structure and so it can be used for evo-devo dynamically sufficient analysis, and in that regression coefficients of phenotype to genotype are uncorrelated with residuals whereas the derivative analogues need not be.

The total effects of the environment on the phenotype measure the total plasticity of the phenotype, considering downstream effects over development. This is given by the block matrix of *total effects of a mutant’s environment on her phenotype*
Thus, the total plasticity of the phenotype can be interpreted as a developmentally immediate pulse of plastic change in the phenotype followed by the triggered developmental feedback of the phenotype.

The total effects of social partners’ genotype or phenotype on the phenotype measure the total *social* developmental bias of the phenotype. The block matrix of *total effects of social partners’ phenotype or genotype on a mutant’s phenotype* is
for . This matrix can be interpreted as measuring total social developmental bias of the phenotype from phenotype or genotype, as well as the total effects on the phenotype of extra-genetic inheritance, and the total indirect genetic effects. In particular, the matrix of total social developmental bias of the phenotype from phenotype, , is a mechanistic version of the matrix of interaction coefficients in the indirect genetic effects literature (i.e., Ψ in Eq. 17 of Moore et al. 1997, which is defined as a matrix of regression coefficients). From Layer 4, Eq. 5, the total social developmental bias of the phenotype can be interpreted as a developmentally immediate pulse of phenotype change caused by a change in social partners’ traits followed by the triggered developmental feedback of the mutant’s phenotype.

The total effects on the genotype are simple since genotypic traits are developmentally independent by assumption. The block matrix of *total effects of a mutant’s genotype on itself* is
and the block matrix of *total effects of a vector* *on a mutant’s genotype* is
(Appendix C, Eq. C13).

We can use some of the previous total-effect matrices to construct the following total-effect matrices involving the geno-phenotype. The block matrix of *total effects of a mutant’s phenotype on her geno-phenotype* is
measuring total developmental bias of the geno-phenotype from the phenotype. The block matrix of *total effects of the genotype on her geno-phenotype* is
measuring total developmental bias of the geno-phenotype from the genotype. This matrix is singular because any matrix with fewer rows than columns is singular (Horn and Johnson, 2013, p. 14). This singularity will be important when we consider mechanistic additive genetic covariances (Layer 6). Now, the block matrix of *total effects of a mutant’s geno-phenotype on her geno-phenotype* is
which can be interpreted as measuring the developmental feedback of the geno-phenotype (Appendix E, Eq. E4). Since is square and block lower triangular, and since is invertible (Appendix B, Eq. B15), we have that is invertible.

Moreover, the total effects of the phenotype and genotype on the environment quantify total niche construction. Total niche construction by the phenotype is quantified by the block matrix of *total effects of a mutant’s phenotype on her environment*
which can be interpreted as showing that developmental feedback of the phenotype occurs first and then direct nicheconstructing effects by the phenotype follow. Similarly, total niche construction by the genotype is quantified by the block matrix of *total effects of a mutant’s genotype on her environment*
which depends on direct niche construction by the genotype and on total developmental bias of the phenotype from the genotype followed by niche construction by the phenotype. The analogous relationship holds for total niche construction by the geno-phenotype, quantified by the block matrix of *total effects of a mutant’s geno-phenotype on her environment*
which depends on the developmental feedback of the geno-phenotype and direct niche construction by the genophenotype.

The total effects of the environment on itself quantify environmental feedback. The block matrix of *total effects of a mutant’s environment on her environment* is
which is always invertible (Appendix D, Eq. D5). This matrix can be interpreted as measuring *environmental feedback*, which depends on direct mutual environmental dependence, total plasticity of the phenotype, and direct niche construction by the phenotype.

We can also use some of the following previous total-effect matrices to construct the following total-effect matrices involving the geno-envo-phenotype. The block matrix of *total effects of a mutant’s phenotype on her geno-envophenotype* is
measuring total developmental bias of the geno-envo-phenotype from the phenotype. The block matrix of *total effects of a mutant’s genotype on her geno-envo-phenotype* is
measuring total developmental bias of the geno-envo-phenotype from the genotype, and which is singular because it has fewer rows than columns.

The block matrix of *total effects of a mutant’s environment on her geno-envo-phenotype* is
measuring total plasticity of the geno-envo-phenotype. The block matrix of *total effects of a mutant’s geno-phenotype on her geno-envo-phenotype* is
measuring total developmental bias of the geno-envo-phenotype from the geno-phenotype. The block matrix of *total effects of a mutant’s geno-envo-phenotype on her geno-envo-phenotype* is
measuring developmental feedback of the geno-envo-phenotype, and which we show is invertible (Appendix F). Obtaining a compact form for analogous to Layer 4, Eq. 9 seemingly needs which appears to yield relatively complex expressions so we leave this for future analysis.

We will see that the evolutionary dynamics of the phenotype depends on a matrix measuring “inclusive” total developmental bias of the phenotype. This matrix is the transpose of the matrix of *total social effects of a focal individual’s genotype or phenotype on hers and her partners’ phenotypes*
for ** ζ** ∈ {

**x, y**} where we denote by the phenotype that a resident develops in the context of mutants (a donor perspective for the mutant). Thus, this matrix can be interpreted as measuring inclusive total developmental bias of the phenotype. Note that the second term on the right-hand side of Layer 4, Eq. 19 is the total effects of social partners’ phenotype or genotype on a focal mutant (a recipient perspective for the mutant). Thus, the inclusive total developmental bias of the phenotype as described by Layer 4, Eq. 19 can be equivalently interpreted either from a donor or a recipient perspective.

Having written expressions for the above total-effect matrices, we can now write the total selection gradients, which measure total directional selection, that is, directional selection considering all the pathways in which a trait can affect fitness in Fig. 1 (see also Morrissey 2014). This contrasts with Lande’s (1979) selection gradient, which corresponds to the direct selection gradient measuring the direct effect of a variable on fitness in Fig. 1. In Appendix B-Appendix F, we show that the total selection gradient of vector ** ζ** ∈ {

**x, y, z,**} is which has the form of the chain rule in matrix calculus notation. Hence, the total selection gradient of

*ϵ*, m**depends on the total effects of**

*ζ***on the geno-envo-phenotype and direct directional selection on the geno-envo-phenotype. Consequently, the total directional selection on**

*ζ***is the directional selection on the geno-envo-phenotype transformed by the total effects of**

*ζ***on the geno-envo-phenotype considering the downstream developmental effects. Layer 4, Eq. 20 has the same form of previous expressions by Caswell (e.g., Caswell, 1982, Eq. 4 and Caswell, 2001, Eq. 9.38), except that it is in terms of traits rather than vital rates (i.e., Caswell’s equations have the entries of the Leslie matrix in Eq. S7 in the place of**

*ζ***m**). Layer 4, Eq. 20 also recovers the form of Morrissey’s (2014) extended selection gradient. Total selection gradients take the following particular forms.

The total selection gradient of the phenotype is
This gradient depends on direct directional selection on the phenotype and direct directional selection on the environment (Layer 2, Eq. 1). It also depends on developmental feedback of the phenotype (Layer 4, Eq. 1) and total niche construction by the phenotype, which also depends on developmental feedback of the phenotype (Layer 4, Eq. 10). Consequently, the total selection gradient of the phenotype can be interpreted as measuring total (directional) phenotypic selection in the fitness landscape modified by developmental feedback of the phenotype and by the interaction of total niche construction and environmental sensitivity of selection. Additionally, the total selection gradient of the phenotype at an admissible stable evolutionary equilibrium equals generation time *T* times the vector of costate variables, which are key variables used for solving optimal control problems (Eq. K3; see also Appendix K and Metz et al. 2016). In optimal control problems with continuous time, costate variables must be found by solving differential equations with boundary conditions at the terminal time while simultaneously solving differential equations describing state dynamics with boundary conditions at the initial time (Bryson, Jr. and Ho, 1975; Sydsæter et al., 2008). This poses a two-point boundary value problem, which is typically challenging to solve. Our discrete-age approach allows us to obtain closed-form formulas for the total selection gradient of the phenotype (Layer 4, Eq. 21), thus providing closed-form formulas for costate variables and avoiding having to solve a two-point boundary value problem.

The total selection gradient of the genotype is This gradient not only depends on direct directional selection on the phenotype and the environment, but also on direct directional selection on the genotype (Layer 2, Eq. 1). It also depends on the mechanistic analogue of Fisher’s (1918) additive effects of allelic substitution or of Wagner’s (1984, 1989) developmental matrix (Layer 4, Eq. 3) and on total niche construction by the genotype, which also depends on the developmental matrix (Layer 4, Eq. 11). Consequently, the total selection gradient of the genotype can be interpreted as measuring total (directional) genotypic selection in a fitness landscape modified by the interaction of total developmental bias of the phenotype from the genotype and directional selection on the phenotype and by the interaction of total niche construction by the genotype and environmental sensitivity of selection. In a standard quantitative genetics framework, the total selection gradient of the genotype would correspond to Lande’s (1979) selection gradient of the genotype if phenotypic and environmental traits were not explicitly included in the analysis. The fifth line of Layer 4, Eq. 22 has the form of previous expressions for the total selection gradient of controls in continuous age in terms of partial derivatives of the Hamiltonian involving costate variables for which closed-form formulas have been lacking (e.g., Day and Taylor 1997, Eq. 4, Day and Taylor 2000, Eq. 6, and Avila et al. 2021, Eq. 23; see also our Eq. K4). Our discrete-age approach allowed us to obtain closedform formulas for the total selection gradient of states (Layer 4, Eq. 21), thus providing closed-form formulas for the total selection gradient of controls.

To derive equations describing the evolutionary dynamics of the geno-envo-phenotype, we make use of the total selection gradient of the environment, although such gradient is not necessary to obtain equations describing the evolutionary dynamics of the geno-phenotype. The total selection gradient of the environment is This gradient depends on total plasticity of the phenotype and on environmental feedback, which in turn depends on total plasticity of the phenotype and niche construction by the phenotype (Layer 4, Eq. 13). Consequently, the total selection gradient of the environment can be understood as measuring total (directional) environmental selection in a fitness landscape modified by environmental feedback and by the interaction of total plasticity of the phenotype and direct directional selection on the phenotype.

We can combine the expressions for the total selection gradients above to obtain the total selection gradient of the geno-phenotype and the geno-envo-phenotype. The total selection gradient of the geno-phenotype is Thus, the total selection gradient of the geno-phenotype can be interpreted as measuring total (directional) genophenotypic selection in a fitness landscape modified by developmental feedback of the geno-phenotype and by the interaction of total niche construction by the geno-phenotype and environmental sensitivity of selection. In turn, the total selection gradient of the geno-envo-phenotype is which can be interpreted as measuring total (directional) geno-envo-phenotypic selection in a fitness landscape modified by developmental feedback of the geno-envo-phenotype.

### 5.5 Layer 5: stabilized effects

We now move on to write the equations for the next layer of the evo-devo process, that of (socio-devo) stabilizedeffect matrices. Stabilized-effect matrices measure the total effects of a variable on another one considering downstream developmental effects, after the effects of social development have stabilized in the population. Stabilizedeffect matrices arise in the derivation of the evolutionary dynamics of the phenotype and environment as a result of social development. If development is not social (i.e., ), then all stabilized-effect matrices reduce to the corresponding total-effect matrices , except one that reduces to the identity matrix.

The stabilized effects of social partners’ phenotypes on a focal individual’s phenotype measure *social feedback*. This is given by the transpose of the matrix of *stabilized effects of social partners’ phenotypes on a focal individual’s phenotype*
where the last equality follows by the geometric series of matrices. The matrix is invertible by our assumption that all the eigenvalues of have absolute value strictly less than one, to guarantee that the resident is socio-devo stable. The matrix can be interpreted as the total effects of social partners’ phenotypes on a focal individual’s phenotype after socio-devo stabilization (Eq. S1); or vice versa, of a focal individual’s phenotype on social partners’ phenotypes. Thus, the matrix describes social feedback arising from social development. This matrix corresponds to an analogous matrix found in the indirect genetic effects literature (Moore et al., 1997, Eq. 19b and subsequent text). If development is not social from the phenotype (i.e., ), then the matrix is the identity matrix. This is the only stabilized-effect matrix that does not reduce to the corresponding total-effect matrix when development is not social.

The stabilized effects of a focal individual’s phenotype or genotype on her phenotype measure stabilized developmental bias. We define the transpose of the matrix of *stabilized effects of a focal individual’s phenotype or genotype on her phenotype* as
for ** ζ** ∈ {

**x, y**}. This matrix can be interpreted as measuring stabilized developmental bias of the phenotype from

**, where a focal individual’s genotype or phenotype first affects the development of her own and social partners’ phenotype which then feeds back to affect the individual’s phenotype. Stabilized developmental bias is “inclusive” in that it includes both the effects of the focal individual on herself and on social partners. If development is not social (i.e., ), then a stabilized developmental bias matrix reduces to the corresponding total developmental bias matrix .**

*ζ*The stabilized effects of the environment on the phenotype measure stabilized plasticity. The transpose of the matrix of *stabilized effects of a focal individual’s environment on the phenotype* is
This matrix can be interpreted as measuring stabilized plasticity of the phenotype, where the environment first causes total plasticity in a focal individual and then the focal individual causes stabilized social effects on social partners. Stabilized plasticity does not depend on the inclusive effects of the environment. If development is not social (i.e., ), then stabilized plasticity reduces to total plasticity.

The stabilized effects on the genotype are simple since genotypic traits are developmentally independent by assumption. The transpose of the matrix of *stabilized effects of a focal individual’s phenotype or environment on the genotype* is
for ** ζ** ∈ {

**x,**}. The transpose of the matrix of

*ϵ**stabilized effects of a focal individual’s genotype on the genotype*is We can use some of the previous stabilized-effect matrices to construct the following stabilized-effect matrices involving the geno-phenotype. The transpose of the matrix of

*stabilized effects of a focal individual’s genotype on the geno-phenotype*is measuring stabilized developmental bias of the geno-phenotype from the genotype. The transpose of the matrix of

*stabilized effects of a focal individual’s environment on the geno-phenotype*is measuring stabilized plasticity of the geno-phenotype. The transpose of the matrix of

*stabilized effects of a focal individual’s geno-phenotype on the geno-phenotype*is measuring stabilized developmental feedback of the geno-phenotype.

The stabilized effects of the phenotype or genotype on the environment measure stabilized niche construction. Although the matrix
appears in some of the matrices we construct, it is irrelevant as it disappears in the matrix products we encounter. The following matrix does not disappear. The transpose of the matrix of *stabilized effects of a focal individual’s genotype on the environment* is
which is formed by stabilized developmental bias of the geno-phenotype from genotype followed by inclusive direct niche construction by the geno-phenotype. This matrix can be interpreted as measuring stabilized niche construction by the genotype. If development is not social (i.e., ), then stabilized niche construction by the genotype reduces to total niche construction by the genotype (see Layer 4, Eq. 11 and Layer 2, Eq. 10).

The stabilized effects of the environment on itself measure stabilized environmental feedback. The transpose of the matrix of *stabilized effects of a focal individual’s environment on the environment* is
which depends on stabilized plasticity of the geno-phenotype, inclusive direct niche construction by the genophenotype, and direct mutual environmental dependence.

We can also use some of the following previous stabilized-effect matrices to construct the following stabilizedeffect matrices comprising the geno-envo-phenotype. The transpose of the matrix of *stabilized effects of a focal individual’s genotype on the geno-envo-phenotype* is
measuring stabilized developmental bias of the geno-envo-phenotype from the genotype. The transpose of the matrix of *stabilized effects of a focal individual’s environment on the geno-envo-phenotype* is
measuring stabilized plasticity of the geno-envo-phenotype. Finally, the transpose of the matrix of *stabilized effects of a focal individual’s geno-envo-phenotype on the geno-envo-phenotype* is
measuring stabilized developmental feedback of the geno-envo-phenotype.

### 5.6 Layer 6: genetic covariation

We now move to the next layer of the evo-devo process, that of genetic covariation. To present this layer, we first define mechanistic breeding value under our adaptive dynamics assumptions, which allows us to define mechanistic additive genetic covariance matrices under our assumptions. Then, we define (socio-devo) stabilized mechanistic breeding value, which we use to define mechanistic additive socio-genetic cross-covariance matrices. The notions of stabilized mechanistic breeding values and mechanistic socio-genetic cross-covariance generalize the corresponding notions of mechanistic breeding value and mechanistic additive genetic covariance to consider the effects of social development.

We follow the standard definition of breeding value to define its mechanistic analogue under our assumptions. The breeding value of a trait is defined under quantitative genetics assumptions as the best linear estimate of the trait from gene content (Lynch and Walsh, 1998; Walsh and Lynch, 2018). Specifically, under quantitative genetics assumptions, the *i*-th trait value *x*_{i} is written as , where the overbar denotes population average, *y* _{j} is the *j*-th predictor (gene content in *j*-th locus), α_{i j} is the partial least-square regression coefficient of , and *e*_{i} is the residual error; the breeding value of *x*_{i} is . Accordingly, we define the mechanistic breeding value **b**_{ζ} of a vector ** ζ** as its first-order estimate with respect to genotypic traits

**y**around the resident genotypic traits : The key difference of this definition with that of breeding value is that rather than using regression coefficients, this definition uses the total effects of the genotype on

**, , which are a mechanistic analogue to Fisher’s additive effect of allelic substitution (his α; see Eq. I of Fisher 1918 and p. 72 of Lynch and Walsh 1998). As previously stated, the matrix also corresponds to Wagner’s (1984, 1989) developmental matrix, particularly when**

*ζ***=**

*ζ***x**(his

**B**; see Eq. 1 of Wagner 1989).

That there is a material difference between breeding value and its mechanistic counterpart is made evident with heritability. Because breeding value under quantitative genetics uses linear regression via least squares, breeding value *a*_{i} is guaranteed to be uncorrelated with the residual error *e*_{i}. This guarantees that heritability is between zero and one. Indeed, the (narrow sense) heritability of trait *x*_{i} is defined as *h*^{2} = var[*a*_{i}]/var[*x*_{i}], where using *x*_{i} = *a*_{i} + *e*_{i} we have var[*x*_{i}] = var[*a*_{i}] + var[*e*_{i}] + 2cov[*a*_{i}, *e*_{i}]. The latter covariance is zero due to least squares, and so *h*^{2} ∈ [0, 1]. In contrast, mechanistic breeding values may be correlated with residual errors. Indeed, in our framework we have that phenotype , but mechanistic breeding value *b*_{ia} is not computed via least squares, so *b*_{ia} and the error may covary, positively or negatively. Hence, the classic quantitative genetics partition of phenotypic variance into genetic and “environmental” (i.e., residual) variance does not hold with mechanistic breeding value, as there may be mechanistic genetic and “environmental” covariance. Consequently, since the covariance between two random variables is bounded from below by the negative of the product of their standard deviations, mechanistic heritability defined as the ratio between the variance of mechanistic breeding value and phenotypic variance cannot be negative but it may be greater than one.

Our definition of mechanistic breeding value recovers Fisher’s (1918) infinitesimal model under certain conditions, although we do not need to assume the infinitesimal model. According to Fisher’s (1918) infinitesimal model, the normalized breeding value excess is normally distributed as the number of loci approaches infinity. Using Layer 6, Eq. 1, we have that the mechanistic breeding value excess for the *i*-th entry of **b**_{ζ} is
Let us denote the mutational variance for the *k*-th genotypic trait at age *a* by
and let us denote the total mutational variance by
If the *y*_{ka} are mutually independent and Lyapunov’s condition is satisfied, from the Lyapunov central limit theorem we have that, as either the number of genotypic traits *N*_{g} or the number of ages *N*_{a} tends to infinity (e.g., by reducing the age bin size), the normalized mechanistic breeding value excess
is normally distributed with mean zero and variance 1. Thus, this limit yields Fisher’s (1918) infinitesimal model, although we do not need to assume such limit. Our framework thus recovers the infinitesimal model as a particular case, when either *N*_{g} or *N*_{a} approaches infinity (provided that the *y*_{ka} are mutually independent and Lyapunov’s condition holds).

From our definition of mechanistic breeding value, we have that the mechanistic breeding value of the genotype is simply the genotype itself. From Layer 6, Eq. 1, the expected mechanistic breeding value of vector ** ζ** is
In turn, the mechanistic breeding value of the genotype

**y**is since because, by assumption, the genotype does not have developmental constraints and is developmentally independent (Layer 4, Eq. 6).

We now define mechanistic additive genetic covariance matrices under our assumptions. The additive genetic variance of a trait is defined under quantitative genetics assumptions as the variance of its breeding value, which is extended to the multivariate case so the additive genetic covariance matrix of a trait vector is the covariance matrix of the traits’ breeding values (Lynch and Walsh, 1998; Walsh and Lynch, 2018). Accordingly, we define the *mechanistic additive genetic covariance matrix* of a vector ** ζ** ∈ ℝ

^{m×1}as the covariance matrix of its mechanistic breeding value: where the fourth line follows from the property of the transpose of a product (i.e., (

**AB**)

^{⊺}=

**B**

^{⊺}

**A**

^{⊺}) and the last line follows since the mechanistic additive genetic covariance matrix of the genotype

**y**is Layer 6, Eq. 2 has the same form of previous expressions for the additive genetic covariance matrix under quantitative genetics assumptions, although using least-square regression coefficients in place of the derivatives if the classic partitioning of phenotypic variance is to hold (see Eq. II of Fisher 1918, Eq. + of Wagner 1984, Eq. 3.5b of Barton and Turelli 1987, and Eq. 4.23b of Lynch and Walsh 1998; see also Eq. 22a of Lande 1980, Eq. 3 of Wagner 1989, and Eq. 9 of Charlesworth 1990). We denote the matrix

**H**(for heredity) rather than

**G**to note that the two are different, particularly as the former is based on mechanistic breeding value. Note

**H**

_{ζ}is symmetric.

In some cases, Layer 6, Eq. 2 allows one to immediately determine whether a mechanistic additive genetic covariance matrix is singular, which means there are directions in the matrix’s space in which there is no genetic variation. Indeed, a matrix with fewer rows than columns is always singular (Horn and Johnson, 2013, section 0.5 second line), and if the product **AB** is well-defined and **B** is singular, then **AB** is singular (this is easily checked to hold). Hence, from Layer 6, Eq. 2 it follows that **H**_{ζ} is necessarily singular if d*ζ*^{⊺}/d**y** has fewer rows than columns, that is, if **y** has fewer entries than ** ζ**. Since

**y**has

*N*

_{a}

*N*

_{g}entries and

**has**

*ζ**m*entries, then

**H**

_{ζ}is singular if

*N*

_{a}

*N*

_{g}<

*m*. Moreover, Layer 6, Eq. 2 allows one to immediately identify bounds for the “degrees of freedom” of genetic covariation, that is, for the rank of

**H**

_{ζ}. Indeed, for a matrix

**A**∈ ℝ

^{m×n}, we have that the rank of

**A**is at most the smallest value of

*m*and

*n*, that is, rank(

**A**) ≤ min{

*m, n*} (Horn and Johnson, 2013, section 0.4.5 (a)). Moreover, from the Frobenius inequality (Horn and Johnson, 2013, section 0.4.5 (e)), for a well-defined product

**AB**, we have that rank(

**AB**) ≤ rank(

**B**). Therefore, for

**∈ ℝ**

*ζ*^{m×1}, we have that Intuitively, this states that the degrees of freedom of genetic covariation are at most given by the lifetime number of genotypic traits (i.e.,

*N*

_{a}

*N*

_{g}). So if there are more traits in

**than there are lifetime genotypic traits, then there are fewer degrees of freedom of genetic covariation than traits. This point is mathematically trivial and has undoubtedly been clear in the evolutionary literature for decades. However, this point will be biologically crucial because the evolutionary dynamic equations in gradient form that are generally dynamically sufficient involve a**

*ζ***H**

_{ζ}whose

**necessarily has more entries than**

*ζ***y**. Note also that these points on the singularity and rank of

**H**

_{ζ}also hold under quantitative genetics assumptions, where the same structure (Layer 6, Eq. 2) holds, except that

**H**

_{y}does not refer to mutational variation but to standing variation in allele frequency and total effects are measured with regression coefficients. Considering standing variation in

**H**

_{y}and regression coefficients does not affect the points made in this paragraph.

Consider the following slight generalization of the mechanistic additive genetic covariance matrix. We define the mechanistic additive genetic cross-covariance matrix between a vector ** ζ** ∈ ℝ

^{m×1}and a vector

*ξ*∈ ℝ

^{n×1}as the cross-covariance matrix of their mechanistic breeding value: Thus,

**H**

_{ζζ}=

**H**

_{ζ}. Note

**H**

_{ζξ}may be rectangular, and if square, asymmetric. Again, from Layer 6, Eq. 4 it follows that

**H**

_{ζξ}is necessarily singular if there are fewer entries in

**y**than in

*ξ*(i.e., if

*N*

_{a}

*N*

_{g}<

*n*). Also, for

*ξ*∈ ℝ

^{n×1}, have that In words, the degrees of freedom of genetic cross-covariation are at most given by the lifetime number of genotypic traits.

The mechanistic additive genetic covariance matrix of the phenotype takes the following form. Evaluating Layer 6, Eq. 2 at ** ζ** =

**x**, the mechanistic additive genetic covariance matrix of the phenotype is which is singular because the developmental matrix is singular since the developmentally initial phenotype is not affected by the genotype and the developmentally final genotypic traits do not affect the phenotype (Appendix C, Eq. C16). However, a dynamical system consisting only of evolutionary dynamic equations for the phenotype thus having an associated

**H**

_{x}-matrix is underdetermined in general because the system has fewer dynamic equations (i.e., the number of entries in

**x**) than dynamic variables (i.e., the number of entries in (

**x**;

**y**;

**)). Indeed, the evolutionary dynamics of the phenotype generally depends on the resident genotype, in particular, because the developmental matrix depends on the resident genotype (Layer 4, Eq. 3; e.g., due to non-linearities in the developmental map involving products between genotypic traits, or between genotypic traits and phenotypes, or between genotypic traits and environmental traits, that is, gene-gene interaction, gene-phenotype interaction, and gene-environment interaction, respectively). Thus, evolutionary dynamic equations of the phenotype alone generally have either zero or an infinite number of solutions for any given initial condition and are thus dynamically insufficient. To have a determined system in gradient form that is dynamically sufficient in general, we follow the evolutionary dynamics of both the phenotype and the genotype, that is, of the geno-phenotype, which depends on**

*ϵ***H**

_{z}rather than

**H**

_{x}alone.

The mechanistic additive genetic covariance matrix of the geno-phenotype takes the following form. Evaluating Layer 6, Eq. 2 at ** ζ** =

**z**, the mechanistic additive genetic covariance matrix of the geno-phenotype is This matrix is necessarily singular because the geno-phenotype

**z**includes the genotype

**y**so d

**z**

^{⊺}/d

**y**has fewer rows than columns (Layer 4, Eq. 8). Intuitively, Layer 6, Eq. 6 has this form because the phenotype is related to the genotype by the developmental constraint (1). From Layer 6, Eq. 3, the rank of

**H**

_{z}has an upper bound given by the number of genotypic traits across life (i.e.,

*N*

_{a}

*N*

_{g}), so

**H**

_{z}has at least

*N*

_{a}

*N*

_{p}eigenvalues that are exactly zero. Thus,

**H**

_{z}is singular if there is at least one trait that is developmentally constructed according to the developmental constraint (1) (i.e., if

*N*

_{p}> 0). This is a mathematically trivial singularity, but it is biologically key because it is

**H**

_{z}rather than

**H**

_{x}that occurs in a generally dynamically sufficient evolutionary system in gradient form (provided the environment is constant; if the environment is not constant, the relevant matrix is

**H**

_{m}which is also always singular if there is at least one phenotype or one environmental trait).

Another way to see the singularity of **H**_{z} is the following. From Layer 6, Eq. 6, we can write the mechanistic additive genetic covariance matrix of the geno-phenotype as
where the mechanistic additive genetic cross-covariance matrix between **z** and **x** is
and the mechanistic additive genetic cross-covariance matrix between **z** and **y** is
Thus, using Layer 4, Eq. 6, we have that
That is, some columns of **H**_{z} (i.e., those in **H**_{zx}) are linear combinations of other columns of **H**_{z} (i.e., those in **H**_{zy}). Hence, **H**_{z} is singular.

The mechanistic additive genetic covariance matrix of the geno-phenotype is singular because the geno-phenotype includes the genotype (“gene content”). The singularity arises because the mechanistic breeding value of the phenotype is a linear combination of the mechanistic breeding value of the genotype by definition of mechanistic breeding value, regardless of whether the phenotype is a linear function of the genotype and regardless of the number of phenotypic or genotypic traits. In quantitative genetics terms, the **G**-matrix is a function of allele frequencies (which corresponds to our ), so a generally dynamically sufficient Lande system would require that allele frequencies are part of the dynamic variables considered; consequently, if the geno-phenotypic vector includes allele frequencies , then **G** is necessarily singular since by definition, breeding value under quantitative genetics assumptions is a linear combination of gene content. The definition of mechanistic breeding value implies that if there is only one phenotype and one genotypic trait, with a single age each, then there is a perfect correlation between their mechanistic breeding values (i.e., their correlation coefficient is 1). This also holds under quantitative genetics assumptions, in which case the breeding value *a* of a trait *x* is a linear combination of a single predictor *y*, so the breeding value *a* and predictor *y* are perfectly correlated (i.e., ). The perfect correlation between a single breeding value and a single predictor arises because, by definition, breeding value excludes residual error *e*. Note this does not mean that the phenotype and genotype are linearly related: it is (mechanistic) breeding values and the genotype that are linearly related by definition of (mechanistic) breeding value (Layer 6, Eq. 1). A standard approach to remove the singularity of an additive genetic covariance matrix is to remove some traits from the analysis (Lande, 1979). To remove the singularity of **H**_{z} we would need to remove at least either all phenotypic traits or all genotypic traits from the analysis. However, removing all phenotypic traits from the analysis prevents analysing phenotypic evolution as the climbing of a fitness landscape whereas removing all genotypic traits from the analysis renders the analysis dynamically insufficient in general because the evolutionary dynamics of some variables is not described. Thus, in general, to analyse a dynamically sufficient description of phenotypic evolution as the climbing of a fitness landscape, we must keep the singularity of **H**_{z}.

We now use stabilized-effect matrices (Layer 5) to consider social development by extending the notion of mechanistic breeding value (Layer 6, Eq. 1). We define the stabilized mechanistic breeding value of a vector ** ζ** as:
Recall that the stabilized-effect matrix equals the total-effect matrix if development is nonsocial. Thus, if development is non-social, the stabilized mechanistic breeding value equals the mechanistic breeding value

**b**

_{ζ}. Also, note that .

With this, we extend the notion of mechanistic additive genetic covariance matrix to include the effects of sociodevo stabilization as follows. We define the *mechanistic additive socio-genetic cross-covariance matrix of* ** ζ** ∈ ℝ

^{m×1}as (L for legacy) Note

**L**

_{ζ}may be asymmetric and its main diagonal entries may be negative (unlike variances). If development is non-social,

**L**

_{ζ}equals

**H**

_{ζ}. As before,

**L**

_{ζ}is singular if

**has fewer entries than**

*ζ***y**. Also, for

**∈ ℝ**

*ζ*^{m×1}, have that That is, the degrees of freedom of socio-genetic covariation are at most also given by the lifetime number of genotypic traits.

Similarly, we generalize this notion and define the *mechanistic additive socio-genetic cross-covariance matrix between* ** ζ** ∈ ℝ

^{m×1}

*and ξ*∈ ℝ

^{n×1}as Again, if development is non-social,

**L**

_{ζξ}equals

**H**

_{ζξ}. Note

**L**

_{ζξ}may be rectangular and, if square, asymmetric. Also,

**L**

_{ζξ}is singular if

*ξ*has fewer entries than

**y**. For

*ξ*∈ ℝ

^{n×1}, have that That is, the degrees of freedom of socio-genetic cross-covariation are at most still given by the lifetime number of genotypic traits.

In particular, some **L**_{ζξ} matrices are singular or not as follows. The mechanistic additive socio-genetic crosscovariance matrix between ** ζ** and the geno-phenotype

**z**is singular if there is at least one phenotype (i.e., if

*N*

_{p}> 0). Thus,

**L**

_{ζz}has at least

*N*

_{a}

*N*

_{p}eigenvalues that are exactly zero. Also, the mechanistic additive socio-genetic cross-covariance matrix between

**and the geno-envo-phenotype**

*ζ***m**is singular if there is at least one phenotype or one environmental trait (i.e., if

*N*

_{p}> 0 or

*N*

_{e}> 0). Thus,

**L**

_{ζm}has at least

*N*

_{a}(

*N*

_{p}+

*N*

_{e}) eigenvalues that are exactly zero. In important contrast, the mechanistic additive socio-genetic cross-covariance matrix between a vector

**∈ {**

*ζ***y, z, m**} and the genotype

**y**is non-singular if

**H**

_{y}is non-singular because the genotype is developmentally independent (Appendix H and Appendix J). The

**L**-matrices share various properties with similar generalizations of the

**G**-matrix arising in the indirect genetic effects literature (Kirkpatrick and Lande, 1989; Moore et al., 1997; Townley and Ezard, 2013).

### 5.7 Layer 7: evolutionary dynamics

Finally, we move to the top layer of the evo-devo process, that of the evolutionary dynamics. This layer contains equations describing the evolutionary dynamics under explicit developmental and environmental constraints. In Supplementary Information section S3 and Appendix G-Appendix J, we show that, in the limit as Δτ → 0, the evolutionary dynamics of the phenotype, genotype, geno-phenotype, environment, and geno-envo-phenotype (i.e., for ** ζ** ∈ {

**x, y, z,**}) are given by which must satisfy both the developmental constraint and the environmental constraint If

*ϵ*, m**=**

*ζ***z**in Layer 7, Eq. 1a, then the equations in Layers 2-6 guarantee that the developmental constraint is satisfied for all τ > τ

_{1}given that it is satisfied at the initial evolutionary time τ

_{1}. If

**=**

*ζ***m**in Layer 7, Eq. 1a, then the equations in Layers 2-6 guarantee that both the developmental and environmental constraints are satisfied for all τ > τ

_{1}given that they are satisfied at the initial evolutionary time τ

_{1}. Both the developmental and environmental constraints can evolve as the genotype, phenotype, and environment evolve and such constraints can involve any family of curves as long as they are differentiable.

Importantly, although Layer 7, Eq. 1a describes the evolutionary dynamics of ** ζ**, such equation is guaranteed to be dynamically sufficient only for certain

**. Layer 7, Eq. 1a is dynamically sufficient if**

*ζ***is the genotype**

*ζ***y**, the geno-phenotype

**z**, or the geno-envo-phenotype

**m**, provided that the developmental and environmental constrains are satisfied throughout. In contrast, Layer 7, Eq. 1a is dynamically insufficient if

**is the phenotype**

*ζ***x**or the environment

**, because the evolution of the genotype is not followed but it generally affects the system.**

*ϵ*Layer 7, Eq. 1a describes the evolutionary dynamics as consisting of selection response and exogenous plastic response. Layer 7, Eq. 1a contains the term
which comprises direct directional selection on the geno-envo-phenotype and socio-genetic crosscovariation between ** ζ** and the geno-envo-phenotype (

**L**

_{ζm}). The term in Layer 7, Eq. 2 is the

*selection response*of

**and is a mechanistic generalization of Lande’s (1979) generalization of the univariate breeder’s equation (Lush, 1937; Walsh and Lynch, 2018). Additionally, Layer 7, Eq. 1a contains the term which comprises the vector of environmental change due to exogenous causes and the matrix of stabilized plasticity . The term in Layer 7, Eq. 3 is the**

*ζ**exogenous plastic response*of

**and is a mechanistic generalization of previous expressions (cf. Eq. A3 of Chevin et al. 2010). Note that the**

*ζ**endogenous*plastic response of

**(i.e., the plastic response due to endogenous environmental change arising from niche construction) is part of both the selection response and the exogenous plastic response (Layers 2-6).**

*ζ*Selection response is relatively incompletely described by direct directional selection on the geno-envo-phenotype. We saw that the matrix **L**_{ζm} is always singular if there is at least one phenotype or one environmental trait (Layer 6, Eq. 12). Consequently, evolutionary equilibria of ** ζ** can invariably occur with persistent direct directional selection on the geno-envo-phenotype, regardless of whether there is exogenous plastic response.

Selection response is also relatively incompletely described by total immediate selection on the geno-phenotype. We can rewrite the selection response, so the evolutionary dynamics of ** ζ** ∈ {

**x, y, z,**} (Layer 7, Eq. 1a) is equivalently given by This equation now depends on total immediate selection on the geno-phenotype , which measures total immediate directional selection on the geno-phenotype (or in a quantitative genetics framework, it is Lande’s (1979) selection gradient of the allele frequency and phenotype if environmental traits are not explicitly included in the analysis). We saw that the total immediate selection gradient of the geno-phenotype can be interpreted as pointing in the direction of steepest ascent on the fitness landscape in geno-phenotype space after the landscape is modified by the interaction of direct niche construction and environmental sensitivity of selection (Layer 3, Eq. 1). We also saw that the matrix

*ϵ*, m**L**

_{ζz}is always singular if there is at least one phenotype (Layer 6, Eq. 11). Consequently, evolutionary equilibria can invariably occur with persistent directional selection on the geno-phenotype after niche construction has modified the geno-phenotype’s fitness landscape, regardless of whether there is exogenous plastic response.

In contrast, selection response is relatively completely described by total genotypic selection. We can further rewrite selection response, so the evolutionary dynamics of ** ζ** ∈ {

**x, y, z,**} (Layer 7, Eq. 1a) is equivalently given by This equation now depends on total genotypic selection , which measures total directional selection on the genotype considering downstream developmental effects (or in a quantitative genetics framework, it is Lande’s (1979) selection gradient of allele frequency if neither the phenotype nor environmental traits are explicitly included in the analysis). We saw that the total selection gradient of the genotype can be interpreted as pointing in the direction of steepest ascent on the fitness landscape in genotype space after the landscape is modified by the interaction of total developmental bias from the genotype and directional selection on the phenotype and by the interaction of total niche construction by the genotype and environmental sensitivity of selection (Layer 4, Eq. 22). In contrast to the other arrangements of selection response, in Appendix H and Appendix J we show that

*ϵ*, m**L**

_{ζy}is non-singular for all

**∈ {**

*ζ***y, z, m**} if

**H**

_{y}is non-singular (i.e., if there is mutational variation in all directions of genotype space); this non-singularity of

**L**

_{ζy}arises because genotypic traits are developmentally independent by assumption. Consequently, evolutionary equilibria of the genotype, geno-phenotype, or geno-envo-phenotype can only occur when total genotypic selection vanishes if there is mutational variation in all directions of genotype space and if exogenous plastic response is absent.

Layer 7, Eq. 1a and its equivalents are generally dynamically insufficient if only the evolutionary dynamics of the phenotype are considered (i.e., if ** ζ** =

**x**). Let us temporarily assume that the following four conditions hold: (I) development is non-social , and there is (II) no exogenous plastic response of the phenotype , (III) no total immediate selection on the genotype , and (IV) no nicheconstructed effects of the phenotype on fitness . Then, the evolutionary dynamics of the phenotype reduces to This is a mechanistic version of the Lande equation for the phenotype. The mechanistic additive genetic covariance matrix of the phenotype (Layer 6, Eq. 5) in this equation is singular because the developmentally initial phenotype is not affected by the genotype and the developmentally final genotypic traits do not affect the phenotype (so has rows and columns that are zero; Appendix C, Eq. C16). This singularity might disappear by removing from the analysis the developmentally initial phenotype and developmentally final genotypic traits, provided additional conditions hold. Yet, the key point here is that a system describing the evolutionary dynamics of the phenotype alone is dynamically insufficient because such system depends on the resident genotype whose evolution must also be followed. In particular, setting does not generally imply an evolutionary equilibrium, or evolutionary stasis, but only an evolutionary nullcline in the phenotype, that is, a transient lack of evolutionary change in the phenotype. To guarantee a dynamically sufficient description of the evolutionary dynamics of the phenotype, we simultaneously consider the evolutionary dynamics of the phenotype and genotype, that is, the geno-phenotype.

Indeed, a dynamically sufficient system can be obtained by describing the dynamics of the geno-phenotype alone if the environment is constant or has no evolutionary effect. Let us now assume that the following three conditions hold: (i) development is non-social , and there is (ii) no exogenous plastic response of the phenotype , and (iii) no niche-constructed effects of the geno-phenotype on fitness . Then, the evolutionary dynamics of the geno-phenotype reduces to
This is an extension of the mechanistic version of the Lande equation to consider the geno-phenotype. The mechanistic additive genetic covariance matrix of the geno-phenotype (Layer 6, Eq. 6) in this equation is singular because the geno-phenotype **z** includes the genotype **y** (so d**z**^{⊺}/d**y** has fewer rows than columns; Layer 4, Eq. 8). Hence, the degrees of freedom of genetic covariation in geno-phenotype space are at most given by the number of lifetime genotypic traits, so these degrees of freedom are bounded by genotypic space in a necessarily larger geno-phenotype space. Thus, **H**_{z} is singular if there is at least one trait that is developmentally constructed according to the developmental map (Layer 7, Eq. 1b). The evolutionary dynamics of the geno-phenotype is now fully determined by Layer 7, Eq. 7 provided that i-iii hold and that the developmental (Layer 7, Eq. 1b) and environmental (Layer 7, Eq. 1c) constraints are met, which **H**_{z} guarantees they are if they are met at the initial evolutionary time τ_{1}. In such case, setting does imply an evolutionary equilibrium, but this does not imply absence of direct directional selection on the genophenotype (i.e., it is possible that ) since **H**_{z} is always singular. Due to this singularity, if there is any evolutionary equilibrium, there is an infinite number of them. Kirkpatrick and Lofsvold (1992) showed that if **G** is singular and constant, then the evolutionary equilibrium that is achieved depends on the initial conditions. Our results extend the relevance of Kirkpatrick and Lofsvold’s (1992) observation by showing that **H**_{z} is always singular and remains so as it evolves. Moreover, since both the developmental (Eq. Layer 7, Eq. 1b) and environmental (Eq. Layer 7, Eq. 1c) constraints must be satisfied throughout the evolutionary process, the developmental and environmental constraints determine the admissible evolutionary trajectory and the admissible evolutionary equilibria if mutational variation exists in all directions of genotype space. Therefore, developmental and environmental constraints together with direct directional selection jointly define the evolutionary outcome if mutational variation exists in all directions of genotype space.

Since selection response is relatively completely described by total genotypic selection, further insight can be gained by rearranging the extended mechanistic Lande equation for the geno-phenotype (Layer 7, Eq. 7) in terms of total genotypic selection. Using the rearrangement in Layer 7, Eq. 5 and making the assumptions i-iii in the previous paragraph, the extended mechanistic Lande equation in Layer 7, Eq. 7 becomes
This equation is closely related to but different from Morrissey’s (2014) Eq. 4, which uses a different factorization of the constraining matrix (here **H**_{z}, there Lande’s **G**) in terms of a square total effect matrix of all traits on themselves (his **Φ** in his Eq. 2) and so Morrissey’s equation is in terms of the total selection gradient of the phenotype rather than of the genotype. Also, being a rearrangement of the classic Lande equation, Morrissey’s equation refers to the selection response of the phenotype rather than of the geno-phenotype and is thus dynamically insufficient. A dynamically sufficient equation with a factorization of the constraining matrix analogous to Morrissey’s factorization is obtained in Eq. (H4), which is in terms of the total selection gradient of the geno-phenotype premultiplied by a necessarily singular matrix so such total selection gradient is not sufficient to identify evolutionary equilibria. In contrast, in Layer 7, Eq. 8, if the mutational covariance matrix **H**_{y} is non-singular, then the mechanistic additive genetic cross-covariance matrix between geno-phenotype and genotype **H**_{zy} is non-singular so evolutionary equilibrium implies absence of total genotypic selection (i.e., ) to first order of approximation. Indeed, to first order, lack of total genotypic selection provides a necessary and sufficient condition for evolutionary equilibria in the absence of exogenous environmental change and of absolute mutational constraints (Layer 7, Eq. 5). Consequently, evolutionary equilibria depend on development and niche construction since total genotypic selection depends on Wagner’s (1984, 1989) developmental matrix and on total niche construction by the genotype (Layer 4, Eq. 22). However, since has only as many equations as there are lifetime genotypic traits and since not only the genotype but also the phenotype and environmental traits must be determined, then provides fewer equations than variables to solve for. Hence, absence of total genotypic selection still implies an infinite number of evolutionary equilibria. Again, only the subset of evolutionary equilibria that satisfy the developmental (Layer 7, Eq. 1b) and environmental (Layer 7, Eq. 1c) constraints are admissible, and so the number of admissible evolutionary equilibria may be finite. Therefore, admissible evolutionary equilibria have a dual dependence on developmental and environmental constraints: first, by the constraints’ influence on total genotypic selection and so on evolutionary equilibria; and second, by the constraints’ specification of which evolutionary equilibria are admissible.

Because we assume that mutants arise when residents are at carrying capacity, the analogous statements can be made for the evolutionary dynamics of a resident vector in terms of lifetime reproductive success (Eq. 8). Using the relationship between selection gradients in terms of fitness and of expected lifetime reproductive success (Eqs. S22), the evolutionary dynamics of ** ζ** ∈ {

**x, y, z,**} (Layer 7, Eq. 1a) are equivalently given by To close, the evolutionary dynamics of the environment can be written in a particular form that is insightful. In Appendix I, we show that the evolutionary dynamics of the environment is given by Thus, the evolutionary change of the environment comprises “inclusive” endogenous environmental change and exogenous environmental change.

*ϵ*, m## 6. Example: allocation to growth vs reproduction

We now provide an example that illustrates some of the points above. To do this, we use a life-history model rather than a model of morphological development as the former is simpler yet sufficient to illustrate the points. In particular, this example shows that our results above enable direct calculation of the evo-devo dynamics and the evolution of the constraining matrices **H** and **L** and provide an alternative method to dynamic optimization to identify the evolutionary outcomes under explicit developmental constraints. We first describe the example where development is non-social and then extend the example to make development social.

### 6.1. Non-social development

We consider the classic life-history problem of modeling the evolution of resource allocation to growth vs reproduction (Gadgil and Bossert, 1970; León, 1976; Schaffer, 1983; Stearns, 1992; Roff, 1992; Kozłowski and Teriokhin, 1999). Let there be one phenotype (or state variable), one genotypic trait (or control variable), and no environmental traits. In particular, let *x*_{a} be a mutant’s phenotype at age *a* (e.g., body size or resources available) and *y*_{a} ∈ [0, 1] be the mutant’s fraction of resource allocated to phenotype growth at that age. Let mutant survival probability *p*_{a} = *p* be constant for all *a* ∈ {1, …, *N*_{a} − 1} with , so survivorship is *ℓ*_{a} = *p*^{a−1} for all *a* ∈ {1, …, *N*_{a}} with . Let mutant fertility be
where (1 − *y*_{a})*x*_{a} is the resource a mutant allocates to reproduction at age *a* and is a positive density-dependent scalar that brings the resident population size to carrying capacity. Let the developmental constraint be
with initial condition , where *y*_{a} *x*_{a} is the resource a mutant allocates to growth at age *a*. These equations are a simplification of those used in the classic life-history problem of finding the optimal resource allocation to growth vs reproduction in discrete age (Gadgil and Bossert, 1970; León, 1976; Schaffer, 1983; Stearns, 1992; Roff, 1992; Kozłowski and Teriokhin, 1999). In life-history theory, one assumes that at evolutionary equilibrium, a measure of fitness such as lifetime reproductive success is maximized by an optimal control **y*** yielding an optimal pair (**x***, **y***) that is obtained with dynamic programming or optimal control theory (Sydsæter et al., 2008). Instead, here we illustrate how the evolutionary dynamics of can be analysed with the equations derived in this paper, including identification of an optimal pair (**x***, **y***).

Let us calculate the elements of Layers 2-4 that we need to calculate genetic covariation and the evolutionary dynamics. Because there are no environmental traits, total immediate effects equal direct effects. Also, because development is non-social, stabilized effects equal total effects (except for social feedback, which is simply the identity matrix). Iterating the recurrence given by the developmental constraint (Example, Eq. 1) yields the mutant phenotype at age *a*
To find the density-dependent scalar, we note that a resident at carrying capacity satisfies the Euler-Lotka equation (Eq. S34), which yields
Using Eq. (5a), the entries of the direct selection gradients are given by
where the generation time without density dependence is
Thus, there is always direct selection for increased phenotype and against allocation to growth (except at the boundaries where or ). The entries of the matrices of direct effects on the phenotype (*a*: row, *j*: column) are given by
Using Layer 4, Eq. 2 and Eq. (C15), the entries of the matrices of total effects on the phenotype are given by
Then, using Layer 4, Eq. 21 and Layer 4, Eq. 22, the entries of the total selection gradients are given by
where we use the empty-product notation such that and the empty-sum notation such that for any *F*_{k}. There is thus always total selection for increased phenotype (except at the boundaries), although total selection for allocation to growth may be positive or negative.

Now, using Eqs. (1) and (3), the evo-devo dynamics are given by
Using Layer 7, Eq. 1a, Layer 7, Eq. 4, and Layer 7, Eq. 5, the evolutionary dynamics of the phenotype in the limit as Δτ → 0 are given by
Note these are not equations in Lande’s form. In particular, the mechanistic additive genetic-cross covariance matrices involved are not symmetric and the selection gradients are not those of the evolving trait in the left-hand side; Example, Eq. 7 cannot be arranged in Lande’s form because the genotypic trait directly affects fitness (i.e., Example, Eq. 3). Importantly, **H**_{xz} and **H**_{xy} depend on because of gene-phenotype interaction in development (i.e., the developmental map involves a product *y*_{a} *x*_{a} such that the total effect of the genotype on the phenotype depends on the genotype; Example, Eq. 4); consequently, Example, Eq. 7 is dynamically insufficient because the system does not describe the evolution of . In turn, the evolutionary dynamics of the geno-phenotype are given by
This system contains dynamic equations for all the evolutionarily dynamic variables, namely both the resident phenotype and the resident genotype , so it is determined and dynamically sufficient. The first equality in Example, Eq. 8 is in Lande’s form, but **H**_{z} is always singular. In contrast, the matrix **H**_{zy} in the second equality is non-singular if the mutational covariance matrix **H**_{y} is non-singular. Thus, the total selection gradient of the genotype provides a relatively complete description of the evolutionary process of the geno-phenotype.

Let the entries of the mutational covariance matrix be given by
where 0 < γ ≪ 1 so the assumption of marginally small mutational variance, namely 0 < tr(**H**_{y}) ≪ 1, holds. Thus, **H**_{y} is diagonal and becomes singular only at the boundaries where the resident genotype is zero or one. Then, from Example, Eq. 6, the evolutionary equilibria of the genotypic trait at a given age and their stability are given by the sign of its corresponding total selection gradient.

Let us now find the evolutionary equilibria and their stability for the genotypic trait. Using Example, Eq. 5, starting from the last age, the total selection on the genotypic trait at this age is
which is always negative so the stable resident genotypic trait at the last age is
That is, no allocation to growth at the last age. Continuing with the second-to-last age, the total selection on the genotypic trait at this age is
Evaluating at the optimal genotypic trait at the last age (Example, Eq. 9a) and substituting *ℓ*_{a} = *p*^{a−1} yields
which is negative (assuming *p* < 1) so the stable resident genotypic trait at the second-to-last age is
Continuing with the third-to-last age, the total selection on the genotypic trait at this age is
Evaluating at the optimal genotypic trait at the last two ages (Example, Eq. 9a and Example, Eq. 9b) and substituting *ℓ*_{a} = *p*^{a−1} yields
which is positive if
So the stable resident genotypic trait at the third-to-last age is
If , the genotypic trait at such age is selectively neutral, but we ignore this case as without an evolutionary model for *p* it is biologically unlikely that survival is and remains at such precise value. Hence, there is no allocation to growth at this age for low survival and full allocation for high survival. Continuing with the fourth-to-last age, the total selection on the genotypic trait at this age is
Evaluating at the optimal genotypic trait at the last three ages (Example, Eq. 9a-Example, Eq. 9c) and substituting *ℓ*_{a} = *p*^{a−1} yields
If , this is
which is positive if
If , the gradient is
which is positive if
Hence, the stable resident genotypic trait at the fourth-to-last age is
for . Again, this is no allocation to growth for low survival, although at this earlier age survival can be smaller for allocation to growth to evolve. Numerical solution for the evo-devo dynamics using Example, Eq. 6 is given in Fig. 5. The associated evolution of the **H**_{z} matrix, plotting Layer 6, Eq. 6, is given in Fig. 6. The code used to generate these figures is in the Supplementary Information.

### 6.2. Social development

Consider a slight modification of the previous example, so that development is social. Let the mutant fertility be
where the available resource is now given by for some constant *q* (positive, negative, or zero). Here the source of social development can be variously interpreted, including that an immediately older resident contributes to (positive *q*) or scrounges from (negative *q*) the resource of the focal individual, or that the focal individual learns from the older resident (positive or negative *q* depending on whether learning increases or increases the phenotype). Let the developmental constraint be
with initial condition .

To see what the stabilisation of social development is, imagine the first individual in the population having a resident genotype developing with such developmental constraint. As it is the first individual in the population, such individual has no social partners so the developed phenotype is with . Imagine a next resident individual who develops in the context of such initial individual. This next individual develops the phenotype with , but then . Iterating this process, the resident phenotype may converge to a socio-devo equilibrium , which satisfies Solving for yields a recurrence for the resident phenotype at socio-devo equilibrium provided that .

Iterating Example, Eq. 10 yields the resident phenotype at socio-devo equilibrium
where we drop the ^{**} for simplicity. To determine when this socio-devo equilibrium is socio-devo stable, we find the eigenvalues of as follows. The entries of the matrix of the direct social effects on the phenotype are given by
Hence, from Eqs. G8 and G9, is upper-triangular, so its eigenvalues are the values in its main diagonal, which are given by . Thus, the eigenvalues of have absolute value strictly less than one if |*q*| < 1, in which case the socio-devo equilibrium in Example, Eq. 11 is socio-devo stable.

Let be the SDS resident phenotype given by Example, Eq. 11 with |*q*| < 1. Then, the evo-devo dynamics are still given by Example, Eq. 6. Using Layer 7, Eq. 1a, Layer 7, Eq. 4, and Layer 7, Eq. 5, the evolutionary dynamics of the phenotype in the limit as Δτ → 0 are now given by
This system is dynamically insufficient as **L**_{xz} and **L**_{xy} depend on because of gene-phenotype interaction in development. In turn, the evolutionary dynamics of the geno-phenotype are given by
This system is dynamically sufficient as it contains dynam ic equations fo r all evolutionarily dynamic variables, namely both and . While **L**_{z} in the first equality is always singular, the matrix **L**_{zy} in the second equality is non-singular if the mutational covariance matrix **H**_{y} is non-singular. Thus, the total selection gradient of the genotype still provides a relatively complete description of the evolutionary process of the geno-phenotype.

We can similarly find that the total selection gradient of the genotypic trait at age *a* is
where the generation time without density dependence is now
This total selection gradient of the genotypic trait at age *a* has the same sign as that found in the model for non-social development (Example, Eq. 5). Hence, the stable evolutionary equilibria for the genotype are still given by Example, Eq. 9. Yet, the associated phenotype, given by Example, Eq. 11, may be different due to social development (Fig. 7). That is, social development here does not affect the evolutionary equilibria, as it does not affect the zeros of the total selection gradient of the genotype which gives the zeros of the evolutionary dynamics of the geno-phenotype (Example, Eq. 13). Instead, social development affects here the developmental constraint so it affects the admissible evolutionary equilibria of the phenotype. Numerical solution for the evo-devo dynamics using Example, Eq. 6 is given in Fig. 7. For the *q* chosen, the phenotype evolves to much larger values due to social feedback than with non-social development although the genotype evolves to the same values. The associated evolution of the **L**_{z} matrix, using Layer 6, Eq. 9, is given in Fig. 8. The code used to generate these figures is in the Supplementary Information.

## 7. Discussion

We have addressed the question of how development affects evolution by formulating a mathematical framework that integrates explicit developmental dynamics into evolutionary dynamics. The framework integrates age progression, explicit developmental constraints according to which the phenotype is constructed across life, and evolutionary dynamics. This framework yields a description of the structure of genetic covariation, including the additive effects of allelic substitution , from mechanistic processes. The framework also yields a dynamically sufficient description of the evolution of developed phenotypes in gradient form, such that their long-term evolution can be described as the climbing of a fitness landscape within the assumptions made. This framework provides a tractable method to model the evo-devo dynamics for a broad class of models. We also obtain formulas to compute the sensitivity of the solution of a recurrence (here, the phenotype) to perturbations in the solution or parameters at earlier times (here, ages), which are given by d**x**^{⊺}/d** ζ** for

**∈ {**

*ζ***x, y**}. Overall, the framework provides a theory of constrained evolutionary dynamics, where the developmental and environmental constraints determine the admissible evolutionary path (Layer 7, Eq. 1).

Previous understanding suggested that development affects evolution by inducing genetic covariation and genetic constraints, although the nature of such constraints had remained uncertain. We find that genetic constraints are necessarily absolute in a generally dynamically sufficient description of long-term phenotypic evolution in gradient form. This is because dynamic sufficiency in general requires that not only phenotypic but also genotypic evolution is followed. Because the phenotype is related to the genotype via development, simultaneously describing the evolution of the genotype and phenotype in gradient form entails that the associated constraining matrix (**H**_{z} or **L**_{z}) is necessarily singular with a maximum number of degrees of freedom given by the number of lifetime genotypic traits (*N*_{a}*N*_{g}). Consequently, genetic covariation is necessarily absent in as many directions of geno-phenotype space as there are lifetime developed traits (*N*_{a}*N*_{p}). Since the constraining matrix is singular, direct directional selection is insufficient to identify evolutionary equilibria in contrast to common practice. Instead, total genotypic selection, which depends on development, is sufficient to identify evolutionary equilibria if there are no absolute mutational constraints and no exogenous plastic response. The singularity of the constraining matrix associated to direct geno-phenotypic selection entails that if there is any evolutionary equilibrium and no exogenous plastic response, then there is an infinite number of evolutionary equilibria that depend on development; in addition, development determines the admissible evolutionary trajectory and so the admissible equilibria. The adaptive topography in phenotype space is often assumed to involve a non-singular **G**-matrix where evolutionary outcomes occur at fitness landscape peaks (i.e., where ). In contrast, we find that the evolutionary dynamics differ from that representation in that evolutionary outcomes occur at best (i.e., without absolute mutational constraints) at peaks in the admissible evolutionary path determined by development (i.e., where ), and that such path peaks do not typically occur at landscape peaks (so generally ).

The singularity of the constraining matrix (**H**_{z} or **L**_{z}) is not due to our adaptive dynamics assumptions. Under quantitative genetics assumptions, the additive genetic covariance matrix of phenotype **x** is as described in the introduction, and here we use the subscripts **x** to highlight that this *α* matrix is for the regression coefficients of the phenotype with respect to gene content. Under quantitative genetics assumptions, the matrix cov[**y, y**] describes the observed covariance in allele frequency due to any source, so it describes standing covariation in allele frequency. Under our adaptive dynamics assumptions, we obtain an **H**_{x} matrix that has the same form of **G**_{x}, but where cov[**y, y**] describes the covariance in genotypic traits only due to mutation at the current evolutionary time step among the possible mutations, so it describes (expected) mutational covariation. Regardless of whether cov[**y, y**] describes standing covariation in allele frequency or mutational covariation, the additive genetic covariance matrix in geno-phenotype space is always singular because the developmental matrix of the geno-phenotype has fewer rows than columns: that is, the degrees of freedom of **G**_{z} have an upper bound given by the number of loci (or genetic predictors) while the size of **G**_{z} is given by the number of loci and of phenotypes. Thus, whether one considers standing or mutational covariation, the additive genetic covariance matrix of the geno-phenotype is always singular. Eliminating traits from the analysis to render **G**_{z} non-singular as traditionally recommended (Lande, 1979) either renders the gradient system underdetermined and so dynamically insufficient in general (if allele frequency is removed), or prevents a description of phenotypic evolution as the climbing of a fitness landscape (if the mean phenotype is removed). The singularity of **H** and **L** in geno-phenotype space persists despite evolution of the developmental map, regardless of the number of genotypic traits or phenotypes provided there is any phenotype, and in the presence of endogenous or exogenous environmental change. Thus, we find that a dynamically sufficient description of phenotypic evolution in gradient form generally requires a singular constraining matrix.

Dynamic sufficiency for phenotypic evolution in gradient form requires that the constraining matrix is in genophenotype space particularly because of non-linear development. The **H**-matrix in phenotype space generally depends on the resident genotype via both the mutational covariance matrix and the developmental matrix. The developmental matrix depends on the resident genotype due to non-linear development, particularly gene-gene interaction, gene-phenotype interaction, and gene-environment interaction (see text below Eq. Layer 6, Eq. 5). The analogous dependence of **G** on allele frequency holds under quantitative genetics assumptions for the same reasons (Turelli, 1988; Service and Rose, 1985). If development is linear (i.e., the developmental map for all phenotypes is a linear function in all its variables at all ages), the developmental matrix no longer depends on the resident genotype (or allele frequency under quantitative genetics assumptions). If in addition the mutational covariance matrix is independent of the resident genotype, then the constraining matrix **H** in phenotype space is no longer dependent on the resident genotype. Thus, if one assumes linear development and both mutational covariation and phenotypic selection being independent of the resident genotype (in addition to no social interactions, no exogenous plastic response, no total immediate genotypic selection, and no niche-constructed effects of the phenotype on fitness (Layer 7, Eq. 6)), the **H** matrix in phenotype space becomes constant and the mechanistic Lande equation (Layer 7, Eq. 6) becomes dynamically sufficient. However, even simple models of explicit development involve non-linearities (e.g., Example, Eq. 1) and mutational covariation depends on the resident genotype whenever the genotype is constrained to take values within a finite range (e.g., between zero and one). Thus, consideration of even slightly realistic models of development seems unlikely to allow for a dynamically sufficient mechanistic Lande equation (i.e., following only phenotypic evolution).

Extensive research efforts have been devoted to determining the relevance of constraints in adaptive evolution (Arnold, 1992; Hine and Blows, 2006; Hansen and Houle, 2008; Jones et al., 2014; Hine et al., 2014; Engen and Sæther, 2021). Empirical research has found that the smallest eigenvalue of **G** in phenotype space is often close to zero (Kirkpatrick and Lofsvold, 1992; Hine and Blows, 2006; McGuigan and Blows, 2007). Mezey and Houle (2005) found a non-singular **G**-matrix for 20 morphological (so, developed) traits in fruit flies. Our results suggest **G** singularity would still arise in all these studies if enough traits are included so as to guarantee a dynamically sufficient description of phenotypic evolution on an adaptive topography (i.e., if allele frequency were included in the analysis as part of the multivariate “geno-phenotype”).

Previous theory has offered limited predictions as to when the **G**-matrix would be singular. These include that incorporating more traits in the analysis renders **G** more likely to be singular as the traits are more likely to be genetically correlated, such as in infinite-dimensional traits (Gomulkiewicz and Kirkpatrick, 1992; Kirkpatrick and Lofsvold, 1992). Suggestions to include gene frequency as part of the trait vector in the classic Lande equation (e.g., Barfield et al., 2011) have been made without noticing that doing so entails that the associated **G**-matrix is necessarily singular. Kirkpatrick and Lofsvold (1992, p. 962 onwards) showed that, assuming that **G** in phenotypic space is singular and constant, then the evolutionary trajectory and equilibria depend on the evolutionarily initial conditions of the phenotype. In our framework, the evolutionarily initial conditions of the phenotype are given by the developmental constraint evaluated at the evolutionarily initial genotype and environment. Hence, the evolutionary trajectory and equilibria depend on the developmental constraint, which provides the admissible evolutionary path. Our results thus extend the relevance of Kirkpatrick and Lofsvold’s (1992) analysis by our observation that **H** is always singular in geno-phenotype space to yield a generally dynamically sufficient gradient system for the phenotype, even with few traits and evolving **H**.

Multiple mathematical models have addressed the question of the singularity of **G**. Recently, simulation work studying the effect of pleiotropy on the structure of the **G**-matrix found that the smallest eigenvalue of **G** is very small but positive (Engen and Sæther, 2021, Tables 3 and 5). Our findings indicate that this model and others (e.g., Wagner, 1984; Barton and Turelli, 1987; Wagner, 1989; Wagner and Mezey, 2000; Martin, 2014; Morrissey, 2014, 2015) would recover **G**-singularity by considering the geno-phenotype so both allele frequency and phenotype change are part of the gradient system. Other recent simulation work found that a singular **G**-matrix due to few segregating alleles still allows the phenotype to reach its unconstrained optimum if all loci have segregating alleles at some point over the long run, thus allowing for evolutionary change in all directions of phenotype space in the long run (Barton, 2017, Fig. 3). Our results indicate that such a model attains the unconstrained optimum because it assumes that fitness depends on a single phenotype at a single age, and that there is no direct genotypic selection and no niche-constructed effects of the genotype on fitness (i.e., there ∂*w*/∂**y** = **0** and , which since fitness depends on a single trait k at a single age j further reduces to (d*x*_{kj}/d*y*_{ia})(∂*w*/∂*x*_{kj}); hence, d*w*/d*y*_{ij} = 0 for any locus *i* there implies ∂*w*/∂*x*_{kj} = 0; Eq. Layer 4, Eq. 22). Our results show that when at least one of these assumptions does not hold, the unconstrained optimum is not necessarily achieved (as illustrated in Example, Eq. 3 and Fig. 5). In our framework, phenotypic evolution converges at best to constrained fitness optima, which may under certain conditions coincide with unconstrained fitness optima. Convergence to constrained fitness optima under no absolute mutational constraints occurs even with the fewest number of traits allowed in our framework: two, that is, one genotypic trait and one phenotype with one age each (or in a standard quantitative genetics framework, allele frequency at a single locus and one quantitative trait that is a function of such allele frequency). Such constrained adaptation has important implications for biological understanding (see e.g., Kirkpatrick and Lofsvold, 1992; Gomulkiewicz and Kirkpatrick, 1992) and is consistent with empirical observations of lack of selection response in the wild despite selection and genetic variation (Merilä et al., 2001; Hansen and Houle, 2004; Pujol et al., 2018), and of relative lack of stabilizing selection (Kingsolver et al., 2001; Kingsolver and Diamond, 2011).

Our results provide a mechanistic description of breeding value, thus allowing for insight regarding the structure and evolution of the constraining matrix, here **H** or **L**. We have defined mechanistic breeding value, not in terms of regression coefficients as traditionally done, but in terms of total derivatives with components mechanistically arising from lower level processes. This yields a mechanistic description of the constraining matrices in terms of total effects of the genotype, which recover previous results in terms of regression coefficients and random matrices (Fisher, 1918; Wagner, 1984; Barton and Turelli, 1987; Lynch and Walsh, 1998; Martin, 2014; Morrissey, 2014). Matrices of total effects of the genotype are mechanistic analogues of Fisher’s (1918) additive effects of allelic substitution (his α) and of Wagner’s (1984, 1989) developmental matrix (his **B**). Our formulas for total effects allow one to compute the effect of a perturbation of the genotype, phenotype, or environment at an early age on the phenotype at a later age. Yet, by being defined from derivatives rather than regression, mechanistic breeding values do not satisfy the classic partitioning of phenotypic variance into genetic and “environmental” variances, and so mechanistic heritability can be greater than one.

Evolutionary analysis might have been hindered by lack of a mechanistic theory of breeding value and thus of the constraining matrix. Ever since Lande (1979) it has been clear that direct directional selection on the phenotype would be insufficient to identify evolutionary equilibria if the **G**-matrix were singular (Lande, 1979; Via and Lande, 1985; Kirkpatrick and Lofsvold, 1992; Gomulkiewicz and Kirkpatrick, 1992). Wagner (1984, 1989) constructed and analysed evolutionary models considering developmental maps, and wrote the **G**-matrix in terms of his developmental matrix to assess its impact on the maintenance of genetic variation. Yet, without a mechanistic theory of the constraining matrix, Wagner (1984, 1988, 1989) and Wagner and Mezey (2000) did not simultaneously track the evolution of genotypes and phenotypes, so did not conclude that the associated **G**-matrix is necessarily singular or that the developmental matrix affects evolutionary equilibria. Wagner’s (1984, 1989) models have been used to devise models of constrained adaptation in a fitness landscape, borrowing ideas from computer science (Altenberg, 1995, his Fig. 2). This and other models (Houle 1991, his Fig. 2 and Kirkpatrick and Lofsvold 1992, their Fig. 5) have suggested how constrained evolutionary dynamics would proceed although they have lacked a mechanistic theory of breeding value and thus of **G** and its evolutionary dynamics. Other models borrowing ideas from computer science have found that epistasis can cause the evolutionary dynamics to take an exponentially long time to reach fitness peaks (Kaznatcheev, 2019). Our mechanistic treatment of genetic covariation finds that as the **H**-matrix in geno-phenotype space has at least as many zero eigenvalues as there are lifetime phenotypes (i.e., *N*_{a}*N*_{p}), even if there were infinite time, the population does not necessarily reach a fitness peak in geno-phenotype space. However, the population eventually reaches a fitness peak in genotype space if there are no absolute mutational constraints after the landscape is modified by the interaction of the total effects of the genotype on phenotype and direct phenotypic selection and by the total niche-constructed effects of the genotype on fitness.

We find that total genotypic selection provides more information regarding selection response than direct directional selection or other forms of total selection. We show that evolutionary equilibria occur when total genotypic selection vanishes if there are no absolute mutational constraints and no exogenous plastic response. Direct selection or total selection on the phenotype need not vanish at evolutionary equilibria, even if there are no absolute mutational constraints and no exogenous plastic response. As total genotypic selection depends on development rather than exclusively on (unconstrained) selection, and as development determines the admissible evolutionary trajectory along which developmental and environmental constraints are satisfied, our findings show that development has a major evolutionary role by sharing responsibility with selection for defining evolutionary equilibria and for determining the admissible evolutionary path. Future work should assess to what extent these conclusions depend on our assumptions, particularly that of deterministic development.

Total selection gradients correspond to quantities that have received various names. Such gradients correspond to Caswell’s (1982, 2001) “total derivative of fitness” (denoted by him as dλ), Charlesworth’s (1994) “total differential” (of the population’s growth rate, denoted by him as d*r*), van Tienderen’s (1995) “integrated sensitivity” (of the population’s growth rate, denoted by him as IS), and Morrissey’s (2014, 2015) “extended selection gradient” (denoted by him as *η*). Total selection gradients measure total directional selection, so in our framework they take into account the downstream developmental effects of a trait on fitness. In contrast, Lande’s (1979) selection gradients measure direct directional selection, so in our framework’s terms they do not consider the developmentally immediate total effects of a trait on fitness nor the downstream developmental effects of a trait on fitness. We obtained compact expressions for total selection gradients as linear transformations of direct selection gradients, arising from the chain rule in matrix calculus notation (Layer 4, Eq. 20), analogously to previous expressions in terms of vital rates (Caswell, 2001, Eq. 9.38). Our mechanistic approach to total selection recovers the regression approach of Morrissey (2014) who defined the extended selection gradient as ** η** =

**Φ**, where

*β***is Lande’s selection gradient and**

*β***Φ**is the matrix of total effects of all traits on themselves (computed as regression coefficients between variables related by a path diagram rather than as total derivatives, which entails material differences with our approach as explained above). Morrissey (2014) used an equation for the total-effect matrix

**Φ**(his Eq. 2) from path analysis (Greene, 1977, p. 380), which has the form of our matrices describing developmental feedback of the phenotype and the geno-phenotype ( and ; Layer 4, Eq. 1 and Layer 4, Eq. 9). Thus, interpreting Morrissey’s (2014)

**Φ**as our (resp. ) and

**as our (resp. ) (i.e., Lande’s selection gradient of the phenotype or the geno-phenotype if environmental traits are not explicitly included in the analysis), then Layer 4, Eq. 21 (resp. Layer 4, Eq. 24) shows that the extended selection gradient**

*β***=**

*η***Φ**corresponds to the total selection gradient of the phenotype (resp. of the geno-phenotype ). We did not show that has the form of the equation for

*β***Φ**provided by Morrissey (2014) (his Eq. 2), but it might indeed hold. If we interpret

**Φ**as our and

**as our (i.e., Lande’s selection gradient of the geno-envo-phenotype thus explicitly including environmental traits in the analysis), then Layer 4, Eq. 25 shows that the extended selection gradient**

*β***=**

*η***Φ**corresponds to the total selection gradient of the geno-envo-phenotype .

*β*Not all total selection gradients provide a relatively complete description of the selection response. We show in Appendix H (Eq. H4) and Appendix J (Eq. J4) that the selection response of the geno-phenotype or the geno-envophenotype can respectively be written in terms of the total selection gradients of the geno-phenotype or the geno-envo-phenotype , but such total selection gradients are insufficient to predict evolutionary equilibria because they are premultiplied by a singular socio-genetic cross-covariance matrix. Also, the selection response of the phenotype can be written in terms of the total selection gradient of the phenotype , but this expression for the selection response has an additional term involving the total immediate selection gradient of the genotype , so the total selection gradient of the phenotype is insufficient to predict evolutionary equilibria (even more considering that following the evolutionary dynamics of the phenotype alone is generally dynamically insufficient). In contrast, we have shown that the total selection gradient of the genotype predicts evolutionary equilibria if there are no absolute mutational constraints and no exogenous plastic response. Thus, out of all total selection gradients considered, only total genotypic selection provides a relatively complete description of the selection response. Morrissey (2015) considers that the total selection gradient of the genotype (his “inputs”) and of the phenotype (his “traits”) would be equal, but the last line of Layer 4, Eq. 22 shows that the total selection gradients of the phenotype and genotype are different in general, particularly due to direct genotypic selection and the total effects of genotype on phenotype.

Our results allow for the modeling of evo-devo dynamics in a wide array of settings. First, developmental and environmental constraints (Layer 7, Eq. 1b and Layer 7, Eq. 1c) can mechanistically describe development, genegene interaction, and gene-environment interaction, while allowing for arbitrary non-linearities and evolution of the developmental map. Several previous approaches have modelled gene-gene interaction, such as by considering multiplicative gene effects, but general frameworks mechanistically linking gene-gene interaction, gene-environment interaction, developmental dynamics, and evolutionary dynamics have previously remained elusive (Rice, 1990; Hansen and Wagner, 2001; Rice, 2002; Hermisson et al., 2003; Carter et al., 2005; Rice, 2011). A historically dominant yet debated view is that gene-gene interaction has minor evolutionary effects as phenotypic evolution depends on additive rather than epistatic effects (under normality or to a first-order of approximation), so epistasis would act by influencing a seemingly effectively non-singular **G** (Hansen, 2013; Nelson et al., 2013; Paixão and Barton, 2016; Barton, 2017). Our finding that the constraining matrix **H** is necessarily singular in a dynamically sufficient phenotypic adaptive topography entails that evolutionary equilibria depend on development and consequently on gene-gene and gene-environment interactions. Hence, gene-gene and gene-environment interaction can generally have strong and permanent evolutionary effects in the sense of defining together with selection what the evolutionary equilibria are (e.g., via developmental feedbacks described by ) even by altering the **H**-matrix alone. This contrasts with a non-singular constraining matrix whereby evolutionary equilibria are pre-determined by selection.

Second, our results allow for the study of long-term evolution of the **H**-matrix as an emergent property of the evolution of the genotype, phenotype, and environment (i.e., the geno-envo-phenotype). In contrast, it has been traditional to study short-term evolution of **G** by treating it as another dynamic variable under constant allele frequency (Bulmer, 1971; Lande, 1979; Bulmer, 1980; Lande, 1980; Lande and Arnold, 1983; Barton and Turelli, 1987; Turelli, 1988; Gavrilets and Hastings, 1994; Carter et al., 2005; Débarre et al., 2014). Third, our results allow for the study of the effects of developmental bias, biased genetic variation, and modularity (Wagner, 1996; Pavlicev and Hansen, 2011; Pavlicev et al., 2011; Wagner and Zhang, 2011; Pavlicev and Wagner, 2012; Watson et al., 2013). While we have assumed that mutation is unbiased for the genotype, our equations allow for the developmental map to lead to biases in genetic variation for the phenotype. This may lead to modular effects of mutations, whereby altering a genotypic trait at a given age tends to affect some phenotypes but not others.

Fourth, our equations facilitate the study of life-history models with dynamic constraints. Life-history models with dynamic constraints have typically assumed evolutionary equilibrium, so they are analysed using dynamic optimization techniques such as dynamic programming and optimal control (e.g., León, 1976; Iwasa and Roughgarden, 1984; Houston and McNamara, 1999; González-Forero et al., 2017; Avila et al., 2021). In recent years, mathematically modeling the evolutionary dynamics of life-history models with dynamic constraints, that is, of what we call the evo-devo dynamics, has been made possible with the canonical equation of adaptive dynamics for function-valued traits (Dieckmann et al., 2006; Parvinen et al., 2013; Metz et al., 2016). However, such an approach poses substantial mathematical challenges by requiring derivation of functional derivatives and solution of associated differential equations for costate variables (Parvinen et al., 2013; Metz et al., 2016; Avila et al., 2021). By using discrete age, we have obtained closed-form equations that facilitate modeling the evo-devo dynamics. By doing so, our framework yields an alternative method to dynamic optimization to analyse a broad class of life-history models with dynamic constraints (see Example).

Fifth, our framework allows for the modeling of the evo-devo dynamics of pattern formation by allowing the implementation of reaction-diffusion equations in *discrete space* in the developmental map, once equations are suitably written (e.g., Eq. 6.1 of Turing, 1952; Tomlin and Axelrod, 2007; Supplementary Information section S6). Thus, the framework may allow one to implement and analyse the evo-devo dynamics of existing detailed models of the development of morphology (e.g., Salazar-Ciudad and Jernvall, 2010; Salazar-Ciudad and Marín-Riera, 2013), to the extent that developmental maps can be written in the form of Eq. (1). Sixth, our framework also allows for the mechanistic modeling of adaptive plasticity, for instance, by implementing reinforcement learning or supervised learning in the developmental map (Sutton and Barto, 2018; Paenke et al., 2007). In practice, to use our framework to model the evo-devo dynamics, it may often be simpler to compute the developmental dynamics of the phenotype and the evolutionary dynamics of the genotype (as in Fig. 5), rather than the evolutionary dynamics of the geno-phenotype or geno-envo-phenotype. When this is the case, after solving for the evo-devo dynamics, one can then compute the matrices composing the evolutionary dynamics of the geno-phenotype and geno-envo-phenotype to gain further understanding of the evolutionary factors at play, including the evolution of the **H**-matrix (as in Fig. 6).

By allowing development to be social, our framework allows for a mechanistic description of extra-genetic inheritance and indirect genetic effects. Extra-genetic inheritance can be described since the phenotype at a given age can be an identical or modified copy of the geno-phenotype of social partners. Thus, social development allows for the modeling of social learning (Sutton and Barto, 2018; Paenke et al., 2007) and epigenetic inheritance (Jablonka et al., 1992; Slatkin, 2009; Day and Bonduriansky, 2011). However, in our framework extra-genetic inheritance is insufficient to yield phenotypic evolution that is independent of both genetic evolution and exogenous plastic change (e.g., in the framework, there cannot be cultural evolution without genetic evolution or exogenous environmental change). This is seen by setting mutational covariation and exogenous environmental change to zero (i.e., **H**_{y} = **0** and ), which eliminates evolutionary change (i.e., ). The reason is that although there is extragenetic *inheritance* in our framework, there is no extra-genetic *variation* because both development is deterministic and we use adaptive dynamics assumptions: without mutation, every SDS resident develops the same phenotype as every other resident. Extensions to consider stochastic development might enable extra-genetic variation and possibly phenotypic evolution that is independent of genetic and exogenously plastic evolution. Yet, we have only considered social interactions among non-relatives, so our framework at present only allows for social learning or epigenetic inheritance from non-relatives.

Our framework can mechanistically describe indirect genetic effects via social development because the developed phenotype can be mechanistically influenced by the genotype or phenotype of social partners. Indirect genetic effects mean that a phenotype may be partly or completely caused by genes located in another individual (Moore et al., 1997). Indirect genetic effect approaches model the phenotype considering a linear regression of individual’s phenotype on social partner’s phenotype (Kirkpatrick and Lande, 1989; Moore et al., 1997; Townley and Ezard, 2013), whereas our approach constructs individual’s phenotype from development depending on social partners’ genotype and phenotypes. We found that social development generates social feedback (described by , Eq. Layer 5, Eq. 1), which closely though not entirely corresponds to social feedback found in the indirect genetic effects literature (Moore et al., 1997, Eq. 19b and subsequent text). The social feedback we obtain depends on total social developmental bias from the phenotype (, Eq. Layer 4, Eq. 5); analogously, social feedback in the indirect genetic effects literature depends on the matrix of interaction coefficients (**Ψ**) which contains the regression coefficients of phenotype on social partner’s phenotype. Social development leads to a generalization of mechanistic additive genetic covariance matrices **H** = cov[**b, b**] into mechanistic additive socio-genetic cross-covariance matrices **L** = cov[**b**^{s}, **b**]; similarly, indirect genetic effects involve a generalization of the **G**-matrix, which includes **C**_{ax} = cov[**a, x**], namely the crosscovariance matrix between multivariate breeding value and phenotype (Kirkpatrick and Lande, 1989; Moore et al., 1997; Townley and Ezard, 2013). However, there are differences between our results and those in the indirect genetic effects literature. First, social feedback (in the sense of inverse matrices involving Ψ) appears twice in the evolutionary dynamics under indirect genetic effects (see Eqs. 20 and 21 of Moore et al. 1997) while it only appears once in our evolutionary dynamics equations through (Eq. Layer 6, Eq. 10). This difference may stem from the assumption in the indirect genetic effects literature that social interactions are reciprocal, while we assume that they are asymmetric in the sense that, since mutants are rare, mutant’s development depends on residents but resident’s development does not depend on mutants (we thank J. W. McGlothlin for pointing this out). Second, our **L** matrices make the evolutionary dynamics equations depend on total social developmental bias from the genotype (, Eq. Layer 5, Eq. 2a) in a non-feedback manner (specifically, not in an inverse matrix) but this type of dependence does not occur in the evolutionary dynamics under indirect genetic effects (Eqs. 20 and 21 of Moore et al. 1997). This difference might stem from the absence of explicit tracking of allele frequency in the indirect genetic effects literature in keeping with the tradition of quantitative genetics, whereas we explicitly track the genotype. Third, “social selection” (i.e.,) plays no role in our results consistently with our assumption of a well-mixed population, but social selection plays an important role in the indirect genetic effects literature even if relatedness is zero (McGlothlin et al., 2010, e.g., setting *r* = 0 in their Eq. 10 still leaves an effect of social selection on selection response due to “phenotypic” kin selection).

Our framework offers formalizations to the notions of developmental constraints and developmental bias. The two notions have been often interpreted as equivalents (e.g., Brakefield, 2006), or with a distinction such that constraints entail a negative, prohibiting effect while bias entails a positive, directive effect of development on the generation of phenotypic variation (Uller et al., 2018; Salazar-Ciudad, 2021). We defined developmental constraint as the condition that the phenotype at a given age is a function of the individual’s condition at their immediately previous age, which both prohibits certain values of the phenotype and has a “directive” effect on the generation of phenotypic variation. We offered quantification of developmental bias in terms of the slope of the phenotype with respect to itself at subsequent ages. No bias would lead to zero slopes thus to identity matrices (e.g., and ) and deviations from the identity matrix would constitute bias.

Our results clarify the role of several developmental factors previously suggested to be evolutionarily important. We have arranged the evo-devo process in a layered structure, where a given layer is formed by components of layers below (Fig. 4). This layered structure helps see that several developmental factors previously suggested to have important evolutionary effects (Laland et al., 2014) but with little clear connection (Welch, 2017) can be viewed as basic elements of the evolutionary process. Direct-effect matrices (Layer 2) are basic in that they form all the components of the evolutionary dynamics (Layer 7) except mutational covariation and exogenous environmental change. Direct-effect matrices quantify direct (i) directional selection, (ii) developmental bias, (iii) niche construction, (iv) social developmental bias (e.g., extra-genetic inheritance and indirect genetic effects; Moore et al. 1997), (v) social niche construction, (vi) environmental sensitivity of selection (Chevin et al., 2010), and (vii) phenotypic plasticity. These factors variously affect selection and development, thus affecting evolutionary equilibria and the admissible evolutionary trajectory.

Our approach uses discrete rather than continuous age, which substantially simplifies the mathematics. This treatment allows for the derivation of closed-form expressions for what can otherwise be a difficult mathematical challenge if age is continuous (Kirkpatrick and Heckman, 1989; Dieckmann et al., 2006; Parvinen et al., 2013; Metz et al., 2016; Avila et al., 2021). For instance, costate variables are key in dynamic optimization as used in life-history models (Gadgil and Bossert, 1970; León, 1976; Schaffer, 1983; Stearns, 1992; Roff, 1992; Kozłowski and Teriokhin, 1999; Sydsæter et al., 2008), but general closed-form formulas for costate variables were previously unavailable and their calculation often limits the analysis of such models. In Appendix K, we show that our results recover the key elements of Pontryagin’s maximum principle, which is the central tool of optimal control theory to solve dynamic optimization problems (Sydsæter et al., 2008). Under the assumption that there are no environmental traits (hence, no exogenous plastic response), in Appendix K, we show that an admissible locally stable evolutionary equilibrium solves a local, dynamic optimization problem of finding a genotype that both “totally” maximises a mutant’s lifetime reproductive success *R*_{0} and “directly” maximises the Hamiltonian of Pontryagin’s maximum principle. We show that this Hamiltonian depends on costate variables that are proportional to the total selection gradient of the phenotype at evolutionary equilibrium (Eq. K3), and that the costate variables satisfy the costate equations of Pontryagin’s maximum principle. Thus, our approach offers an alternative method to optimal control theory to find admissible evolutionary equilibria for the broad class of models considered here. By exploiting the discretization of age, we have obtained various formulas that can be computed directly for the total selection gradient of the phenotype (Layer 4, Eq. 21), so for costate variables, and of their relationship to total genotypic selection (fifth line of Layer 4, Eq. 22), thus facilitating analytic and numerical treatment of life-history models with dynamic constraints. Although discretization of age may induce numerical imprecision relative to continuous age (Kirkpatrick and Heckman, 1989), numerical and empirical treatment of continuous age typically involves discretization at one point or another, with continuous curves often achieved by interpolation (e.g., Kirkpatrick et al., 1990). Numerical precision with discrete age may be increased by reducing the age bin size (e.g., to represent months or days rather than years; Caswell, 2001), potentially at a computational cost.

By simplifying the mathematics, our approach yields insight that may be otherwise challenging to gain. Lifehistory models with dynamic constraints generally find that costate variables are non-zero under optimal controls (Gadgil and Bossert, 1970; Taylor et al., 1974; León, 1976; Schaffer, 1983; Houston et al., 1988; Houston and McNamara, 1999; Sydsæter et al., 2008). This means that there is persistent total selection on the phenotype at evolutionary equilibrium. Our findings show that this is to be expected for various reasons including absolute mutational constraints (i.e., active path constraints so controls remain between zero and one, as in the Example), the occurrence of direct genotypic selection, and there being more state variables than control variables (in which case *δ***x**^{⊺}/*δ***y** is singular as it has more rows than columns, even after removing initial states and final controls from the analysis; Eq. C10) (fifth line of Layer 4, Eq. 22). Thus, zero total genotypic selection at equilibrium may involve persistent total phenotypic selection. Moreover, life-history models with explicit developmental constraints have found that their predictions can be substantially different from those found without explicit developmental constraints. In particular, without developmental constraints, the outcome of parent-offspring conflict over sex allocation has been found to be an intermediate between the outcomes preferred by mother and offspring (Reuter and Keller, 2001), whereas with developmental constraints, the outcome has been found to be that preferred by the mother (Avila et al., 2019). Our results show that changing the particular form of the developmental map may induce substantial changes in predictions by influencing total genotypic selection and the admissible evolutionary equilibria. In other words, the developmental map used alters the evolutionary outcome because it modulates absolute socio-genetic constraints (i.e., the **H** or **L** matrices in geno-phenotype space).

We have obtained a term that we refer to as exogenous plastic response, which is the plastic response to exogenous environmental change over an evolutionary time step (Layer 7, Eq. 3). An analogous term occurs in previous equations (Eq. A3 of Chevin et al. 2010). Additionally, our framework considers *endogenous* plastic response due to niche construction (i.e., endogenous environmental change), which affects both the selection response and the exogenous plastic response. Exogenous plastic response affects the evolutionary dynamics even though it is not ultimately caused by change in the resident genotype (or in gene frequency), but by exogenous environmental change. In particular, exogenous plastic response allows for a straightforward form of “plasticity-first” evolution (Waddington, 1942, 1961; West-Eberhard, 2003) as follows. At an evolutionary equilibrium where exogenous plastic response is absent, the introduction of exogenous plastic response generally changes socio-genetic covariation or directional selection at a subsequent evolutionary time, thereby inducing selection response. This constitutes a simple form of plasticity-first evolution, whereby plastic change precedes genetic change, although the plastic change may not be adaptive and the induced genetic change may have a different direction to that of the plastic change.

Empirical estimation of the developmental map may be facilitated by it defining a dynamic equation. Whereas the developmental map defines a dynamic equation to construct the phenotype, the genotype-phenotype map corresponds to the solution of such dynamic equation. It is often impractical or impossible to write the solution of a dynamic equation, even if the dynamic equation can be written in practice. Accordingly, it may often prove impractical to empirically estimate the genotype-phenotype map, whereas it may be more tractable to empirically infer developmental maps. Inference of developmental maps from empirical data can be pursued via the growing number of methods to infer dynamic equations from data (Schmidt and Lipson, 2009; Brunton et al., 2016; Ghadami and Epureanu, 2022, and papers in the special issue).

To conclude, we have formulated a framework that synthesizes developmental and evolutionary dynamics yielding a theory of long-term phenotypic evolution on an adaptive topography by mechanistically describing the long-term evolution of genetic covariation. This framework shows that development has major evolutionary effects by showing that selection and development jointly define the evolutionary outcomes if mutation is not absolutely constrained and exogenous plastic response is absent, rather than the outcomes being defined only by selection. Our results provide a tool to chart major territory on how development affects evolution.

## 8. Acknowledgements

I thank Andy Gardner for extensive support throughout this project, by discussing, reading, and commenting on the many drafts, and offering interpretation, advice, and funding. I thank K.N. Laland, R. Lande, L.C. Mikula, A.J. Moore, and M.B. Morrissey for comments on previous versions of the manuscript, and J.W. McGlothlin, J.A.J. Metz, I. Salazar-Ciudad, and D.M. Shuker for discussion. I thank T.F. Hansen, M. Pavlicev, S.J. Schreiber, and three anonymous reviewers for very helpful criticism. I thank M.B. Morrissey for discussion and explanation of his work. This work was funded by an ERC Consolidator Grant to A. Gardner (grant no. 771387), by the School of Biology of the University of St Andrews, and by a John Templeton Foundation grant to K.N. Laland and T. Uller (grant ID 60501).

## Appendix A. Matrix calculus notation

Following Caswell (2019), for vectors **a** ∈ ℝ^{n×1} and **b** ∈ ℝ^{m×1}, we denote
so (∂**a**/∂**b**⊤) ⊤ = ∂**a**⊤/∂**b**. The same notation applies with total derivatives.

## Appendix B. Total selection gradient of the phenotype

Here we derive the total selection gradient of the phenotype , which is part of and simpler to derive than the total selection gradient of the genotype .

### Appendix B.1. Total selection gradient of the phenotype in terms of direct fitness effects

We start by considering the total selection gradient of the *i*-th phenotype at age *a*. By this, we mean the total selection gradient of a perturbation of *x*_{ia} taken as initial condition of the recurrence equation (1) when applied at the ages {*a*, …, *n*}. Consequently, a perturbation in a phenotype at a given age does not affect phenotypes at earlier ages, in short, due to *the arrow of developmental time*. By letting ** ζ** in Eq. (S19) be

*x*

_{ia}, we have Note that the total derivatives of a mutant’s relative fitness at age

*j*in Eq. (B1) are with respect to the individual’s phenotype at possibly another age

*a*. From Eq. (S17), we have that a mutant’s relative fitness at age

*j*, , depends on the individual’s phenotype at the current age (recall

**z**

_{j}= (

**x**

_{j};

**y**

_{j})), but from the developmental constraint (1) the phenotype at a given age depends on the phenotype at previous ages. We must then calculate the total derivatives of fitness in Eq. (B1) in terms of direct (i.e., partial) derivatives, thus separating the effects of phenotypes at the current age from those of phenotypes at other ages.

To do this, we start by applying the chain rule, and since we assume that genotypic traits are developmentally independent (hence, they do not depend on the phenotype, so d**y** _{j}/d*x*_{ia} = **0** for all *i* ∈ {1, …, *N*_{p}} and all *a, j* ∈ {1, …, *N*_{a}}), we obtain
Applying matrix calculus notation (Appendix A), this is
Applying matrix calculus notation again yields
Factorizing, we have
Eq. (B2) now contains only partial derivatives of age-specific fitness.

We now write Eq. (B2) in terms of partial derivatives of lifetime fitness. Consider the *direct selection gradient of the phenotype at age j* defined as
Such selection gradient of the phenotype at age *j* forms the selection gradient of the phenotype at all ages (Layer 2, Eq. 1). Similarly, the *direct selection gradient of the environment at age j* is
and the matrix of *direct effects of a mutant’s phenotype at age j on her environment at age j* is
From Eq. (S18), *w* only depends directly on **x** _{j}, **y** _{j}, and *ϵ*_{j} through *w*_{j}. So,
which substituted in Eq. (B2) yields
where the *total immediate selection gradient of the phenotype at age j* is
Consider now the total immediate selection gradient of the phenotype at all ages. The block column vector of *total immediate effects of a mutant’s phenotype on fitness* is
Using Layer 2, Eq. 2d, we have that
is a block column vector whose *j*-th entry equals the rightmost term in Eq. (B5). Thus, from (B5), Layer 2, Eq. 1, and (B6), it follows that the total immediate selection gradient of the phenotype is given by Layer 3, Eq. 1.

Now, we write the total selection gradient of *x*_{ia} in terms of the total immediate selection gradient of the phenotype. Substituting Eq. (B4) in Eq. (B1) yields
where we use the block row vector
Therefore, the total selection gradient of all phenotypes across all ages is
where the total immediate selection gradient of the phenotype is given by Layer 3, Eq. 1 and the block matrix of *total effects of a mutant’s phenotype on her phenotype* is
Using Layer 3, Eq. 1, expression (B7) is now in terms of partial derivatives of fitness, partial derivatives of the environment, and total effects of a mutant’s phenotype on her phenotype, d**x**^{⊺}/d**x**, which we now proceed to write in terms of partial derivatives only.

#### Appendix B.2. Matrix of total effects of a mutant’s phenotype on her phenotype

From the developmental constraint (1) for the *k*-th phenotype at age *j* ∈ {2, …, *N*_{a}} we have that , so using the chain rule and since genotypic traits are developmentally independent we obtain
Applying matrix calculus notation (Appendix A), this is
Applying matrix calculus notation again yields
Factorizing, we have
Rewriting *g*_{k, j−1} as *x*_{k j} yields
Hence,
where we use the matrix of *direct effects of a mutant’s phenotype at age j on her phenotype at age j* + 1
and the matrix of *direct effects of a mutant’s environment at age j on her phenotype at age j* + 1
We can write Eq. (B8) more succinctly as
where we use the matrix of *total immediate effects of a mutant’s phenotype at age j on her phenotype at age j* + 1
The block matrix of *total immediate effects a mutant’s phenotype on her phenotype* is
The equality (B11) follows because total immediate effects of a mutant’s phenotype on her phenotype are only nonzero at the next age (from the developmental constraint in Eq. 1) or when a variable is differentiated with respect to itself. Using Layer 2, Eq. 2d and Layer 2, Eq. 2c, we have that
which equals the rightmost term in Eq. (B10) for *j* = *a* + 1. Thus, from (B10), Layer 2, Eq. 2a, (B11), and (B12), it follows that the block matrix of total immediate effects of a mutant’s phenotype on her phenotype satisfies Layer 3, Eq. 3.

Eq. (B9) gives the matrix of total effects of the *i*-th phenotype of a mutant at age *a* on her phenotype at age *j*. Then, it follows that the matrix of total effects of all the phenotypes of a mutant at age *a* on her phenotype at age *j* is
Eq. (B13) is a recurrence equation for over age *j* ∈ {2, …, *N*_{a}}. Because of the arrow of developmental time (due to the developmental constraint (1)), perturbations in an individual’s late phenotype do not affect the individual’s early phenotype (i.e., for *j* < *a* and *j* ∈ {1, …, *N*_{a} − 1})^{1}. Additionally, from the arrow of developmental time (Eq. 1), a perturbation in an individual’s phenotype at a given age does not affect any other of the individual’s phenotypes at the *same* age (i.e., where **I** is the identity matrix). Hence, expanding the recurrence in Eq. (B13), we obtain for *j* ∈ {1, …, *N*_{a}} that
Thus, the block matrix of *total effects of a mutant’s phenotype on her phenotype* is
which is block upper triangular and its *a j*-th block entry is given by Layer 4, Eq. 2. Eq. (B15) and Layer 4, Eq. 2 write the matrix of total effects of a mutant’s phenotype on her phenotype in terms of partial derivatives, given Eq. (B10), as we sought.

From Eq. (B15), it follows that the matrix of total effects of a mutant’s phenotype on her phenotype is invertible. Indeed, since is square and block upper triangular, then its determinant is
(Horn and Johnson, 2013, p. 32). Since , then for all *a* ∈ {1, …, *N*_{a}}. Hence, , so is invertible.

We now obtain a more compact expression for the matrix of total effects of a mutant’s phenotype on her phenotype in terms of partial derivatives. From Eq. (B11), it follows that
which is block 1-superdiagonal (i.e., only the entries in its first block super diagonal are non-zero). By definition of matrix power, we have that (*δ***x**^{⊺}/*δ***x** − **I**)^{0} = **I**. Now, from Eq. (B16), we have that
Using Eq. (B16), taking the second power yields
which is block 2-superdiagonal. This suggests the inductive hypothesis that
holds for some *i* ∈ {0, 1, …}, which is a block *i*-superdiagonal matrix. If this is the case, then we have that
This proves by induction that Eq. (B17) holds for every *i* ∈ {0, 1, …}, which together with Layer 4, Eq. 2 proves that
holds for all *i* ∈ {0, 1, …, *N*_{a}}. Evaluating this result at various *i*, note that
is a block matrix of zeros except in its block main diagonal which coincides with the block main diagonal of Eq. (B15). Similarly,
is a block matrix of zeros except in its first block super diagonal which coincides with the first block super diagonal of Eq. (B15). Indeed,
is a block matrix of zeros except in its *i*-th block super diagonal which coincides with the *i*-th block super diagonal of Eq. (B15) for all *i* ∈ {1, …, *N*_{a} − 1}. Therefore, since any non-zero entry of the matrix (*δ***x**^{⊺}/*δ***x** − **I**)^{i} corresponds to a zero entry for the matrix (*δ***x**^{⊺}/*δ***x** − **I**) ^{j} for any *i* =f *j* with *i, j* ∈ {0, …, *N*_{a} − 1}, it follows that
From the geometric series of matrices we have that
The last equality follows because *δ***x**^{⊺}/*δ***x** − **I** is strictly block triangular with block dimension *N*_{a} and so *δ***x**^{⊺}/*δ***x** − **I** is nilpotent with index smaller than or equal to *N*_{a}, which implies that . From Eq. (B11), the matrix 2**I** − *δ***x**^{⊺}/*δ***x** is block upper triangular with only identity matrices in its block main diagonal, so all the eigenvalues of 2**I** −*δ***x**^{⊺}/*δ***x** equal one and the matrix is invertible; thus, the inverse matrix in Eq. (B19) exists. Finally, using Eq. (B19) in (B18) yields Layer 4, Eq. 1, which is a compact expression for the matrix of total effects of a mutant’s phenotype on her phenotype in terms of partial derivatives only, once Layer 3, Eq. 3 is used.

#### Appendix B.3. Conclusion

##### Appendix B.3.1. Form 1

Using Eqs. (B7) and (Layer 3, Eq. 1) for ** ζ** =

**x**, we have that the total selection gradient of the phenotype is Thus, using Layer 4, Eq. 10 yields the first line of Layer 4, Eq. 21.

##### Appendix B.3.2. Form 2

Using Eq. (B7), the total selection gradient of the phenotype is given by the second line of Layer 4, Eq. 21.

##### Appendix B.3.3. Form 3

Using Eqs. (B7), Layer 3, Eq. 1 for ** ζ** =

**z**, and Layer 4, Eq. 7, we have that the total selection gradient of the phenotype is given by the third line of Layer 4, Eq. 21, where the

*total immediate selection gradient of the genophenotype*is

##### Appendix B.3.4. Form 4

Finally, using the first line of Layer 4, Eq. 21 and Layer 4, Eq. 14, we obtain the fourth line of Layer 4, Eq. 21.

## Appendix C. Total selection gradient of the genotype

### Appendix C.1. Total selection gradient of the genotype in terms of direct fitness effects

Here we derive the total selection gradient of the genotype following an analogous procedure to the one used in Appendix B for the total selection gradient of the phenotype. The *i*-th genotypic trait value at age *a* is *y*_{ia}, so letting ** ζ** in Eq. (S19) be

*y*

_{ia}, we have The total derivatives of a mutant’s relative fitness at age

*j*in Eq. (C1) are with respect to the individual’s genotypic trait at possibly another age

*a*. We now seek to express such selection gradient entry in terms of partial derivatives only.

From Eq. (S17), we have with **z** _{j} = (**x** _{j}; **y** _{j}), so applying the chain rule, we obtain
Applying matrix calculus notation (Appendix A), this is
Applying matrix calculus notation again yields
Factorizing, we have
We now write Eq. (C2) in terms of partial derivatives of lifetime fitness. Consider the *direct selection gradient of the genotype at age j*
and the matrix of *direct effects of a mutant’s genotype at age j on her environment at age j*
Using Eqs. (B3) and (B5) in Eq. (C2) yields
where we use the *total immediate selection gradient of the genotype at age j* or, equivalently, the *total immediate effects of a mutant’s genotype at age j on fitness*
Consider now the *total immediate selection gradient of the genotype* for all ages
Using Layer 2, Eq. 2d, we have that
is a block column vector whose *j*-th entry is the rightmost term in Eq. (C4). Thus, from (C4), Layer 2, Eq. 1, and (C5), it follows that the total immediate selection gradient of the genotype satisfies Layer 3, Eq. 1.

Now, we write the total selection gradient of *y*_{ia} in terms of the total immediate selection gradient of the genotype. Substituting Eq. (C3) in Eq. (C1) yields
where we use the block row vectors
Therefore, the total selection gradient of the genotype for all genotypic traits across all ages is
where we use the block matrix of *total effects of a mutant’s genotype on her phenotype*
and the block matrix of *total effects of a mutant’s genotype on her genotype*
Expression (C6) is now in terms of partial derivatives of fitness, partial derivatives of the environment, total effects of a mutant’s genotype on her phenotype, d**x**^{⊺}/d**y**, and total effects of a mutant’s genotype on her genotype, d**y**^{⊺}/d**y**, once Layer 3, Eq. 1 is used. We now proceed to write d**x**^{⊺}/d**y** and d**y**^{⊺}/d**y** in terms of partial derivatives only.

### Appendix C.2. Matrix of total effects of a mutant’s genotype on her phenotype and her genotype

From the developmental constraint (1) for the *k*-th phenotype at age *j* ∈ {2, …, *N*_{a}} we have that , so using the chain rule we obtain
Applying matrix calculus notation (Appendix A), this is
Applying matrix calculus notation again yields
Factorizing, we have
Rewriting *g*_{k, j−1} as *x*_{k j} yields
Hence,
where we use the matrix of *direct effects of a mutant’s genotypic trait values at age j on her phenotype at age j* + 1
We can write Eq. (C7) more succinctly as
where we use the matrix of *total immediate effects of a mutant’s genotypic trait values at age j on her phenotype at age j* + 1
We also define the corresponding matrix across all ages. Specifically, the block matrix of *total immediate effects of a mutant’s genotype on her phenotype* is
The equality (C10) follows because the total immediate effects of a mutant’s genotypic trait values on her phenotype are only non-zero at the next age (from the developmental constraint in Eq. 1). Using Layer 2, Eq. 2d and Layer 2, Eq. 2c, we have that
which equals the rightmost term in Eq. (C9) for *j* = *a* + 1. Thus, from Eqs. (C9)–(C11), it follows that the block matrix of total immediate effects of a mutant’s genotype on her phenotype satisfies Layer 3, Eq. 3.

Eq. (C8) gives the matrix of total effects of a mutant’s *i*-th genotypic trait value at age *a* on her phenotype at age *j*. Then, it follows that the matrix of total effects of a mutant’s genotypic traits for all genotypic traits at age *a* on her phenotype at age *j* is
Eq. (C12) is a recurrence equation for over age *j* ∈ {2, …, *N*_{a}}. Since a given entry of the operator d/d**y** takes the total derivative with respect to a given *y*_{ia} while keeping all the other genotypic traits constant and genotypic traits are developmentally independent, a perturbation of an individual’s genotypic trait value t a given age does not affect any other of the individual’s genotypic trait value at the same or other ages (i.e., and for *j* =f *a*). Thus, the matrix of total effects of a mutant’s genotype on her genotype is
Moreover, because of the arrow of developmental time (due to the developmental constraint in Eq. 1), perturbations in an individual’s late genotypic trait values do not affect the individual’s early phenotype (i.e., for *j* < *a* and *j* ∈ {1, …, *N*_{a} − 1})^{2}. Additionally, from the arrow of developmental time (Eq. 1), a perturbation in an individual’s genotypic trait values at a given age does not affect any of the individual’s phenotypes at the *same* age (i.e., for *j* = *a*). Consequently, Eq. (C12) for *j* ∈ {1, …, *N*_{a}} reduces to
That is,
Expanding this recurrence yields
Evaluating Eq. (C14) at *j* = *a* + 1 yields
which substituted back in the top line of Eq. (C14) yields
Hence, the block matrix of *total effects of a mutant’s genotype on her phenotype* is
whose *a j*-th block entry is given by
where we use Layer 4, Eq. 2 and adopt the empty-product convention that
Eqs. (C16) and (C17) write the matrix of total effects of a mutant’s genotype on her phenotype in terms of partial derivatives, given Eq. (C9), as we sought.

We now obtain a more compact expression for the matrix of total effects of a mutant’s genotype on her phenotype in terms of partial derivatives. To do this, we note a relationship between the matrix of total effects of a mutant’s genotype on her phenotype with the matrix of total effects of a mutant’s phenotype on her phenotype. Note that the *a j*-th block entry of (*δ***x**^{⊺}/*δ***y**)(d**x**^{⊺}/d**x**) is
where we use Eq. (C10) in the second equality and Eq. (C17) in the third equality, noting that and for *j* ≤ *a*. Hence, Layer 4, Eq. 3 follows, which is a compact expression for the matrix of total effects of a mutant’s genotype on her phenotype in terms of partial derivatives only, once Layer 4, Eq. 1 and Layer 3, Eq. 3 are used.

### Appendix C.3. Conclusion

#### Appendix C.3.1. Form 1

Using Eqs. (C6), (C13), and Layer 3, Eq. 1 for ** ζ** ∈ {

**x, y**}, we have that the total selection gradient of the genotype is Thus, using Layer 4, Eq. 11 yields the first line of Layer 4, Eq. 22.

#### Appendix C.3.2. Form 2

Using Eqs. (C6) and (C13), the total selection gradient of the genotype is given by the second line of Layer 4, Eq. 22.

#### Appendix C.3.3. Form 3

Using Eqs. (C6), (B20), and Layer 4, Eq. 8, we have that the total selection gradient of the genotype is given by the third line of Layer 4, Eq. 22.

#### Appendix C.3.4. Form 4

Using the first line of Layer 4, Eq. 22 and Layer 4, Eq. 15, we obtain the fourth line of Layer 4, Eq. 22.

#### Appendix C.3.5. Form 5

Finally, we can rearrange total genotypic selection (Layer 4, Eq. 22) in terms of total selection on the phenotype. Using Layer 4, Eq. 3 in the second line of Layer 4, Eq. 22, and then using the second line of Layer 4, Eq. 21, we have that the total selection gradient of the genotype is given by the fifth line of Layer 4, Eq. 22.

## Appendix D. Total selection gradient of the environment

Here proceed analogously to derive the total selection gradient of the environment, which allows us to write an equation describing the evolutionary dynamics of the geno-envo-phenotype.

### Appendix D.1. Total selection gradient of the environment in terms of direct fitness effects

As before, we start by considering the total selection gradient entry for the *i*-th environmental trait at age *a*. By this, we mean the total selection gradient of a perturbation of *ϵ*_{ia} taken as initial condition of the developmental constraint (1) when applied at the ages {*a*, …, *n*}. Consequently, an environmental perturbation at a given age does not affect the phenotype at earlier ages due to the arrow of developmental time. By letting ** ζ** in Eq. (S19) be

*ϵ*

_{ia}, we have The total derivatives of a mutant’s relative fitness at age

*j*in Eq. (D1) are with respect to the individual’s environmental traits at possibly another age

*a*. We now seek to express such selection gradient in terms of partial derivatives only.

From Eq. (S17), we have with **z** _{j} = (**x** _{j}; **y** _{j}), so applying the chain rule and, since we assume that genotypic traits are developmentally independent (hence, genotypic trait values do not depend on the environment, so d**y** _{j}/d*ϵ*_{ia} = **0** for all *i* ∈ {1, …, *N*_{p}} and all *a, j* ∈ {1, …, *N*_{a}}), we obtain
In the last equality we applied matrix calculus notation (Appendix A). Using Eq. (B3) we have
Substituting Eq. (D2) in (D1) yields
Therefore, the total selection gradient of all environmental traits across all ages is
where we use the block matrix of *total effects of a mutant’s environment on her phenotype*
and the block matrix of *total effects of a mutant’s environment on her environment*
Expression (D3) is now in terms of partial derivatives of fitness, total effects of a mutant’s environment on her phenotype, d**x**^{⊺}/d** ϵ**, and total effects of a mutant’s environment on her environment, d

*ϵ*^{⊺}/d

**. We now proceed to write d**

*ϵ***x**

^{⊺}/d

**and d**

*ϵ*

*ϵ*^{⊺}/d

**in terms of partial derivatives only.**

*ϵ*### Appendix D.2. Matrix of total effects of a mutant’s environment on her environment

From the environmental constraint (2) for the *k*-th environmental trait at age *j* ∈ {1, …, *N*_{a}} we have that , so using the chain rule since genotypic traits are developmentally independent yields
In the last equality we used matrix calculus notation and rewrote *h*_{k j} as *ϵ*_{k j}. Since we assume that environmental traits are mutually independent, we have that ∂*ϵ*_{ka}/∂*ϵ*_{ia} = 1 if *i* = *k* or ∂*ϵ*_{ka}/∂*ϵ*_{ia} = 0 otherwise; however, we leave the partial derivatives ∂*ϵ*_{ka}/∂*ϵ*_{ia} unevaluated as it is conceptually useful. Hence,
Then, the matrix of total effects of a mutant’s environment at age *a* on her environment at age *j* is
Hence, the block matrix of *total effects of a mutant’s environment on her environment* is
Note that the *a j*-th block entry of (d**x**^{⊺}/d** ϵ**)(∂

*ϵ*^{⊺}/∂

**x**) for

*j*>

*a*is where we use Layer 2, Eq. 2d in the second equality. Note also that since environmental traits are mutually independent, for

*j*=f

*a*from the environmental constraint (2). Finally, note that because of the arrow of developmental time, for

*j*<

*a*due to the developmental constraint (1). Hence, Layer 4, Eq. 13 follows, which is a compact expression for the matrix of total effects of a mutant’s environment on itself in terms of partial derivatives and the total effects of a mutant’s environment on her phenotype, which we now write in terms of partial derivatives only.

### Appendix D.3. Matrix of total effects of a mutant’s environment on her phenotype

From the developmental constraint (1) for the *k*-th phenotype at age *j* ∈ {2, …, *N*_{a}} we have that , so using the chain rule and since genotypic traits are developmentally independent yields
In the last equality we used matrix calculus notation and rewrote *g*_{k, j−1} as *x*_{k j}. Hence,
Then, the matrix of total effects of a mutant’s environment at age *a* on her phenotype at age *j* is
Using Eq. (D4) yields
Using Eq. (B10), this reduces to
Expanding this recurrence yields
which using Layer 4, Eq. 2 yields
It will be useful to denote the matrix of *total immediate effects of a mutant’s environment at age j on her phenotype at age j* for *j* > 0 as
The matrix of *direct effects of a mutant’s environment on itself* is given by Layer 2, Eq. 3. In turn, the block matrix of *total immediate effects of a mutant’s environment on her phenotype* is
so Layer 3, Eq. 4 follows from Eqs. (D7), Layer 2, Eq. 3, and Layer 2, Eq. 2c.

Using Eq. (D7), Eq. (D6) becomes
Note that the *a j*-th entry of (*δ***x**^{⊺}/*δ*** ϵ**)(d

**x**

^{⊺}/d

**x**) is where we use Eq. (D8) in the second equality. Hence, Layer 4, Eq. 4 follows, where the block matrix of

*total effects of a mutant’s environment on her phenotype*is Layer 4, Eq. 4, (D8), and Layer 4, Eq. 1 write the matrix of total effects of a mutant’s environment on her phenotype in terms of partial derivatives. This is a compact expression for the matrix of total effects of a mutant’s environment on her phenotype in terms of partial derivatives only.

### Appendix D.4. Conclusion

#### Appendix D.4.1. Form 1

Eq. (D3) gives the total selection gradient of the environment as in the first line of Layer 4, Eq. 23.

#### Appendix D.4.2. Form 2

Using Eq. (D3) and Layer 4, Eq. 13 yields
Collecting for d**x**^{⊺}/d** ϵ** and using Layer 3, Eq. 1 for

**=**

*ζ***x**as well as Layer 3, Eq. 2, we have that the total selection gradient of the environment is given by the second line of Layer 4, Eq. 23.

#### Appendix D.4.3. Form 3

Using the first line of Layer 4, Eq. 23 and Layer 4, Eq. 16, we obtain the third line of Layer 4, Eq. 23.

#### Appendix D.4.4. Form 4

Finally, we can rearrange total selection on the environment in terms of total selection on the phenotype. Using Layer 4, Eq. 4 in the second line of Layer 4, Eq. 23, and then using the second line of Layer 4, Eq. 21, we have that the total selection gradient of the environment is given by the fourth line of Layer 4, Eq. 23.

## Appendix E. Total selection gradient of the geno-phenotype

We have that the mutant geno-phenotype is **z** = (**x**; **y**). We first define the (direct), total immediate, and total selection gradients of the geno-phenotype and write the total selection gradient of the geno-phenotype in terms of the total immediate selection gradient of the geno-phenotype and of the partial selection gradient of the geno-envophenotype.

### Appendix E.1. Total selection gradient of the geno-phenotype in terms of direct fitness effects

We have the *selection gradient of the geno-phenotype*
the *total immediate selection gradient of the geno-phenotype*
and the *total selection gradient of the geno-phenotype*
Now, we write the total immediate selection gradient of the geno-phenotype as a linear combination of the selection gradients of the geno-phenotype and environment. Using Layer 3, Eq. 1 for ** ζ** ∈ {

**x, y**}, we have that the total immediate selection gradient of the geno-phenotype is Using Layer 2, Eq. 7, we have that Therefore, Eq. (E1) becomes Layer 3, Eq. 1 for

**=**

*ζ***z**.

#### Appendix E.1.1. Form 2

Now we bring together the total selection gradients of the phenotype and genotype to write the total selection gradient of the geno-phenotype as a linear transformation of the total immediate selection gradient of the genophenotype.

Using the third lines of Layer 4, Eq. 21 and Layer 4, Eq. 22, we have which is the second line of Layer 4, Eq. 24.

#### Appendix E.1.2. Form 3

Now we use the expressions of the total selection gradients of the phenotype and genotype as linear transformations of the geno-envo-phenotype to write the total selection gradient of the geno-phenotype. Using the fourth lines of Layer 4, Eq. 21 and Layer 4, Eq. 22, we have which is the third line of Layer 4, Eq. 24.

#### Appendix E.1.3. Form 1

Now, we obtain the total selection gradient of the geno-phenotype as a linear combination of selection gradients of the geno-phenotype and environment. Using Layer 3, Eq. 1 for ** ζ** =

**z**, the second line of Layer 4, Eq. 24 becomes We define the block matrix of total effects of a mutant’s geno-phenotype on her environment as which using Layer 4, Eq. 10 and Layer 4, Eq. 11 yields which is Layer 4, Eq. 12, where in the second equality we factorized and in the third equality we used Layer 4, Eq. 9. Using this in Eq. (E2), the first line of Layer 4, Eq. 24 follows.

### Appendix E.2. Matrix of total effects of a mutant’s geno-phenotype on her geno-phenotype

Here we obtain a compact expression for . Before doing so, let us obtain the block matrix of *total immediate effects of a mutant’s geno-phenotype on her geno-phenotype*
where the equality follows from the assumption that genotypic traits are developmentally independent. Using Layer 2, Eq. 6, Layer 2, Eq. 7, and Layer 2, Eq. 9 we have that
which equals the right-hand side of Eq. (E3) so Layer 3, Eq. 5 holds.

Now, motivated by Layer 4, Eq. 1 and the equation for total effects in path analysis (Greene, 1977), suppose that
for some matrix **E**_{z} to be determined. Then,
Using Layer 4, Eq. 9 and a formula for the inverse of a 2 × 2 block matrix (Horn and Johnson, 2013, Eq. 0.7.3.1), we have
Using Layer 4, Eq. 3 yields
Simplifying and using Layer 4, Eq. 1 yields
Substituting in Eq. (E4) and simplifying yields
Hence,
and so Layer 4, Eq. 9 holds.

## Appendix F. Total selection gradient of the geno-envo-phenotype

We have that the mutant geno-envo-phenotype is **m** = (**x**; **y**; ** ϵ**). We now define the direct, total immediate, and total selection gradients of the geno-envo-phenotype and write the total selection gradient of the geno-envo-phenotype in terms of the partial selection gradient of the geno-envo-phenotype.

We have the *selection gradient of the geno-envo-phenotype*
the *total immediate selection gradient of the geno-envo-phenotype*
and the *total selection gradient of the geno-envo-phenotype*
Now we use the expressions of the total selection gradients of the phenotype, genotype, and environment as linear transformations of the geno-envo-phenotype to write the total selection gradient of the geno-envo-phenotype. Using the fourth lines of Layer 4, Eq. 21 and Layer 4, Eq. 22 and the third line of Layer 4, Eq. 23, we have
which is Layer 4, Eq. 25.

To see that is non-singular, we factorize it as follows. We define the block matrix
which is non-singular since it is square, block upper triangular, and ∂*ϵ*^{⊺}/∂** ϵ** =

**I**(Layer 2, Eq. 3). We also define the block matrix of which is non-singular since it is square, block lower triangular, and d

**x**

^{⊺}/d

**x**is non-singular (Eq. B15). Note that where the last equality follows from Layer 4, Eq. 10, Layer 4, Eq. 11, and Layer 4, Eq. 13. Using Layer 4, Eq. 18, we thus have that Hence, is non-singular since and are square and non-singular.

## Appendix G. Evolutionary dynamics of the phenotype

Here we derive an equation describing the evolutionary dynamics of the resident phenotype.

From Eqs. (S10) and (S19), we have that the evolutionary dynamics of the resident genotype satisfy the canonical equation
whereas the developmental dynamics of the resident phenotype satisfy the developmental constraint
for *a* ∈ {1, …, *N*_{a} − 1}.

Let be the resident geno-phenotype at evolutionary time τ, specifically at the point where the socio-devo stable resident is at carrying capacity, marked in Fig. 3. The *i*-th mutant phenotype at age *j* + 1 at such evolutionary time τ is . Then, evolutionary change in the *i*-th resident phenotype at age *a* ∈ {2, …, *N*_{a}} is
Taking the limit as Δτ → 0, this becomes
Applying the chain rule, we obtain
Applying matrix calculus notation (Appendix A), this is
Applying matrix calculus notation again yields
Factorizing, we have
Rewriting *g*_{i,a−1} as *x*_{ia} yields
Hence, for all resident phenotypes at age *a* ∈ {2, …, *N*_{a}}, we have
Here we used the following series of definitions. The matrix of *direct effects of social partner’s phenotype at age a on the mutant’s phenotype at age j* is
and the block matrix of direct effects of social partners’ phenotype on a mutant’s phenotype is given by Layer 2, Eq. 4 with . The matrix is the *a*-th block column of .

Similarly, the matrix of *direct effects of social partners’ genotypic trait values at age a on a mutant’s phenotype at age j* is
and the block matrix of direct effects of social partners’ genotype on a mutant’s phenotype is given by Eq. (Layer 2, Eq. 4) with . The matrix is the *a*-th block column of .

In turn, the matrix of *direct effects of social partners’ phenotype at age a on a mutant’s environment at age j* is
and the block matrix of direct effects of social partners’ phenotype on a mutant’s environment is given by Layer 2, Eq. 5 with . The matrix is the *a*-th block column of .

Similarly, the matrix of *direct effects of social partners’ genotypic trait values at age a on a mutant’s environment at age j* is
and the block matrix of *direct effects of social partners’ genotype on a mutant’s environment* is given by Layer 2, Eq. 5 with . The matrix is the *a*-th block column of .

Having made these definitions explicit, we now write Eq. (G2) as
where we used the transpose of the total immediate effects of a mutant’s phenotype and genotype on her phenotype (Eqs. B10 and C9), and the the matrix of *total immediate effects of social partners’ phenotype or genotype at age a on a mutant’s phenotype at age j*
for since the initial phenotype **x**_{1} is constant by assumption. We also define the corresponding matrix of *total immediate effects of social partners’ phenotype on a mutant’s phenotype* as
for . The matrix is the *a*-th block column of . Using Layer 2, Eq. 2c and since the initial phenotype **x**_{1} is constant by assumption, we have that
for , which equals the rightmost term in Eq. (G4). Thus, from Eqs. (G4), (G5), and (G6), it follows that the block matrix of total immediate effects of social partners’ phenotype or genotype on a mutant’s phenotype satisfies Layer 3, Eq. 3.

Noting that and that evaluation of d**z**_{a}/dτ and ∂*ϵ*_{a}/∂τ at is and respectively, Eq. (G3) can be written as
which is a recursion for over *a*. Expanding this recursion two steps yields
Collecting the derivatives with respect to τ yields
Inspection shows that by expanding the recursion completely and since we assume that initial phenotype does not evolve (i.e., ), the resulting expression can be succinctly written as
where the ↶ denotes left multiplication. Note that the products over *k* are the transpose of the total effects of a mutant’s phenotype at age *j* + 1 on her phenotype at age *a* (Layer 4, Eq. 2). Hence,
Before simplifying Eq. (G7), we introduce a series of matrices that are analogous to those already provided, based on Eq. (C17). The matrix of *total effects of social partners’ phenotype or genotypic traits at age a on a mutant’s phenotype at age j* is
for . The block matrix of *total effects of social partners’ phenotype or genotype on a mutant’s phenotype* is thus
for . Then, from Eq. (G8), the block matrix in Eq. (G9) satisfies Layer 4, Eq. 5.

Using Eqs. (C17) and (D9) and given the property of transpose of a product (i.e., (**AB**)^{⊺} = **B**^{⊺}**A**^{⊺}), Eq. (G7) can be written more succinctly as
Note that from Eq. (C16), we have that for *j* ≥ *a*, from Eq. (D10), we have that for *j* ≥ *a*, and from Eq. (B15), we have that for *j* + 1 ≥ *a*. Hence, the same expression holds extending the upper bounds of the sums to the last possible age:
Changing the sum index for the rightmost sum yields
Expanding the matrix calculus notation for the entries of in the rightmost sum yields
Expanding again the matrix calculus notation for the entries of and in the two rightmost sums yields
Using the transpose of the matrix in Eq. (G8) in the two rightmost terms, noting that and for *j* = 1 (from Eq. G5), yields
Applying matrix calculus notation to each term yields
for *a* ∈ {2, …, *N*_{a}}. Since , it follows that
which contains our desired on both sides of the equation.

The matrix premultiplying on the right-hand side of Eq. (G10) is , which is square. We now make use of our assumption that the absolute value of all the eigenvalues of is strictly less than one, which guarantees that the resident geno-phenotype is socio-devo stable (Eq. S3 and following text). Given this property of , then is invertible. Hence, we can define the transpose of the matrix of *stabilized effects of a focal individual’s phenotype on a social partners’ phenotype* (second equality of Layer 5, Eq. 1). Thus, solving for in Eq. (G10), we finally obtain an equation describing the evolutionary dynamics of the phenotype
Let us momentarily write for some differentiable function to highlight the dependence of a mutant’s phenotype **x** on her genotype **y** and on the genotype of resident social partners. Consider the resident phenotype that develops in the context of the mutant genotype, denoted by . Hence,
where the second equality follows by exchanging dummy variables. Then, the transpose of the matrix of *total social effects of a mutant’s genotype on her and a partner’s phenotypes* is
Similarly, let us momentarily write for some differentiable function to highlight the dependence of a mutant’s phenotype **x** on her (developmentally earlier) phenotype **x** and on the phenotype of resident social partners. Consider the resident phenotype that develops in the context of the mutant phenotype, denoted by . Hence,
where the second equality follows by exchanging dummy variables. Then, the transpose of the matrix of *total social effects of a mutant’s phenotype on her and a partner’s phenotypes* is
Thus, from Eq. (G13) and the second equality of Layer 5, Eq. 1, the transpose of the matrix of stabilized effects of a focal individual’s phenotype on social partners’ phenotype may also be written as
where the last equality follows from the geometric series of matrices. This equation is the first and third equalities of Layer 5, Eq. 1.

Therefore, using Layer 5, Eq. 2 and Layer 5, Eq. 2b, the evolutionary dynamics of the phenotype are given by
where the second line follows by using Eq. (G1) in the limit Δτ → 0, and the third line follows from Layer 6, Eq. 13. The first line of Eq. G15 describing evolutionary change of the phenotype in terms of evolutionary change of the genotype is a generalization of previous equations describing the evolution of a multivariate phenotype in terms of allele frequency change (e.g., the first equation on p. 49 of Engen and Sæther 2021). Eq. (G15) is Layer 7, Eq. 5 for ** ζ** =

**x**. Using the third line of Layer 4, Eq. 22 and Layer 6, Eq. 11 yields Layer 7, Eq. 4 for

**=**

*ζ***x**, whereas using the fourth line of Layer 4, Eq. 22 and Layer 6, Eq. 12 yields Layer 7, Eq. 1a for

**=**

*ζ***x**.

## Appendix H. Evolutionary dynamics of the geno-phenotype

### Appendix H.1. In terms of total genotypic selection

Here we obtain an equation describing the evolutionary dynamics of the resident geno-phenotype, that is, . In this section, we write such an equation in terms of the total genotypic selection. Since , from Eqs. (G15) and (S10a), we can write the evolutionary dynamics of the resident geno-phenotype as
Using Layer 6, Eq. 13 and Layer 5, Eq. 3, this is
Using Layer 5, Eq. 4, this reduces to
Using Layer 6, Eq. 13 yields Layer 7, Eq. 5 for ** ζ** =

**z**. Using the third line of Layer 4, Eq. 22 and Layer 6, Eq. 11 yields Layer 7, Eq. 4 for

**=**

*ζ***z**, whereas using the fourth line of Layer 4, Eq. 22 and Layer 6, Eq. 12 yields Layer 7, Eq. 1a for

**=**

*ζ***z**.

In contrast to other arrangements, the premultiplying matrix **L**_{zy} is non-singular if **H**_{y} is non-singular. Indeed, if
for some vector **r**, then from Layer 5, Eq. 4a and Layer 5, Eq. 3b we have
Doing the multiplication yields
which implies that **r** = **0**, so is non-singular. Thus, **L**_{zy} is non-singular if **H**_{y} is non-singular.

### Appendix H.2. In terms of total selection on the geno-phenotype

Here we write the evolutionary dynamics of the geno-phenotype in terms of the total selection gradient of the geno-phenotype.

First, using Layer 6, Eq. 2, we define the *mechanistic additive genetic covariance matrix of the unperturbed geno-phenotype* as
By definition of , we have
From Eq. (S10c), the resident phenotype is independent of mutant genotype, so
Doing the matrix multiplication yields
The matrix is singular because the unperturbed geno-phenotype includes the genotype (i.e., has fewer rows than columns). For this reason, the matrix would still be singular even if the zero block entries in Eq. (H2) were non-zero (i.e., if ).

Now, we write an alternative factorization of **L**_{z} in terms of . Using Layer 4, Eq. 9 and Layer 5, Eq. 5, consider the matrix
Doing the matrix multiplication yields
Using Layer 5, Eq. 3b, we have
Notice that the matrix on the right-hand side is
Hence, we obtain an alternative factorization for **L**_{z} as
Thus, we can write the selection response of the geno-phenotype (in the form of Layer 7, Eq. 4) as
Using the relationship between the total and total immediate selection gradients of the geno-phenotype (second line of Layer 4, Eq. 24), this becomes
We can further simplify this equation by noticing the following. Using Layer 6, Eq. 10 and , we have that the *mechanistic additive socio-genetic cross-covariance matrix of the geno-phenotype and the unperturbed genophenotype* is
Expanding, we have
Using Layer 5, Eq. 3b and since the resident phenotype does not depend on mutant genotype, then
Doing the matrix multiplication yields
Notice that the last matrix equals
We can then write the evolutionary dynamics of the resident geno-phenotype in terms of the total selection gradient of the geno-phenotype as
The cross-covariance matrix is singular because has fewer rows than columns since the unperturbed geno-phenotype includes the genotype. For this reason, would still be singular even if the zero block entries in Eq. (H3) were non-zero (i.e., if ). Then, evolutionary equilibria of the geno-phenotype do not imply absence of total selection on the geno-phenotype, even if exogenous plastic response is absent.

## Appendix I. Evolutionary dynamics of the environment

### Appendix I.1. In terms of endogenous and exogenous environmental change

Here we derive an equation describing the evolutionary dynamics of the environment. Let be the resident geno-phenotype at evolutionary time τ, specifically at the point where the socio-devo stable resident is at carrying capacity, marked in Fig. 3. From the environmental constraint (2), the *i*-th environmental trait experienced by a mutant of age *a* at such evolutionary time τ is . Then, evolutionary change in the *i*-th environmental trait experienced by residents at age *a* ∈ {1, …, *N*_{a}} is
Taking the limit as Δτ → 0, this becomes
Applying the chain rule, we obtain
Applying matrix calculus notation, this is
Applying matrix calculus notation again yields
Rewriting *h*_{ia} as *ϵ*_{ia}, we obtain
Hence, for all environmental traits at age *a*, we have
Note that evaluation of d**z**_{a}/dτ and ∂*ϵ*_{a}/∂τ at is and , respectively. Using Layer 2, Eq. 2d and Layer 2, Eq. 2d yields
Then, we have
Now note that ∂*ϵ*_{a}/∂**z**^{⊺} = (∂*ϵ*_{a}/∂**x**^{⊺}, ∂*ϵ*_{a}/∂**y**^{⊺}), so
Hence, for all environmental traits over all ages, we have
where we use Layer 2, Eq. 7 and the block matrix of direct effects of social partners’ geno-phenotype on a mutant’s environment (Layer 2, Eq. 8; see also Layer 2, Eq. 5).

Let us momentarily write for some differentiable function to highlight the dependence of a mutant’s environment ** ϵ** on her geno-phenotype

**z**and on the geno-phenotype of resident social partners. Consider the environment a resident experiences when she is in the context of mutants, denoted by . Hence, where the second equality follows by exchanging dummy variables. Then, the transpose of the matrix of

*direct social*

*effects of a mutant’s geno-phenotype on her and a partner’s environment* is
Similarly, the transpose of the matrix of *direct social effects of a mutant’s phenotype on her and a partner’s environment* is
and the transpose of the matrix of *direct social effects of a mutant’s genotype on her and a partner’s environment* is
Consequently, the evolutionary dynamics of the environment are given by Layer 7, Eq. 10.

*Appendix I.2. In terms of total genotypic selection*

Using the expression for the evolutionary dynamics of the geno-phenotype (Layer 7, Eq. 5 for ** ζ** =

**z**) in that for the environment (Layer 7, Eq. 10) yields Using Layer 6, Eq. 13 for

**=**

*ζ***z**yields Collecting for ∂

**/∂τ and using Layer 5, Eq. 6 yields Using Layer 6, Eq. 13 yields Layer 7, Eq. 5 for**

*ϵ***=**

*ζ***. Using the third line of Layer 4, Eq. 22 and Layer 6, Eq. 11 yields Layer 7, Eq. 4 for**

*ϵ***=**

*ζ***, whereas using the fourth line of Layer 4, Eq. 22 and Layer 6, Eq. 12 yields Layer 7, Eq. 1a for**

*ϵ***=**

*ζ***.**

*ϵ*## Appendix J. Evolutionary dynamics of the geno-envo-phenotype

### Appendix J.1. In terms of total genotypic selection

Here we obtain an equation describing the evolutionary dynamics of the resident geno-envo-phenotype, that is, . In this section, we write such an equation in terms of total genotypic selection. Since , from (G15), (S10a), and Layer 7, Eq. 5 for ** ζ** =

**, we can write the evolutionary dynamics of the resident geno-envo-phenotype as Using Layer 6, Eq. 10 and Layer 5, Eq. 3, this is Using Layer 5, Eq. 7, this reduces to Using Layer 6, Eq. 13 yields Layer 7, Eq. 5 for**

*ϵ***=**

*ζ***m**.. Using the third line of Layer 4, Eq. 22 and Layer 6, Eq. 11 yields Layer 7, Eq. 4 for

**=**

*ζ***m**, whereas using the fourth line of Layer 4, Eq. 22 and Layer 6, Eq. 12 yields Layer 7, Eq. 1a for

**=**

*ζ***m**.

In contrast to other arrangements, the premultiplying matrix **L**_{my} is non-singular if **H**_{y} is non-singular. Indeed, if
for some vector **r**, then from Layer 5, Eq. 7a and Layer 5, Eq. 3b we have
Doing the multiplication yields
which implies that **r** = **0**, so is non-singular. Thus, **L**_{my} is non-singular if **H**_{y} is non-singular.

*Appendix J.2. In terms of total selection on the geno-envo-phenotype*

Here we write the evolutionary dynamics of the geno-envo-phenotype in terms of the total selection gradient of the geno-envo-phenotype.

First, using Layer 6, Eq. 2, we define the *mechanistic additive genetic covariance matrix of the unperturbed geno-envo-phenotype* as
By definition of , we have
From Eqs. (S10c) and (S10d), the resident phenotype and environment are independent of the mutant genotype, so
Doing the matrix multiplication yields
The matrix is singular because the unperturbed geno-envo-phenotype includes the genotype (i.e., has fewer rows than columns). For this reason, the matrix would still be singular even if the zero block entries in Eq. (J2) were non-zero (i.e., if and ).

Now, we write an alternative factorization of **L**_{m} in terms of . Using Layer 4, Eq. 18 and Layer 5, Eq. 8, we have
Doing the matrix multiplication yields
Using Layer 5, Eq. 3b, this is
Notice that the matrix on the right-hand side is
Hence, we obtain an alternative factorization for **L**_{m} as
We can now write the selection response of the geno-envo-phenotype (in the form of Layer 7, Eq. 1a) as
Using the relationship between the total and partial selection gradients of the geno-envo-phenotype (Layer 4, Eq. 25), this becomes
We can further simplify this equation by noticing the following. Using Layer 6, Eq. 10 and , we have that the *mechanistic additive socio-genetic cross-covariance matrix of the geno-envo-phenotype and the unperturbed geno-envo-phenotype* is
Expanding, we have
Using Layer 5, Eq. 3b and since the resident phenotype and environment do not depend on the mutant genotype, then
Doing the matrix multiplication yields
Notice that the last matrix equals
Thus,
We can then write the evolutionary dynamics of the resident geno-envo-phenotype in terms of the total selection gradient of the geno-envo-phenotype as
The cross-covariance matrix is singular because has fewer rows than columns since the unperturbed geno-envo-phenotype includes the genotype. For this reason, would still be singular even if the zero block entries in Eq. (J3) were non-zero (i.e., if and ). Then, evolutionary equilibria of the geno-envophenotype do not imply absence of total selection on the geno-envo-phenotype, even if exogenous plastic response is absent.

## Appendix K. Connection to dynamic optimization

Life-history models often consider genetically controlled traits (controls) that depend on an underlying variable (e.g., age) together with traits (states) constructed via dynamic (e.g., developmental) constraints over the underlying variable. When such a model is simple enough, analytical solution (i.e., identification of evolutionarily stable strategies) is possible using optimal control or dynamic programming methods (Sydsæter et al., 2008). A key tool from optimal control theory that enables finding such analytical solutions (i.e., optimal controls) is Pontryagin’s maximum principle. The maximum principle is a theorem that essentially transforms the dynamic optimization problem into a simpler problem of maximizing a function called the Hamiltonian, which depends on control, state, and costate (or adjoint) variables. The problem is then to maximize the Hamiltonian with respect to the controls, while state and costate variables can be found from associated dynamic equations. We now show that our results imply the key elements of Pontryagin’s maximum principle for a standard life-history problem.

First, we state the optimization problem. Let **y** and **x** respectively denote the control and state variables over age, and assume that there are no environmental traits. Let survivorship be a state variable, denoted by *x*_{ℓa} = *ℓ*_{a}, so it satisfies the developmental constraint with initial condition . Thus, using Eq. 8, we can write the expected lifetime number of offspring of a mutant with pair **z** = (**x**; **y**) in the context of a resident with pair as
Consider the optimization problem of finding an optimal pair **z*** = (**x***; **y***) such that
subject to the dynamic constraint
for *a* ∈ {1, …, *N*_{a}}, with given and free. Hence, **z*** is a best response to itself under the best response function *R*_{0}, where **y*** is an optimal control and **x*** is its associated optimal state. The optimization problem in (K1) is a standard life-history problem generalized to include social interactions. From Layer 7, Eq. 5 for ** ζ** =

**z**and Eq. (S22b), it follows that since there is no exogenous environmental change, an admissible locally stable evolutionary equilibrium

**z*** locally solves the problem (K1).

Second, we define the costate variables and show that they are proportional to the total selection gradient of states evaluated at an admissible locally stable evolutionary equilibrium. The costate variable of the *i*-th state variable at age *a* for problem (K1) is defined as
(section 9.6 of Sydsæter et al. 2008). Hence, from Eq. (S22b), we have that the costate for the *i*-th state variable at age *a* is
That is, costate variables are proportional to the total selection gradient of state variables at an admissible locally stable evolutionary equilibrium **z***. The total selection gradient of states thus generalizes the costate notion to the situation where controls and states are outside of evolutionary equilibrium for the life-history problem of *R*_{0} max-imization. We have obtained various equations (Layer 4, Eq. 21) that enable direct calculation of such generalized costates in age structured models with *R*_{0} maximization. Moreover, we have obtained an equation that relates such generalized costates to the evolutionary dynamics (fifth line of Layer 4, Eq. 22). Since we are assuming that there are no environmental traits, total immediate effect matrices reduce to direct effect matrices. Thus, the fifth line of Layer 4, Eq. 22 shows that such generalized costates affect the evolutionary dynamics indirectly by being transformed by the direct effects of controls on states, ∂**x**^{⊺}/∂**y**.

Third, we show that total maximization of *R*_{0} is equivalent to direct maximization of the Hamiltonian, which is the central feature of Pontryagin’s maximum principle. We have that the total selection gradient of controls can be written in terms of the total selection gradients of states (fifth line of Layer 4, Eq. 22), so for the controls at age *a* we have
where we substituted total immediate derivatives for partial derivatives because we are assuming that there are no environmental traits. Using Eqs. (S22) yields
From Eqs. (C10) and (K1a) given that the partial derivative ignores the dynamic constraint (K1c), it follows that
Using Eqs. (K2) and (K1c) and evaluating at optimal controls yields
This suggests to define
which recovers the Hamiltonian of Pontryagin’s maximum principle in discrete time (section 12.5 of Sydsæter et al. 2008) for the objective function (K1a). Then, the total derivative of the objective function with respect to the controls at a given age equals the partial derivative of the Hamiltonian when both derivatives are evaluated at optimal controls:
This is the essence of Pontryagin’s maximum principle: the signs of the left-hand side derivatives are the same as the signs of the derivatives on the right-hand side, which are simpler to compute (although one must then compute costate variables).

Fourth, we show that the formulas we found for the costate variables (K2) imply the costate equations of Pontryagin’s maximum principle for discrete time. Such costate equations are dynamic equations that allow one to calculate the costate variables. Using Layer 4, Eq. 21 and Eqs. (S22), we have that
Expanding the matrix multiplication on the right-hand side, this is
where we used Eq. (B15). Using the expression of the total effect of states on themselves as a product (Layer 4, Eq. 2) yields
Doing the sum over *j* yields
Using the second line of Layer 4, Eq. 21 and Eqs. (S22) again yields
This equals the partial derivative of the Hamiltonian with respect to the states at age *a*. Indeed, using (K5) we have
Substituting this in Eq. (K6) and evaluating at optimal controls yields
This is the costate equation of Pontryagin’s maximum principle in discrete time (Eq. 4 in section 12.5 of Sydsæter et al. 2008).