Abstract
Most phenotypes, such as gene expression profiles, developmental trajectories, behavioural sequences or other reaction norms are function-valued traits, since they vary across an individual’s age and in response to various internal and/or external factors (state variables). In turn, many individuals live in populations subject to some limited genetic mixing and are thus likely to interact with their relatives. We here formalise selection on function-valued traits when individuals interact in a group-structured population, by deriving the marginal version of Hamilton’s rule for function-valued traits. This rule simultaneously gives a condition for the invasion of an initially rare mutant function-valued trait and its ultimate fixation in the population (invasion thus implies substitution). Hamilton’s rule thus underlies the gradual evolution of function-valued traits and gives rise to necessary first-order conditions for uninvadability (evolutionary stability). Using and extending results from optimal control theory and differential game theory, we characterise the first-order condition for time-dependent traits (dynamic traits) in terms of dynamic constraints on state variables and their marginal effects on reproductive value. Our results apply to both open-loop traits, which are function of time (or age) only, and closed-loop (state-feedback) traits, which are function of both time and state. This allows us to delineate role of state-dependence of trait expression and thus to other’s traits affects selection on function-valued trait, which pertains to both life-history and social evolution.
1 Introduction
Any biological organism is an open system exchanging energy and information with its surrounding. As such, most if not all traits of an organism may vary in response to changes of its internal factors as well as to changes in its external biotic and abiotic environmental conditions. Examples include gene expression profiles, physiological processes, developmental trajectories, morphological shapes, and behavioural sequences, which we here refer collectively as function-valued traits and by which we mean a phenotype whose expression level depends on some index variable describing the domain over which the phenotype can vary (e.g., time, space, temperature, behaviour of others; and where traits that vary with the environment are often collectively called reaction norms). The shapes of function-valued traits are moulded by natural selection and therefore formalising how selection shapes these traits helps to understand their evolution (Gomulkiewicz et al., 2018).
Selection on quantitative function-valued traits has been formalised using different theoretical approaches, which consider different perspectives on the evolution of these traits. First, the evolution of life-history schedules has often been studied by applying Pontryagin’s maximum principle (e.g., León, 1976; Oster and Wilson, 1977; Schaffer, 1982; Iwasa and Roughgarden, 1984; Stearns, 1992; Perrin, 1992; Bulmer, 1994; Irie and Iwasa, 2005; Metz et al., 2016). Here, a trait evolves to vary as a function of the age or time of interaction of individuals, while individual fitness can be constrained by the dynamics of “state variables”. These are observables describing internal condition(s) of the individual and/or that of its environment, e.g., fat reserves, information, resource availability, behaviour of others, that in turn depend on trait expression. This approach formalises so-called open-loop traits (Weber, 2011; Liberzon, 2011), which are only time- or age-dependent traits. Selection on open-loop traits has been formalised to include interactions between relatives, which allows the assumption of limited dispersal or spatial or family-structured populations (Day and Taylor, 1997, 1998, 2000; Wild, 2011). Second, in behavioural ecology and evolutionary game theory, selection on function-valued traits has typically been studied by using dynamic programming (see Houston et al., 1999; Mangel et al., 1988 for textbook treatments and e.g. Leimar, 1997; Ewald et al., 2007; McNamara and Houston, 1987; Dechaume-Moncharmont et al., 2005. Here, a trait evolves to vary as a function of (some relevant) state variables and time. This formalises so-called closed-loop strategies (Weber, 2011; Liberzon, 2011) as these involve a feedback between current and future trait expression. Both Pontryagin’s maximum principle and dynamic programming are optimal control theory approaches (e.g., Bryson and Ho, 1975; Kamien and Schwartz, 2012; Weber, 2011; Liberzon, 2011), whose general aim is to identify a control variable over a period of time that maximises (in the best response sense in the presence of interactions between individuals) an objective function (fitness in biology) under the constraints of a dynamical system on state variables. Third, in quantitative genetics theory, a directional selection coefficient on function-valued has been derived (Gomulkiewicz and Beder, 1996; Beder and Gomulkiewicz, 1998) (assuming no interactions between individuals), which can be decomposed with component-wise selection coefficient describing how the mean of each component of a function-valued trait evolves over short-time scales, i.e., the time scales of demographic changes (Kirkpatrick and Heckman, 1989; Gomulkiewicz and Kirkpatrick, 1992; Gomulkiewicz and Beder, 1996). Finally, the connected approach of invasion analysis has investigated long-term adaptive dynamics of open-loop function-valued traits, which allows to focus on candidate endpoints of evolution (Dieckmann et al., 2006; Parvinen et al., 2006, 2013).
What are the relationships between these different approaches? When do closed-loop traits and open-loop traits lead to different evolutionary outcomes? How does the directional selection coefficient on function-valued traits connect to gradual evolution and Hamilton’s rule in group-structured populations? There are many open questions about the evolution of function-valued traits. By contrast, many general principles have been proven to hold for scalar traits (e.g. body size at maturity, sex allocation throughout life). In particular, for small trait deviations (weak selection), the selection coefficient on a scalar quantitative trait in a population subject to limited genetic mixing can be expressed as a marginal version of Hamilton’s rule, where the direct and indirect fitness effects (the “cost” and “benefit”) are given by partial derivatives of individual fitness (e.g., Taylor and Frank, 1996; Frank, 1998; Roze and Rousset, 2003; Rousset, 2004; Lehmann and Rousset, 2014; Van Cleve, 2015). This provides two useful results about gradual quantitative evolution. First, since the selection coefficient is independent of allele frequency at all frequencies and is of constant sign (Roze and Rousset, 2003, 2004; Rousset, 2004; Lehmann and Rousset, 2014), gradual evolution is vindicated even when the survival and reproduction of individuals depend on the behaviour of others, such like under density- and frequency-dependent selection. Second, when the selection gradient vanishes, Hamilton’s rule provides the necessary first-order condition for a strategy to be locally uninvadable; that is, it allows to determine candidate evolutionary stable strategies, which is central to the characterisation of long-term evolution (Geritz et al., 1998; Rousset, 2004) and thus of adaption.
Our goal in this paper is twofold. First, it is to formalise the selection coefficient on quantitative function-valued traits under weak selection and limited genetic mixing (taking panmictic population as a special case into account) in order to characterise the gradual evolution of function-valued traits. Our second aim is to understand the role of trait dependence to state variables for selection on function-valued traits. To achieve these aims, the rest of this paper is organised as follows. (1) We derive the selection coefficient acting on a mutant allele coding for a function-valued trait in the island model of dispersal (group-structured population), which yields the marginal version of Hamilton’s rule for function-valued traits. We deduce from Hamilton’s rule the necessary first-order condition for local uninvadability, which yields the candidate uninvadable function-valued traits and applies to both continuous and discrete traits. (2) We apply these results to time-dependent function-valued traits (dynamic traits) by deriving necessary conditions for uninvadability expressed in terms of dynamic constraints on state variables and their (marginal) effects on the reproductive value. This allows to compare how selection acts on open-loop versus closed-loop traits, specifying the role of trait responsiveness, and thus allows to establish the connection between the dynamic programming and the maximum principle type of approaches in the context of gradual phenotypic evolution. (3) We illustrate the different main concepts of our approach by analysing the evolution of temporal common pool resource production within groups. (4) Finally, we discuss the scope of our results.
2 Model
2.1 Biological scenario
Consider a haploid population subdivided into an infinite number of homogeneous groups (without division into class structure) with a fixed number N of individuals, where censusing takes place at discrete demographic time periods. All groups are subject to the same environmental conditions and are equally connected to each other by random dispersal. A discrete demographic time period spans an entire life cycle iteration where various events can occur (e.g. growth, reproduction, dispersal) to individuals. The life cycle may allow for one, several, or all individuals per group to die (thus including whole group extinction through environmental stochasticity or warfare). Generations can thus overlap but when this occurs, the parents are considered equal (in respect to their “demographic properties”) to their offspring in each generation (since there is no within group class structure). Dispersal can occur before, during, or after reproduction, and in groups, so that more than one offspring from the same natal group can establish in a non-natal group (i.e., propagule dispersal). We refer to this group-structured population where all individuals within groups are indistinguishable, as the homogeneous island population (i.e., broadly this corresponds to the infinite island model of dispersal of Wright, 1931, used since at least Eshel, 1972 under various versions to understand selection on social traits, e.g., Rousset, 2004, and where the specifics of our demographic assumptions are equivalent to those considered in Mullon et al. 2016).
We assume that two alleles segregate in the homogeneous island population at a locus of interest: a mutant allele with trait and a resident (wild-type) allele with trait . Here, is the set of feasible traits that individuals can potentially express and defined as the set of real-valued functions with range and domain , where is a space of some index variable(s) representing, for instance, time, an environmental gradient or some cue. We assume here that is a closed interval over some discrete or continuous index variable t. If t is discrete, then the element is a vector and if t is continuous then the element is a piece-wise continuous function. Hence, we write to emphasise that it consists of the (finite or infinite) collection of all point-wise values u(t) of the trait. Namely, u can be thought of as a continuous or a discrete “path” (or a schedule, or a trajectory) on the space .
The crucial assumption of this paper is that the mutant trait um can be expressed in the following form as a small deviation from the resident trait: where the phenotypic deviation function must satisfy for sufficiently small non-negative parameter ϵ. Because may have a boundary, not all phenotypic deviations generate a mutant strategy um that remains within the bounds of the feasible trait space , independent of the choice of ϵ (see also Section 2.3.2). Note that we are making a distinction between a phenotypic deviation η and the effect size ϵ of that deviation (in the literature, a (scalar) mutant effect is often modelled with the notation δ = ηϵ, e.g. Rousset, 2004). This distinction in the notation is necessary for analysing selection on function-valued traits.
2.2 Allele frequency change and short-term evolution
Our first aim is to characterise the change in mutant allele frequency in the homogeneous island population under weak selection (ϵ ≪ 1). To that end, it is useful to follow the direct fitness approach (Taylor and Frank, 1996; Rousset and Billiard, 2000; Rousset, 2004) and introduce the individual fitness function such that w(u•, u∘, u) gives the expected number of successful offspring produced over one life cycle iteration by a focal individual (possibly including self through survival) with trait u•, when its average neighbour in the focal group has trait u∘ and an average individual (from other groups) in the population has trait u, which is taken here to be the resident trait. We note that any individual in the population can be taken to be the focal individual (Rousset and Billiard, 2000; Rousset, 2004) and that the fitness of this individual can always be expressed in terms of average phenotypes of other individuals in different roles with respect to the focal (e.g., group neighbour, cousin, members of other groups, etc.), whenever mutant and resident phenotypes are closely similar (see the argument in Appendix A.2 for function-valued traits and a textbook argument for scalar traits e.g. Rousset, 2004, p. 95). These individuals in different roles as well as the focal individual itself are actors on the fitness of the focal, which will throughout be regarded as a focal recipient of the trait expressions of different categories of actors (i.e., recipient-centred approach).
In terms of this definition of individual fitness, we define the direct fitness effect of expressing the mutant allele as which is the effect that the focal individual has on its own fitness if it would switch from expressing the resident to the mutant allele for a small allelic effect. Analogously, we define the indirect fitness effect of expressing the mutant allele as which is the effect that the whole set of neighbours have on focal’s fitness if they were to all switch from expressing the resident to the mutant allele. Finally, let us denote by r(u) the neutral relatedness between two randomly sampled group neighbours (Michod and Hamilton, 1980; Frank, 1998) in the homogeneous island population that is monomorphic for the resident; namely, r(u) is the probability that in a neutral process (where all individuals are alike) the two homologous alleles of these individuals coalesce in the same common ancestor (e.g. Roze and Rousset, 2003; Rousset, 2004; Lehmann and Rousset, 2014; Van Cleve, 2015). Note that relatedness defined as such depends only on the resident trait. In Appendix A, we show that the change Δp in the frequency p of the mutant allele over one demographic time period (one life cycle iteration) can be expressed in terms of these quantities as follows.
Invasion implies substitution-principle result
In the homogeneous island population with two alleles, the change in mutant allele frequency p in the population takes the form where p(1 − p) is the genetic variance at the locus of interest, is a selection coefficient of order O(ϵ) that is independent of p, and O(ϵ2) is a remainder of all higher order terms. This entails an “invasion implies substitution” property of the mutant allele, which says that if sη(u) > 0, the mutant allele coding for a small function-valued deviation ϵη is selected for and not only invades but substitutes the (ancestral) resident allele [since effects of order O(ϵ2) can be neglected in eq. (4) whenever sη(u) is non-zero].
We have thus formalised an “invasion implies substitution”-principle (see Priklopil and Lehmann, 2020 for a review) for function-valued traits in the homogeneous island population and which takes the form of Hamilton’s rule: the mutant spreads if r(u)bη(u) − cη(u) > 0. This result is a generalisation of previous analogous results for scalar traits (Roze and Rousset, 2003, 2004; Rousset, 2004; Lehmann and Rousset, 2014). Owing to its simplicity, the function-valued trait nature of our result is perhaps yet not fully apparent, but is made explicit on noting that the direct and indirect effects (eqs. 2–3) are both formally Gâteaux derivatives, which are directional derivatives (see section A.1 in Appendix for a formal definition and e.g., Troutman, 1991, p. 45–50, Luenberger, 1997, p. 171–178) and represent change in fitness resulting from a sum of all weighted component-wise changes in trait expression (over the domain ) induced by the mutation function η. To outline the component-wise change in fitness, it is useful to write the selection coefficient as where · is an inner product on functions (the generalisation of a dot product, and will be used as such throughout), is the selection gradient function, where the component s(t, u) gives the selection gradient on component u(t) of the trait, i.e. the value of u at time t, holding other components u(t′) (for all ) of the trait fixed. Each component of the selection gradient function is then given by where are, respectively, the effect on the focal’s own fitness from changing marginally component u•(t) of its trait, while holding other trait components u•(t′) (for t′ ≠ t) fixed, while b(t, u) is the effect of all group neighbours on the focal individuals fitness when changing marginally component u∘(t) of their traits, while holding other components u∘(t′) (for t′ ≠ t) of their traits fixed. That is, the costs and benefits are partial derivatives and s(t, u) is the inclusive fitness effect on a focal individual. When t is discrete and finite, eq. (7) corresponds to the trait specific inclusive fitness effect derived previously for a backdrop monomorphic resident homogeneous island population (Mullon et al., 2016, eq. 12).
Eq. (6) shows that the selection coefficient is a weighted change of trait-specific changes. Note that for continuous index variable t over the interval , the partial derivatives −c(t, u) and b(t, u) in eq. (8) are formally functional derivatives (e.g. Troutman, 1991, p. 45–50, Luenberger, 1997, p. 171–178). In the absence of interactions between relatives s(t, u) reduces to β(y) in eq. 1 of Gomulkiewicz and Kirkpatrick (1992) for y = t, G(a) in eq. 4 of Parvinen et al. (2006) for a = t, or g(a; u) in eq. 3 in Metz et al. (2016) for a = t (but see ΔWincl(t) in eq. 25 of Day and Taylor, 2000, which incorporates interactions between relatives).
2.3 Necessary condition for local uninvadability and long-term evolution
It follows from Section 2.2 that a necessary first-order condition for a trait u* to be locally uninvadable (resistant to invasion by any mutant in a small radius ϵ ≪ 1) is that the selection coefficient is nonpositive for all admissible mutants in the resident u* population, that is
Local resistance to invasion by sets of alternative mutants allows to characterise candidate long-term evolutionary outcomes (Eshel and Feldman, 1984; Eshel, 1996; Eshel et al., 1998) and is a first-step (and often the only accessible computational step) towards characterising uninvadable traits.
A crucial question is whether a locally uninvadable strategy u* will be approached by gradual evolution from within its neighbourhood and thus be convergence stable (Eshel, 1983; Lessard, 1990; Geritz et al., 1998; Rousset, 2004; Leimar, 2009). Because characterising convergence stability involves a second order analysis of the selection coefficient, which is involved for multidimensional traits (Lessard, 1990; Leimar, 2009), it will not be investigated further in this paper. For the same reason, we will also not consider sufficient conditions for local uninvadability. In the remainder of this section, we focus on characterising in more detail the necessary condition of local univadability (eq. 9) in terms of the selection gradient function s(u), which allows removing the considerations of mutational effect η.
2.3.1 Local uninvadability for scalar-valued traits
Let us first consider the case of scalar quantitative traits, where the trait of each individual is an element belonging to a bounded subset of a real line. That is, the resident and mutant traits stay within the feasibility bounds umin ≤ u, um ≤ umax]). For this case, the index t in eq. (7) can be dropped and one obtains the standard selection gradient on a scalar-valued trait for the homogeneous island population: which is the standard selection gradient for scalar quantitative traits in the homogeneous island population (Taylor and Frank, 1996; Frank, 1998; Roze and Rousset, 2003, 2004; Rousset, 2004; Lehmann and Rousset, 2014; Van Cleve, 2015).
Note that for u = umin an admissible phenotypic deviation η must be non-negative η ≥ 0 and for u = umax it must be non-positive η ≤ 0 while for umin ≤ u ≤ umax the deviation η is unrestricted. Substituting this into the first-order condition for uninvadability eq. (9) yields that the necessary condition for uninvadability for scalar bounded traits can be expressed in the following form
Note that if the set of admissible traits is unbounded (i.e. ), then the first-order necessary condition for local uninvadability is given by the second line of eq. (11).
2.3.2 Local uninvadability for function-valued traits
Let us return to the general case where the trait of each individual is an element of , being either a vector ( discrete) or a (bounded and piece-wise continuous) function ( continuous). More precisely, for all the resident and mutant traits stay within the feasibility bounds umin(t) ≤ u(t), um(t) ≤ umax(t), such that and . Now, an admissible deviation must satisfy for all similar conditions as given for the scalar-traits in Section (2.3.1), that is, for u(t) = umin(t) an admissible phenotypic deviation η must be non-negative η(t) ≥ 0 and for u(t) = umax(t) it must be non-positive η(t) ≤ 0 while for umin(t) ≤ u(t) ≤ umax(t) the deviation η(t) is unrestricted. Substituting the admissible deviations into eq. (9) yields that a candidate uninvadable strategy satisfies for all : which is thus equivalent to eq. (11) in a point-wise way.
3 From the selection gradient to candidate optimal controls
The point-wise description of the candidate uninvadable trait u* given by eq. (12) is unlikely to be directly useful in solving for u* in concrete applications because factors characterising the organism and its environment change over time (e.g. organisms can grow and the resources in the environment can get depleted). Hence, solving u* from eq. (12) would entail simultaneously solving a large number of equations, while tracking the changes in the relevant time-dependent factors. A more useful characterisation of u* can be achieved with the use of the mathematical framework of optimal control theory, most notably dynamic programming and Pontryagin’s maximum principle, both of which have been used abundantly in evolutionary biology (e.g., León, 1976; Iwasa and Roughgarden, 1984; Mangel et al., 1988; Houston et al., 1999; Stearns, 1992; Perrin, 1992; Kozłowski, 1992; Day and Taylor, 1997, 2000; Cichon and Kozlowski, 2000; Irie and Iwasa, 2005; Parvinen et al., 2006, 2013; Priklopil et al., 2015; Metz et al., 2016; Avila et al., 2019). Yet these previous models often do not discuss and integrate the various modes of trait expression, in particular the open-loop and closed-loop strategies (see e.g. p. 225-226 in Basar and Olsder, 1999 for a discussion in the game theory literature). In this section, we connect our analysis to optimal control theory and in so doing, we develop an integrative approach to the functional dependence of trait evolution.
3.1 Key concepts
3.1.1 Concept of control and state variables
For space reasons, we focus on a continuous time formulation (but parallel developments apply to discrete time), and assume that a demographic time period is characterised by the time interval during which the trait expression is observed. This time interval can be thought of as the length of the reproductive season or the time during which behavioural interaction occur between individuals, which eventually leads to reproduction. More specifically, we now assume that the fitness of the focal individual can be written in the form where f (u•(t), x•(t)) is the rate of increase of individual fitness at time t and Φ(x•(tf)) is the contribution to individual fitness at the final time t = tf (formally and Φ: ℝ3 → ℝ+). Here, collect, respectively, the trait expression levels u•(t), u∘(t), and u(t) at time t of the focal individual, that of an average neighbour, and an average individual from the population, and the state variables x•(t), x∘(t), and x(t) of these respective individuals. State variables describe some measurable condition of an individual, e.g. size, stored energy, hunting skill, or that of its environment, e.g. amount of resource patches, environmental toxicity (note that the “•” in the subscript of u• and x• emphasises that these controls and state variables are those of all actors on a focal recipient). The defining feature of a state variable in our model is that its time dynamics depends on the evolving trait of one or more individuals in interaction and we will henceforth from now on call the elements of u•(t) the control variables, which is customary for these type of models in the evolutionary literature (e.g., Perrin, 1992; Day and Taylor, 2000).
Because models with both control and state variables become rapidly notationally complex, we assume that both the controls and the state variables are one-dimensional real numbers. The state of every individual is assumed to change according to the function , such that with initial condition (“i.c.”) x•(0) = xinit and which is the rate of change of the state of a focal individual with control u•(t) in state x•(t), when its neighbours have average control u∘(t) and average state x∘(t) in a population where the average control (in other groups) is u(t) and the average state is x(t). Similarly, we can also express the rate of change of the state of an average neighbour of the focal and an average individual in the rest of the population, respectively, as where the vectors collect the (average) controls and states of actors on the state variables of an average neighbour of the focal individual (first line), and on an average individual in the population (second line), respectively (here and throughout all vectors are defined by default as being column vectors). These actors are thus second-order level actors on the focal recipient since they affect the state variables of actors affecting the focal’s fitness. Note that the subscript of the control vectors (u∘(t) and x∘(t)) and state vectors (u(t) and x(t)) emphasise the individual (actor) from who’s perspective the second-order actors’ control and variables are collected. Accordingly, the vectors in eq. (17) contain elements which are, for an average neighbour of the focal, the control and state expressions of average neighbours viewed as actors on the focal individual. While we have so far explicitly distinguished between the states of different individuals, which is required if state represents some property of individual’s condition (e.g. body size or individual knowledge), nothing prevents the individual state to represent some environmental condition common to the group or population and which can be influenced by individual behaviour (e.g. local amount of resources in the group, in which case x•(t) = x∘(t), see concrete example in section 4). Note that while tracking the dynamics of three state variables (eqs. 15–18) may appear complicated, it is much simpler than tracking the state of all individuals in a group separately (which would require as many equations as there are individuals in the group and is the approach taken in Day and Taylor, 1997, 2000).
We now make a couple of remarks about the properties of the fitness function w(u•, u∘, u) (eq. 13) and its dynamic constraints (eqs. 15–16), which is a special case of a fitness function w(u•, u∘, u) considered in section 2. The fitness function (13) depends on the full trajectories of the control and state variables, but since the state variables are fully determined by the controls (by way of eqs. 15–16) and the initial condition xinit (which we assume here to be fixed), then fitness is determined by the controls. In particular, if fitness depends only on the state of the system at the final time tf(w(u•, u∘, u) = Φ(x•(tf))), then fitness still depends critically on the control variables. We assumed in section 2 that the fitness w(u•, u∘, u) is Gâteaux differentiable (eqs. 2 and 3), which means here that functions f, Φ and g are smooth enough with respect to its arguments (see e.g. section 3 of Liberzon, 2011 for textbook treatment of assumptions and Clarke, 1976 for minimal assumptions needed). We finally note that in the homogeneous island population, individual fitness may depend in a non-linear way on various vital rates (e.g, Roze and Rousset, 2003, eq. 35, Akçay and Van Cleve, 2012, eq. A12, Van Cleve, 2015, eq. 38, Mullon et al., 2016, eq. box 1a), which themselves may depend on integrals depending on the control schedules of the individuals in interaction. Such situations can be analysed either by defining state variables whose integrated values represent the integral, and are covered by the scrap value Φ(x•(tf)) in eq. (13), or by noting that to the first-order, functions of integrals can be replaced by integrals of first-order Taylor series of fitness and hence the f (u•(t), x•(t)) fitness component in eq. (13) may be evaluated as a first-order Taylor expansion of fitness in its vital rates (e.g., Van Cleve, 2015, eq. 39).
3.1.2 Concept of neutral reproductive and shadow value
A central role in our analysis will be played by the (neutral) reproductive value of an individual at time t in a resident population, which gives the total contribution to fitness from time t onward of a (recipient) individual when the current state variables of the actors on its fitness is x(t). The argument u has been separated with the semicolon in order to emphasise that the reproductive value is evaluated assuming a fixed control trajectory where u is treated as a parameter. Hence, the reproductive value is formally a function of current time t and state . In Appendix B.1, we show that the reproductive value satisfies the following partial differential equation (PDE) with final condition (“f.c.”) v(tf, x(tf); u) = Φ(x(tf)), where 1 = (1, 1, 1) and the vector is the gradient of the reproductive value with respect to the changes in the state variables of each individual affecting the focal’s fitness (and associated with fixed resident control path u). Note that we use the vector x•(t) as an argument of v when expressing the partial derivatives to emphasise who’s actor state we are varying, even though we are here considering only a resident population where all actors on a focal recipient have the same controls. The “-” sign on the left-hand-side of eq. (20) indicates that the reproductive value of an individual is growing when looking backwards in time. Hence, it grows according to the current rate f (u(t), x(t)) of fitness increase and the sum 1 · λ(t, x(t); u) of the effects of the current state change of each type of actor on the future fitness of the focal individual, weighted by the change g(u(t), x(t)) of state of the actors that are all the same in a resident population. The elements of the gradient λ(t, x(t); u) are called the shadow values of the states in the optimal control literature (see e.g. Dorfman, 1969; Caputo and Caputo, 2005), since by changing state, there is no immediate effect on fitness, but only future fitness effects.
3.1.3 Concept of open and closed-loop controls
Because the internal and external conditions of organisms vary, trait expression can evolve to be functionally dependent on these conditions (Sibly and McFarland, 1976; McFarland, 1977; McFarland and Houston, 1981; Houston et al., 1999). Hence, trait expression may not only depend on time but also on state variables. Focusing on the resident trait u(t), we can conceptualise trait expression in (at least) two different ways that are relevant to evolutionary biology. Namely, where the function is the trait expression rule (or decision rule for short) in the so-called closed-loop (or feedback or Markovian) form of the control variable (Basar and Olsder, 1999, p. 221, Dockner et al., 2000, p. 59). A closed-loop (feedback) control can be thought of as a contingency plan, which specifies a trait expression rule according to both the state and age (or time during interaction phase ) of the individual. An open-loop control and can be thought of as an entirely fixed course of trait expression from birth to death of an individual (trait expression happens “by the clock”). Note that closed-loop (feedback) control is the general characterisation that subsumes open-loop control.
Closed-loop traits are biologically more general and certainly relevant to life-history, since trait expressions is more likely to depend on individual state that on chronological age. Yet, in many life-history evolution models, mostly open-loop control traits have been assumed (e.g., Macevicz and Oster, 1976; Schaffer, 1983; Bulmer, 1983; Iwasa and Roughgarden, 1984; Sibly et al., 1985; Stearns, 1992; Perrin, 1992; Kozłowski, 1992; Perrin et al., 1993; Day and Taylor, 1997, 1998, 2000; Irie and Iwasa, 2005; Parvinen et al., 2006; Metz et al., 2016). We know from the literature of dynamic game theory that if there are no interactions between individuals, the open- and closed-loop (feedback) representations of control lead to the same uninvadable control paths (see e.g. Basar and Olsder, 1999, chapter 5.6, Dockner et al., 2000, chapter 3.5 for optimal-control textbook treatment), thus circumventing the need to distinguish between them. This means that the use of closed-loop controls appearing in life-history models without interactions between individuals (e.g., Houston et al., 1999) do not lead to different results if instead an open-loop representation of traits would have been used. In the presence of interactions between individuals, however, open-loop and closed-loop controls can lead to substantially different uninvadable control paths and it becomes crucial to distinguish between them (Basar and Olsder, 1999, chapter 5.6, Dockner et al., 2000, chapter 3.5). In light of these results from game theory, we now proceed to analyse candidate uninvadable open- and closed-loop controls.
3.2 First-order condition for closed-loop controls
3.2.1 The first-order condition in terms of dynamic constraints
Let us now evaluate the point-wise fitness effects of Hamilton’s marginal rule (7) by substituting the fitness function eq. (13) into eq. (8) and taking the derivative with respect to u•(t) and u∘(t). Calculations displayed in Appendix B (in particular eqs. B.16–B.28) then show that the direct effect is and the indirect effect is
The derivatives are evaluated at u = d(x) = {d(t, x(t))}t∈[0,tf] for closed-loop controls and at u = d = {d(t)}t∈[0,tf] for open-loop controls, where d(t, x(t)) = (d(t, x(t)), d(t, x(t)), d(t, x(t))) and d(t) = (d(t), d(t), d(t)) are vectors of closed-loop and open-loop trait expression rules, respectively, evaluated in a resident population. Hence, the direct effect −c(t, u) and the indirect effect b(t, u) are given in terms of the effect on the Hamiltonian (the rate of change of the reproductive value) of the infinitesimal variations (perturbations) of current trait components u•(t) and u∘(t).
We now make two observations about the direct and indirect effects. First, the perturbations of the change of the state variables (∂g/∂u• and ∂g/∂u∘) have cascading downstream effects on fitness growth rate f (u•(t), x•(t)), but since under a first-order analysis everything else than the original perturbation needs to be held constant, the downstream effects are accounted for by the shadow values λ(t, x(t); u) evaluated in the resident population. Second, the state dynamics of an average individual in the population is not affected from variations in u• and u∘(∂g(u(t), x(t))/∂u• = ∂g(u(t), x(t))/∂u∘ = 0, by way of eq. 16).
In order to obtain a full characterisation of the first-order condition taking into account the dynamic constraints brought by the shadow value, it is useful to introduce the Hamiltonian function
This can be thought of as the contribution to individual fitness of all current “activities” (Dorfman, 1969, p. 822); namely, the (phenotypic) expressions u•(t) of all individuals currently in state x•(t) at t, holding everything else fixed in a resident population. It is thus the sum of the current rate of fitness contribution f(u•(t), x•(t)) and the changes in current states g(u•(t), x•(t)) resulting from the activities weighted by λ(t, x(t); u) evaluated in the resident population, since the shadow values do not directly depend on the activities at time t. Our next result (proved in Appendices B.1 and B.2) establishes the necessary condition for uninvadability for closed-loop control paths as follows.
Candidate uninvadable closed-loop control result
Let be a candidate uninvadable closed-loop (feedback) control path with associated state path , where x*(t) = (x*(t), x*(t), x*(t)) and shadow value λ*(t, x*(t)) = λ(t, x*(t); u*). The candidate uninvadable control path d*(x*) has to necessarily satisfy eq. (12), where the point-wise selection coefficient s(t, u*) on control component u*(t) = d*(t, x*(t)) can be written for all as where d*(t, x*(t)) = (d*(t, x*(t)), d*(t, x*(t)), d*(t, x*(t))), the state variable satisfies and the shadow value function λ*(t, x*(t)) = ∇v*(t, x*(t)) is obtained from the reproductive value function v*(t, x*(t)) = v(t, x*(t); u*) satisfying
Hence, all quantities are evaluated on the resident control u* = d*(x*) and state x* paths.
We now emphasise two points about this result. First, the dynamic constraints entail solving forward in time eq. (27), which is an ODE (ordinary differential equation), and solving backwards in time eq. (28), which is a PDE (partial differential equation). Thus, the Hamiltonian can be thought as the growth rate of the reproductive value (when looking backwards in time). For a reader familiar with the dynamic programming literature, the reproductive value v*(t, x*(t)) is not the so-called value function of the model and hence eq. (20) (even when evaluated along the candidate uninvadable control path u• = u*) is not the eponymous Hamilton-Jacobi-Bellman equation (e.g., Bryson and Ho, 1975; Kamien and Schwartz, 2012; Basar and Olsder, 1999; Dockner et al., 2000; Liberzon, 2011; Weber, 2011). This means that the above result says nothing about the sufficiency of uninvadability, like any standard first-order selection analysis. In this regard, our result provides a weaker, yet simpler and novel condition to characterise closed-loop controls compared to the standard approach in the optimal control theory literature.
Second, by substituting eq. (25) into (26) yields that any interior candidate uninvadable strategy satisfying s(t, u*) = 0 (recall 12) must satisfy
This fundamental balance condition says that the current fitness effect (on the focal individual) is traded-off (hence the negative sign) by the state-modulated future fitness effect resulting from the change in state variables. This trade-off is instrumental in allowing to characterise the candidate uninvadable control u*(t) = d*(t, x*(t)), which can be typically done in two steps. The first step is to determine d*(t, x*(t)) satisfying (29), while treating the system state x*(t) and its shadow value λ*(t, x*(t)) as parameters, yielding the implicit expression in terms of some function D (that satisfies eq. 29). Essentially, this step is akin to solving a static (one-dimensional) first-order condition (and which can in principle also be used whenever s(t, u*) ≠ 0, Dockner et al., 2000, p. 97). We will refer to this first-step characterisation as the static characterisation, since it allows to characterise the general nature of the solution in terms of x*(t) and λ*(t, x*(t)) independently of their explicit values. The second step entails solving for the trajectories of x*(t) and v*(t, x*(t)) generated by eqs. (27) and (28) under eq. (30) and then taking the gradient of ∇v*(t, x*(t)) to obtain λ*(t, x*(t)). Finally, after solving for trajectories x*(t) and λ*(t, x*(t)) we can explicitly characterise the candidate uninvadable control by substituting the these solutions into eq. (30). Solving eq. (28) for v*(t, x*(t)) is the main technical challenge in finding the candidate uninvadable traits.
It is also often the case in biological models that the Hamiltonian is affine in the control variables so that fitness depends linearly on the evolving traits (e.g. Macevicz and Oster, 1976; Perrin et al., 1993; Irie and Iwasa, 2005; Avila et al., 2019). In such cases, controls do not appear in the selection gradient (26) and hence, one can not directly determine from it the static characterisation (30). These types of controls are known to be singular arcs (see Kelley, 1964; Kopp and Moyer, 1965; Goh, 1966 for classic developments and see e.g. Sethi and Thompson, 2006; Bryson and Ho, 1975 for textbook treatments). In order to characterise the candidate uninvadable singular arc, we can take the total time derivative of the selection gradient s(t, u*), which (potentially) provides an additional algebraic equation in the variables (u*, x*, λ*) that can contain the control(s) with a non-zero coefficient. In case it does not, another time derivative can be taken until expression for u* can be obtained. Hence, for singular arcs, we can obtain the static characterisation (30) by applying until u* can be obtained. Note that for a candidate uninvadable control to be a singular arc, eq. (31) has to hold for a finite interval. If eq. (31) does not hold over a finite interval, then u* is known to be a bang-bang control (see e.g. Sethi and Thompson, 2006; Bryson and Ho, 1975), meaning that u* takes the values only on its boundaries (u*(t) = umax(t) or u*(t) = umin(t), owing to eq. 12).
3.2.2 Shadow value dynamics and state feedback in a resident population
From the static (first-step) characterisation of d*(t, x*(t)) (eq. 30), we observe that the candidate uninvadable trait is at most a function of λ*(t, x*(t)), but does not directly depend on the reproductive value v* itself. Furthermore, taking the partial derivative of eq. (28) with respect to x*(t) and using the definition of the Hamiltonian (25) and re-arranging (see sections B.1.2 and B.3 in Appendix) yields with f.c. λ*(tf, x*(tf)) = ∇Φ(x*(tf)) and where d(t, x•(t)) = (d(t, x•(t)), d(t, x∘(t), d(t, x(t)). The dynamics of the shadow value, given by eq. (32), may at first glance appear to be an ODE (and therefore easier to solve than eq. 28, which is clearly a PDE for the reproductive value v*). This may lead one to hope that it is possible to circumvent from explicitly determining v*, by simply solving eq. (32) to directly obtain λ*. But this hope is crushed by the trait dependence on state, which entails that eq. (32) depends on the derivatives of the elements of d(t, x•(t)) with respect to state, which in turn depends on higher-order derivatives of v(t, x*(t); u*). This can be seen by using eq. (30), whereby which unveils that eq. (32) is actually not an ODE. This means that in general it is not possible to determine the candidate uninvadable trait from using eq. (32), which has been repeatedly stressed in optimal control theory (e.g. Starr and Ho, 1969b,a; Başar, 1977). However, the analysis of the components of eq. (32) has less been stressed, but turns out to be informative in highlighting the main similarities and differences between selection on closed-loop and open-loop controls.
Lets now decompose eq. (32) for the component (similar results hold for the other shadow values (see Appendix B.3) and write with f.c. . This says that the rate of change of the shadow value is given by a direct effect of state change on the Hamiltonian (current fitness effect and state-modulated fitness effect) and a feedback effect on the Hamiltonian, which arises since closed-loop traits react to changes in the state. Using the expression for the Hamiltonian in eq. (25), the expressions for direct and indirect effects in eq. (23)–(24), and noting that ∂d(t, x(t))/∂x•(t) = 0, the trait feedback effect can be further expanded as where the derivatives ∂d(t, x•(t))/∂x•(t) and ∂d(t, x∘(t))/∂x•(t) give the trait sensitivities of the focal individual and its average neighbour, respectively, at time t to changes in focal’s state variable x•(t). Hence, the feedback effect of state change is equal to the trait sensitivities of all individuals in the group weighted by their effects on the focal’s fitness (the latter are effectively the direct and indirect fitness effects).
We now make three observations about eqs. (34)–(35). First, trait sensitivities result in inter-temporal feedbacks in trait expressions. We can see this by first observing that current trait expression affects changes in state variables (by way of the second line of eq. 29) which affect future fitness (measured by the shadow value). In turn, the dynamics of shadow value takes into account that closed-loop traits respond to changes in state variables (by way of the second line of eq. 35). That is, the shadow value takes into account the effects of current trait expression on future trait expression. Hence, under closed-loop control, the current trait expression of one individual is linked to future trait expression of itself and other individuals in its group. Second, the sign of the feedback effect of state change (sign of eq. 35) determines if the trait sensitivity causes the shadow values to be higher (for negative feedback term) or lower (for positive feedback term), assuming that the sign is the same throughout the interaction period. This means that the sign of feedback effect determines how trait sensitivity affects the trade-off between current and future (state-modulated) fitness effects (by way of eq. 29). Third, the shadow value dynamics given by eqs. (34)–(35) is different from that in classical results from dynamic game theory (first developed by Starr and Ho, 1969a,b), where the feedback effect through the focal’s own trait variation does not appear due to the absence of interactions between relatives, whereby −c(t, u*) = 0 at s(t, u*) = 0. By contrast, in our model with interactions between relatives one has −c(t, u*) + r(u*)b(t, u*) = 0 at s(t, u*) = 0. Thus, we recover the classical result for the feedback effect from dynamic game theory when r(u*) = 0.
We now consider three scenarios (which are relevant for biology) under which the feedback term (given by eq. 35) that describes the dynamics of vanishes (similar arguments also hold for the feedback term and recall that we do not need to consider the dynamics λ(t) here, because λ*(t, x*(t)) does not affect the selection gradient). That is, we consider scenarios for which eq. (32) is a system of ODE’s (for components and λ∘) and therefore solving a PDE (28) for v*(t, x*(t)) is not necessary to determine the candidate uninvadable trait. These three scenarios are as follows.
Open-loop u(t) = d*(t) controls. Because the traits do not depend on the state variables, then the are no trait sensitivities and ∂d•(t)/∂x•(t) = 0 and ∂d∘(t)/∂x•(t) = 0 which implies that eq. (35) vanishes.
No social interactions in the group, meaning that fitness components of individuals do not depend on traits and states of other individuals in the group, i.e. the fitness components f, Φ and g of the focal individual do not depend on u∘ and x∘, hence b(t, u*) = 0. It then further follows from eq. (26) that in order for s(t, u*) = 0 to be satisfied, we need c(t, u*) = 0. It then follows directly that eq. (35) vanishes.
In a population of clonal groups (r(u*) = 1) that share a common state variable (x•(t) = x∘(t), e.g. common resource in the group). Two observations can be made for this scenario. First, from eq. (26) it follows that −c(t, u*)+ b(t, u*) = 0 for clones at s(t, u*) = 0. Second, since x•(t) = x∘(t), then e.g. ∂d(t, x•(t))/∂x•(t) = ∂d(t, x∘(t))/∂x•(t). Combining these two observations directly leads to conclude that the feedback term eq. (35) vanishes.
The are a two implications that follow for these three cases. First, open-loop and closed-loop evolutionary equilibria are in general very different, since the state-feedback effect causes inter-temporal feedbacks between trait expressions of locally interacting individuals under closed-loop controls, which are not possible under open-loop controls. However, if individuals do not locally interact or if they interact in clonal groups, then closed-loop and open-loop representation of controls produces the same candidate uninvadable trait and state trajectories.
Second, the sign of feedback effect of state change (eq. 35) determines if the shadow value for closed-loop traits is higher or lower than the shadow value for open-loop traits (given that this sign is the same for all t ∈ [0, tf]). If the sign of the net trait sensitivity is negative, then changing the state results in a lower future inclusive fitness effect (recall the definition from eq. 29) for closed-loop controls than for open-loop controls (in a candidate uninvadable population). Hence, for a negative trait sensitivity, state variables change less for closed-loop controls than for open-loop controls. If the sign is positive then the opposite is true (i.e. future inclusive fitness effect is higher for closed-loop controls, hence state variables are expected to change more).
3.3 First-order condition for open-loop controls
We now focus specifically on open-loop controls by pointing out the simplifications in our analysis that arise when
As we showed in the previous section, under eq. (36) the state-feedback term eq. (35) for shadow value dynamics vanishes (similarly it vanishes also for and λ*(t, x*(t))), which implies that eq. (32) is a system of ODE’s. Hence, we can characterise the necessary condition for an open-loop control paths as follows.
Candidate uninvadable open-loop control result
Let be the candidate uninvadable open-loop control path with associated state path and shadow value λ*(t, x*(t)). The candidate uninvadable control path u* = d* has to necessarily satisfy eq. (12), where the point-wise selection coefficient s(t, u*) on a control component u*(t) = d*(t) can be written for all as where d*(t) = (d*(t), d*(t), d*(t)), the state variable satisfies and the shadow values satisfy
This result is Pontryagin’s weak principle for interactions between relatives (since only small mutant deviations are considered, Speyer and Jacobson, 2010, p. 74) and only requires consideration of the shadow value. It has been derived previously Day and Taylor (1997, 2000), for a slightly less general model, where individuals locally play the field or interact in a pairwise way (see also Day and Taylor, 1998; Wild, 2011 for related work). We here re-derived this result as a corollary of the closed-loop result of the previous section when the feedback-terms describing the dynamics of the shadow value λ(t) vanish (i.e, eq. 35 vanishes). Hence, we closed the loop between the selection gradient on function-valued traits, invasion implies substitution, Hamilton’s rule, dynamic programming, and Pontryagin’s (weak) maximum principle.
4 Example: Common pool resource production
4.1 Biological scenario
We here present an application of our model to production of common pool resource that returns fitness benefits to all group members but is produced at a personal fitness cost. The evolving trait we consider is the schedule of effort invested into resource production during an interaction period and we will hereinafter refer to the control u(t) as the production effort at time t. Let u•(t), u∘(t), u(t) denote the production efforts at time t of the focal individual, average neighbour in the focal group, and average individual (from other groups) in the population, respectively. Let xc(t) and x(t) be the resource level at time t (the total amount of resources produced) in the focal group and the average group in the population, respectively. Note that here we have a common state variable xc(t) = x•(t) = x∘(t) between individuals in the same group and individuals interact through the state only locally (resources produced in other groups does not directly affect the fitness of individuals in the focal group).
We study the evolution of the production effort under two different scenarios: (i) individuals can adjust their production effort according to the (local) resource level xc(t) (closed-loop control) and (ii) individuals are committed to a fixed schedule of production effort (open-loop control). One difficulty in analysing the evolution of such traits is that limited genetic mixing generates relatedness between group members but also competition between them, which leads to kin competition (e.g., Taylor, 1988; Frank, 1998; Rousset, 2004; Van Cleve, 2015). Since we want to highlights the key effects of the evolution of open-loop and closed-loop controls in the context of interactions between relatives in a simple way, we want to avoid the complexities brought by kin competition and thus assume implicitly a life cycle that entails no kin competition. In particular, we assume that individual fitness takes the form which depends on the resource level xc(tf) in the group at the end of interaction period tf and on the accumulated (personal) cost of producing the common resource for the group (the second term in eq. 40). Here ceffort is a parameter scaling the personal cost. The resource level xc(tf) and ceffort are measured in units of the number of offspring to the focal individual and scaled such that they inherently take into account the proportional effect of density-dependent competition (proportional scaling of fitness does not affect the direction of selection). Neglecting the effects of kin competition in eq. (40) does not lead to any loss of generality in our forthcoming analysis, since taking kin competition into account would only affect the final results by re-scaling the value of relatedness (e.g., Van Cleve, 2015).
We assume that the rate of increase of the resource level in a group depends on the total amount of production effort that individuals in the group invest into producing it and that the return from this effort decreases exponentially with the current level of resource where the parameter a > 0 is the efficiency of producing the common resource.
4.2 Static characterisation of the production effort
We can express the reproductive value (19) as where the state variable is only one-dimensional (since fitness depends only on resources produced in the group xc(t), which is here evaluated at resident strategy, i.e. xc(t) = x(t)). We will denote the corresponding shadow value (common to all actors) with , which gives the effect on fitness of changing the state variable in the group evaluated at the resident population.
Using eq. (40), the fitness components f and Φ (as defined in eq. 13) take the form while the rate of change of the state variable xc(t) is given by
On substituting eqs. (43) and (44) into the Hamiltonian function eq. (25) produces
Since reproduction is assumed to take place after the interaction period [0, tf] and producing resources is costly, then the current fitness contribution f (the first term in eq. 45) is negative for all t ∈ [0, tf]. Hence, the effect on the current state change weighted by the shadow value gcλc (the second term in eq. 45) has to be positive.
In terms of eq. (45), the direct fitness effect is while the indirect fitness effect is where from which we obtain the fundamental balance equation (eq. 29) for this model:
This says that the net effect of accumulation of personal cost due to spending effort to produce a unit resource must balance out the inclusive fitness benefit associated with the unit resource. Solving eq. (48) for u*(t) yields where scales the benefit to cost ratio of producing the resource, note also that γ > 0. Eq. (49) says that (candidate uninvadable) production effort u*(t) decreases exponentially with the resource level x*(t), increases linearly with the shadow value and relatedness, and is not directly dependent on time. This general nature of the solution applies to both open-loop and closed-loop controls and is depicted in Fig. 1.
4.3 Full characterisation of the production effort
4.3.1 Closed-loop production effort
We now turn to analyse u*(t) explicitly as a function of time, which requires to evaluate x*(t) and v*(t, x*(t)) = v(t, x*(t); u*). To that end, we evaluate the dynamic eq. (41) for xc(t) along u• = u* and xc = x* and substituting the expression for u*(t) from eq. (49) therein yields
Substituting the Hamiltonian (45) for our example into eq. (28) and substituting the right-hand-side of eq. (49) to express u*(t) and simplify yields where and note that c1 > 0. Using the method of characteristics, Ewald et al. (2007) have showed that the partial differential equation (52) has the following solution (our eq. 52 corresponds to eq. (21) of Ewald et al., 2007 where c1 = 3/2k and their solution is presented on page 1459 of their paper, where c1 = c = 3/2k). Taking the derivative with respect to x*(t) and upon simplifying yields the expression for the shadow value
Substituting this into the static characterisation eq. (49) shows that where the state variables is the solution of which was obtained by substituting the shadow value from eq. (55) into the dynamic eq. (51) of resource level x*(t) and for which we were not able to find a closed form expression.
4.3.2 Open-loop production effort
We now derive an explicit expression for the candidate uninvadable open-loop trait u*(t) = d*(t). Substituting the Hamiltonian (45) for our example into eq. (39) and using the right-hand-side of eq. (49) to express u*(t), yields the dynamic equation of the shadow value , and with eq. (51) for the dynamic equation of resource level x*(t) we arrive at the following two-point boundary value system
This system has one real-valued solution and substituting this solution back into the expression for static characterisation eq. (49), we obtain the explicit expression for the candidate uninvadable open-loop trait, which turns out to be constant in time.
4.4 Comparison of closed-loop and open-loop production efforts
With this example, we recover the result that in a population of clonal groups (r(u*) = 1) closed-loop and open-loop equilibria coincide (Figure 2). In a population of non-clonal groups (r(u*) < 1) the production effort u*(t) and the resulting amount of resource x*(t) tend to be lower under closed-loop candidate uninvadable equilibrium (hereinafter, we simply write “equilibrium”) than under open-loop equilibrium (Figure 2). Overall, the production effort monotonically increases over time for the closed-loop control and stays constant under the open-loop control (Figure 2).
The difference between closed-loop and open-loop (non-clonal) equilibria arises from the difference in the dynamic constraint on the shadow value (recall section 3.2.2). We find that the shadow value is lower under closed-loop control than under open-loop control when r(u*) < 1 (Fig. 3). This is so, because of the feedback effect of state change (eq. 35), which for our example is
Since, this is negative, the shadow value declines faster backwards in time than under closed-loop equilibrium. In order to understand why the feedback effect is negative, we need to consider the signs of b(t, u*) − c(t, u*) and the trait sensitivity ∂d(t, xc(t))/∂xc(t) (which is here same for all group members). The term b(t, u*) − c(t, u*) is positive when groups are non-clonal and zero when they are clonal (Fig. 4, panel a). This means that if everyone in the group produces more of the resource, then the focal’s fitness increases under non-clonal equilibrium and is unaffected under the clonal equilibrium. The trait sensitivity is always negative (Fig. 4, panel b). Hence, individuals will reduce their production effort in response to an increase in the resource level and the magnitude of this effect is larger for higher values of relatedness.
In conclusion, investment effort is lower for closed-loop traits than for open-loop traits in a population of non-clonal groups (r(u*) < 1), because closed-loop trait expression takes into account that other individuals will reduce their production effort in response to the focal individual increasing its production effort. In a population of clonal groups where the focal individual increases its trait expression, the response from other individuals will not affect the fitness returns to the focal. An open-loop equilibrium can thus be thought of as a group fitness maximising equilibrium, which can be achieved under closed-loop trait expression only when there are no genetic conflicts between individuals (i.e. clonal groups). For open-loop controls this equilibrium is always achieved, regardless of how related individuals are, since genes code for control trajectories that are pre-determined at birth (full commitment to control trajectory) and trait expression can not be adjusted in response to changes in the resource level. For clonal groups, trait sensitivity to changes in the resource level does not alter individual behaviour, because everyone’s interests in the group are aligned.
5 Discussion
We formalised the directional selection coefficient on function-valued traits when interactions occur between individuals in a group-structured population subject to limited genetic mixing. This selection coefficient describes the directional evolution of quantitative function-valued traits and determines three relevant evolutionary features. First, it gives the invasion condition of a mutant allele coding for a multidimensional phenotypic deviation (the deviation of a whole function) of small magnitude and takes the form of Hamilton’s marginal rule −c+rb > 0, where the marginal direct fitness effect −c and the marginal indirect fitness effect b are given by directional derivatives (formally Gâteaux derivatives). Second, the selection gradient is frequency-independent (same for all allele frequencies) and thus underlies gradual evolution of function-valued traits, since −c + rb > 0 implies not only initial invasion of the mutant function-valued deviation, but also substitution of the resident ancestral type in the population. Finally, the stationary selection gradient (i.e. when −c + rb = 0) gives the necessary first-order condition for uninvadability and allows to characterise long-term evolutionary outcomes. While these three features are well known to hold for scalar traits (e.g., Rousset, 2004; Lehmann and Rousset, 2014; Van Cleve, 2015), our derivation of Hamilton’s marginal rule for multidimensional traits generalises them to traits of arbitrary complexity and establishes a link between selection on scalar and function-valued quantitative traits.
We also showed how selection on dynamic function-valued traits (i.e. traits that are time-dependent and are potentially subject to dynamic state constraints) can be analysed using optimal control theory. In particular, we analysed selection on both open-loop controls, whose expression is only time-dependent, and closed-loop controls, whose expression is also state-dependent. Regardless of the mode of trait control, the direct and the indirect effects (−c and b) of Hamilton’s rule can in a dynamic context be decomposed into, on one side, the direct and indirect effects on current fitness of a focal individual and, and on the other side, the direct and indirect effects on its future contributions to fitness. These latter effects arise through changes in the states of interacting individuals and depend on the shadow values, which are the focal’s future contributions to fitness resulting from current variations in states and are formally given by the effects of state changes on the reproductive value function. The shadow values are thus central in balancing the trade-off between current and future fitness effects and thus in shaping life-history trade-offs.
Our analysis allows to compare the differences of how selection acts on open-loop and closed-loop controls. While the selection coefficient takes the same form (Hamilton’s rule) for these two modes of trait control, the dynamic constraints on the two cases are different and this arises from the difference in the dynamics of the shadow value, which plays a central role in understanding these differences. For open-loop traits, the shadow value dynamics depends only on how state variation directly affects current fitness and state dynamics. For closed-loop traits, the shadow value dynamics is shown to depend additionally on the state feedback effect, which captures how state variations of the individuals in interaction affect current fitness and state dynamics through trait sensitivity to state variations (eq. 35). This causes interdependencies between the states of different individuals and inter-temporal effects between trait expressions, which are absent under open-loop controls. Analysis of the feedback effect leads to two insights about the role of state-dependence of trait expression in shaping trait evolution. First, the sign of the feedback effect (the sign of eq. 35) determines if the shadow value is larger (for positive feedback effect) or smaller (for negative feedback effect) for closed-loop traits than for open-loop traits. Second, state-dependence of trait expression plays no role if there are no social interactions between individuals or interactions occur only between clones (r = 1), in which cases the candidate uninvadable open-loop and closed-loop trait expressions coincide (and the state-feedback term is zero). Thus, in both cases only open-loop analysis is required, which turns out to be useful because only ordinary differential equation’s (ODE’s) need to be solved to determine the uninvadable open-loop and closed-loop controls.
We also worked out an example of common pool resource production to illustrate these concepts, and where strategic interactions under closed-loop trait expression cause individuals to invest less into common pool resource at an evolutionary equilibrium (the state feedback effect is negative). This result is in line with the literature of economic game theory, where “period commitments” lead to higher levels of cooperation in common pool situations (Meinhardt, 2012). Indeed, an open-loop control can be viewed as a strategy that commits to its expression over the entire interaction time period [0, tf], since it cannot be altered in response to some change experienced by individuals, while a closed-loop control is a strategy with no commitment over time since it is conditional on state. However, depending on the nature of the interaction between individuals, closed-loop trait expression can also lead to higher levels of cooperation. A concrete example is an analysis of the repeated prisoner’s dilemma game, where open-loop controls lead to defection, while closed-loop controls can sustain cooperation (e.g., Weber, 2011), since under closed-loop strategies current actions are linked to future ones. Hence, only closed-loop strategies can sustain the reciprocity principle of repeated games by giving rise to incentives that differ fundamentally from those of unconditional trait expression (see Binmore, 2020, p. 87 for a characterisation of this principle).
We finally discuss some scopes and limitations of our formalisation. First, concerning scopes, we focused explicitly on two types of controls, but other types of controls are taken implicitly into account as special cases of our formalisation. For instance, closed-loop stationary controls, where trait expression depends only on state and not on time are included in our analysis as special cases of closed-loop controls. Stationary controls are obtained under certain scenarios when the components of fitness (such as functions f, Φ and g in eqs. 13 and 15) do not explicitly depend on time (as we have assumed here) and the time horizon of the interaction period becomes very large (i.e. tf → ∞, see e.g. McNamara et al., 1991; Venner et al., 2006). Then, the uninvadable closed-loop strategy convergences to a stationary strategy d*(x*) and the PDE for the value function becomes an ODE, which is easier to solve (see e.g. Dockner et al., 2000, p. 97). This could be applied for modelling reaction norms to (stationary) environmental factors (e.g. salinity, altitude, temperature).
Another special case included in our analysis is that of constant controls, where trait expression depends neither on state nor time, but nevertheless affects the dynamics of some state variable that in turn affects fitness, in which case the value function becomes a constant. Several concrete biological situations fall into this category. For instance, neural networks are dynamical systems whose output is controlled by a finite number of scalar weights (Haykin, 2009), the selection on which is encapsulated by our formalisation if weights are taken to be traits evolving genetically and thus treated as controls (see Ezoe and Iwasa, 1997 for an application to evolutionary biology). Likewise, phenomena as different as gene expression profiles and learning during individual’s lifespan can be regarded as outcomes of dynamical systems controlled by a finite number of constant traits (e.g. see respectively Alon, 2020 and Dridi and Akçay, 2018).
Concerning limitations, we modelled a population reproducing in discrete time, where within each time period individuals can interact for a fixed time interval [0, tf]. As such, the vital rates of individuals can change over this interaction period, but not between interaction periods. Hence, our model with limited dispersal and time-dependent vital rates (during [0, tf]) applies to semelparous species, which covers models with conflict between relatives of annual organisms (Day and Taylor, 2000; Avila et al., 2019). Further, if we allow for complete dispersal between groups (r(u) = 0), then our framework can be used to address the evolution of function-valued traits under overlapping generations with time-dependent vital rates as in continuous time classical life history models with and without social interactions (e.g. León, 1976; Schaffer, 1983; Stearns, 1992; Perrin, 1992), but we add the possibility of considering the evolution of closed-loop controls. This scenario is encapsulated in our formalisation because the individual fitness function we used to analyse dynamic trait evolution (eq. 13) takes the same functional form as the basic reproductive number in age-structured populations (and which is sign equivalent to the Malthusian or geometric growth rate, e.g., Karlin and Taylor, 1981, p. 423-424). As such, our results on closed-loop controls (section 3.2) allow to characterise long-term evolutionary outcomes when the fitness of an individual takes the form of the basic reproductive number. For this situation, our results for open-loop controls (section 3.3) reduce to the standard Pontryagin’s weak principle used in life-history models (e.g. Schaffer, 1983; Stearns, 1992; Perrin, 1992). In order to cover time-dependent vital rates with overlapping generations within groups under limited dispersal, one needs to track the within-group age structure (e.g. Ronce et al., 2000), which calls for an extension of our formalisation. Finally, we did not consider between-generation fluctuations in environmental conditions, which certainly affect the evolution of function-valued traits and it would be interesting to investigate this case. Hence, while our results are not demographically general, our hope is that the present formalisation is nevertheless helpful in providing broad intuition about the nature of directional selection on function-valued traits.
Acknowledgements
We thank Charles Mullon for helpful comments on the manuscript.
Appendix A Derivation of Hamilton’s rule for function-valued traits
In this Appendix, we prove the gradient version of Hamilton’s rule for function-valued traits and show that this provides an invasion implies substitution principle under weak selection (eqs. 4–8). A central concept used in our proof is the notion of a Gâteaux derivative.
A.1 Gâteaux derivative and point-wise functional derivative
Let be some functional where is a vector space over a domain and assume that for some the limit exists for all deviations η that satisfy for a sufficiently small non-negative parameter ϵ. Then, the function F is said to be Gâteaux differentiable at y, and δηF (y)/δηy is the shorthand notation for a Gâteaux derivative at y in the direction of η (Hille and Phillips, 1957, Section 3). The Gâteaux derivative can thus be thought of as a generalization of the directional derivative familiar from finite dimensional spaces. Most rules that hold for ordinary derivatives also hold for Gâteaux derivatives, e.g. Taylor’s theorem and chain rule (e.g. see Section 2.1C in Berger, 1977, or, Appendix A of Engel and Dreizler, 2013). The Gâteaux derivative can be expressed in terms of point variations (e.g. see Engel and Dreizler, 2013, eq. A.15 and eq. A.28) as where is the point-wise functional derivative of F at y(t) and δt is the Dirac measure taking value 1 at t and otherwise it is 0. That is, eq. (A.3) is the partial derivative of F with respect to x at t and hence we use the more familiar ‘partial derivative’ notation from finite dimensional spaces. The representation in eqs. (A.2)–(A.3) is useful because it allows, for instance, to take a functional derivative of fitness with respect to the trait, and partition it into a deviation η(t) and a marginal fitness effect at a specific (single) time point (i.e. a point-wise marginal fitness effect), and only then integrate over the domain .
A.2 Dynamics of mutant-frequency
Consider that the mutant allele coding for trait um and the resident coding for trait u, segregate in the homogeneous island population as described in the main text. Because no individual-level demographic heterogeneity is assumed withing groups (i.e., no class structure), each group can be characterised, from a population genetic state perspective, by the number of mutants that inhabit the group and we denote the set of all group genetic states with . The state of the entire homogeneous island population can thus be described with the vector where ϕi,τ is the frequency of groups with i mutants at demographic time τ. Since population size is constant in the homogeneous island population (mean fitness is one), the change in the average frequency Δpτ = pτ +1 − pτ of the mutant allele from demographic time τ to τ + 1 (over one life-cycle iteration) can be expressed as where W(um, u, ϕτ) is the marginal fitness (or lineage fitness) of the mutant allele. Namely, this is the expected number of offspring (including the surviving self) produced by a randomly sampled mutant individual from the collection of all mutants in the population when the distribution of mutants and resident across groups is ϕτ. This fitness can be written as the average where is the probability that a randomly sampled mutant resides in a group with i mutants (whence ∑i qi,τ = 1) and where ui = (um, u, i − 1) and uϕ = (um, u, ϕτ) are vectors that describe, from the perspective of an individual sampled in a group with i mutants, the distribution of traits among group neighbours (local individuals) and in the groups in the population at large (non-local individuals), respectively. The function is the individual fitness where denotes the space of frequency distributions on (i.e. the simplex in ℝN +1), and as such, gives the fitness of a mutant when among its neighbours i − 1 individuals have trait um and N − (i − 1) have trait u, and in the groups in the population at large, mutant and resident traits follow the ϕτ distribution. When the mutant is rare, eq. (A.5) reduces to the invasion fitness of the mutant in the homogeneous island population (Mullon et al., 2016, eq. 1).
A.3 Weak-selection approximation
We now study mutant gene frequency change Δpτ assuming small ϵ. To that end, it is useful to note that the fitness of a mutant in a group in state i can be approximated by writing it in terms of average traits as where specifies that all group neighbours have the same group average trait with pi = [(i − 1)/(N − 1)] being the frequency of mutants among neighbours, while specifies that all non-local individuals have the same average population trait with being the mutant frequency in the population. Eq. (A.6), which has been used for scalar traits (Rousset, 2004, p. 95), follows by Taylor expanding to the first-order about ϵ = 0 and using the chain rule (which applies to Gâteaux derivatives, e.g. eq. A.38 in Engel and Dreizler, 2013) to see that the coefficients of the Taylor series involve (at most) Gâteaux derivatives weighted by average allele frequencies. This is an instantiation of the so-called generalised law of mass action (Meszéna et al., 2005; Dercole, 2016) and is secured by the assumption that all individuals within a group that have the same trait are exchangeable (individuals are demographically homogeneous).
Because all non-local (mutant and resident) individuals are considered to have the same average trait (the same is true for group neighbours), is de facto independent of ϕτ. This allows us to further simplify the right-hand side of eq. (A.6) by writing where the function is the (average) fitness function introduced in section 2 of the main text, where we do not need to detail mutant distributions. Hence, is the fitness of an individual with trait um in terms of only the average trait of its group-neighbours and the average trait of individuals in the population.
A.3.1 Allele frequency change
Substituting eq. (A.5)–(A.7) into eq. (A.4), we can express the change in allele frequency as
Taylor expanding the fitness function to the first-order about ϵ = 0 yields where (fitness in a monomorphic population is one), whereby
We now apply eq. (A.1) and use the chain rule for Gâteaux derivatives (see e.g. eq. A.38 in Engel and Dreizler, 2013), which produces where all partial derivatives here and henceforth are evaluated at the resident value u. Since all the partial derivatives are independent of any allele frequency, they give the effects on any individual’s fitness stemming, respectively, from itself, its average neighbour, and an average population member by varying (infinitesimally) trait expression. Hence, the type of the actor is not relevant when evaluating the fitness effects and we can equivalently write eq. (A.11) as where we took into consideration that the sum of partial derivatives of the fitness function with respect to all of its arguments is zero (since population size is constant, see e.g. Rousset, 2004, p. 96 for scalar traits) and where we replaced the variables um, , and with u•, u∘, and u (note that we have already substituted the resident trait into the final argument). This will be useful subsequently as it makes clear that fitness effects are independent of individual types and thus allows us to focus attention on the fitness of a focal individual.
Substituting eq. (A.12) into eq. (A.10) gives
Because is the probability that, conditional on being a mutant, a randomly sampled neighbour is also a mutant, and is the probability that two randomly sampled individuals are both mutants (i.e., frequency of mutant pairs), eq. (A.13) can be written
Hence, to the first order in ϵ, the dynamics of Δpτ is a function of only direct and indirect fitness effects evaluated in the resident population, and the average frequency pτ and mutant-pair frequency pmm,τ. Further, we only need to study the dynamics of pmm,τ under neutrality (ϵ = 0) because any higher order terms contribute to O(ϵ2) in eq. (A.14). Eq. (A.14) thus generalises to function-valued traits, a standard result for scalar traits (first detailed in Roze and Rousset, 2003 and re-derived a number of times since, e.g., Roze and Rousset, 2004; Rousset, 2004; Roze and Rousset, 2008; Lehmann and Rousset, 2014).
A.3.2 Mutant-pair dynamics and relatedness
Using standard population genetic arguments for writing recursions of moments of allelic state (e.g., Jacquard, 1974; Nagylaki, 1992; Roze and Rousset, 2008), we have where P1(u) is the the fraction of pairs within groups (of two randomly sampled individuals in the same group without replacement) that descended from the same individual in the previous demographic time step (so that possibly one individual in the pair is the parent of the other in the presence of survival). The quantity P2(u) is the fraction of pairs that have descended from two distinct individuals in the previous demographic time period, and where all these coefficients are constant since they are evaluated under ϵ = 0 and thus depend at most on the resident trait u. The steady state can be solved explicitly and one gets where is the relatedness in a patch at the steady state, i.e. the fraction of pairs at the steady state that have a common ancestor in the patch. Owing to neutrality, this is also the probability that a randomly sampled neighbour of a randomly sampled focal individual, carries the same allele as the focal. More-over, the steady state changes continuously with a resident trait whenever P1(u) and P2(u) change continuously.
A.4 Timescale separation and the invasion implies substitution - principle
We can observe that the dynamics of mutant frequency pτ, given by eq. (A.14), is dominated by terms of order O(ϵ), while the mutant-pair frequency pτ,mm, given by eq. (A.15), is dominated by terms of order O(1). Hence, when ϵ is small, the variable pmm,τ undergoes significant fluctuations over the demographic time step Δτ = (τ + 1) − τ = 1 (one iteration of a life cycle) while pτ is (nearly) constant. By contrast, pτ changes significantly over a slower time interval Δτ* = ϵΔτ while pmm,τ is near its equilibrium value. We will refer to Δτ * as the evolutionary time step and the phenotypic effect ϵ scales the relationship between evolutionary and demographic time (i.e. one evolutionary time step contains demographic time steps, and equivalently we can write ).
Combining eq. (A.14) and eq. (A.15), we see that the dynamics of the mutant frequency is thus fully described by the coupled system in demographic time and by a change of variables the system in eq. (A.18) can be equivalently expressed in slow evolutionary time as
We now separate the demographic and evolutionary timescales (i.e. the timescales of pτ,mm and pτ) by letting ϵ → 0 and the two last systems above reduce, respectively, to and Eq. (A.20) says that in a purely fast demographic time (ϵ = 0) the mutant frequency pτ = p stays constant (“frozen in time”), while mutant-pair frequency pτ,mm changes. Eq. (A.21) says that in a purely slow evolutionary time (ϵ = 0) the mutant-pair frequency has reached the steady state (its solution given in eqs. A.16–A.17), while the mutant frequency changes (thus p is in a so-called quasi-steady state – it changes so slowly that it is considered a steady state in one timescale but a fluctuating variable in another). By performing the substitution and in eq. (A.21) the dynamics of mutant frequency in slow evolutionary time is where is given in eq. (A.17). Because in eq. (A.17) persists under small perturbation of the resident phenotype u (Section A.3.2), we can approximate eq. (A.22) with an equation in fast demographic time whenever ϵ is sufficiently small, i.e. where we used Δτ = 1. This gives us the invasion implies substitution - principle on the time of the demographic process we began with (e.g., eq. A.4). Therefore, we can re-write eq. (A.23) as with and by using the definition of Gâteaux derivatives in eq. (A.1) we can explicitly write which is the effect a focal individual has on itself if it were to express the mutant phenotype and which is the effect that all local individuals have on the focal individual if they were to express the mutant phenotype (where we have likewise substituted u in the second equality). Hence, we have derived eqs. (4)–(5) of the main text.
Appendix B First-order condition for state-dependent models
In this Appendix we derive the results of main text section 3. These derivation are based on standard approach of calculus of variations as used in optimal control theory (Liberzon, 2011; Weber, 2011), but our argument will somewhat differ from standard approaches insofar as we will not make use of the Hamilton-Jacobi-Bellman equation, since we are interested only in the necessary first-order conditions (as opposed to necessary conditions in the standard approach). As such, it is important to stress that throughout sections B.1 and B.3, where we derive the dynamics of the (neutral) reproductive value v(t, x(t); u) and the shadow value λ(t, x(t); u) = ∇v(t, x(t); u), we evaluate all the traits u•(t) = u∘(t) = u(t) and states x•(t) = x∘(t) = x(t) at some resident values. Only in section B.2 we look at small deviations from the resident population, by analysing the Gâteaux derivatives of the fitness function w(u•, u∘, u), where we show that we only need to analyse the (neutral) reproductive value v(t, x(t); u).
For conciseness of notation, we also use the following short-hand notation: for total derivatives w.r.t. time t we write , for partial derivatives we write ∂f (t, x(t))/∂x(t) ≡ fx(t, x(t)), and second-order partial derivatives we write ∂2f (t, x(t))/∂x(t)∂x(t) ≡ fxx(t, x(t)). As in the main text, we always use the gradient ∇ notation for gradient with respect to state variables x(t).
B.1 Reproductive value dynamics in a resident population
We here derive the dynamic equations for the reproductive value, eq. (20) of the main text, and an associated equation for the reproductive value that is useful for the other derivations.
B.1.1 Partial differential equation for the reproductive value function
Recall from eq. (19) of the main text that the reproductive value at time t is defined as where the argument u has been separated with the semicolon in order to emphasise that the controls have been fixed. Hence, for a given u and initial condition x(t) at time t the state trajectory x is fully determined (i.e. the solution to the ODE in eq. (16) exists and is unique). Because both functions u and x are now given functions, the reproductive value in eq. (B.1) is considered to be a function of time t and the initial condition x(t) only (strictly speaking it should be a function also of the final time tf).
In order to derive a dynamic equation of v(t, x(t); u), we consider a very small (but positive) time interval Δt and write eq. (B.1) as where Δx(t) = x(t + Δt) − x(t) is the change in the state variables over Δt and v(t + Δt, x(t)+ Δx(t); u) is the reproductive value at t + Δt and all arguments have been noted accordingly. Using a first-order Taylor expansion around t, we approximate the second term in the second line of eq. (B.2) as where is the partial derivative with respect to the first-argument while is the vector of partial derivatives with respect to the last argument. Now approximating the first term on the right-hand-side of eq. (B.2) by f (u(t), x(t), t)Δt, we can write eq. (B.2) as
Subtracting v(t, x(t); u) from both sides, dividing by Δt, letting Δt → 0, noting that Δx(t)/Δt → g(u(t), x(t)) (as Δt → 0), and rearranging leads to which is a PDE for v(t, x(t); u) with a final condition (f.c.) v(tf, x(tf); u) = Φ(x(tf)) for the fixed control path u.
An analogous equation (in the context of life-history evolution) to eq. (B.7) has been previously derived for any resident population in Metz et al. (2016, see eq. 71), but where the traits were assumed to be open-loop controls. It is important to stress here that eq. (B.7) is not a form of the so-called Hamilton-Jacobi-Bellman equation for the value function evaluated on the optimal control path of optimal control theory (e.g., eq. 3.7 Dockner et al., 2000, chapter 3.2, or eq. 5.10 in Liberzon, 2011 or eq. 3.16 in Weber, 2011), even though it has a similar structure. This is because (i) the reproductive value v is here defined to hold for any resident control schedule u (and is not evaluated at optimality like the value function), and (ii) the value function for our model cannot be computed from the reproductive value of the focal individual, but needs to be computed from the invasion fitness of the mutant, which is the value function in an evolutionary model (invasion fitness is given by eq. A.5 when the mutant becomes rare or eq. 38 in Day and Taylor, 2000, but in the latter case only open-loop traits were allowed).
B.1.2 Dynamic equation for shadow value (gradient of reproductive value)
Recall that the controls u(t) = d(t, x) are functions of x. We now derive the dynamic equation for the shadow value (gradient of reproductive values), which will be useful in later proofs. Taking the gradient of eq. (B.7) with respect to x(t), we have with f.c. ∇v(tf, x(tf); u) = ∇Φ(x(tf)), where are (column) vectors. Bringing all the terms to the same side and using the chain in rule to expand ∇ [∇v(t, x(t); u) · g(u(t), x(t))], we obtain where is the transpose operator, 0= (0, 0, 0) is a zero (column) vector and is the Hessian matrix of the reproductive value function and is a gradient of vector g.
Now total differentiating ∇v(t, x(t); u) with respect to time and using the property that u is fixed along a path, we get which, on substitution into eq. (B.11), and noting that the order of taking partial derivatives can be changed yields which will be used in the next section.
B.2 First-order condition and the Hamiltonian
We now turn to deriving the (point-wise) direct effect −c(t, u(t)) and the indirect effect b(t, u(t)), given by eqs. (23) and (24), as well as the point-wise selection gradient for closed-loop traits, eq. (26) and the dynamic equation for the shadow value, eq. (32).
In Appendix A we showed that we can express the direct effect (A.26) and indirect effect (A.27) as Gâteaux derivatives
In order to compute these Gâteaux derivatives we first re-write the fitness function w(u•, u∘, u) by augmenting to it a zero quantity containing of adjoint system of constraints (see e.g. Liberzon, 2011, p. 97) and we then we show how to decompose the direct effect −cη(u) and indirect effect bη(u) into point-wise direct effects −c(t, u(t)) and point-wise indirect effects b(t, u(t)), respectively, which allows to characterise the point-wise first-order condition (26).
B.2.1 Augmenting the fitness function with an adjoint system of constraints
Recall the individual fitness function eq. (13) of the main text and let us append to it a zero quantity where recalling (eq. 21 of the main text) that λ(t, x(t); u) = ∇v(t, x(t); u) is the shadow value (gradient of reproductive value function). We can integrate the last term in eq. (B.17) by parts and hence the eq. (B.17) becomes
B.2.2 Computing the Gâteaux derivatives of the fitness function
We now substitute eq. (B.19) into eq. (B.16) and use eq. (A.2), which yields that we can express the Gâteaux derivatives of the fitness function in terms of point-wise variations as follows and where the term λ(0, x(0); u) · x•(0) has disappeared under differentiation because it is a given initial condition, and where all the derivatives under the integrals are evaluated at u•(t) = u(t), x•(t) = x(t) and are point variations of x(t) caused by variations in u•(t) and u∘(t), respectively.
Finally, by definition of the shadow value and from eqs. (B.14) and (B.15) we have with f.c. λ(tf, x(tf); u) = ∇Φ(x(tf)). Hence, it follows from eqs. (B.14) and (B.15) that the terms (in brackets) multiplying and are zero. Therefore, we can write and where and are point variations of w(u•, u∘, u) caused by u•(t) and u∘(t) (recall eq. A.2) and with this we have derived eqs. (23) and (24) of the main text.
Substituting eqs. (B.27) and (B.28) into eq. (7) of the main text (substituting u* = d*(x*), where d*(x*) = (d*(t, x*(t)), d*(t, x*(t)), d*(t, x*(t))) and x* = (x*, x*, x*)) and using the definition of the Hamiltonian (eq. 25). This yields where λ*(t, x*(t)) = ∇v(t, x*(t); u*) and the evaluation can be expressed as u*(t) = d*(t, x*(t)) for closed-loop traits and as u*(t) = d*(t) for open-loop traits. Now recall that the dynamics of x*(t) can be obtained from eq. (16) when evaluating it along u*, which yields
The dynamics of λ*(t, x*(t)) = ∇v(t, x*(t); u*) can be obtained from eq. (B.15) and taking into account the Hamiltonian function and evaluating along u* we obtain for closed-loop control u* = d*(x*) path and open-loop control u* = d* path
In conclusion, we have derived the point-wise direct and indirect effects, given by eqs. (23) and (24) of the main text (given here by eqs. B.27 and B.28, respectively). In addition we derived the point-wise selection gradient eq. (26) of the main text (here, eq. B.30) along with the dynamic eqs. (32) and (39) on the shadow value function λ*(t, x*(t)) (here, eqs. B.32 and B.33) for closed-loop controls and open-loop controls, respectively. With this we have derived the first-order condition of uninvadability for closed-loop and open-loop controls.
B.3 Shadow value dynamics and the state feedback
In this section we derive (34)–(33) of the main text, namely we show the components of the shadow value dynamics and it depends on higher order derivatives of v*(t, x*(t)). To that end, it will turn out to be useful to explicitly express the control in closed-loop form u(t) = d(t, x(t)), unless we are explicitly evaluated at singular path u* = d*(x*). Substituting the Hamiltonian (B.29) into eq. (B.15) yields where
We can express the last gradient (B.37) as where gives all the components of the feedback effect of state variables on trait expressions.
Lets now further explore the elements of a matrix (B.39). From eq. (17) it follows that ∂d(t, x(t))/∂x•(t) = ∂d(t, x(t))/∂x•(t) = 0. From eqs. (17) and (18) it also follows that
Hence, we can express all the non-zero derivatives in matrix (B.39) as effects of the different actors changing their state on the focal recipient trait expression. Recall the static characterisation (30) from the main text, which holds for interior solutions (when selection gradient (B.30) vanishes)
Thus, from eq. (B.41) it follows we can express all the derivations of closed-loop contols d in terms of derivations of function D, i.e.
Substituting eqs. (B.40) and (B.42) into eq. (B.39) yields where all the derivatives in the matrix are evaluated at x•(t) = x(t) = x*(t). We can observe from eq. (B.43) that that all the non-zero elements of matrix (B.43) depend on higher-order derivatives of v*(t, x*(t)) and hence eq. (B.34) is not and ODE for λ(t, x*(t)) = ∇v*(t, x*(t)).