The moulding of intra-specific diversity by selection under ecological inheritance

Organisms continuously modify their environment, often impacting the fitness of future conspecifics due to ecological inheritance. When this inheritance is biased towards kin, selection favours modifications that increase the fitness of downstream individuals. How such selection shapes trait diversity within populations, however, remains poorly understood. Using mathematical modelling, we investigate the coevolution of multiple traits in a group-structured population when these traits affect the group environment, which is then bequeathed to future generations. We examine when such coevolution favours polymorphism as well as the resulting associations among traits. We find in particular that two traits become associated when one trait affects the environment while the other influences the likelihood that future kin experience this environment. To illustrate this, we model the coevolution of (a) the attack rate on a local renewable resource, which deteriorates environmental conditions, with (b) dispersal between groups, which reduces the likelihood that kin suffers from such deterioration. We show this often leads to the emergence of two highly-differentiated morphs: one that readily disperses and depletes local resources; and another that maintains these resources and tends to remain philopatric. More broadly, we suggest that ecological inheritance can contribute to phenotypic diversity and lead to complex polymorphism.


Introduction
By consuming, polluting or engineering, most if not all organisms modify and transform the environment they live in. Via such modifications, an individual impacts its own fitness as well as the fitness of conspecifics who share its environment. These fitness effects can further extend to future generations when environmental modifications are transmitted to offspring under what is referred to as ecological inheritance (Laland et al., 1996;Odling-Smee et al., 2003;Bonduriansky, 2012;Bonduriansky and Day, 2020). Pseudomonas aeruginosa for example, release long-lasting iron-scavenging siderophores, thus benefiting close-by conspecifics in the short and long term (Ratledge and Dover, 2000;Imperi et al., 2009) including individuals that are not living yet (Kümmerli and Brown, 2010). Many plants continuously modify their substrate via plant-soil feedback (Ehrenfeld et al., 2005; e.g. by producing and absorbing tannin Kraus et al., 2003), which can affect downstream generations. Humans construct complex infrastructures, such as schools, hospitals, or whole cities, which can last and be enjoyed for centuries. Ecological inheritance of course can also be harmful, for instance when individuals over-consume a slowly renewable resource or release pollutants that are difficult to degrade.
Perhaps most notably, the current climate change induced by human activities will likely affect many generations to come (Collins et al., 2013).
For natural selection to shape the inter-generational ecological effects of a trait, the genes underlying this trait must be statistically associated to the environment they transform (Dawkins, 1982(Dawkins, , 2004Brodie, 2005;Lehmann, , 2008. This association entails that an environmental modification is more likely to be experienced by individuals in the future that carry the same genes as the individual who caused the initial modification. One simple and ubiquitous way for a gene-environment association to emerge is via spatial structure. When the population is subdivided and dispersal between subpopulations is limited, individuals in the same local environment are more likely to share the same genes than individuals sampled at random in the population (i.e. they are related; Hamilton, 1964;. This is true of individuals living at the same but also at different generations . As a result, the inter-generational ecological modification made by an individual preferentially affects the fitness of its future relatives when dispersal is limited (Lehmann, 2008).
How directional selection steers the gradual evolution of traits with inter-generational ecological effects under limited dispersal has been extensively studied (Lehmann, , 2008Sozou, 2009;Arnoldi et al., 2020;. One of the main insights from this theory is that populations in which dispersal is more limited are more likely to evolve traits that are costly to the individual but yield delayed ecological benefits, such as the preservation of a common good (Silver and Di Paolo, 2006;Lehmann, , 2008Sozou, 2009;Krakauer et al., 2009;Arnoldi et al., 2020;; for review: Estrela et al., 2019). While directional selection can explain diversity between isolated populations or species, it is not sufficient to investigate diversity within populations Dercole and Rinaldi, 2008). But intra-specific diversity in ecologically-relevant traits is common and has potentially significant environmental effects (Bolnick et al., 2011;Des Roches et al., 2018). How trait diversity within species is moulded by natural selection under ecological inheritance remains unclear. Agent-based simulations suggest that polymorphism in traits that have long lasting ecological effects can emerge in spatially-structured population (e.g. Silver and Di Paolo, 2006;Han et al., 2006;Behar et al., 2014;Joshi et al., 2020) but a more general theory to understand such emergence is still wanting.
In this paper, we extend current theory to understand when gradual evolution leads to polymorphism in traits that have long-lasting environmental effects. We investigate mathematically the coevolution of multiple traits in a group-structured population when these traits affect the group environment, which is then passed down to future generations. We use our model to investigate the type of between-traits within-individuals correlations that are favored by selection when polymorphism emerges. Our analyses reveal in particular that two traits tend to be correlated when one modifies the environment in a long-lasting manner while the other influences the likelihood that future relatives experience this environmental modification. To illustrate this, we model the coevolution of the attack rate on a local renewable resource, which deteriorates environmental conditions, with dispersal, which reduces the likelihood that relatives suffer from such deterioration. Beyond this specific example, we discuss the other pathways revealed by our model via which selection favours between-traits association under ecological inheritance, with potential implications for dispersal and behavioural syndromes, phenotypic plasticity, and niche construction. finally (iv) offspring compete locally in each patch for N spots. Generations are thus non-overlapping but as we detail next, individuals of different generations can interact with one another indirectly via the environment.
To allow for individuals to transform their environment in a way that can be passed onto future generations, we write the state t +1 of a patch at a generation t +1 as a function of the traits expressed by individuals in that patch at the previous generation t , as well as the previous state; specifically as t +1 = F (z 1 , z 2 , . . . , z N , ), where the vector z i denotes the phenotype expressed by individual i ∈ {1, . . . , N } living in the patch at generation t and is the state of the patch at generation t ( Fig. 1B for diagram). The environmental dynamics given by eq. (1) unfold even in the absence of genetic variation (i.e. when z i = z for all i ). We assume that these dynamics converge to a stable equilibrium, meaning that in a population monomorphic for z (i.e. where all individuals in the population express the same phenotype z), all patches are eventually characterised by the same environmental equilibrium, which we denote byˆ (z) (sometimes denotedˆ for short). This equilibrium satisfiesˆ = F (z, z, . . . , z,ˆ ), as well as the stability condition, Here and hereafter, all derivatives are estimated where all individuals express the same phenotype z and all patches are characterised by the associated ecological equilibriumˆ (z). So all derivatives should be seen as functions of the evolving traits z. The quantity ∂F /∂ gives the effect of a perturbation in the state of a patch at one generation on the state at the next generation. This derivative can thus be thought of as the effect of ecological inheritance: the greater the absolute value of ∂F /∂ is, the more consequential an environmental modification is to future generations.
The consequences of an environmental modification also depend on how the fitness of individuals varies with their local environment. To capture this in a general way, we write the fitness of an individual from a given patch (say individual i ∈ {1, . . . , N }), which is defined as its expected number of successful offspring over one full iteration of the life-cycle, as where z i = (z i 1 , . . . , z i n ) is the phenotype expressed by this individual i (with z i p the value expressed for trait p); z −i = (z 1 , . . . , z i −1 , z i +1 , . . . , z N ) collects the phenotypes expressed by its N − 1 patch-neighbours; and is the environmental state of its patch. This formulation allows for the fitness of an individual to depend on interactions between its own traits, the traits of its neighbours, and the environment (Fig. 1B).

Evolutionary dynamics
We are interested in the coevolution of the n traits, and particularly in whether such coevolution leads to polymorphism and highly differentiated types. To investigate this, we assume that traits evolve via the input of rare genetic mutations with weak phenotypic effects. Under these assumptions, evolutionary dynamics take place in two steps (Metz et al., 1995;Dercole and Rinaldi, 2008), which can be inferred from the invasion fitness (i.e. the geometric growth rate), W (ζ, z) of a rare allele coding for a deviant phenotype ζ = (ζ 1 , . . . , ζ n ) in a population otherwise monomorphic for z = (z 1 , . . . , z n ) (Tuljapurkar, 1989;Ferriere and Gatto, 1995;Tuljapurkar et al., 2003;Otto and Day, 2007).

Directional selection
First, the population evolves under directional selection whereby selected mutants rapidly sweep the population before a new mutation arises, so that the population can be thought of as "jumping" from one monomorphic state to another. Trait dynamics during this phase are characterised by the selection gradient vector, which points in the direction favoured by selection in phenotypic space (the space of all possible phenotypes, here R n or a subset thereof), i.e.
Specifically, selection favours an increase in trait p when s p (z) > 0, and conversely a decrease when s p (z) < 0.
The population may thus eventually converge to a singular phenotype, z * = (z * 1 , . . . , z * n ), which is such that each entry of the selection gradient is zero at z * , i.e. s(z * ) = 0.
For the population to converge to z * , it is sufficient that the so-called Jacobian matrix, with (p, q) entry (J(z * )) pq = ∂s p (z) is negative-definite (i.e. the symmetric real part of J(z * ), [J(z * ) + J(z * ) T ]/2, has only negative eigenvalues, . Such a phenotype is an attractor of evolutionary dynamics and typically referred to as (strongly) convergent stable ).

Disruptive, stabilising and correlational selection
Once the population expresses a convergence stable phenotype z * , selection is either: (i) stabilising, in which case any mutant different from z * is purged so that the population remains monomorphic for z * ; or (ii) disruptive, favoring alternative phenotypes leading to polymorphism (Metz et al., 1995;Dercole and Rinaldi, 2008). Whether selection is stabilising or disruptive when n traits coevolve depends on the so-called Hessian matrix , whose (p, q)-entry is given by On its diagonal, H(z * ) indicates whether selection is disruptive on each trait (Lande and Arnold, 1983). Specifically, when h pp (z * ) > 0, selection on trait p is disruptive when p evolves in isolation from the other traits (i.e. when all the other traits are fixed). Conversely, selection on trait p is stabilising when h pp (z * ) < 0. The off-diagonal elements of H(z * ), meanwhile, give the strength of correlational selection, h pq (z * ), on each pair of traits p and q (Lande and Arnold, 1983). These indicate the type of among-traits associations that selection favours within individuals: when h pq (z * ) > 0, selection favours a positive association (or correlation) among traits p and q, and a negative association when h pq (z * ) < 0. With n traits coevolving, selection in a population expressing a convergence stable phenotype z * is stabilising when the leading eigenvalue of H(z * ) is negative, and disruptive when the eigenvalue is positive .

Selection under ecological inheritance
The selection gradient (s(z), eq. 6) and Hessian (H(z * ), eq. 11) are defined in terms of the invasion fitness W (ζ, z) of a genetic mutant, which can be seen as a measure of fitness at the level of the gene that codes for a deviant phenotype. To reveal selection on the inter-generational effects that an individual expressing such a deviant phenotype has, one needs to shift from a gene-to an individual-centred perspective (i.e. express selection in terms of the individual fitness function eq. 4 rather than W (ζ, z), Lehmann, , 2008. This shift in perspective has been achieved for the selection gradient, thus characterising directional selection on inter-generational effects (Lehmann, , 2008. We summarize these previous findings in the context of our model in Box I. Our aim here is to characterise the Hessian matrix and thus correlational and disruptive selection in terms of individual fitness and inter-generational effects. Our approach, which is explained in Appendix A, is based on a second-order sensitivity analysis of invasion fitness (from eq. 11). Our mathematical derivations can be found in Appendix B. We summarise our results in the next section.

Correlational selection under ecological inheritance
We first show in Appendix B.1 that correlational selection on traits p and q (or disruptive selection on trait p when p = q) can be decomposed as the sum of two terms, which we detail in sections 3.1 and 3.2, respectively.

Intra-generational fitness effects
The first term of eq. (12), h g,pq (z), corresponds to correlational selection due to the intra-generational effects of traits on fitness (so ignoring inter-generational ecological effects on fitness and focusing on genetic effects on fitness only, hence the g in the subscript of h g,pq (z)). We show in Appendix B.2 that it can be expressed as which is equivalent to the coefficient of correlational selection derived in previous papers where traits have intra-generational effects only (eqs. 13a-b-c in Mullon et al. 2016, eqs. 7a-b-c in Mullon andLehmann 2019; see also Ajar, 2003;Wakano and Lehmann, 2014 for the case where p = q). We refer interested readers to these papers for a detailed interpretation of eq. (13), but briefly this equation can be read as the sum of three terms.
The first, ∂ 2 w i /(∂z i p ∂z i q ), is the effect of joint changes in traits p and q of the focal individual on its own fitness. This cross derivative quantifies the synergistic (or "multiplicative" or "interaction") effects of traits on fitness: when positive, it tells us that fitness increases more when both traits p and q change in a similar way (i.e. both increase or both decrease, so that p and q have complementary effects on fitness); conversely when negative, fitness increases more when both traits p and q change in opposite ways (i.e. one increases and the other decreases, so that p and q have antagonistic effects on fitness). In a well mixed population, correlational selection depends only on such "direct" synergy (i.e. synergistic effects of focal traits on focal fitness, so that Lande, 1979;Phillips and Arnold, 1989;. The rest of eq. (13) is due to limited dispersal and the interactions among contemporary relatives (i.e. living at the same generation) that result from such limitation. The remainder of the first of line of eq. (13) consists of what can be referred to as "indirect" synergistic effects (Mullon and Lehmann, 2019). These are effects of joint changes in traits p and q on focal fitness, where at least one of these changes occurs in a neighbour to the focal, weighted by relevant relatedness coefficients. Specifically, ∂ 2 w i /(∂z i p ∂z j q ) is the effect on focal fitness of joint changes in trait p in the focal and in trait q in a neighbour (indexed j ), and ∂ 2 w i /(∂z j p ∂z j q ), of changes in traits p and q in the same neighbour j . Both are weighted by r • 2 , which is the pairwise relatedness coefficient: the probability that two individuals randomly sampled in a patch under neutrality are identicalby-descent. Throughout, quantities with a superscript • are evaluated under neutrality, i.e. in a population that is monomorphic for z. Quantities with a superscript • may thus depend on z but we do not write such dependency to avoid notational clutter. The first line of eq. (13) also features ∂ 2 w i /(∂z j p ∂z hq ), which is the effect on focal fitness of joint changes in different neighbours (indexed j and h with j = h). This is weighted by r • 3 , which is the coefficient of threeway relatedness, i.e. the probability that three individuals randomly sampled in a patch under neutrality are identical-by-descent. Finally, the second line of eq. (13) consists of the product between the indirect fitness effect of one trait (∂w i /∂z j p ), and the effect of the other on pairwise relatedness (∂r 2 /∂z q ), which quantifies the effect of a trait change on the probability that a rare mutant individual expressing this change interacts with another mutant in the same patch (eq. B-17 for formal definition). This reveals in particular that selection favours an association among two traits when one trait improves the fitness of neighbours, and the other trait increases the probability that these neighbours are relatives Mullon and Lehmann, 2019).

Inter-generational effects: three pathways for correlational selection via ecological inheritance
The second term of eq. (12), h e,pq (z), is correlational selection due to the inter-generational ecological effects of traits on fitness (hence the e of h e,pq (z)) and thus constitutes the more novel part of our results. We find that this coefficient can be decomposed as three terms, corresponding to three pathways through which correlational selection can act owing to ecological inheritance (eq. B-20 in Appendix for decomposition).

Environmentally-mediated synergy
The first pathway, can be thought of as the synergistic effects of traits on fitness via the environment (hence the e×e subscript).
Each of the two terms of eq. (15) reveals one way such synergy can come about. The first, labelled "synergy on ", is the most intuitive. It consists of the product between: (i) ∂w i /∂ , which is the fitness effect of an environmental change; and (ii) E • ∂ 2 /(∂z p ∂z q ) , which is the expected effect of a change in both traits p and q in all the local ancestors of a focal individual on the environment experienced by this focal, where expectation is taken over the neutral distribution of local genealogies of the focal (i.e. the distribution of the number of local ancestors of the focal at each past generation when the population is monomorphic, hence the superscript ). This expectation quantifies the inter-generational environmental modifications made by a lineage of individuals that express a joint change in traits p and q. The first term of eq. (15) says that selection will associate two traits p and q when these have synergistic effects on the environment (according to E • ∂ 2 /(∂z p ∂z q ) ) that in turn affects fitness (i.e. ∂w i /∂ = 0). As an example, consider a scenario where is the amount of a common good that can be transferred between generations (e.g. pyoverdine in siderophore-producing bacteria), so that fitness increases with such an amount (∂w i /∂ > 0).
Let trait p be the production of this common good and q its protection against degradation or expropriation (so that depends on the product between both traits). Traits p and q would have complementary effects on in this example (E • ∂ 2 /(∂z p ∂z q ) > 0). The first term of eq. (15) tells us that in this case, selection favours a positive association between both traits, i.e. that individuals who tend to participate more in the common good also tend to protect it more.
The second term of eq. (15), labelled "non-linear fitness effects of ", indicates that correlational selection can also associates traits that influence the environment independently of one another, according to E • ∂ /∂z p × ∂ /∂z q , which is the expectation of the product between the effect of a change in trait p in all the local ancestors of a focal individual on the environment experienced by this focal, and such an effect of a change in trait q. These independent ecological effect lead to correlational selection when the environment has non-linear effects on fitness (so ∂ 2 w i /∂ 2 = 0). In the common good example introduced in the previous paragraph for instance, selection for a positive association among production and protection would be strengthened where fitness accelerates with the amount of common good (∂ 2 w i /∂ 2 > 0) and weakened where it decelerates (∂ 2 w i /∂ 2 < 0). This is because such non-linearity in fitness creates synergy among traits via their ecological effects.
Correlational selection thus emerges when traits have synergistic effects on the environment, or when traits influence the environment which in turn has non-linear effects on fitness. The strengths of these two effects depend on the inheritance of ecological effects from local ancestors, as quantified in eq. (15) by E • ∂ 2 /(∂z p ∂z q ) and E • ∂ /∂z p × ∂ /∂z q , respectively. We expand those in terms of the local genealogy of a focal individual in Appendix B.3.1. We show in particular that,

Genes-environment interactions
The second pathway through which correlational selection can act owing to ecological inheritance (second term of eq. 14) is given by This pathway emerges when fitness is influenced by the interaction between the environment and traits (or the genes coding for these traits, hence the g×e subscript). Specifically, the first term of eq. (17), labelled "direct g × e interactions", consists of the interaction between the environment and the expression of one trait by the focal (∂ 2 w i /(∂z i p ∂ )), multiplied with to the inter-generational ecological effects of the other trait (E • ∂ /∂z q , which is the expected effect of a change in trait q in all the local ancestors of a focal individual on the environment experienced by this focal and is given by eq. I.B in Box I). To understand the implications of this, consider again a scenario where is some inter-generational common good and one trait, say q, is the production or maintenance of this common good (so that E • ∂ /∂z q > 0). The other trait p, however, now is some costly competitive trait whose cost depends on environmental conditions (e.g. horn length in beetles; Emlen, 1994), so that individuals living in better patches (i.e. with greater ) pay a lower expression cost (leading to ∂ 2 w i /(∂z i p ∂ )) > 0). The first term of eq. (17) in this example would be positive, indicating that it favours a positive association between traits p and q, i.e. individuals who participate more to the common good also express larger competitive traits. This is because individuals who contribute more to the common good also tend to live in better habitats (owing to limited dispersal and past relatives contributing to the environment, see eq. I.B). They can thus also afford to express larger competitive traits.
The second term of eq. (17), labelled "indirect g × e interactions", consists of the interaction between the environment and the expression of one trait by a neighbour of the focal (∂ 2 w i /(∂z j p ∂ )), multiplied to , which is the expected product between: (i) the ecological effects of the other trait (∂ /∂z q ); and (ii) the frequency of relatives among the neighbours of the focal, R, which here should be seen as a ran- According to eq. (17), this favours a negative correlation between investing into current members through helping (via p) and investing into future patch members through patch maintenance (via q), i.e. individuals that invest more into future relatives invest less in present relatives and vice-versa.
The relevance of indirect gene-environment interactions for correlational selection depends on E • [R ×∂ /∂z q ], which we show is given by with r • 3,h is the probability that two individuals living in the same generation, plus a third individual h generations ago, all randomly sampled from the same patch are identical-by-descent (eq. B-41 for definition; Appendix B.3.2 for derivation eq. 18). This probability indicates the likelihood for an individual to influence the environment of at least two downstream relatives living h generations away in the same patch and thus interacting socially with one another. Accordingly, the greater r • 3,h , the more influence a modification to the patch environment can have on social interactions, and thus the more relevant indirect gene-environment effects are to selection (Supplementary Fig. S2 for a graphical interpretation of eq. 18).

Biased ecological inheritance
The third and final pathway for correlational selection can target (h r×e,pq (z) in eq. 14) can be expressed as where E (1) q ∂ /∂z p is the expected effect of a change in trait p in all the local ancestors of a focal individual on the environment experienced by this focal, where expectation is taken over the perturbation of the distribution of local genealogies owing to a change in trait q (hence the superscript (1) and subscript q in , which is expectation over the neutral distribution; eq. B-6 for a formal definition of E (1) p [.]).
More intuitively perhaps, E (1) q ∂ /∂z p quantifies how trait q influences the way an environmental modification driven by a change in trait p is inherited. This can be seen more explicitly when we unroll E (1) q ∂ /∂z p over its inter-generational effects: (Appendix B.3.3 for derivation). Here, ∂r 2,h /∂z q is the effect of a change in trait q on the relatedness between individuals living in the same patch separated by h generations. To understand this effect better, consider a focal individual who expresses a change in trait q relative to a resident. This ∂r 2,h /∂z q quantifies how such a trait change influences the probability that an individual randomly sampled in that same patch h generations ago is identical-by-descent to the focal, relative to the probability that two resident individuals separated by h generations are identical-by-descent under neutrality (i.e. relative tor • 2,h , Appendix. D.2 for details). So when for instance ∂r 2,h /∂z q > 0, individuals that express greater values of trait q are more likely to transmit environmental modifications to their kin. What eq. (20) substituted into (19) in turn says, is that correlational selection will associate this trait q with another trait p when trait p leads to inter-generational environmental modifications that increase fitness (i.e. so that ∂F /∂z i p > 0 in eq. 20 and ∂w i /∂ > 0 in eq. 19; Fig. S3 for diagram). We explore the potential implications of such correlational selection in section 4 with a specific example.

Summary
To summarize our findings so far, we have identified three main pathways via which traits can be linked by correlational selection under ecological inheritance (eq. 14). Each of these pathways can be expressed in terms of how a trait change in an individual causes an environmental modification that in turn influences the fitness of future relatives (eq. 16, eq. 18, eq. 20, eq. I.B in Box I, and eq. B-36 in Appendix B.3.1). This perspective not only offers a clear view on correlational selection on inter-generational effects, it also allows us to efficiently compute the Hessian matrix and thus investigate the conditions that lead to polymorphism in traits that influence the environment in the long term. In fact, what remains to be characterized for such computation are the various relevant relatedness coefficients (e.g.r • 3,h,h ) or their perturbation due to selection (e.g. ∂r 2,h /∂z q ).

Joint evolution of dispersal with the attack rate on a local renewable resource
We now go over a specific model that looks at the joint evolution of two ecologically-relevant traits: (i) the rate of attack or consumption of a resource within patches; and (ii) dispersal between patches. The evolution of both traits have been studied in isolation in multiple studies (for dispersal: Hamilton and May (1977); Taylor and Frank (1996); Gandon and Michalakis (1999); Gandon and Rousset (1999); Ajar (2003); for resource ex-

A resource-consumer model in a patch-structured population
We first specify a resource-consumer scenario and lay the building blocks of our analysis.
Traits. Each individual is characterised by two traits: (i) the rate z 1 of attack on a local resource in a patch (during step (i) of the life-cycle; see section 2.1); and (ii) the probability z 2 of juvenile dispersal, which we assume is costly with a probability c d of dying during dispersal (step (iii) of the life-cycle; see section 2.1).
Environment. The environmental state, t , of a patch at generation t is the density of the resource in that patch before consumption. From generation t to t + 1, this density in a patch where the average attack rate is z 1 changes according to the map (eq. 1), t +1 = F (z 1 , z 2 , . . . , z N , ) = e r (e r − 1) + e τ 1 Nz 1 , where the parameter r > 0 is the rate of renewal of the resource per generation,z 1 = N j =1 z j 1 /N is the average attack rate in the patch, and τ 1 > 0 modulates the effect of the attack rate on resource density (the greater τ 1 is, the more the resource density decreases due to consumption by individuals in the patch, where the total attack rate is given by Nz 1 ). Eq. (21) emerges from an explicit model of resource dynamics that happen within generations (Box II). Solving for equilibriumˆ = F (z, z, . . . , z,ˆ ) (eq. 2) and checking its stability (using eq. 3), we find that in a population of consumers monomorphic for z = (z 1 , z 2 ), the resource density stabilises for as long as the renewal rate r is large enough compared to consumption (specifically when r > N z 1 τ 1 , Appendix E.1 for details). Otherwise, the resource goes extinct. We focus our attention on the case where the resource is maintained (so where r > N z 1 τ 1 ).
Fitness. An individual uses the resources it has collected to produce offspring, favoring increased attack rate.
Increasing one's attack rate may however also be costly, for instance due to lost opportunities or increased risk.
To reflect these benefits and costs, we assume that the fecundity of a focal individual i with attack rate z i 1 in a patch where its neighbours express ratesz −i ,1 = (z 11 , z 21 , . . . , z (i −1)1 , z (i +1)1 , . . . , z N 1 ) and the resource density is , is given by This equation consists of the product between the amount of resources consumed by the focal individual (Box II, eq. II.D for how this amount emerges from a resource-consumer dynamical model) and the individual cost of consumption. From eq. (23), fecundity in a population monomorphic for z is where the equilibrium amount of resourcesˆ is given by eq. (22). Under the island model of dispersal (e.g. eq.
6.5 in , the fitness of a focal individual can then be written as where the first term, w p,i , is the philopatric component of fitness, i.e. the expected number of offspring that establish in their natal patch (consisting of the ratio of offspring of the focal that remain in their natal patch to the total number of offspring that enter competition in that patch); and the second, w d,i , is the dispersal component, i.e. the expected number of offspring that establish in non-natal patches (consisting of the ratio of offspring of the focal that leave their natal patch to the total number of offspring that enter competition in another patch).
Relatedness. In addition to the resource map (eq. 21) and fitness (eq. 25), the analysis described in section 3 relies on several relatedness coefficients under neutrality and selection (Table 1). Those in turn depend on the backward probability of dispersal, m (i.e. the probability that in a monomorphic population, a randomly sampled individual is an immigrant). In this model where dispersal z 2 is evolving, such probability is given by which consists of the ratio of the number of individuals dispersing into a patch to the total number of individuals that compete for breeding spots.
Equations (21)-(26) (together with Table 1) are all the ingredients necessary to perform the analysis of disruptive selection laid out in section 3 (and Box I for directional selection). Details can be found in Appendix E whose main results we summarize below, paying special attention on whether polymorphism emerges.

Directional selection: convergence to intermediate dispersal and attack traits
Substituting eqs. (21)-(25) into the selection gradient eq. (I.A) and analysing this gradient according to the approach described in Box I, we find that the population first converges to a singular strategy z * = (z * 1 , z * 2 ) for both traits. In line with previous results (e.g., Hamilton and May, 1977;Ajar, 2003), the singular value z * 2 for dispersal reads as, which decreases with the cost c d of dispersal and with the number N of individuals per patch ( Fig. 2.A.i, Appendix E.2.2). The singular value z * 1 for the attack rate, meanwhile, satisfies the following equality, where λ = 1 − m(z * ) is the probability that an individual is philopatric in a population monomorphic for z * , which thus depends on the evolved dispersal strategy z * 2 (found by substituting eq. 27 into eq. 26). Solving eq. (28) for z * 1 numerically, we find that when dispersal is costly (i.e. c d is large) the population evolves lower attack rate (z * 1 is small, Fig. 2.A.ii). This is because when c d is large, dispersal evolves to be limited (eq. 27). As a result, inter-generational relatedness becomes large (i.e.r • 2,h in eq. (I.B) becomes large), which in turn favors the evolution of restraint so that individuals leave more resources to their downstream relatives (Lehmann, 2008). Conversely, when dispersal cost c d is low, dispersal evolves to be large, which causes inter-generational relatedness to drop, and thus the evolution of high attack rates as consumption no longer affects relatives.
Another relevant parameter to the evolution of the attack rate is the rate r of resource renewal. In particular, high renewal rate r leads to more exploitative strategies (i.e. greater z * 1 , Fig. 2.A.ii). This is because when r is large, ecological inheritance is weak as the resource renews itself quickly after consumption. Individuals can thus consume more resources while incurring little cost to their descendants and the descendants of their relatives (i.e. ∂F /∂ is small in eq. I.B).
Plugging the singular strategy for attack rate z * 1 (given by eqs. 27-28) into the equilibrium resource density (eq. 22) allows us to understand the effect of evolutionary dynamics on the resource. As expected from the previous paragraph, high dispersal cost c d leads to higher resource density as consumption evolves to be more restrained ( Fig. 2.A.iii). Conversely, low dispersal cost reduces resource density at evolutionary equilibrium. If in addition to low dispersal cost, the rate r of renewal is also low, then the resource may in fact go extinct as it is unable to renew itself fast enough in the face of increased consumption ( Fig. 2.A.iii).

Disruptive and correlational selection: the emergence of dispersive overconsumers and sessile scrimpers
To determine whether the population becomes polymorphic once it has converged to the singular phenotype z * = (z * 1 , z * 2 ) (given by eqs. 27-28), we substitute eqs. (21)-(25) into eqs. (12)-(19) and perform the analysis described in section 2.2.2 (Appendix E for details). Our analysis indicates that when either trait evolves in isolation from the other, selection is always stabilising. In other words, when dispersal evolves alone or the attack rate evolves alone, the population remains monomorphic. But when both traits coevolve, selection is disruptive when the dispersal cost c d and resource renewal rate r are intermediate (gray region in Fig. 2.B). In contrast, when c d and r are high, selection is stabilising (white region in Fig. 2.B). Eco-evolutionary dynamics in our model can thus lead to three outcomes depending on the dispersal cost c d and resource renewal rate r : (i) when c d and r are both low, consumer evolution leads to resource extinction (so that if the consumer relies entirely on this resource, it would also go extinct); (ii) when c d and r are high, the consumer remains monomorphic for dispersal and attack rate such that the resource is maintained; (iii) when c d and r are intermediate, the consumer becomes polymorphic for both traits.
A closer look at correlational selection on dispersal and attack rate, h 12 (z * ), reveals two things about the nature of the polymorphism. The first is that since h 12 (z * ) > 0 is always positive (Appendix E.2.3), the polymorphism should be characterised by a positive association between the two traits. We thus expect two types to emerge: (i) one that consumes and disperses more ("dispersive overconsumers"); and (ii) another that consumes and disperses less ("sessile scrimpers"). The second relevant aspect of this polymorphism that our analysis shows is that the term that mainly contributes to correlational selection is the one capturing biased ecological inheritance, h r×e,12 (z * ) (eq. 19-20, Appendix E.2.3 for details). More specifically, it is the combination of negative effects of dispersal on inter-generational relatedness, ∂r 2,h /∂z q < 0, and of consumption on the environment, ∂F /∂z i 1 < 0, that leads correlational selection to be positive (owing to eq. 20). This indicates that polymorphism in our model is due to a positive association between dispersal and attack rate, leading scrimpers to preferentially inherit the patch they maintain from relatives, and overconsumers to preferentially inherit the patch they deplete from non-relatives.

The rise and fall of overconsumption
To check our mathematical analyses, we ran individual-based simulations under conditions that lead to stabilising and disruptive selection (Appendix E.4 for simulation procedure). As predicted, the population gradually converges to the singular strategy for dispersal and consumption in both cases (Fig. 3AB). Concomitantly, the resource density goes to its equilibriumˆ given by eq. (22) (Fig. 3CD). Where selection is stabilising, the population remains monomorphic for both traits (i.e. unimodally distributed around this strategy, Fig. 3.A), and the resource density within patches remains distributed around the ecological equilibriumˆ ( Fig. 3.C). In contrast, two morphs that correspond to dispersive overconsumers and sessile scrimpers emerge and become increasingly differentiated where selection is disruptive ( Fig. 3.B). In this case, the distribution of resource densities becomes bimodal so that the population consists of patches of either low (i.e. with small ) or high quality (i.e. with larger , Fig. 3.D). This is due to variation in morph composition among patches such that patches with a greater frequency of overconsumers are typically of low quality (Fig. S4).
When two morphs coexist, our simulations reveal eco-evolutionary cycles whereby the population alternates between generations during which scrimpers are common and resources are plentiful, and generations during which overconsumers are more abundant and resources are scarce ( Fig. 4.AB). With evolution favouring increasingly differentiated morphs, the amplitude of these cycles increases as in particular, overconsumers have an increasingly detrimental effect on their patch (Fig. 4AB). In fact, there comes a time when overconsumers are so rare in periods of low-abundance that they stochastically go extinct (i.e. they become so rare that by chance, none reproduce). The population is then monomorphic for the scrimper morph, which in the absence of the overconsumer morph is counterselected. The population thus converges once again to the singular strategy (given by eqs. 27-28), whereupon polymorphism emerges and collapses again and again ( Fig. 4.C).

Discussion
Here, we have extended current theory on the gradual evolution of traits under ecological inheritance to understand how phenotypic variation within populations is moulded by disruptive and correlational selection.
Our analyses indicate that ecological inheritance opens three pathways for correlational selection to shape polymorphism and create associations among traits (eq. 14).
The first of these pathways associates traits that have synergistic effects on fitness via the environment (h e×e,pq (z), eq. 15). This is relevant to situations where several traits jointly contribute to the local environment which can be inherited by future generations of relatives. In a naked mole-rat colony for instance, a well-maintained burrow rests on multiple tasks, such as gnawing at the tunnel walls, digging, sweeping substrate, bringing in material to build the nest. According to our model, the tendencies to perform these different tasks may become linked within individuals in two cases: (i) when traits have multiplicative effects on the environment (first term of eq. 15); or (ii) when the environment has non-linear effects on fitness (second term of eq. 15). In the absence of interference among characters, we may expect different traits with beneficial consequences for the environment, such as those contributing to a burrow, to have positive multiplicative effects on the environment. In this case, correlational selection favours a positive association between traits and within individuals (under case (i) above). Interestingly, such a positive association has been reported in the naked mole-rat among nest building and burrowing (Siegmann et al., 2021), so that rather than specialising in either of these two different tasks, individuals contribute either more or less to both within colonies. In fact, task specialization and labour division appears to be absent from many animal societies (controlling for sex-or condition-specific effects, Kitchen and Packer, 1999;e.g . This would happen for instance when fitness saturates or plateaus with environmental quality, which is a natural assumption for most models (Foster, 2004 for a review on the diminishing returns of helping).
The two other pathways for correlational selection, h r×e,pq (z) and h g×e,pq (z), favour the association between traits that lastingly modify the environment with traits that belong to two broad classes, respectively. One class consists of characters that influence the likelihood that environmental modifications are experienced by downstream relatives (h r×e,pq (z), eq. 19). This should be especially relevant for traits that underlie gene flow as gene flow is the main driver of relatedness. We showed for instance in section 4 that selection readily links dispersal with the attack rate on a local resource, leading to the coexistence of two morphs: a dispersive morph that consumes more and depletes the local resource, and a sessile morph that consumes less and maintains the resource. This positive association between dispersal and attack rate allows overconsumers to preferentially bequeath the patch they deplete to non-relatives, and more frugal individuals to preferentially leave the patch they maintain to relatives. Other than the attack rate, variation in handling time or feeding efficiency can also affect resource density (Holling, 1959;Rueffler et al., 2006) and may thus also become linked with dispersal.
The expectation from our model is that individuals that have a more negative impact on resource density, such as with shorter handling time or greater feeding efficiency, evolve to disperse more readily.
In addition to resources, individuals can modify many other environmental factors that are relevant to fitness (Estrela et al., 2019 for review). For example, Drosophila larvae metabolically release nitrogenous waste thereby deteriorating the rearing environment of future larvae (Borash et al., 1998). Microbes can modify the pH of their environment which feeds back on their growth and survival (Ratzke and Gore, 2018 : Cote et al., 2010a;Spiegel et al., 2017), are ecologically and evolutionarily significant as they influence the demographic and genetic consequences of movement (Ronce and Clobert, 2012;Edelaar and Bolnick, 2012;Raffard et al., 2021). In the model presented in section 4 for instance, the association between dispersal and attack rate on a resource led to complex meta-population dynamics, with cycles occurring both on ecological and evolutionary timescales ( Fig. 4).
The second class of traits that selection associates with characters that modify the environment consists of traits whose effects on fitness depend on that environment, i.e. due to "genes × environment" interactions (h g×e,pq (z), eq. 17). Such context-or environment-dependent effects are not uncommon. Traits that are useful during competitive interactions, like conspicuous traits to attract mates (Mappes et al., 1996;Woods Jr et al., 2007;Dougherty, 2021) or fighting appendages like antlers or horns (Emlen, 2008;Miller, 2013), are costly to produce but expression costs likely depend on the environment, at least partly. Indeed, individuals that grow in better conditions or are better provisioned often show more extravagant traits without suffering a greater cost of expression (Mappes et al., 1996;Vehrencamp et al., 1989). The suggestion from our analysis is that context-dependent traits of the sort should become linked to characters that improve the environment when this environment is bequeathed to relatives. This is because such combination of linkage and ecological inheritance allows genes that are good in certain environments to be expressed more often in those environ-ments. This reasoning extends to traits that have indirect context-dependent fitness effects ("indirect g × e interactions" term in eq. 17), for instance favouring the association of helping with traits that deteriorate the environment when the fitness effects of helping increase as the environment decreases in quality (as observed e.g. for cooperative breeding in birds Emlen, 1982). The above considerations should be especially relevant to plastic phenotypes through reaction norms (West-Eberhard, 1989;Stearns, 1989), where one of the evolving trait is the response to the environment and another is modification to this environment. Our model can in fact readily be used to investigate how correlational selection associates environmental response with environmental modification within individuals, and thus help understand the maintenance of variation in plasticity and reaction norms (Pigliucci, 2005).
The genes × environment interactions of our model can also be connected to the so-called process of niche construction, which is "the process whereby organisms actively modify their own and each other's evolutionary niches" (Laland et al., 2016). This definition is made more explicit by considering the formal models developed by the authors that use it. The typical set-up is a population genetics model with two loci, E and A, at each of which two alleles segregate (Laland et al., 1996(Laland et al., , 1999Odling-Smee et al., 2003;Silver and Di Paolo, 2006). The current and past allele frequency in the population at locus E determines an environmental variable, which in turn determines whether carrying allele a or A at locus A is more beneficial. Such fitness epistasis via the environment is precisely captured by the direct genes × environment interactions in eq. (17) (specifically by the first term where trait p is allelic frequency within individuals at locus A and trait q at locus E). Our approach thus encompasses these models. But whereas population genetics models focus on short term evolution through changes in frequency of alleles with potentially large effects, the approach we take here investigates phenotypic evolution in the long term under the constant input of mutation with weak effects. One of our main contributions to this approach is having provided to a way to determine whether gradual evolution in a dispersal-limited population leads to polymorphism in "niche construction" traits owing to disruptive selection and frequency-dependent interactions. The polymorphism here contrasts in two ways with the genetic polymorphism reported in the population genetics models of niche construction (Laland et al., 1996(Laland et al., , 1999Odling-Smee et al., 2003). First, the genetic polymorphism in these previous models is due to specific assumptions of fitness that create overdominance rather than because of disruptive selection on traits like in our model. Second, while we allow for limited dispersal and local environmental effects, these population genetics models assume that the population is well-mixed and that the same environment is experienced by all individuals in the population (though see Silver and Di Paolo, 2006 for simulations). This entails that a trait cannot be statistically associated to its environmental effect and as a result, there cannot be any selection on a trait's inter-generational effects (to see this, one can put all relatedness coefficients and their perturbations to zero in our equations; Dawkins, 1982Dawkins, , 2004Brodie, 2005;Lehmann, , 2008. The inter-generational environmental effects of traits that evolve in those population genetics models are a thus a complete by-product of evolution rather than an adaptation. Like all formal models, ours relies on many assumptions that are violated in nature. One is that individuals haploid and reproduce asexually. Provided genes have additive effects on traits, diploidy and sexual reproduction do not influence evolutionary dynamics under directional selection (Geritz and Kisdi, 2000;. The emergence of polymorphism due to correlational selection may however depend on the genetic architecture of traits. Where different traits are encoded by separate loci, meiotic recombination breaks the positive genetic linkage favoured by correlational selection. But if the genetic architecture is allowed to evolve (through e.g. recombination modifiers or pleiotropic loci), then correlational selection favours an architecture that allows associations among traits to be heritable, which in turn leads to polymorphism . Another useful simplifying assumption we have made is that individuals disperse uniformly among patches so that there is no isolation-by-distance. While isolation-by-distance does not lead to fundamental changes in how selection shapes traits with lasting ecological effects (as shown by analyses of directional selection, Lehmann, 2008), it introduces interesting effects whereby selection depends not only on temporal but also on spatial environmental effects of traits. Finally, we have assumed that patches are of constant size and that traits influence a single environmental variable. To capture a greater variety of ecological effects, it would be interesting, albeit challenging, to allow for demographic fluctuations owing to trait evolution and multiple environmental variables (extending e.g. Rousset and Ronce, 2004;, to consider disruptive selection).
To sum up, we have investigated the coevolution of multiple traits in a group-structured population when these traits affect the group environment, which is then bequeathed to future generations. We found that such bequeathal provides ground for different types of traits to become linked by selection, with implications for a wide range of traits involved in niche construction, division of labor, dispersal syndromes, conditiondependence and phenotypic plasticity. Our results broadly suggest that ecological inheritance can contribute to phenotypic diversity and potentially lead to complex polymorphism involving multiple traits with longlasting effects on the environment. Yurtsev, E. A., A. Conwill, and J. Gore, 2016. Oscillatory dynamics in a bacterial cross-protection mutualism.
Proceedings of the National Academy of Sciences 113:6236-6241. i.e. the probability that an individual randomly sampled in a patch under neutrality is an immigrant. This probability may be a fixed parameter or evolve, in which case m depends on the resident trait z (as e.g. when traits that influence gene flow evolve, as in our section 4). The quantity w p,i is the philopatric component of individual fitness, i.e. the expected number of offspring of individual i that remain in their natal patch, which depends on the same parameter as total individual fitness eq. (4) (eq. 25 for an example). † Relevant equation in Appendix where derivation can be found.

Box I: Directional selection under ecological inheritance
The selection gradient on a trait p with inter-generational effects (given by eq. 1) can be expressed in terms of individual fitness (eq. 4) as (eqs. 13-15 in , eq. 2 in Lehmann, 2007, eq. 4 in Lehmann, 2008 in Sozou, 2009 -appendix A here for re-derivation). Eq. (I.A) consists of three weighted fitness effects.
(i) ∂w i /∂z i p is the effect of a change in trait p in the focal individual i on its own fitness (direct fitness effect). (ii) ∂w i /∂z j p is the effect of a change in trait p in a patch-neighbour (individual j = i ) on focal fitness (indirect fitness effect). This is weighted by the number (N − 1) of such neighbours, and the pairwise relatedness coefficient under neutrality, r • 2 , which is the probability that a randomly sampled neighbour is identical-by-descent to a focal when the population is monomorphic for z (see eq. A-20 in Appendix A for formal definition). The first two summands of eq. (I.A) thus correspond to the standard selection gradient in group-structured population (which can be read as Hamilton's rule: is that selection favours a trait change when this change in a local lineage of individuals perturbs the local environment in a way such that on average, it increases the fitness of its members (e.g. produce some long lasting common good that increases the fitness of downstream relatives,   , where ∂F /∂z i p is the effect of a change in trait p in one individual (say i ) on its local environment over one generation (from eq. 1), andr • 2,h is the probability that an individual randomly sampled h ≥ 1 generations ago in the same patch to a focal individual is identical-by-descent to this focal (eq. A-28 in Appendix A for formal definition). Eq. (I.B) can be understood as follows (Fig. 1C for diagram). Consider the patch of the focal h generations ago. In this past generation, the focal individual has on average Nr • 2,h ancestors in the patch. Each of these ancestors perturbs the environment by ∂F /∂z i p (solid green arrow in Fig. 1C) due to a change in trait p. In turn, each perturbation persists through time via ecological inheritance, carried over each generation by a factor ∂F /∂ . Thus a perturbation initiated h generations ago has decayed by a factor (∂F /∂ ) h−1 by the time it reaches the focal (dashed green arrow in Fig. 1C). Summing such effects from all past ancestors (so from h = 1 to ∞) obtains eq. (I.B).

Box II: Intra-and inter-generational resource dynamics
Here, we derive the inter-generational dynamics of resource density (eq. 21) from a model of consumerresource dynamics that occur in continuous time within generations. To track these dynamics, we denote intra-generational time by τ which runs from 0 to T , where T is the length in continuous time of a generation (i.e. a whole iteration of the life-cycle). We let t ,τ be the resource density at time τ within generation t in a focal patch. With this notation, the density t of the resource in that patch before consumption at generation t is given by t = t ,0 = t −1,T . To obtain t +1 (eq. 21), we thus track t ,τ from τ = 0 to T . We assume that resource dynamics during that time period are decomposed into two phases: a consumption phase (for 0 ≤ τ ≤ τ 1 ) followed by a renewal phase (for τ 1 < τ ≤ T ) so that τ 1 is the amount of time the resource is being consumed (and T − τ 1 the amount of time it renews itself).
(1) Consumption. First, each individual i in the patch consume the resource at a rate give by their trait z i 1 . Specifically, the rate of change in the amount ρ i ,τ of resources collected by individual i = 1, . . . , N while the rate of change in resource density in the patch is Solving eqs. (II.A) and (II.B) with initial conditions ρ i ,0 = 0 and t ,0 = t , we obtain that by the end of consumption, the resource density in the patch is t ,τ 1 = t e −Nz 1 τ 1 , (II.C) and that the amount of resources consumed by a focal individual i is, which is the first term of eq. (23) (with t = ).
(2) Renewal. After consumption, we assume the resource renews itself growing logistically, according to where r 0 is the per capita growth rate of the resource when at low abundance, and k the carrying capacity of a patch for the resource, which we set to k = 1 for simplicity. Solving eq. (II.E) for τ 1 < τ ≤ T with initial condition given by eq. (II.C), we obtain that by the end of generation t (so at the beginning of generation of t + 1), the resource density is (II.F) Letting r = r 0 (T − τ 1 ) be the rate of renewal per-generation, eq. (II.F) becomes eq. (21), as required.

A.i)
A.ii) B. (1) (2) (3) (4) C. (2) as overconsumers become more frequent, resource density falls; (3) when the resource is scarce, overconsumers are counter-selected; (4) once overconsumers are rare, resource density increases. C. Long-term evolutionary cycles. As the morphs becomes increasingly differentiated due to disruptive selection, the over-consumer morph becomes increasingly rare, leading to its stochastic extinction, until it re-appears through mutation whereupon the evolutionary cycles starts again. Figure S1: Diagram for synergy due to non-linear fitness effects of the environment, eq. (16) substituted into eq. (15) (second term). To be read similarly to Fig. 1C of main text.  As expected, this shows that patches with more over-consumers tend to carry fewer resources. In black: expected equilibrium in resource densityˆ as a function of mean attack ratez 1 (from eq. 22). For these parameter values, the resource goes to extinction whenz 1 > 0.1 (i.e.ˆ = 0 whenz 1 > 0.1).

Appendix for "The moulding of intra-specific diversity by selection under ecological inheritance"
Iris Prigent 1 and Charles Mullon 1,*

A Invasion analysis and directional selection gradient
Our investigations are based on an invasion analysis of rare mutants. To infer on invasion, we use the basic reproductive number as a proxy of invasion fitness (i.e. a quantity that is sign equivalent around one to invasion fitness so that it equally tells about invasion). We detail this basic reproductive number below and then use it to re-derive the selection gradient (eq. I.A in Box I) that has been described in previous papers (Lehmann, , 2008. For ease of presentation, we consider the case where the phenotype consists of a single trait. The extension to multiple traits is straightforward, as we later show.

A.1 Basic reproductive number
We denote the basic reproductive number of a rare mutant coding for trait ζ ∈ R in a resident population monomorphic for z ∈ R by R 0 (ζ, z). In the island model of dispersal, the basic reproductive number R 0 (ζ, z) is given by the expected number of successful offspring produced by an individual that is randomly sampled from a local lineage of rare mutants with trait ζ in a resident population with trait z (where a local lineage consists of all the individuals carrying the mutant who reside in a focal patch; Lehmann et al., 2016). We will sometimes refer to such randomly sampled individual from a local lineage of rare mutants as a "representative" mutant for short.
Suppose the mutation appeared as a single copy at generation t = 0 in a focal patch. For our model (see section "life-cycle" in the main text), we need to take into account that the fitness of an individual at an arbitrary generation t > 0 depends on the environmental state of the patch and that in turn, this state depends on the genetic history of the patch (i.e. the sequence of mutants and residents there have been in the patch since generation t = 0). Because the population consists of patches carrying a finite number of individuals, the genetic history of a focal patch is stochastic due to local sampling effects. To capture this history and its stochastic nature, we define M t ∈ {1, ..., N } as a random variable for the number of mutants at generation t in the focal patch. With the mutation arising at generation t = 0 as a single copy, we have M 0 = 1. We write H t = {M i } 0≤i ≤t ∈ H t for a random collection of the number of mutants in the focal patch from the time the mutation appeared until generation t (where H t is the countable set of all possible H t ) and let Pr(H t ) be the probability that history H t is realized.
Using the above notation and following  (see their appendix A), the probability ψ t (H t ) that an individual, randomly sampled from the local mutant lineage, lives in the patch at time t and that this patch has history H t can be written as, is the expected total size of the local mutant lineage over its lifetime, where we have deliberately stressed the dependence of M t on H t by writing it as M t (H t ). Because the population is otherwise monomorphic for the resident z and patches are not completely isolated from one another, there is constant immigration of zindividuals into the focal patch. As a result, a local mutant lineage eventually vanishes with probability one (i.e. lim t →∞ Pr(M t = 0) = 1). In other words, a local mutant lineage has a finite lifetime and so its total size n L < ∞ is bounded. For our analyses, it will sometimes be convenient to rewrite (A-1) as where M t (H t )/N is the mutant frequency in the focal patch at generation t .
The probability density function ψ t (H t ) characterises the genetic history preceding a representative mutant.
To calculate R 0 (ζ, z), we further need to specify the individual fitness of such a mutant, which is defined as the expected number of successful offspring of this individual. To do so, let us first define z −i ,t as a vector with N − 1 entries collecting the phenotypes of the patch-neighbours of a focal mutant, so that z −i ,t consists of M t − 1 entries ζ and N − M t entries z, i.e., To highlight the dependence of z −i ,t on M t , we will sometimes denote this vector as z −i (H t ). The fitness of an individual also depends on the environmental state of the patch it resides in, which in turn depends on the genetic history of the patch. To capture this, we write the environmental state of the focal patch at generation t as (H t ). Using the notation introduced here and eq. (4) of the main text, the fitness of a mutant living at time t in the focal patch with history H t can be written as where ζ is the (mutant) phenotype of the individual whose fitness is being considered, z −i (H t ) is the vector of phenotypes of its neighbours, and (H t ) is the environmental state of its patch.
Finally, we can use the above to write R 0 (ζ, z), i.e. the fitness of a representative mutant, as where we have defined the conditional expectation operator, for any function f . From the theory of multi-type branching processes, it has been shown that if R 0 (ζ, z) ≤ 1, then the mutant will vanish with probability one . Otherwise, there is a non-zero probability that the mutant invades and replaces the resident. R 0 (ζ, z) thus constitutes a proxy for invasion fitness W (ζ, z) of the mutant, equally telling about the nature of selection.

A.2 The directional selection gradient
Here, we derive eq. (I.A) in Box I of the main text. First, we compute the directional selection gradient s(z) acting on a single trait, which we then generalize to multiple traits.

A.2.1 The two components of the selection gradient
The selection gradient is given by the marginal effect of the trait on the reproductive number, i.e. by where here and thereafter all derivatives are evaluated under neutrality, i.e., where the population is monomorphic for the resident, ζ = z, and at environmental equilibrium, (H t ) =ˆ (z) =ˆ , for short. Strictly speaking, this selection gradient defined from R 0 (ζ, z) is proportional to the one defined from invasion fitness (eq. 6) but we do not distinguish between these gradients as they give the same information about the direction of selection (same for the Hessian). Substituting eq. (A-6) into eq. (A-8) and using the chain rule, we obtain where ψ • t (H t ) denotes the probability density function ψ t (H t ) under neutrality and thus characterises patch dynamics for a neutral mutant.
To simplify eq. (A-9), we first note that w(z, z,ˆ ) is the fitness of a resident individual in the resident population.
Since the population size is constant, this is one, w(z, z,ˆ ) = 1. Second, because ψ t (H t ) is a probability density function, it sums to one over its support, i.e. ∞ t =0 H t ∈H t ψ t (H t ) = 1. The derivative of this sum with respect to ζ therefore vanishes, ∂( ∞ t =0 H t ∈H t ψ t (H t ))/∂ζ = 0. From these observations, the selection gradient simplifies to, where we use to denote expectation of a function f over ψ • t (H t ).
Next, we re-write the fitness of a focal individual as eq. (4) in the main text, is the vector collecting the traits of the patch neighbours to the focal (eq. A-4), and t = (H t ) is the environmental state of the focal patch. Using the chain rule, we can decompose the derivative of individual fitness as where z j ,t is entry j of the vector z −i ,t . Since the marginal fitness effect of a trait change in every neighbour j ∈ {1, 2, . . . , N − 1} is the same, we can take the derivative out of its sum over j in eq. (A-13), and using eq. (A-4) is the indirect fitness effect.
Substituting eq. (A-14) into eq. (A-10), we find that the selection gradient can be decomposed into two components, captures selection due to genetic effects of the trait on fitness (hence the subscript g), and captures selection due to the ecological effects of the trait (hence the subscript e). We detail these components in sections A.2.2 and A.2.3 respectively.

A.2.2 Directional selection due to genetic effects
Following straightforward re-arrangements, s g (z) (eq. A-17) turns out to be the standard selection gradient on traits with fitness effects only (Frank, 1998;, for textbook treatments), -20) corresponds to the standard coefficient of pairwise relatedness, i.e., the probability that under neutrality, a random individual sampled among the N −1 neighbours of a representative mutant is also a mutant (or equivalently, the probability that two individuals sampled without replacement from the same patch at the same generation are identical-by-descent, IBD; eq. C-2 for derivation).

A.2.3 Directional selection due to ecological effects
Conditional mutant effect on the environment. To specify s e (z) (eq. A-18), we first derive with respect to ζ both sides of where F gives the dynamics of the local environment over one generation (eq. 1 of the main text). We thus where we have used for short. Since there are M t −1 mutants in the patch at generation t − 1, eq. (A-22) becomes Eq. (A-24) can be viewed as a linear (non-homogeneous) recurrence in ∂ t /∂ζ. Since the first mutant appeared at t = 0, this recurrence has initial condition ∂ 0 /∂ζ = 0. Solving eq. (A-24) using standard methods, we obtain where M t −h is the random variable for the number of mutants in the focal patch, h generations before t .
Unconditional mutant effect on the environment. Eq. (A-25) formalises the notion that the effect of the mutation on the environment at generation t depends on the whole (random) sequence of mutants in the focal patch up to t (i.e. on H t ). As per eq. (A-18), we further need to take the expectation of this effect over the (using eq. A-11). A change of dummy variables in the sums of the right-hand side of this equation allows us to rewrite it as corresponds to the probability that under neutrality, an individual that is randomly sampled from the focal patch h ≥ 1 generations before a representative mutant is also a mutant. Alternatively,r • 2,h can also be understood as the probability that under neutrality, two individuals sampled h generations apart from the same patch are IBD. We computer • 2,h in Appendix C.2.1, obtaining eq. (C-7). Substituting eq. (C-7) into eq. (A-27), we obtain is the probability that two individuals sampled with replacement from the same patch at the same generation are IBD (under neutrality).

A.2.4 Selection gradient vector
Extending the selection gradient to multiple traits is straightforward . One needs to consider that each trait p of the phenotype vector z = (z 1 , . . . , z n ) can influence individual fitness (eq. 4) and the environmental state of the patch (eq. 1). From these considerations and using the same arguments as the ones used for eqs. (A-8)-(A-19), we get find that this latter effect is where the first line can be rearranged to read as eq. (I.B) of Box I, and the second line to row 5 of Table 1.

B The disruptive selection coefficient
In this appendix, we derive the results presented in section 3 of the main text, which shows disruptive and correlational selection. As done for the selection gradient, we first compute the second-order effects of selection, when a single trait is evolving which we then generalize to multiple traits . With a single evolving trait, h(z) corresponds to the coefficient of disruptive selection on z.

B.1 Decomposing the disruptive selection coefficient B.1.1 Neutral and mutant genetic structure
First, we substitute for the reproductive number R 0 (ζ, z) (eq. A-6) into eq. (B-1) and using the chain rule, we (with eq. A-12). We again use the fact that the sum of ψ t (H t ) over its support is always equal to one (i.e. constant), so that the last term of eq. (B-2) vanishes. This leaves us with two components for disruptive selection, captures the second-order fitness effects of a mutation on fitness averaged over neutral mutant patch dynamics (i.e. over the neutral distribution ψ • t (H t )), and depends on the effect of the trait on mutant patch dynamics (via ∂ψ t (H t )/∂ζ). For short hand, we use to denote expectation of a function f over the perturbation of the probability density distribution ψ t (H t ). With this, eq. (B-5) becomes To characterise h w (z) (eq. B-4) further, we first expand the second-order derivatives using the chain rule: Substituting eq. (B-8) into eq. (B-4), we obtain after some straightforward re-arrangements, and The other term participating to disruptive selection, h r (z) (eq. B-7), meanwhile, can be specified by substituting eq. (A-14) into it, yielding (B-12)

B.1.2 Intra-and inter-generational effects
To better contrast our model with previous results that ignore inter-generational ecological effects (Ajar, 2003;Wakano and Lehmann, 2014;Mullon and Lehmann, 2019), we re-arrange the coefficient of disruptive selection h(z) (eqs. B-3, B-9-B-12) as, is due to intra-generational effects of traits on fitness, while is due to inter-generational effects of traits on fitness. This is the decomposition presented in the main text eq. (12) in the form of correlational selection for the multi-trait case.

B.2.1 Disruptive selection due to intra-generational fitness effects
From eq. (A-20), we first note that E • [(M t − 1)/(N − 1)] = r • 2 is neutral pairwise relatedness in eq. (B-14). Second, (B-16) corresponds to the three-way relatedness coefficient under neutrality, i.e. the probability that three individuals sampled without replacement from the same patch are IBD in a monomorphic population. Finally, using eq. (B-6), we have where r 2 (ζ, z) is the probability that in a mutant patch, a randomly sampled neighbour to a mutant is also mutant. This r 2 (ζ, z) can thus be thought of as "mutant intra-generational relatedness" and eq.  (B-18) which corresponds to eq. (8) in Ajar (2003) and eqs. (26-27-29) in Wakano and Lehmann (2014).

B.2.2 Correlational selection due to intra-generational fitness effects
Eq. (B-18) can be straightforwardly extended to consider the case where multiple traits coevolve. Following previous papers connecting disruptive selection when a single and when multiple traits evolve , we obtain from eq. (B-18) that correlational selection acting on traits p and q due to intra-generational fitness effect is This corresponds to eq. (13) of the main text, where we denote ∂r 2 /∂z p = ∂r 2 (ζ, z)/∂ζ p for the effect of a change in trait p on the probability of sampling a mutant among the N − 1 neighbours of a focal mutant. We calculate this effect on relatedness in Appendix D.1.

B.3 Inter-generational ecological fitness effects
We decompose h e (z) (eq. B-15) into three biologically relevant terms, -20) which correspond to three inter-generational effects of traits on fitness that may lead to polymorphism. We develop each of these terms below and connect them to the main text equations of section 3.2, which are for the multi-traits case (with eq. B-20 equivalent to eq. 14).
Squared mutant effect on the environment. From eq. (A-25), we have which we substitute into eq. (A-11) to obtain We can change the order of summations in eq. (B-23) to obtain, as the probability that under neutrality, two individuals sampled respectively h and h generations before a representative mutant are also mutants (all from the same patch).

Second order mutant effect on the environment.
To specify E • ∂ 2 t /∂ζ 2 of eq. (B-21), we first derive both sides of eq. (A-21) twice with respect to ζ. Using similar argument as for the first order ecological effects (eq. A-21 to eq. A-24), we obtain Equation (B-26) can be seen as a linear (non-homogeneous) recurrence for ∂ 2 t /∂ζ 2 in t with initial conditions ∂ 2 0 /∂ζ 2 = 0 and ∂ 0 /∂ζ = 0 (since the first mutant appeared at time t = 0). Solving eq. (B-26) using standard which can be written as wherer • 2,h andr • 3,h,h are defined in eq. (A-28) and eq. (B-25), respectively, and were we have defined as the probability that under neutrality, two distinct individuals sampled h generation before a representative mutant are also mutants.
Correlational selection due to non-linear ecological effects. Eq. (B-21) can be extended to obtain the component h e×e,pq (z) of correlational selection on traits p and q that is due to non-linear ecological effects: which corresponds to eq. (15) of the main text. From eq. (B-24), meanwhile, we obtain the expected effect of the product of those two traits p and q on the environment, After straightforward re-arrangements, eq. (B-34) corresponds to eq. (16) of the main text. For computational purposes, it will turn out to be convenient to decompose eq. (B-34) as wherer • 3,h,h is the probability that under neutrality, two individuals sampled with replacement h generations before a representative mutant are also mutants (i.e. defined by eq. B-25, where h = h ).

B.3.2 Genes-environment interactions
The second term of h e (z) (eq. B-20) collects the effects that are caused by gene-environment interactions: Product between mutant frequency and mutant effect. Using eqs. (A-11) and (A-25) obtains D. where changing the summation order yields as the probability that under neutrality, two individuals, one sampled among the neighbours of a representative mutant and another sampled h generations ago, are also mutants.
Correlational selection due to genes-environment interactions. Extending eq. (B-38) to the multi-trait case gives h g×e,pq (z), the correlational selection on traits p and q that is due to genes-environment interactions: (B-42) which corresponds to eq. (17) of the main text, where we denote From eq. (B-40), this expectation is which gives us eq. (18) of the main text. The relatedness coefficient, r • 3,h , is computed in Appendix C (eq. C-15), which substituted into eq. (B-44) gives row 8 in Table 1.

B.3.3 Biased ecological inheritance
The last term of h e (z) (eq. B-20), (B-45) consists of the effects that are caused by biased ecological inheritance, whose multi-trait equivalent is eq. (19) of the main text. Eq. (B-45) is the product of two terms: (i) the fitness effect of a change in the environment; and (ii) the first order effect of the trait on the environment experienced by a mutant. To compute this latter effect, we substitute eq. (A-25) into eq. (B-6) obtaining , (B-46) wherer 2,h (ζ, z) is the probability that given a mutant individual in a mutant patch, a randomly sampled individual from that same patch h generations ago is also mutant. Under neutrality, this reduces tor 2,h (z, z) =r • 2,h .
The derivative ofr 2,h (ζ, z) with respect to ζ can thus be interpreted on the effect of such trait on intergenerational relatedness.
Correlational selection due biased ecological inheritance. Eq. (B-45) can be readily extended to obtain h r×e,pq (z), that is the correlational selection on two traits p and q that is due to biased ecological inheritance, which corresponds to eq. (19) of the main text, where we denote as the expectation of a function f over the perturbation of the probability density function ψ t (H t ) due to a change in trait p (i.e. the multi-trait version of eq. B-6). Substituting eq. (A-25) into eq. (B-48), we find the effect of trait p on the environment experienced by a mutant averaged over patch dynamics perturbed by a change in trait q (and similarly for E (1) p ∂ t /∂ζ q ). From eq. (B-46), this is giving us eq. (20) of the main text, where for simplicity we write for the effect of a change in trait q on inter-generational relatedness. We calculate this effect on relatedness in Appendix D.2 (eq. D-48), which substituted into eq. (B-50) lead to row 10 in Table 1.

C Relatedness under neutrality
Here, we calculate the neutral relatedness coefficients that are necessary to compute directional, disruptive and correlational selection (and necessary for the rows 1-8 of Table 1 of the main text), using coalescent arguments (eg., Karlin, 1968;.

C.1 Intra-generational relatedness
The relevant intra-generational relatedness coefficients for the island model under a Wright-Fisher life cycle are well known (e.g. Ajar, 2003;Ohtsuki, 2010). We rederive them here for the sake of completeness and to illustrate the type of arguments used.

C.1.1 Pairwise relatedness
The probability r • 2 (t ) that two individuals living at time t in a focal patch are IBD satisfies the recurrence, which can be understood as follows. To be IBD, two individuals must be philopatric, which occurs with probability (1 − m) 2 . Then with probability 1/N they have the same parent (and thus coalesce immediately). Otherwise, with probability (N −1)/N , they have different parents and therefore are related with probability r • 2 (t −1). . In turn, the probability that two individuals sampled with replacement from the same patch reads asr

Solving for the equilibrium r
as with probability 1/N , the same individual is sampled twice, and with probability (N − 1)/N , two different individuals are sampled.

C.1.2 Threeway relatedness
Using a similar argument as above, the equilibrium probability that three individuals randomly sampled without replacement form the same patch are IBD satisfies, which yields, Ohtsuki, 2010, eq. 9-10). Meanwhile, gives the probability that three individuals sampled with replacement at the same generation from the same patch are IBD.

C.2 Inter-generational relatedness
Inter-generational relatedness is computed the same way as intra-generational .

C.2.1 Pairwise inter-generational relatedness
The probability that two individuals sampled h > 0 generations from the same patch are IBD is (eq. 6 in . To understand this, consider the lineage back in time of a randomly sampled individual. With probability (1−m) h , this lineage remains in the focal patch for h generations and in turn coalesces with the lineage of another individual from that generation with probabilityr • 2 .

C.2.2 Threeway inter-generational relatedness
Computing disruptive and correlational selection require three further inter-generational relatedness coefficients that involve three individuals. These coefficients have so far not been characterized in the literature.

(i) IBD between one individual at generation t and two patch neighbours at generation t −h.
We first compute r • 3,h,h (eq. B-32), which is the probability that one individual at some random generation t and two patch neighbours at generation t − h, all randomly sampled from the same patch are IBD. This probability is given by, (ii) IBD between two patch neighbours at generation t and one individual at generation t − h. Next, let us calculate r • 3,h (eq. B-41), which is the probability that two patch neighbours at generation t and a third individual randomly sampled at generation t − h are all IBD. We need to consider two cases: either (1) the lineages of the two individuals sampled at t remain in the focal patch without coalescing within h generations; or (2) their lineages coalesce into one common ancestor within h generations and their common lineage remains in the focal patch ( Fig. C-1). To reflect these two pathways, we decompose r • 3,h as where P • h,1 is the probability of coalescence in case (1) and P • h,2 in case (2) (see igure C-1: Decomposition of r • 3,h . On the left, P • h,1 corresponds to scenario (1) (eq. C-12), where the two lineages of the individuals sampled at generation t do not coalesce within h generations while staying in the focal patch. On the right, P • h,2 corresponds to scenario (2) (eq. C-14), where the two lineages of the individuals sampled at generation t coalesce within h generations. 1 − 2/N , however, a randomly sampled individual from the focal patch at generation t − h generations is not a direct ancestor to either individuals, in which case they are all IBD with probability r • 3 ( Fig. C-1, middle). These arguments lead us to for the first term of r • 3,h .
For P • h,2 , consider again the two individuals from generation t and their lineages back in time ( Fig. C-1, right hand side). The probability that their lineages coalesce at generation t − i (1 ≤ i ≤ h) and further remain in the focal until generation t − h is given by (C-14) Plugging eqs. (C-12) and (C-14) into eq. (C-11) finally leads us to (C-15) (iii) IBD between individuals at generations t , t − h and t − h . Finally, we calculater • 3,h,h (eq. B-25), which is the probability that individuals randomly at generations t , t − h and t − h (h > 0 and h > 0) from the same patch are all IBD. It is useful to distinguish two cases, which we consider separately. The case where h = h reads as as with probability (1 − m) h an individual living in the focal patch at generation t has a direct ancestor in that same patch at generation t −h. In turn, this direct ancestor is IBD with two individuals sampled randomly with replacement at generation t − h with probabilityr • 3 . For the case where h = h, let us assume without loss of generality that h > h. We then havē We thus have altogetherr (C-20) wherer • 2,h −h is given by eq. (C-7) and r • 3,h −h by eq. (C-15).

D Effect of selection on relatedness
In this appendix, we characterize the effect of selection on intra-and inter-generational relatedness, ∂r 2 (ζ, z)/∂ζ (which appears in eq. B-18) and ∂r 2,h (ζ, z)/∂ζ (which appears in eq. B-46), and necessary for the rows 9-10 of Table 1 of the main text. For the ease of presentation, we do so when the phenotype consists of a single trait. The extension to multiple traits is straightforward.

D.1 Intra-generational relatedness
Intra-generational relatedness under selection has been derived in previous papers in the absence of ecological inheritance (Ajar, 2003;Wakano and Lehmann, 2014;. Here, we extend these derivations to include such inheritance. As a starting point, we substitute eq. (A-3) into eq. (B-17) to obtain is the probability that an individual randomly sampled from a mutant patch at generation t is mutant, and is the probability that two individuals randomly sampled without replacement from a mutant patch at generation t are both mutants. Both k t andk t depend on mutant and resident trait values, ζ and z, but we do not write such dependency explicitly to avoid notational clutter. Taking the derivative of eq. (D-1) with respect to ζ and evaluating at z, we get using the quotient rule.
We can unpackk t (eq. D-3) using conditional probability as where Pr(M t |H t −1 ) is the probability that there are M t mutants in the focal patch at generation t given the history H t −1 of this patch up to generation t − 1. Expanding eq. (D-5) then yields, are the first and second moments of the distribution for the number of mutants at generation t given H t −1 , respectively. To characterise these moments, let us first decompose individual fitness of a mutant at generation t − 1 (from eq. A-5) as is the number of offspring that remain in their natal patch, and w d ( those that establish in other patches (and where also z i = ζ and z −i ,t −1 is a N − 1 vector with M t −1 − 1 entries consisting of ζ and N − M t −1 of z, see eq. A-4). Note that under neutrality (i.e. ζ = z), these fitness components reduce to (D-8) We can then use the fact the life-cycle is Wright-Fisher, so that the number of mutants at generation t in a mutant patch follows a binomial distribution with parameters Hence, the two first moments of the mutant distribution are given by (D-10) Substituting eq. (D-10) into eq. (D-6), we thus obtaiñ Taking the derivative of eq. (D-11) and using eq. (D-8), we obtain using k t −1 andk t −1 defined in eq. (D-2) and eq. (D-3).
Before substituting eq. (D-12) into eq. (D-4), note first that the first term of eq. (D-4) can be expanded as However since the mutation arises as a single copy at generation t = 0, the second term of eq. (D-13) vanishes (i.e. ∂k 0 /∂ζ = 0), leaving us with Substituting eq. (D-12) into the right hand eq. (D-14), we obtain using eq. (A-3). Changing the dummy variable t on the right hand side of eq. (D-15) and using eq. (A-11) lead us to (D-16) with E • f (H t ) defined in eq. (A-11). Re-arranging eq. (D-16) and using eq. (C-2), we finally obtain Plugging eq. (D-17) into eq. (D-4) gives into which we substitute leaving us with (D-21) The first line of eq. (D-21) corresponds to the effect of selection on intra-generational relatedness in the absence of ecological inheritance (eq. B.46 in Wakano and Lehmann, 2014) while the second is due to such inheritance.
For B (k) (with k ≥ 1), we first substitute eq. (D-19) into eq. (D-33) to obtain =r • 2,k (eq. A-28) (D-41) The last term of eq. (D-33) can be computed using eqs. (A-11) and (A-25): Since the mutant appeared at t = 0, we have M t −l M t −k = 0 for any t < l or t < k. We can thus re-arrange the order of the summation in eq. (D-42) to obtain 3,k,l . (D-45) Finally, plugging eq. (D-45) into eq. (D-27), we obtain after some straightforward re-arrangements ∂r 2,h (ζ, z) For computational purposes (and be able to user • 3,k,l as calculated in eq. C-19), we re-arrange the last term of eq. (D-47) such thatr • 3,k,l has either k = l or l > k: 3,k,l . (D-48) Substituting for the relevant inter-generational relatedness coefficients found in Appendix C into eq. (D-48), which is in turn substituted into eq. (20) of the main text (or equivalently eq. B-50 in Appendix B.3.3), we obtain row 10 of Table 1 in the main text.

E Coevolution of dispersal and attack rate on a local resource
In this appendix, we describe in brief our analysis behind the results presented in section 4 of the main text.

E.1 Ecological dynamics
We first characterise the equilibriumˆ of the resource system when the consumer population is monomorphic.

E.2 Evolutionary dynamics under directional selection
First, we investigate directional selection on attack rate and dispersal.

E.3 Disruptive selection and polymorphism
The population thus gradually converges to z * = (z * 1 , z * 2 ). Whether it remains monomorphic for z * or becomes polymorphic depends on the Hessian matrix H(z * ) (eq. 10). Substituting eqs. (21)-(25) into eqs. (12)-(19) and using Table 1, we find that the Hessian matrix has the following sign structure (we do not show the actual entries as those are too complicated to generate valuable information but the sign structure can be easily derived using an algebraic computer program, such as Mathematica, Wolfram Research, Inc., 2016). The negative diagonal entries reveal that when either trait evolves in isolation form the other, selection is always stabilising.
Meanwhile, the positive off-diagonal entry of H(z * ) means that the coefficient of correlational selection is positive, so favouring a positive association between dispersal and attack rate. Further insights into correlational selection can be brought by decomposing this coefficient as in eqs. (12) and (14) of the main text, whose different terms we find have the following signs h 12 (z * ) = h g,12 + h e×e,12 + h g×e,12 (z * ) ≤ 0 + h r×e,12 (z * ) ≥ 0 ≥ 0. (E-12) This shows that the main driver for a positive correlation between dispersal and attack rate is biased ecological inheritance, h r×e,12 (z * ). In turn, following eqs. (19) and (20)  which shows that correlational selection is caused by the negative effect of the attack rate on the resource, and the negative effect of dispersal on relatedness. More specifically, eq. (E-13) demonstrate that correlational selection is the strongest when ecological inheritance high, that is, when r is small and τ 1 is large. For those parameter values, the correlational selection on dispersal and the attack rate is larger than the stabilising selection applying to each of those traits in isolation, thus the population experiences disruptive selection (Fig. 2.B).

E.4 Individual based simulations
We used R (version R-4.0.2, R Core Team, 2022) to model the coevolution of attack and dispersal following the life-cycle described in main text section 4 with individual based simulations. The population is subdivided among N p = 2, 000 patches, each carrying N = 10 individuals. Each individual i is characterized by its two quantitative traits: (z 1,i , z 2,i ), attack rate z 1,i and dispersal probability z 2,i ; and each patch by its resource density . At the beginning of each generation, we calculate the fecundity f i of each individual i according eq. (24).
Then, we form the offspring generation by sampling in each patch N = 10 individuals from the parental generation, where each parent is weighted according to the patch they live in and their fecundity. Specifically, if the parent say i resides in the same patch where the spot is being filled, it is weighted by f i (1 − z 2,i ), otherwise it is weighted by f i z 2,i (1 − c d )/(N p − 1) (to capture offspring dispersal). With probability µ = 0.001, an offspring mutates, in which case its phenotype consists of its parental phenotype to which we add a perturbation from a bivariate Normal distribution with mean (0, 0) and covariance matrix Traits values are controlled so that the attack rate remains positive and the dispersal probability remains between 0 and 1. Finally, we calculate the resource density available to the offspring generation, according to eq. (21). We repeat this iteration for 100 000 generations.