## Abstract

The standard axiomatic theory of rationality posits that agents order preferences according to the average utilities associated with the different choices. Expected Utility Theory has repeatedly failed as a predictive theory of the choice behavior, as reflected in the enormous new literature in behavioral economics. A frequent thread in this literature is that apparently irrational behaviors in contemporary contexts may have once served important functions, but there has been little attempt to formalize the relationship between evolutionary fitness and choice behavior. Biological agents should maximize fitness, but fitness itself is not a reasonable choice variable since its time-scale exceeds the lifespan of the decision-maker. Consequently, organisms use proximate motivational systems that work on appropriate time-scales and are amenable to feedback and learning. We develop an evolutionary principal-agent model in which individuals maximize a set of proximal choice variables, the interests of which are aligned with fitness. We show that age-specific demographic rates can be used as choice variables. The solution to our model yields probability weightings similar to Cumulative Prospect Theory and Rank-Depended Expected Utility Theory. The pessimistic probability weighting characteristic of these models emerges naturally in an evolutionary framework because of extreme intolerance to zeros in multiplicative growth processes. We show that even under a model of constant absolute risk aversion for choice variables at the proximate level, agents are highly risk-averse at the lowest levels of consumption and suggest a consistency with empirical research on the risk preferences of the poor.

## Introduction

Organisms, humans included, must make decisions about foraging, reproductive, social, and political behaviors that have consequences for proximate outcomes such as satiety, income, wealth, happiness, sexual satisfaction, and well-being. These decisions also have consequences for ultimate outcomes such as fitness and, consequently, there are strong expectations that selection acting on differential fitness will shape the decision-making system. The rational-choice tradition suggests that individuals make decisions that maximize some objective function typically denoted by the catch-all term “utility” (1). In particular, a set of possible payoffs (i.e., a “lottery”) with associated probabilities is preferred to some other lottery with probabilities if some value function that associates payoffs with probabilities is greater than that associated with . A natural value function to associate payoffs with probabilities is linear in the probabilities and utilities of distinct outcomes, e.g., . The decision model in which preferences are ordered by their expected utilities is known as expected utility theory (EUT) and was first formalized in its modern form in (1).

The EUT approach has been very fruitful for economics, political science, and other behavioral sciences. Like many economic theories it is axiomatic and the fundamental axioms that underlie EUT (completeness, transitivity, continuity, and independence) are sensible requirements that ensure that preferences are consistent (1). However, an enormous literature has developed showing that people violate the axioms underlying EUT in both experimental and naturalistic contexts. Some examples include the common consequence effect (Allais paradox), the common ratio effect (2), ambiguity aversion (3), preference reversals between gambles represented as bids versus choices (4), the incommensurability of risk sensitive behavior for high-vs. low-stakes gambles (5), and abundant evidence that framing and reference points induce departures from canonically-predicted behavior (6; 7; 8).^{1}

Machina noted (10; 11) there is nothing inevitable about expected utility serving as the objective function for preference ordering and suggested that nonlinear functions mapping values and utilities might account for these types of systematic departures from the predictions of EUT. What is less clear is what mechanistic or functional basis such nonlinear functions might have. A frequent thread in the behavioral economics literature is that behaviors that appear irrational in contemporary contexts may have served important functions during human evolutionary history, but there has been little attempt to formalize the relationship between evolutionary fitness and choice behavior. In line with Machina’s observation, we describe and interpret a formal model of the evolution of preferences that leads to decision-making rules which violate EUT but nevertheless maximize organisms’ evolutionary fitness. In particular, organisms display a profound intolerance for zeros or low values in key evolutionary parameters. This intolerance for zeros arises from a fundamental difference between economic utility and fitness. Cumulative expected utility is additive across time periods, whereas fitness is multiplicative (12; 13). Within an individual’s lifetime, the individual must survive each previous time period to reach a given age to reproduce. Across generations, individual lineages must persist. Zeros in such processes represent absorbing states. Consequently, risky strategies that may be acceptable under an additive metric such as cumulative expected utility can be unacceptable, and even catastrophic, given a multiplicative metric such as survival or lineage persistence. We show how this type of evolutionary conservatism leads to pessimistic subjective probability weights when rank-dependent expected utility theory (RDEUT) replaces EUT in the evolutionary model. In so doing, we present a first-principles, evolutionary foundation for RDEUT and cumulative prospect theory (CPT), a successful phenomenological theory that incorporates RDEUT (7). The evolutionary perspective also implies that the canonical explanation for risk-sensitive behavior, namely, curvature of the utility function, is overly simplistic, consistent with Rabin’s Calibration Theorem (5).

## Natural Selection and Preferences

We are interested in economic decisions in the broadest sense. At the basic level, organisms are making decisions over what can be thought of as different lotteries. For example, should a forager hunt for sand monitor lizards or hill kangaroo (14)? Should a peasant farmer intensify cultivation of a nearby garden plot or spread effort across two geographically distinct plots (15)? Should a woman wean her infant and have another baby or continue nursing and delay reproduction (16; 17)? In these examples, each lottery yields different payoffs probabilistically. All of these examples have a clear impact on fitness, but it is extremely unlikely that any conscious fitness-maximization goal plays significantly into any of the decision-makers’ choices. Instead, their decisions are shaped by preferences over a variety of proximate currencies like hunger/satiety, feelings of security, or feelings of love and responsibility for children. Samuelson and Swinkles (18) raise the important question, given the evolutionary mandate to successfully leave descendants, why do people have preferences for anything but fitness?

### Why Have Preferences for Proximate Quantities?

In the substantial literature that addresses the question of why people have preferences for proximate currencies rather than fitness itself (19; 20; 21), three inter-related factors loom largest. First, natural selection operates on time-scales that are longer than the lifespans of the organisms whose behavior it shapes and, furthermore, it is a stochastic, undirected process. Second, organisms regularly encounter novel situations for which natural selection is unable to directly specify behaviors. Third, the types of solutions that might emerge via natural selection to address the first two factors are constrained by trade-offs imposed by the cost of gathering and processing information. These observations can be accommodated within a single analytic framework by utilizing the economic concept of the principal-agent problem (20) which will allow us to address the question of why organisms have preferences defined over proximate currencies, rather than the ultimate currency of fitness.

### The Evolutionary Principal-Agent Problem

Consider a principal that possesses certain goals that it is unable to achieve unless it acts through agents over which it has only indirect control (20). Natural selection is the ultimate arbiter of which biological entities remain and increase in a population. While the process of natural selection clearly lacks agency, it is nonetheless useful to consider the outcomes of selection as having been designed (22). It is in this sense that natural selection can be thought of as a principal with a goal of maximization of fitness. However, there are clear limitations in the ability of selection to achieve a solution, as enumerated above. Due to the obvious lack of direct control, selection shapes cognitive mechanisms which are, on average, consistent with fitness maximization. As noted by Binmore (20), the collective proximate cognitive mechanisms can be thought of as the agent in the evolutionary principal-agent problem, wherein the principal (selection) “seeks to design an incentive scheme that minimizes the distortions resulting from having to delegate to the agents” (20). The extent to which selection minimizes these distortions for a given organism in a given setting depends on the structure and strengths of the constraints it faces.^{2}

Figure 1 encapsulates the conceptual model that emerges from the principal-agent framework. At the bottom level, an individual is choosing between lotteries . The outcomes of these lotteries contribute to utilities, , at the next level. These utilities differ somewhat from the classical economic notion of utility. They can be thought of as either motivational systems such as satiety, sexual gratification, or happiness or as proximate determinants of fitness such as infant survival or total fertility. What links these two different notions of “utility” is that they work on a time-scale where the organism can use feedback from outcomes to change its preferences and therefore decision-making. These proximal utilities then contribute ultimately to fitness.

## Applications

To illustrate the utility of the hierarchical principal-agent framework described in the preceding section, we present two applications that utilize it. Since the examples involve quite sophisticated mathematics and draw on evolutionary and economic theory that may be unfamiliar to some readers, the appendix provides background and mathematical material that could not be included in the main text. In addition, we summarize here the common elements that the two examples share as a result of the principal-agent framework. In particular, what are needed to implement the framework are: (1) a hierarchical evolutionary model that specifies both the dependence of determinants of fitness on the lotteries and of fitness on the determinants of fitness; (2) an economic model of decision making; and (3) a formal mechanism to link the evolutionary and economic theory.

The first example is a generalization of the Arrow-Pratt risk premium. It is the more general example in so far as both the evolutionary and economic component of the model are maximally general. That is, it applies to any specification of fitness of the functional form (compositional) and any specification of utility. The formal link between the evolutionary and economic theory is the assumption that utility, , is the determinant of fitness accessible to the organism to maximize over. However, from the standpoint of the principal, natural selection, maximizing expected utility is only correct to first order and the generalized risk premium contains a non-traditional component that is especially relevant at low resource levels. We illustrate this with a formal model based on demographic data from Madagascar in 1966.

The second example also illustrates how expected utility maximization is an acceptable first order solution of the evolutionary principal-agent problem, but is incorrect to higher orders. The evolutionary component of the model is stochastic age-structured life history theory, the same theory used to illustrate the first example. The economic component is rank dependent expected utility theory (RDEUT), a generalization of Expected Utility Theory (EUT). The formal link between the evolutionary and economic theory is the assumption that the economic decision making mechanism (RDEUT) should yield the same preference ordering on uncertain outcomes as one based directly on fitness maximization. Enforcing this equivalence leads to subjective probability weighting, which is at the heart at RDEUT.

### Example 1: Generalizing the Arrow-Pratt Risk Premium

The standard explanation for risk preferences is based on the curvature in the utility function (26). Individuals with a concave utility function (i.e., prefer a sure payoff to a gamble with the same mean value because the upside gain of the gamble is smaller than the downside loss. Such individuals are “risk averse” and must be offered a higher mean payoff to accept a risky gamble over a certain payoff. The additional payoff needed to compensate for the risk of the gamble is called the risk premium, π (see Appendix 1 for more detailed explication). For gambles with small or moderate levels of risk, the risk premium is approximated by
where is the classical Arrow-Pratt Index of Absolute Risk Aversion (27; 28) and *σ*^{2} is the variance of the gamble in the variable *x*. In the hierarchical evolutionary framework captured by Figure 1, *u* is the proximate determinant of fitness that an organism utilizes to assess trade-offs. To first approximation, an organism may make decisions based on the expected value of *u*, such as the expected number of offspring or expected survival. This decision rule is straightforward to apply and may be suitable if there is little variation in the organism’s life history and in the absence of environmental uncertainty, but otherwise leads to sub-optimal decisions for the organism’s fitness (17). We suggest that natural selection, the principal, can correct the first order approximation by adding a correction to the agent’s decision rules that accounts for the linkage between *f* and *u*. This correction accounts for the hierarchical or compositional dependence of *f* on *x, f*(*u*(*x*)), and yields an illuminating generalization of the Arrow-Pratt measure.

Let π_{f} be the risk premium for gambles that leave the measure of fitness *f* unchanged. The measure of curvature that figures into this risk premium is Application of the chain rule to yields and , which provides a formula for the generalized risk premium,
where . While *α*_{u} is the classic Arrow-Pratt index of absolute risk aversion, is a term that has no analogue in EUT. Figure 2 plots the Arrow-Pratt coefficient of absolute risk aversion (*α*_{u}) and evolutionary coefficient of absolute risk aversion for the age-structured model described in Appendix 1, with demographic data drawn from Madagascar in 1966 (29). The utility function we utilize exhibits constant absolute risk aversion and governs juvenile survivorship. The evolutionary term is plotted both with and without environmental uncertainty (see Appendix). At high consumption levels, the evolutionary curves approach the EUT curve. At low consumption levels, however, greater risk aversion (a higher risk premium) is predicted for both evolutionary curves. This suggests that the individuals least likely to be expected utility maximizers are poor individuals. The fact that greater risk aversion is predicted both with and without environmental uncertainty demonstrates that this effect is distinct from pessimism, which requires environmental uncertainty.

### Example 2: Pessimism due to Environmental Uncertainty

In this section, we utilize stochastic age-structured life history theory to show that organisms that face uncertainty arising, e.g., from environmental fluctuations will act as pessimistic decision makers *sensu* RDEUT.^{3} Appendix 2 provides background material and more detailed explication for this example. Let *k* index strategies an agent can choose and let *s* index states of the world that influence the fertility the agent will achieve. The agent faces uncertainty in choosing the strategy since the state of the world is not known when the strategy is chosen, but the agent does not the probability *p*_{s} that each state occurs. The agent’s fertility given strategy *k* and state *s* is . The mean fertility for strategy *k* is

Table 1 illustrates this model for a simple case with three possible strategies and two states of the world, each of which occurs with probability 0.5. What makes these three strategies interesting is that they offer the same expected utility (i.e., fertility) but, as we show next, different evolutionary fitnesses. In particular, strategy *k* = 1 offers the highest fitness because it has the lowest variance and strategy *k* = 3 offers the lowest fitness because it has the highest variance.^{4} To demonstrate this, we utilize stochastic age-structured life history theory. Let **A** represent a Leslie matrix in which all elements are fixed except for one stochastic fertility term, , which is simply the fertility term discussed above and in Table 1. Tuljapurkar (31) shows that the appropriate fitness measure to use given environmental stochasticity is the long-term logarithmic growth rate. Furthermore, he shows that in a serially independent environment in which the world state is chosen randomly each time period with no dependence on previous states the long term logarithmic growth rate can be approximated by
where λ_{0} is the growth rate of the (hypothetical) mean phenotype with mean Leslie matrix 〈**A**〉 and is the variance in the *ij*-th term due to environmental fluctuations (i.e., uncertainty over states of the world). The crucial insight to be gained from Equation 4 is that the second term is always negative. The effect of environmental uncertainty, therefore, is to reduce *a* if the mean Leslie matrix is held constant. That is, implies , and vice versa. Thus, we have demonstrated that strategy *k* = 1 in Table 1 is preferred to *k* = 2 and *k* = 2 is preferred to *k* = 3.

The stage is now set to introduce the economic theory and show that environmental uncertainty induces deviations from expected utility maximization consistent with pessimistic subjective probability weighting. An expected utility maximizer will evaluate strategies solely by the mean fertility, , which is equivalent to writing , where is the expected utility valuation of the fitness. Symbolically, we can write
where ⇔ means implies and the implication is in both directions. This result is pertinent to RDEUT since the definition of pessimism in RDEUT is that a re-weighted lottery is valued as less than its expected utility value (for a discussion of this definition see Appendix 2). For a concave utility function, this is equivalent to assuming that *w*(*p*) ≤ *p* for all *p* (32). Symbolically, we can write

If we assume that natural selection (the principal) has imparted subjective probability weighting to the agent in order to “fix” the optimism of the EUT decision rule we can posit, by comparing Equation 5 with Equation 6, that natural selection should instill its agents with pessimistic subjective probability weights.

## Discussion

The key scientific finding of this article is the derivation of RDEUT-like pessimism weighting from evolutionary first principles. An explanation for pessimistic probability weighting emerges naturally from coupling utility-maximizing decision-making and fitness maximization. This pessimism arises from a profound intolerance for zeros in key evolutionary parameters which, in turn, arises from a fundamental difference between economic utility and fitness. As we have previously observed, cumulative expected utility is additive across time periods, whereas key evolutionarily-salient processes undergirding fitness, such as survival and the persistence of lineages, are multiplicative. Consequently, additive decision metrics such as EUT do not necessarily lead to optimal behavior from a fitness standpoint, and can even lead to catastrophic outcomes. It is now possible to draw a striking correspondence between the evolutionary intolerance for zeros and the theoretical foundations of RDEUT. John Quiggin, in describing the mentality with which he approached the derivation of RDEUT, writes, “The crucial idea was that the overweighting of small probabilities proposed by Handa and others should be applied only to low probability extreme outcomes, and not to low probability intermediate outcomes” (32, 56). Since low probability extreme outcomes are the key factor driving both the evolutionary intolerance for zeros and the development of RDEU, it may come as no surprise that our evolutionary model predicts the biasing of probability weights for survival (i.e., utility) in a manner consistent with RDEUT. Aversion to zeros has been used in population biology to explain a range of life-history phenomena which are not favored under standard non-stochastic models such as the regular production of clutches smaller than the most productive clutch (33), delayed reproduction (34), and iteroparity (35).

Our results suggest a new life to long-standing debates on the persistent risk-aversion of agricultural peasants (36; 37; 38) and, more recently, the willingness of the poorest poor to adopt microfinance and other development schemes (39). The general expectation stemming from EUT, and following the foundational paper of Friedman and Savage (26), is that the poorest poor should be willing to take substantial risks to remove themselves from poverty because of the convexity of the putative sigmoid utility function. This logic contributes to the notion that the poorest poor are natural entrepreneurs. However, an increasing body of evidence indicates that the poorest poor are entrepreneurial only to the extent that they lack alternatives such as reliable wage employment. As Banerjee and Duflo (39) write, “are there really a billion barefoot entrepreneurs, as the leaders of MFIs and the socially minded business gurus seem to believe? Or is this just an illusion, stemming from a confusion about what we call an ‘entrepreneur’?” Our results predict substantial departures from standard expectations based on EUT for the poorest poor in exactly the direction (Figure 2) documented by extensive research of the Poverty Action Lab and collaborators (39). The poorest poor are exactly the people we expect that symmetry-breaking second term of Equation 2 to dominate, making them substantially more risk-averse than the standard theory predicts. The evolutionarily-constructed aversion to zeros imbues the human mind with conservatism that is particularly clearly expressed at the lowest levels of consumption.

## Acknowledgements

This paper was completed while the second author was a fellow at the Center for Advanced Study in the Behavioral Sciences and is a contribution to Imperial College’s Grand Challenges in Ecosystems and the Environment initiative. We thank Rebecca Bird, Elly Power, Elpeth Ready, Matt Jackson, Ken Wachter, Andrew Iannaccone, and Tim Barraclough for insightful comments.

## Footnotes

↵

^{1}Violations of the stationarity of time preferences, another canonical assumption, include the common difference effect and the absolute magnitude effect (8; 9).↵

^{2}A virtually identical perspective underlies the so-called “indirect approach,” in which the utility function is defined on proximate goods or outcomes, and the utility function in turn determines the success of an organism (23; 24; 25).↵

^{3}Our model generalizes that of (30), who also pointed out that evolution could induce preferences that to do not accord with EUT.^{4}The gambles we discuss in this section all depend implicitly on underlying consumption levels that determine fertility. However, the details of the dependence do not impact the results so for simplicity we choose not to explicitly model them.