Reduced neuronal value signals in monkey orbitofrontal cortex during relative reward-specific satiety

Alexandre Pastor-Bernier; Arkadiusz Stasiak; Wolfram Schultz

doi:10.1101/2020.07.04.187518

SUMMARY

Reward-specific satiety changes the subjective value of one reward relative to other rewards. Two-dimensional indifference curves (IC) capture relative reward-specific values of two-component choice options according to Revealed Preference Theory. Any change of reward value would be captured by specific IC distortions. We estimated two-dimensional ICs from stochastic choice and found that natural on-going consumption of two liquid rewards led to characteristic IC changes indicative of relative value reduction of specific rewards, suggesting reward-specific satiety. Licking changes confirmed the satiety in a mechanism-independent manner. Neuronal reward signals in monkey orbitofrontal cortex (OFC) followed the specific IC distortions and indicated value changes compatible with relative reward-specific satiety. A neuronal classifier predicted well the value changes inferred from the altered behavioral choices. These results demonstrate that neuronal signals in OFC reflect the altered subjective value of selectively sated rewards during economic choice.

INTRODUCTION

An animal’s internal state markedly influences subjective reward value (Cabanac, 1971; Rolls et al., 1983). A classic case is satiety, where two processes come into play. General, non-differential satiety concerns the reduction in subjective value of all rewards and also involves changes in general arousal, attention and motivation. By contrast, reward-specific satiety reduces the subjective value of specific rewards relative to other rewards (sensory-specific satiety; Cabanac, 1971; Rolls et al., 1983; Reichelt et al., 2014); in analogy, salt depletion makes salt attractive, indicating increased value of salt, while leaving sugar attraction unchanged (Robinson & Berridge 2013). However, the neuronal mechanisms of reward-specific satiety are poorly understood in animals, partly because the accompanying general satiety impairs task performance necessary for well-controlled tests. Existing data point to a role of orbitofrontal cortex (OFC) (Rolls et al., 1989; Critchley and Rolls, 1996; Small et al., 2001; Kringelbach et al., 2003).

Testing reward-specific satiety requires comparison between a reward on which the animal is sated and at least one other reward on which the animal is less or not sated. This requirement matches the notion that choice options have have multiple reward components. For example, a meal is composed of meat and vegetables, and the choice of the meal involves both rewards. This multi-component nature is conceptualized in Revealed Preference Theory; its two-dimensional indifference curves (IC) graphically display reward preferences and subjective reward values that are revealed by measurable choice (Fisher, 1892; Samuelson, 1937; Samuelson, 1938). The preferences may be fixed, as the theory assumes, or they may be flexibly constructed on the fly at the time of choice; the distinction is debatable but not crucial for the current experiment (Payne, Bettman, & Schkade, 1999; Simonson, 2008; Dhar & Novemsky, 2008; Kivetz, Netzer & Schrift, 2008; Warren, McGraw & Van Boven, 2011). Our previous work established ICs in rhesus monkeys that represent subjective reward values in an orderly manner and fulfill necessary requirements for rationality, including completeness (preference for one or the other option, or indifference), transitivity, and independence of option set size (Pastor-Bernier et al., 2017). Similar ICs were empirically estimated in humans (Pastor-Bernier et al., 2020). The ICs represent the relative subjective values of the two bundle rewards; thus, important for the present study, IC changes would indicate changes in relative reward value. Responses of substantial fractions of OFC neurons follow the IC scheme, namely increasing with higher subjective value, and being equal with differently composed but equally valued bundles (Pastor-Bernier et al., 2019). The feasibility of testing OFC neurons with two-reward bundles allowed us to investigate value changes indicative of relative, reward-specific satiety during on-going reward consumption.

The current study used the rigorous formalisms of ICs to investigate the influence of on-going reward consumption on OFC neurons. We presented monkeys with bundles containing a common juice (blackcurrant) and one other reward liquid. Natural, on-going reward consumption of the reward bundle altered systematically the geometry and key parameters of ICs, which suggested reward value changes reflecting reward-specific satiety. Neuronal signals in OFC coding the chosen value of multiple rewards followed the altered ICs that indicated reward-specific satiety. These data from a novel, concept-driven approach unequivocally demonstrate reward-specific satiety of OFC value neurons.

RESULTS

Behavioral test design

Our study followed the notions that subjective reward value can be inferred from observable economic choice, that altered choice would indicate a change in reward value, and that a reduction in reward value with on-going consumption would reflect satiety. An assessment of differential, reward-specific value change requires at least two rewards. To this end, we tested choices between bundles that each had two liquid rewards and established two-dimensional indifference curves (IC) whose slope and curvature reflect, and change with, the subjective value of one bundle reward relative to the other bundle reward. We tested stochastic choices rather than single-shot choices for reasons of neuronal response statistics.

The two-dimensional ICs represent choices of two-reward bundles in a convenient graphical manner (Figure 1A). In choice between two bundles, relative reward value can be inferred from the amount of one reward the animal gives up in order to obtain one unit of the other reward (called the Marginal Rate of Substitution, MRS). The trade-off is measured at choice indifference between the new bundle and the old bundle (equal probability of P = 0.5 for choosing each of two options). The position of equally preferred bundles on the two-dimensional graph are called choice indifference points (IP). Any reward value change between the components from reward-specific satiety would be manifested as a change in the trade-off amounts at the IP (MRS change).

Figure 1. Design, task and behavior

(A) Test scheme: relative reward-specific satiety indicated by decreasing trade-off: with on-going consumption of both juices, the animal gave up progressively less blackcurrant juice for obtaining the same amount (0.3 ml) of grape juice while maintaining choice indifference between the black and one of the colored bundles (from green to red). The two colored curves show indifference curves estimated from choices of bundles between the colored dots. These changes suggested subjective value loss of grape juice relative to blackcurrant juice.

(B) Choice options. Each bundle contained two rewards (A, B) with independently set amounts indicated by the vertical bar position within each rectangle (higher was more). The Reference Bundle contained two preset reward amounts. The Variable Bundle contained a specific amount of one reward and an experimentally varied amount of the other reward.

(C) Task sequence: In each trial the animal contacted a central touch key for 1.0 s; then the two choice options appeared on a computer monitor. After 2.0 s, two blue spots appeared on the monitor, and the animal touched one option within 2.0 s. After touching the target for 1.0 s, the blue spot underneath the chosen bundle turned green as feedback for successful selection, and the blue spot disappeared. The computer-controlled liquid solenoid valve delivered reward A at 1.0 s after the choice, and reward B 0.5 s later.

(D) Psychophysical assessment of choice between constant Reference Bundle (0.6 ml blackcurrant juice, 0.0 ml grape juice) and Variable bundle (varying blackcurrant juice, 0.3 ml grape juice) (same bundles as in C). Green and violet curves inside green ±95% confidence intervals: initial choices; blue, orange and red curves: on-going consumption. Each curve was estimated from 80 trials (Weibull fits).

(E) Gradual changes in slope and curvature of choice indifference curves between pre-satiety (green, violet) and during increasing satiety (blue, orange, red).

At the onset of a daily experiment, the black and green bundles of Figure 1A were chosen with equal probability. When choosing the green bundle, the animal gave up 0.5 ml of blackcurrant juice (from 0.6 ml to 0.1 ml) to gain 0.3 ml of grape juice. With repeated choices, the animal consumed both juices, and the trade-off amounts changed: to gain the same 0.3 ml amount of grape juice, the animal gave up progressively less blackcurrant juice, from 0.45 ml via 0.38 ml and 0.25 ml to finally only 1.8 ml (upward arrow, from violet via blue and orange to red). Thus, the slope of the IC between the black and the colored bundles changed as the animal ‘payed’ progressively less blackcurrant juice for the same amount of grape juice. The IC changed also in shape; the curvature changed from initial convex (relative to origin; green) to concave (red), indicating that the animal was reluctant to give up any blackcurrant juice unless it received substantial amounts of grape juice. Both changes indicated a reduction of subjective reward value of grape juice relative to blackcurrant juice during on-going consumption of both juices, which suggested relative reward-specific satiety for grape juice. These IC changes constituted our test scheme for satiety.

On-going reward consumption affects subjective value represented by ICs

To establish ICs representing subjective reward value, we presented the monkey simultaneously with two composite stimuli on a horizontally mounted touch screen (binary choice task with two discrete, mutually exclusive and collectively exhaustive options; Figure 1B, C). Two rectangles in each stimulus represented a bundle with two reward components whose individual amounts were indicated by a vertical bar (higher was more). The two components were blackcurrant juice or blackcurrant juice with added monosodium glutamate (MSG) in all bundle types as reward A, and grape juice, strawberry juice, mango juice, water, apple juice, peach juice or grape juice with added inosine monophosphate (IMG) as reward B.

We set both rewards in the Reference Bundle to specific amounts, varied psychophysically the amount of one reward in the Variable Bundle over the whole testing range and estimated the amount of reward at which both bundles were stochastically chosen with equal probability using a Weibull fit on the choice function. These two amounts defined the IP on the two-dimensional graph. As schematized in Figure 1A, on-going juice consumption during these choices resulted in increasing amounts of blackcurrant juice being retained for gaining the same amount of grape juice at choice indifference (Figure 1D; rightward shifts of IPs from green via violet, blue and orange to red). The initial two IPs were close together (green and violet in green zone), whereas the next IPs showed substantial change, suggesting initially maintained relative subjective value between the two rewards until an outright drop occurred (blue, yellow and red IPs). These changes indicated progressive value reduction of grape juice with on-going consumption.

The IPs were used to fit indifference curves (IC) along which all bundles were equally preferred (Figure 1E; see Methods; Eq. 1). For example, the green IC was fitted from bundles that were all equally preferred to each other (and equally preferred to the black bundle at top left, given previous transitivity tests; Pastor-Bernier et al. 2019). On-going juice consumption resulted in well-ordered, monotonic change of IC slope from green to red and concomitant transition from convex via linear to concave curvature, indicating relative reward-specific value reduction and satiety for grape juice.

Positioning of single-component bundles along the x-and y-axes allowed numeric value assessment without liquid interaction within bundles. Opposite to before, we held blackcurrant juice constant and psychophysically estimated the trade-in amounts of grape juice at IPs (Figure S1A-C). With on-going juice consumption, the animal gave up the same constant blackcurrant juice amount only when gaining monotonically increasing grape juice amounts at IP, thus reducing the ratio blackcurrant:grape juice and confirming the relative value reduction of grape juice. The IC curvature changed in a similar way as with the original testing scheme (Figure S1D). The ICs with Monkey B showed similar changes (Figures S1E and S1F). These tests demonstrate robust value reduction of grape juice with on-going consumption irrespective of the test scheme employed.

Consistency across different bundles

Two rhesus monkeys performed 74,659 trials with the eight bundle types (Figure 2). Given that relative reward-specific satiety would change the ratio of reward amounts at IPs, and the observation that animals sated least on blackcurrant juice, we defined the boundary between presated and sated states by the confidence interval of the initial, left-most choice function between blackcurrant juice and any reward (green in Figures 1D, S1A and S1E); any IP outside this interval would reflect value reduction. Before satiety, we estimated 408 IPs in 38,443 trials (3-5 ICs estimated from 54 IPs / bundle type, 4-16 IPs / IC); during satiety, we estimated 400 IPs in 36,216 trials (3-5 ICs from 50 IPs / bundle type, 4-18 IPs / IC; see Behavioral database in Methods for breakdown).

Figure 2. indifference curves reflect relative reward-specific satiety for different bundle types

(A) - (F) Behavioral indifference curves (ICs) for all bundle types used in the current experiment with Monkey A. Lines show ICs fitted hyperbolically to indifference points (IP) of same color (Eq. 1). Dots in A, C, E show measured IPs (choice indifference between all bundles of same color). Dotted lines in B, D, F show ± 95% confidence intervals. Reward A is plotted on the y-axis, reward B on the x-axis. Bc, blackcurrant juice; MSG, monosodium glutamate; IMP, inosine monophosphate.

(G), (H) Monkey B.

On-going consumption of all eight bundles by both animals produced asymmetric satiety-related changes of IC shape (Figure 2). Stronger satiety for 7 of the 8 liquids (x-axis) relative to blackcurrant (y-axis) resulted in flattening of ICs and gradual transition from convexity via linearity to concavity. However, monkey B seemed to become less sated on peach juice compared to blackcurrant juice, as suggested by steeper ICs (Figure 2H); with on-going consumption, the animal gave up more blackcurrant juice for gaining the same amount of peach juice, indicating value loss of blackcurrant juice relative to peach juice.

Numeric comparisons of IC parameters substantiated these findings. The IC slope relative to blackcurrant decreased significantly with on-going consumption of all rewards except for peach juice and strawberry juice (Figure S1G; P = 0.0156, Wilcoxon paired test). The IC curvature flattened significantly and switched from convex to concave with five of the eight tested bundle types (Figures 2; S1H; P = 0.0313). These IC changes demonstrated robust relative subjective value loss with on-going liquid consumption in a variety of bundle types.

Control for other choice variables

To confirm that bundle choice continued to vary only with the bundle rewards and did not reflect unrelated variables during satiety, we performed a logistic regression (Eq. 2). As before satiety (Pastor-Bernier et al. 2019), we found that the probability of choosing the Variable Bundle continued to correlate positively with the amounts of both of its rewards, and inversely with the amounts of both Reference Bundle rewards (Figure S1I; VA, VB vs. RA, RB), confirming previous findings (Pastor-Bernier et al., 2019). Further, choice probability for the Variable Bundle was anticorrelated with the accumulated consumption of blackcurrant juice (MA) and positively correlated with grape juice consumption (MB). This asymmetry is explained by the trade-off at IPs; as grape juice lost more value than blackcurrant juice during satiety, the animal consumed more grape juice and gave up less blackcurrant juice. Trial number within individual trial blocks (CT) and spatial choice CL) did not explain the choice. Thus, even with on-going consumption, the animals based their choice on the reward amounts of the bundles and the actually consumed rewards according to the experimental design; unrelated variables kept having no significant influence.

Licking and liquid consumption

Licking durations are a crude means for assessing subjective reward value and could represent a mechanism-independent confirmation for the value changes seen with the ICs. Trial-by-trial time courses of licking durations with on-going consumption showed gradual and asymmetric decreases for the bundle rewards. Licking remained nearly constant for blackcurrant juice (slope = −2.86 deg, R² = 0.56; linear regression) but decreased strongly for grape juice (slope = −20.6 deg, R² = 0.50), suggesting stronger value loss for grape juice compared to blackcurrant juice (Figure 3A, B). Cumulative lick durations were significantly longer in the pre-sated state (green) compared to the sated state (violet) with the main liquids tested in both monkeys (Figure 3C-G). The reward value changes inferred from lick durations corresponded to those inferred from IC slope and curvature changes. The lick durations indicated also some value reduction of blackcurrant juice, suggesting that the differential reward-specific value changes did not derive from a single bundle reward but were relative between the two rewards.

Figure 3. Anticipatory licking and differential juice consumption

(A), (B) Anticipatory licking with bundles (blackcurrant juice, grape juice) with advancing reward consumption within single test sessions. Red lines show linear regressions of lick duration across trials. Lick durations remained nearly constant for blackcurrant juice, but decreased for grape juice, indicating relative value loss for grape juice.

(C) - (G) Cumulative distributions of lick durations between bundle appearance and reward delivery for several bundles. Both animals showed significantly more trials with longer lick durations before (green) than during satiety (violet). Monkey A, blackcurrant juice: P = 5.46 x 10^-4; Kolmogorov-Smirnov test; N = 5,740 / 5,894 pre-sated/sated trials) grape juice: P = 2.59 x 1^-9; N = 6,910 / 2,902, water: P = 3.60 x 10^-3; N = 4,143 / 2,718, strawberry juice: P = 8.66 x 10^-6; N = 4,920 / 3,281; Monkey B, mango juice: P = 2.41 x 10^-9; N = 4,730 / 7,840.

(H) Cumulative consumption of water and blackcurrant juice during 10 advancing blocks and 7,160 anchor trials (each bundle contained only one non-zero liquid). For constant blackcurrant amounts (red), the animal consumed significantly more water than blackcurrant in gradually advancing trial blocks.

(I) Exponential reduction of blackcurrant:water ratio from 0.32 (1:3) to 0.15 (1:6) after initial trials (vertical grey line). Single exponential function f (β, x): β₁ + β₂e^(β3x); [β₁. β₂. β₃] = [0.15, 254.78, −1.41] (β₁: final ratio, pink line; β₂: decay constant). Consecutive trial blocks for fitting included last block with stable ratio (green dots).

The IC flattening with on-going consumption indicated that the animal required increasing amounts of the more devalued reward B for giving up the same amount of the less devalued reward A at trade-off (Figures 1E, 2). This led to increasing consumption of the more devalued reward B, which seems paradoxical but can be explained by the choice properties for two-component bundles; at trade-off, the animal gave up some of the less sated reward only if it received more of the sated reward. As the animal had no control over the Reference Bundle that defined the IP, the animal ended up consuming relatively more of the devalued reward as the session advanced. For example, with the bundle (blackcurrant juice, water), the consumption of the devalued water increased relative to that of the less devalued blackcurrant juice (Figure 3H; blue vs. red; P = 5.0979 x 10^-7; Kolmogorov-Smirnov test; N = 7,160 trials). Concomitant with consumption, the ratio blackcurrant:water amounts at IP decreased, indicating that water had lost more subjective value than blackcurrant juice, as shown by exponential decay (Figure 3I). The correlation between this ratio and the combined consumption of bundles blackcurrant juice with grape juice, water, strawberry juice and mango juice was highly significant (Rho = 0.3859; P = 0.0056; Pearson).

Thus, the licking changes confirmed in a mechanism-independent manner the relative reward-specific value changes inferred from IC choices.

Neuronal test design

We used the IC changes with on-going reward consumption observed in a large variety of bundles to investigate altered value coding in OFC reward neurons. Given the shallower slopes and the less convex and more concave curvatures, we placed bundles on specific segments of the ICs that would change with on-going consumption, such that the physically unaltered bundles would end up on different ICs or IC parts. We tested neurons in either or both of two situations: (i) during choice over zero-bundle, both rewards were set to zero in one bundle, and the animal unfailingly chose the alternative, non-zero bundle; (ii) during choice between two non-zero bundles, at least one reward was set to non-zero in both bundles, and the animal chose either bundle. All tested neuronal responses were sensitive to multiple rewards and coded the value of the bundle the animal chose (chosen value). The tested responses followed the basic scheme of ICs (Pastor-Bernier et al., 2019): monotonic increase with bundles placed on different ICs (testing bundles with different value), and insignificant response variation with bundles positioned along same ICs (testing equally preferred bundles with equal value). Our satiety test involved two bundle placements that considered the IC properties: variation of blackcurrant juice while holding grape juice constant, and variation of grape juice while holding blackcurrant juice constant. Comparison of the x-y plots between the pre-sated state (Figure 4A and B) and the sated state (C and D) illustrates this test scheme. The IC flattening with satiety moved the bundle positions relative to the ICs substantially for grape juice variation (compare B and D) but very little for blackcurrant juice variation (compare A with C). Thus, tests following this design should be sensitive for detecting neuronal changes with satiety.

Figure 4. Reward-specific satiety in single OFC neuron

(A) Monotonic response increase across three indifference curves (IC) with increasing blackcurrant juice before satiety during choice over zero-bundle. Each colored dot indicates a bundle with specific amounts of blackcurrant and grape juice located on a specific IC. Responses varied monotonically and significantly across ICs with increasing blackcurrant juice (grape juice remained constant) (P = 0.0053, F = 8.88, 36 trials; 1-way Anova).

(B) As (A) but significant response variation with grape juice across ICs (blackcurrant juice remained constant) (P = 1.97141 x 10^-6, F = 39.73, 25 trials). Same colors as in (A).

(C) Despite IC change after on-going reward consumption, the three bundles remained on the same three ICs, and the neuronal response variation remained significant (P = 0.0029, F = 10.28, 36 trials). Note 29% reduction of peak response, from 15.5 to 11 impulses/s (red), and indiscriminate responses between intermediate and low bundles. Grey dotted lines repeat the ICs before satiety shown in (A).

(D) IC change from convex to concave indicates relative value reduction and satiety for grape juice. The three unchanged bundles were now located near the same, intermediate IC, indicating about equal reward value among them. The neuronal response to grape juice was reduced by 75% (from 15.2 to 3.8 imp/s at peak, red) and had lost significant variation (P = 0.1116, F = 2.68, 34 trials). Dotted ICs are from pre-sated state. Thus, while continuing to code reward value (C), the responses followed the satiety-induced IC change.

Single-neuron value-coding follows IC changes

At the beginning of daily testing, neuronal responses followed monotonically the increase of both bundle rewards, confirming value coding by the tested neuron (Figure 4A and B). With on-going reward consumption, the ICs changed; as a consequence, bundles aligned with increasing blackcurrant juice kept their position on the ICs, and the neuronal responses continued to distinguish reward value during choice over zero-bundle (Figure 4C). By contrast, as the ICs flattened and became concave, the three, physically unaltered bundles aligned with increasing grape juice were now almost on the same IC (Figure 4D), which indicated similar reward value for these bundles. Correspondingly, the neuronal responses failed to vary with grape juice amounts, and the response peak for the largest grape juice quantity had dropped by 75% as this reward was now located on the second highest IC instead of the highest IC (Figure 4D). This result is consistent with the stronger value reduction of grape juice compared to blackcurrant juice as inferred from the flattened ICs.

The neuronal changes on-going reward consumption occurred also in choices between two non-zero bundles (Figure S2). The positions of bundles aligned with increasing blackcurrant juice remained on the same ICs as before, and the responses continued to code the value of the chosen option, as the intermediate responses to bundles on the intermediate IC suggested (Figure S2A and C; blue; dotted line for hollow dot). By contrast, the three physically unaltered bundles aligned with varying grape juice were now distributed over a narrower and lower IC range, indicating smaller differences of lower value, and the chosen value responses became correspondingly less differential and lower (Figure S2B vs. S2D, red, blue, green). Further, the responses to the physically unaltered bundle whose position had changed from intermediate to highest IC (hollow blue) now dominated all other responses (dotted blue line).

With all these changes, OFC neurons continued to code reward value with on-going reward consumption. Their responses continued to follow the amount of blackcurrant juice whose value had changed less (Figures 4A and C, and S2A and C) but were substantially altered for grape juice whose value had changed more (Figure 4B and D and Figure S2B and D). These OFC signals reflected reward-specific relative value change and satiety as inferred from the altered ICs.

Neuronal population

We investigated satiety in a total of 272 task-related OFC neurons in area 13 at 30-38 mm anterior to the interaural line and lateral 0-19 mm from the midline (which were a part of the population reported previously; Pastor-Bernier et al., 2019). Responses in 98 of these OFC neurons followed the IC scheme in any of the four task epochs (Bundle stimulus, Go, Choice or Reward) during choice over zero-bundle or choice between two non-zero bundles (Table 1). Of the 98 tested neurons, 82 showed satiety-related changes with bundles composed of blackcurrant juice (component A) and grape juice, water or mango juice (component B) (Table 2).

View this table:

Table 1.

Numbers of neurons tested before and during satiety

View this table:

Table 2.

Satiety-induced neuronal changes

We tested averaged z-scored neuronal population responses with the same scheme of bundle alignment on ICs as with single neurons. Bundles aligned with blackcurrant juice (component A) remained on the same three ICs during satiety; by contrast, with the satiety-induced IC flattening, bundles aligned with grape juice, water or mango juice (component B) that were on different ICs before satiety were now very close to a single, intermediate IC with little value variation (see left x-y maps in Figure 4A-D). The population of 101 positive value coding responses in 31 neurons continued to vary with blackcurrant juice amount during satiety in any task epoch (Bundle stimulus, Go, Choice or Reward), although with a 12% peak reduction (Figure 5A, B); response variations with reward amounts of component B in the same neurons went from significant differences before satiety to insignificant differences during satiety, with a 43% peak reduction (Figure 5C, D). Thus, the neuronal population responses confirmed the satiety pattern seen in single neurons.

Figure 5. Population responses

(A) - (D) Averaged z-scored population responses from 31 positive coding neurons showing response reduction during satiety. Each part shows responses to bundles on lowest and highest of three indifference curves (IC) during choice over zero-bundle. Data are from choice over zero-bundle, both animals, four bundle types (component A: blackcurrant juice, component B: grape juice, water or mango juice). The response differences between lowest and highest ICs were statistically significant both before satiety (P = 1.53862 x 10^-5, F = 19.28,1-way Anova) and during satiety (P = 2.96646 x 10^-16, F = 72.18), but degraded and lost statistical significance with component B (before satiety: P = 4.39918 x 10^-16, F = 73.24; during satiety: P = 0.6796, F = 0.17). Dotted lines show ± 95% confidence intervals.

(E) Response changes in positively coding neurons in any of four task epochs (Bundle stimulus, Go, Choice and Reward; Table 2) during choice over zero-bundle. Red: significant response decrease in population reflecting satiety-induced value reduction (P = 7.15 x 10^-4; 101 responses in 31 neurons; 1-tailed t-test). Black: significant response increase (P = 0.0014; 69 responses in 21 neurons). Imp/s: impulses/second).

(F) As (E) but for negative (inverse) value coding neurons. Red: significant response increase reflecting satiety-induced value reduction (P = 0.0013; 54 responses in 15 neurons). Black: insignificant response decrease (P = 0.1274; 33 responses in 14 neurons).

(G) As (E) but for choice between two non-zero bundles. Red: response decrease (P = 0.0156; 54 responses in 16 neurons; 1-tailed t-test). Black: response increase (P = 0.0101; 57 responses in 16 neurons). Imp/s: impulses/second).

(H) As (F) but for choice between two non-zero bundles. Red: significant response increase (P = 0.0242; 31 responses in 9 neurons). Black: insignificant response decrease (P = 0.1939; 36 responses in 14 neurons).

Numeric quantification of individual responses demonstrated satiety-induced significant response reduction with positive value coding neurons and significant response increases with negative (inverse) coding neurons during choice over zero-bundle (Figure 5E and F, red) and during choice between two non-zero bundles (Figure 5G and H, red; Table 2). A minority of neurons showed either inverse changes that were difficult to reconcile with value coding (black in Figures 5E-H), or no significant changes at all.

Neuronal satiety changes indicated by classification accuracy

To confirm the changes in neuronal value coding with a different approach, we tested the extent to which a hypothetical observer could use the neuronal responses to distinguish bundles on different ICs before and during satiety. Specifically, how well could neuronal responses obtained before satiety distinguish the same bundles during satiety, and vice versa? If the neuronal bundle responses reflected the substantial IC changes, the classification of the unchanged bundles should be rather low. To this end, we trained a support vector machine (SVM) classifier on neuronal responses to randomly selected bundles positioned on the lowest and highest of three ICs, respectively. Good classifier performance was evidenced by decent discrimination with as few as five neurons and increasing accuracy with added neurons (Figure 6). The two tests provided similar accuracy drops:

Figure 6. Classifier performance demonstrates substantial satiety-induced value change

(A) Classification by support vector machine (SVM) using neuronal responses to stimuli of bundles positioned on the lowest and highest indifference curve (IC), respectively (choice over zero-bundle). Left two maps show identical bundle positions on changed ICs with on-going juice consumption. Satiety-induced value change is inferred from altered ICs (red). Right: results from classifier trained before satiety and tested for bundle distinction between the two ICs before satiety (black) and during satiety (red). The higher accuracy of bundle distinction with increasing neuron numbers attests to classifier validity. Error bars indicate standard error of the mean (SEM).

(B) As (A), but training of classifier during satiety using bundles positioned in relation to satiety-altered ICs.

First, the classifier trained on neuronal responses to bundle stimuli before satiety provided good bundle distinction before satiety during choice over zero-bundle, testifying to its accuracy. However, accuracy dropped dramatically when the classifier trained before satiety tested bundle distinction during satiety, despite continuing accuracy increase with added neurons (Figure 6A).

Second, in the reversed procedure, accuracy was high when training and testing the classifier for bundle distinction during satiety, but lower when training during satiety but testing before satiety. These accuracy differences were seen during choice over zero-bundle with neuronal responses to Bundle (Figure 6B) and Go stimuli but not during Choice and Reward epochs (Figure S3A-C). The changes were not explained by pretrial baseline changes (Figure S3D). Substantial accuracy differences were also seen in choice between two non-zero bundles during the Bundle stimulus, Go and Choice epochs but not during the Reward epoch (Figure S3E-H), again not explained by baseline changes (Figure S3I). The changes in accuracy were consistent across ongoing consumption (Figure S3J).

Together, these data demonstrate that the neuronal responses dynamically followed the substantial IC changes that reflected the value changes and satiety from on-going reward consumption.

Neuronal satiety changes with single-reward bundles

Using choice options with two reward components differs in several ways from previous studies (Tremblay & Schultz 1999; Padoa-Schioppa & Assad 2006) and requires controls and additional analyses. We used the same two visual component stimuli but set only one, but different, reward in each bundle to a non-zero amount, which positioned the bundles graphically along the x-axis and y-axis but not inside the IC map; the ICs had been estimated with conventional bundles with two mostly non-zero rewards varying over the whole test range.

First we used single-reward bundles for confirming the results with conventional bundles. The responses of the neuron shown in Figure 7A, B distinguished well both rewards during choice over zero-bundle before satiety. With on-going consumption of both rewards, the ICs flattened, preserving the blackcurrant juice positions on the ICs (Figure 7C) but changing the physically unchanged position of the two water amounts relative to the ICs (Figure 7D). The neuron kept discriminating blackcurrant juice amounts during satiety (Figure 7C). However, with the satiety-induced IC change, the large water amount was now positioned on a lower IC than before (Figure 7D, red on x-axis), which was the same IC as the small blackcurrant amount was about on (blue on y-axis). Correspondingly, the neuronal activity with the large water amount lost its peak (reduction by 50%) and was now very similar to the activity with the small blackcurrant amount (Figure 7C, D, red dotted vs. blue solid arrows). Further, the position of the small water amount was now below its original IC (blue on x-axis), and the neuron, with its lost response, failed to distinguish between the two water amounts. Thus, the neuronal changes with single-reward bundles followed the satiety-induced IC changes, indicating that the neuronal satiety changes reported above were not specific for multi-component bundles.

Figure 7. Reward-specific satiety with single-reward bundles

(A-D) Responses of same single neuron before and during satiety. Each bundle contained specific non-zero amounts of only blackcurrant juice or only water (colored dots on indifference curves, ICs) and was tested during choice over zero-bundle.

(A) Significant response increase across two ICs with increasing blackcurrant juice (Bc) before satiety (water remained zero) (red vs. blue; P = 0.0091, F = 6.92, 23 trials; 1-way Anova).

(B) As (A) but significant response variation with increasing water across two ICs (blackcurrant juice remained zero) (P = 0.0113, F = 7.32, 31 trials). Same colors as (A).

(C) Despite IC flattening after on-going reward consumption, the two bundles with blackcurrant juice variation remained on the same two ICs, and the neuronal response variation remained significant (P = 0.002, F = 11.04, 40 trials), and the peak response was only slightly reduced (red). Dotted ICs are from presated state.

(D) IC flattening after on-going reward consumption indicates relative value reduction and satiety for water. The two unchanged bundles with water variation were now located below and at the IC. The neuronal response was substantially reduced by 50% (red) and had lost significant variation (P = 4337, F = 0.64, 40 trials). Further, the large-water bundle (dashed red line) elicited now a similar response as the low-blackcurrant bundle that is now located on the same IC (solid blue line). Thus, while continuing to code reward value (C), the responses followed the satiety-induced IC change.

(E) Polar and vectorial population plots for neuronal responses for bundle (blackcurrant juice, grape juice) (black, red), and vector plots for behavioural choice over zero-bundle (green). Neuronal vector slopes were 35 deg before satiety and 62 deg during satiety, using all significantly positive and normalized negative (inverse) coding responses from all four task epochs; all included responses followed the IC scheme. Dots refer to neuronal responses, vectors represent averages from behavioral choices (green; dotted lines: 95% confidence intervals) and neuronal responses (red), based on Eqs. 1 and 3, respectively (see Methods). Neuronal correlation coefficients (β’s) on axes refer to Eq. 3.

(F) As for (C) but for choice between two non-zero bundles. Neuronal vector slopes were 38 deg before and 45 deg during satiety.

(G), (H) As (E, F) but for bundle (blackcurrant juice, water).

(I) As (E) but for bundle (blackcurrant juice, mango juice).

(J) Correlation between rectified neuronal and behavioral IC slopes during satiety in all tested neurons (rho = 0.604; P = 8 × 10^-6, Pearson correlation; rho = 0.595, P = 2 × 10^-5, Spearman rank-correlation; N = 90 responses during choice between two non-zero bundles).

Next we used single-reward bundles for more quantification. We plotted neuronal population vectors from dots on polar plots that showed the influence of each of the two rewards on the neuronal response (Figure 7E-I). The usually unequal value of the two rewards was manifested as deviation from the diagonal, and the relative value change with on-going consumption was expressed as change in the neuronal population vector. For example, in tests with the bundle (blackcurrant juice, grape juice), the elevation angle of the neuronal population vector increased from 35 deg before satiety to 62 deg during satiety in choice over zero-reward bundle (Figure 7E, red), and from 38 deg to 45 deg with choice between two non-zero bundles (Figure 7F). This change indicated value reduction of grape juice (plotted on x-axis) relative to blackcurrant juice (y-axis) with on-going consumption. Further, the shorter neuronal vectors during satiety indicated reduced overall responding (red). Similar changes indicated reduced value coding for water and mango juice (x-axis) relative to blackcurrant juice (y-axis) (Figure 7G-I). These neuronal changes were paralleled by changes of the behavioral vector (Figure 7E-I, green). Both before satiety and during satiety, the neuronal vectors (red) were within the confidence intervals of the behavioral vectors (green). An analysis of IC slopes during satiety confirmed the neuronal-behavioral correspondence seen with the vector plots. Estimated from regression coefficient ratios (−β₂ / β₁) (Eq. 3) and (−b / a) (Eq. 1), the slopes of the linear neuronal ICs of single-reward bundles correlated well with the slopes of linear behavioral ICs (Figure 7J). Thus, the vector analysis of population responses confirmed and quantified the reward value changes with on-going consumption seen with the single-neuron responses to bundles aligned to ICs.

Taken together, the results with single-reward bundles confirmed the findings with our conventional two-reward bundles: neuronal value responses changed with on-going consumption in good correlation with behavioral changes, indicating a neuronal correlate for relative, rewardspecific satiety.

DISCUSSION

This study used bundles of two rewards and found changes in value coding of OFC neurons during on-going reward consumption that indicated relative reward-specific satiety. Behavioral choices were captured by graphic ICs that represented relative subjective values of two juice rewards in a conceptually rigorous manner. The ICs changed with on-going reward consumption during individual experimental sessions in a characteristic manner that indicated an orderly change in reward value and suggested relative, reward-specific satiety (Figures 1 and 2). Satiety was mechanism-independently suggested by changes in licking behavior (Figure 3). Specifically, ongoing consumption of both bundle rewards resulted in progressive flattening of the ICs, which indicated value loss for one bundle reward relative to the other bundle reward. Our preceding study had established neuronal chosen value responses in OFC that were sensitive to multiple rewards and followed the animal’s rational choice of two-reward bundles, including completeness, transitivity and independence from option set size (Pastor-Bernier et al., 2019). The current study shows that such OFC value responses matched the IC changes during relative reward-specific satiety. Specifically, the responses were similar with all equally valued rewards on flattened ICs (Figures 4 and 5). Machine learning classifiers predicting bundle discrimination from neuronal responses confirmed accurate reward value coding both before and during satiety and demonstrated the substantial nature of the neuronal changes (Figure 6). Responses to conventional single rewards confirmed these satiety-indced changes (Figure 7). These data from a particularly sensitive reward value test demonstrate that neuronal responses in OFC follow the value alterations induced by reward-specific atiety.

The current demonstration of systematically altered reward value coding with reward-specific satiety builds on previous studies on monkey OFC neurons that investigated satiety in a more basic manner. There are notably the studies from Rolls’ laboratory in which monkeys were presented with syringes or tubes containing various fruit juices; rating scales were used to assess behavioral acceptance or rejection of these juices after bolus injections or on-going consumption (Rolls et al. 1989; Critchley & Rolls 1996). The studies report on OFC neurons that responded to several juices and lost the response only for the particular juice on which the animal was sated. The response reduction with sensory-specific satiety in OFC contrast with Rolls’ studies on earlier stages of the gustatory system, including the nucleus of the solitary tract, the frontal opercular taste cortex, and the insular taste cortex, where no such satiety-related changes were found (Yaxley et al. 1985; Yaxley et al. 1988; Rolls et al. 1988). These studies were the first to describe neuronal correlates of sensory-specific satiety, although it is unknown whether the neurons coded subjective reward value inferred from choices in the absence of satiety or covaried with other crucial aspects of reward value, such as reward amount and behavioral preference that formed the basis for our study. Another study found reward response increases with satiety in some OFC neurons (Pritchard et al. 2008), which might correspond to some of our results that were incompatible with reward value coding (satiety-induced response increases in positive value coding neurons, satiety-induced response decreases in inverse value coding neurons; Figure 5E-H).

While reward-specific satiety affects subjective reward value, on-going consumption induces also a general reduction of arousal, attention and motivation. Such general satiety affects the processing of all rewards in an environment or context in which some satiation occurs, both for rewards on which the animal has been sated and for those on which the animal has not been sated. General satiety effects cannot be distinguished from reward-specific satiety when testing only a single reward, and the effects may be attributed to motivation, as in the case of reduced dopamine responses in mice that received food pellets for extended periods of time (Rossi et al. 2013). Nevertheless, even with testing restricted to a single reward, dopamine reward signals may be susceptible to genuine satiety, as the reduction of human midbrain responses with on-going consumption of Swiss chocolate suggests (Small et al., 201). In our results, the shorter neuronal population vectors might indicate an effect of general satiety on neuronal responses, in addition to the reward-specific satiety suggested by the changes in vector angle (Figure 7E-I). However, general satiety cannot explain our asymmetric behavioral and neuronal effects that indicate relative reward-specific value changes in OFC.

The observed increase in consumption of sated liquids like water (Figure 3H) seemed to contradict earlier findings and the general intuition that satiety would rather reduce consumption of rewards on which an animal is sated (Rolls et al. 1989; Critchley & Rolls 1996). Differences in study design might explain these discrepancies. When an animal has the choice between a sated and a non-sated reward, or the choice between accepting and not accepting a reward, it would naturally prefer the non-sated reward which by definition would have more value. This was the case in the cited earlier studies. By contrast, in our study, the animal chose between two bundles that each had two rewards on which the animal was differently sated. As the animal was still interested to obtain the less sated reward, it would inadvertently also receive the other, more sated bundle reward. The animal had no control over the setting of the Reference Bundle against which it would choose the alternative bundle. At the IP, the animal had the choice to give up some of the non-sated reward in order to receive more of the sated reward. If the animal was still interested in a less sated reward, it might give up a limited amount of it if it were to receive a lot more of the other reward as compensation (as long as it did not outright reject it, which was not the case). This trade-off was represented by the increasing concavity of the ICs with on-going consumption, which indicated that really large amounts of the more devalued reward B were required for giving up the less devalued reward A (Figures 1E, 2). Outright rejection of reward B would be represented not by a downward sloped IC but by an upward sloped IC, which was observed in our animals with lemon juice, yoghourt and saline (Pastor-Bernier et al., 2017) but not with the currently used rewards; such upward sloped ICs indicate that an animal needed to be ‘bribed’ with more reward for accepting these normally rejected rewards. By contrast, in the current satiety experiment, the animal inadvertently consumed more of the sated reward during satiety compared to before, and the maintained downward IC slope indicated that the animal was not entirely averse to the sated reward.

STAR METHODS

Animals

Two adult male macaque monkeys (Macaca mulatta; Monkey A, Monkey B), weighing 11.0 kg and 10.0 kg, respectively, were used in these experiments that had already yielded behavioral and neuronal data without satiety (Pastor-Bernier et al., 2017; Pastor-Bernier et al., 2019). Neither animal had been used in any other study.

Ethical approval

This research has been ethically reviewed, approved, regulated and supervised by the following institutions and individuals in the UK and at the University of Cambridge (UCam): the UK Home Office implementing the Animals (Scientific Procedures) Act 1986 with Amendment Regulations 2012, the local UK Home Office Inspector, the UK Animals in Science Committee (ASC), the UK National Centre for Replacement, Refinement and Reduction of Animal Experiments (NC3Rs), the UCam Animal Welfare and Ethical Review Body (AWERB), the Certificate Holder of the UCam Biomedical Service (UBS), the UCam Welfare Officer, the UCam Governance and Strategy Committee, the UCam Named Veterinary Surgeon (NVS), and the UCam Named Animal Care and Welfare Officer (NACWO).

General behavior

The animals were habituated during several months to sit in a primate chair (Crist Instruments) for a few hours each working day. They were trained in a specific, computer-controlled behavioral task in which they contacted visual stimuli on a horizontally mounted touch-sensitive computer monitor (Elo) located 30 cm in front of them. The animal’s eye position in the horizontal and vertical plane were monitored with a non-invasive infrared oculometer (Iscan). Matlab software (Mathworks) running on a Microsoft Windows XP computer controlled the behavior and collected, analyzed and presented the data on-line. A solenoid valve (ASCO, SCB262C068) controlled by the same Windows computer served to deliver specific liquid amounts. A Microsoft SQL Server 2008 Database served for Matlab off-line data analysis. Following task training for about 6 months, animals were surgically implanted with a recording chamber for electrophysiological recordings, which typically lasted for another 6-10 months.

Stimuli, task and rewards

A computer touch monitor presented the subject with two visual stimuli (4° apart) representing two bundles, a Reference Bundle and a Variable Bundle (Figure 1A). Each bundle contained two rewards (Component reward A: violet rectangle, and component reward B: green rectangle) with independently set amounts indicated by the vertical bar position within each rectangle (higher was more). The Reference Bundle contained two preset reward amounts that were fixed for a given block of trials. The Variable Bundle contained a specifically set amount of one reward and an experimentally varied amount of the other reward. The task sequence (Figure 1B) has been described in detail (Pastor-Bernier et al., 2017; Pastor-Bernier et al., 2019) and are summarized as follows. Reward A in all bundles was blackcurrant juice, or blackcurrant juice with added monosodium glutamate (MSG), Reward B was grape juice, strawberry juice, mango juice, water, apple juice, peach juice, or grape juice with added inosine monophosphate (IMG).

Each trial began when the animal contacted a centrally located touch sensitive key for 1.0 s after a pseudorandom inter-trial interval of 1.6 ± 0.25 s. Then two bundles appeared and remained on the screen for 2.0 s, after which two blue spots appeared as GO stimulus underneath the bundles, upon which the animal released the touch key and touched the blue spot of its choice within 2.0 s. After a hold time of 1.0 s, the chosen blue spot turned green and the unchosen blue spot disappeared. Simultaneously a white frame around the chosen bundle appeared providing feedback for successful choice. The computer-controlled liquid solenoid valve delivered liquid A at 1.0 s after the choice, followed 0.5 s later by liquid B (except when using peach juice as reward B; here the sequence was reversed: liquid B was delivered first, then 0.5 s later liquid A, blackcurrant juice).

Estimation of behavioral ICs

The behavioral method used to obtain an IP from stochastic choice has been presented in full detail (Pastor-Bernier et al., 2017; Pastor-Bernier et al., 2019). With two bundle options, the animal chose between the pre-set Reference Bundle (left in Figure 1A) and the Variable Bundle (right) in repeated trials. Thus, the constant Reference Bundle provided a stable reference against the changing bundle composition in the Variable Bundle. We set one reward in the Variable Bundle to one unit (> 0.1 ml) above the amount of the same reward in the Reference Bundle, while pseudorandomly varying the amount of the other reward widely. The variation of the animal’s repeated choice with that single varying reward allowed us to construct a full psychophysical function and estimate the IP from a Weibull fit (point of subjective equivalence; P = 0.5 choice of each bundle). We obtained each IP from a total of 80 trials (2 left-right stimulus positions with 5 equally spaced reward amounts in 8 trials). To avoid known adaptations in OFC neurons (Tremblay and Schultz, 1999; Padoa-Schioppa, 2009; Kobayashi et al., 2010; Rustichini et al., 2017), we always tested the full reward range of the experiment.

To obtain an IC, we fit a series of IPs with a hyperbolic function using weighted least mean squares: with y and x as milliliter amount of reward A (plotted at y-axis on 2D graph, Figure 1A and 1E) and reward B (plotted at x-axis), a and b as weights of the influence of the reward amounts plotted on the y- and x-axes, respectively, and c as curvature. A potent reward that contributes strongly to the choice of the bundle would have a large weight (high coefficient a or b), whereas a less potent reward would have lower weight coefficients. Thus, with the potent (more weight) reward plotted on the x-axis, and the less potent (less weight) reward plotted on the y-axis, choice indifference between them (IC) would occur with smaller milliliter amounts on the x-axis compared to the y-axis. Hence, the IC slope would be steeper than the diagonal line (see Figure 1A, D). By resolving Eq. 1 as y = −(b / a) * x, the IC slope would be the ratio of the coefficients that reflect the weights of the rewards: −b / a. With a higher potency of reward B (x-axis) compared to reward B (y-axis), the rectified IC slope would be larger than 1. Relatively stronger satiety for reward B (x-axis) compared to reward A (y-axis) would reduce the weight of reward B, reduce the absolute value of the ratio −b / a, and flatten the IC slope. Thus, the IC slope −b / a describes the relative impact of the two bundle rewards (reflecting the value ratio between the two rewards), whereas the weights (a and b) describe the influence of the reward amounts. The hyperbolic function can be written in an equivalent form to the regression with interaction used for analysing neuronal responses (β₀ = β₁A + β₂B + β₃AB; see Eq. 3 below).

Definition and criteria for pre-sated and sated states

Satiety was detected by psychophysical choice functions exceeding the confidence intervals of initial tests (see Figures 1C, S1A and S1E); this measure indicated a changed value relation between the two bundle rewards. More specifically, the gradual effect of satiety on choice preference was identified by tracking the IPs as consumption advanced across blocks of 80 trials. The Weibull-fitted IPs were obtained psychophysically for fixed and equally spaced amounts of reward B. Changes in relative value of the two bundle rewards were assessed with interleaved anchor trials in choices between bundles with only one non-zero reward: bundle (non-zero blackcurrant juice; no reward B) vs. bundle (no blackcurrant juice; non-zero reward B), using any reward B. To aggregate IP data across sessions and compensate for across-session variability, we normalized the reward value ratio to the first titration block in all sessions. We then compared the normalized distributions of IPs within the CI of the first block with the distributions of IPs exceeding the CI of the first block.

Behavioral database

In the pre-sated state, we estimated 56 IPs for fitting 5 ICs with the bundle (blackcurrant juice, grape juice), 68 IPs for 4 ICs with bundle (blackcurrant juice, strawberry juice), 58 IPs for 4 ICs with bundle (blackcurrant juice, water), 38 IPs for 5 ICs with bundle (blackcurrant juice, mango juice) (Monkey B), 65 IPs for 5 ICs with bundle (blackcurrant+MSG, grape+IMP), 55 IPs for 5 ICs with bundle (blackcurrant juice, mango juice), 45 IPs for 3 ICs with bundle (blackcurrant juice, apple juice), and 40 IPs for 2 ICs with bundle (blackcurrant juice, peach juice) (Monkey B).

In the sated state, we estimated 52 IPs for 3 ICs with bundle (blackcurrant juice, grape juice), 37 IPs for 4 ICs with bundle (blackcurrant juice, strawberry juice), 63 IPs for 4 ICs with bundle (blackcurrant juice, water), 48 IPs for 5 ICs with bundle (blackcurrant juice, mango juice) (Monkey B), 49 IPs for 4 ICs with bundle (blackcurrant+MSG, grape+IMP), 52 IPs for 4 ICs with bundle (blackcurrant juice, mango juice), 55 IPs for 3 ICs with bundle (blackcurrant juice, apple juice), and 44 IPs for 2 ICs with bundle (blackcurrant juice, peach juice) (Monkey B).

Control regressions for behavioral choice

To test whether the animal’s choice reflected the amount of the bundle rewards during satiety, rather than other, unintended variables such as spatial bias, we used the logistic regression with P(V) as probability of choice of Variable Bundle, β₀ as offset coefficient, β₁ − β₇ as correlation strength (regression slope) coefficients indicating the influence of the respective regressor, CT as trial number within block of consecutive trials, RA as amount of reward A of Reference Bundle, RB as amount of reward B of Reference Bundle, VA as amount of reward A of Variable Bundle, VB as amount of reward B of Variable Bundle, CL as choice of any bundle stimulus presented at the left, MA as consumed amount of reward A, MB as consumed amount of reward B, and ε as error. We used a binomial fit with logit link function to obtain standardized β coefficients. Choices over zero-reward bundles were excluded in the regression to avoid internal correlation between value and consumption.

Licking

Licking was monitored with an infrared optosensor positioned below the juice spout (V6AP; STM Sensors). Anticipatory licking durations were measured between the appearance of the bundle stimuli and delivery of the first reward liquid (approximate duration 5 - 6 s) in bundles containing only one non-zero component reward with advancing trials in satiety and within single working sessions. Licking data were collected with four different bundles, namely (blackcurrant juice, grape juice), (blackcurrant juice, water), (blackcurrant juice, strawberry juice) and (blackcurrant juice, mango juice).

Surgical procedures and electrophysiology

As described before for the same animals (Pastor-Bernier et al., 2019), a head-restraining device and a recording chamber (40 x 40 mm, Gray Matter) were implanted on the skull under full general anesthesia and aseptic conditions. The stereotactic coordinates of the chamber enabled neuronal recordings of the orbitofrontal cortex (OFC) (Paxinos et al., 2000). We located the OFC from bone marks on coronal and sagittal radiographs taken with a guide cannula inserted at a known coordinate in reference to the implanted chamber, using a medio-lateral vertical and a 20° degree forward directed approach aiming for area 13. Monkey A provided data from the left hemisphere, Monkey B from the right hemisphere, via a craniotomy in each animal ranging from Anterior 30 to 38, and Lateral 0 to 19. We conducted single-neuron electrophysiological recordings using both custom made glass-coated tungsten electrodes (Merrill & Ainsworth, 1972), and commercial electrodes (Alpha Omega, Israel) (impedance of about 1 MOhm at 1 kHz). Electrodes were inserted into the cortex with a multi-electrode drive (NaN drive, Israel) with the same angled approach as used for the radiography. Neuronal signals were collected at 20 kHz, amplified using conventional differential amplifiers (CED 1902 Cambridge Electronics Design) and band-passed filtered (high: 300 Hz, low: 5 kHz). We used a Schmitt-trigger to digitize the analog neuronal signal online into a computer-compatible TTL signal. However, we did not use the Schmitt-trigger to separate simultaneous recordings from multiple neurons, in which case we searched for another recording from only a single neuron, or we stored occasionally the data in analog form for off-line separation by dedicated software (Plexon offline sorter). An infrared eye tracking system monitored eye position (ETL200; ISCAN), with temperature check on an experimenter’s hand at the approximate position of the animal’s head.

Definition for neurons following the revealed preference scheme

We analysed single-neuron activity during four task epochs vs. Pretrial control (1 s): visual Bundle stimulus (2 s), Go signal (1 s), Choice (1 s) and Reward (2 s, starting with reward A, followed 0.5 s later by reward B, thus covering both rewards). To establish neuronal relationships to these task epochs, we compared the activity in each neuron during each task epoch separately against the Pretrial control epoch using the paired Wilcoxon test (P < 0.01). A neuron was considered task-related if its activity in at least one of the four task epochs differed significantly from the activity during the Pretrial control epoch.

Responses of individual neurons should follow the scheme of two-dimensional ICs that characterizes revealed behavioral preferences for two-dimensional bundles. Specifically, the responses should comply with three characteristics defined previously (Pastor-Bernier et al., 2019).

(Characteristic 1) Neuronal responses should change monotonically with increasing behavioral preference across behavioral ICs, irrespective to bundle composition. Such monotonic neuronal response changes should reflect increasing amounts of one or both bundle rewards, assuming a positive monotonic subjective value function on reward amount.

(Characteristic 2) Neuronal responses should vary insignificantly for all equally preferred bundles positioned along a same behavioral IC, despite different physical bundle composition.

(Characteristic 3) Neuronal responses should follow the IC slope and the non-linear curvature of behavioral ICs. The IC slope reflects the value relationship between the two bundle rewards, indicating the revealed preference relation between the two rewards of a bundle, and thus the value of one reward relative to a common reference reward.

We used a combination of three statistical tests to assess these characteristics.

Characteristic 1: To capture the change across ICs in the most conservative, assumption-free manner possible, we used a simple linear regression on each Wilcoxon-identified task-related response: with y as neuronal response in any of the four task epochs, measured as impulses/s and z-scored normalized to the Pretrial control epoch of 1.0 s (z-scoring of neuronal responses applied to all regressions listed below), A and B as milliliter amount of reward A (plotted at y-axis) and reward B (x-axis), respectively, β₀ as offset coefficient, β₁ and β₂ as neuronal regression coefficients, and ε as error consisting of the sum of individual errors of each expression (erro, err₁, err₂, err₃ for offset and respective regressors 1-3). The regression defined by Eq. 3 is equivalent to the hyperbolic model used for fitting behavioral ICs (d=ax+by+cxy; Eq. 1).

The coefficients β₁ and β₂ needed to be either both positive (indicating positive neuronal relationship, higher neuronal activity reflecting more reward quantity) or both negative (inverse neuronal relationship) to reflect the additive nature of the individual bundle components giving rise to revealed preference (P < 0.05, unless otherwise stated; t-test).

This linear regression assessed the degree of linear monotonicity of neuronal response change across ICs (P < 0.05 for β coefficients; t-test). Further, all significant positive or negative response changes identified by Eq. 3 needed to be also significant in a Spearman rank-correlation test that assessed ordinal monotonicity of response change across ICs without assuming linearity and numeric scale (P < 0.05).

Characteristics 1 and 2

To assess the two-dimensional across/along IC scheme in a direct and intuitive way, and without assuming monotonicity, linearity and numeric scale, we used a two-factor Anova on each Wilcoxon-identified task-related response that was significant for both regressors in Eq. 3; the factors were across-IC (ascending rank order of behavioral ICs) and along-IC (same rank order of behavioral IC). To be a candidate for following IC scheme of revealed preferences, changes across-ICs should be significant (P < 0.05), changes within-IC should be insignificant, and their interaction should be insignificant.

Characteristic 3

Whereas the regression defined by Eq. 3 estimated neuronal responses across ICs, a full estimation of neuronal ICs for comparison with behavioral ICs would require inclusion of the IC slope and curvature, both of which depended on both rewards. By simplifying Eq. 3 by setting to zero both the β₃ coefficient and the constant neuronal response along the IC, the neuronal IC slope would be the ratio of coefficients (−β₂ / β₁). Note the different meanings of the slope term: the neuronal IC slope (−β₂ / β₁) describes the relative coding strength of the two bundle rewards (reflecting the neuronal ratio of the two rewards), whereas each neuronal regression slope alone (β) describes the coding strength of neuronal response (correlation with the specific regressor). The neuronal IC curvature was estimated from the β₃ coefficient of the interaction term AB (all β‘s P < 0.05; t-test).

Polar plot of OFC reward sensitivity

The purpose of this analysis was to provide quantitative and graphic information about satiety-induced behavioral and neuronal changes that would allow comparison with previous OFC studies that had not established ICs (Tremblay & Schultz 1999; Padoa-Schioppa & Assad 2006). The analysis concerned monotonic response increase or decrease with increasing amounts of bundle rewards across ICs (characteristic 1 above), but did not address other IC characteristics such as trade-off, slope and curvature (characteristics 2 and 3) that had not been investigated previously. We established 2D polar plots whose dots indicated the relative contribution of each of the two bundle rewards to the neuronal response. We then constructed vectors by averaging these dots of neuronal responses. We then compared vectors of averaged neuronal responses and averaged behavioral choices before and during satiety.

For the behavioral choices, we plotted vectors (with 95% confidence intervals) from averaged polar plot dot positions defined by magnitude (distance from center: sqrt (a² + b²)) and relative weight (elevation angle: arctangent (a / b)); coefficient a refers to reward A (blackcurrant, y-axis), coefficient b refers to any of the other rewards (x-axis) (Eq. 1). The angle of the vector reflected the relative contribution the two bundle rewards to the choice, as estimated by the a and b coefficients (Eq. 1). A deviation of the alignment angle from the diagonal line indicated an unequal contribution weight to bundle choice, and thus a non-1:1 reward ratio.

For the neuronal plots, each dot on the 2D plot was defined by the two β regression coefficients for neuronal responses (Eq. 3; P < 0.01, t-test) for each of the two rewards in any of the four task epochs. The distance from center indicated the z-scored response magnitude (sqrt (β₁² + β₂²)), coding sign (positive or negative), and relative weight (elevation angle; arctangent (β₁ / β₂)) of the two β coefficients. Coefficient β₁ referred to reward A (blackcurrant, y-axis), coefficient β₂ referred to any of the other rewards (x-axis). Responses with negative (inverse) coding were rectified. Further IC characteristics such as systematic trade-off across multiple IPs and IC curvature played no role in these graphs. The alignment of the dots along the diagonal axis showed the relative coding strength for the two bundle rewards, as estimated by the β regression coefficients; a deviation from the diagonal line indicated an unequal influence of the two bundle rewards on the neuronal responses, reflecting a neuronal correlate of reward ratio.

Neuronal decoders

We used linear support vector machine (SVM) algorithms to decode neuronal activity according to bundles presented at different behavioral ICs during choice over zero-reward bundle (bundle distinction) and, separately, according to the behavioral choice between two non-zero bundles located on different ICs (choice prediction). As in our main study on revealed preferences (Pastor-Bernier et al., 2019), we implemented both decoders as custom-written software in Matlab R2015b (Mathworks). The SVM decoder with linear kernel was accomplished with svmtrain and svmclassify procedures (our previous work had shown that use of nonlinear SVM kernels does not improve decoding Tsutsui et al., 2016). The SVM decoder was trained to find the optimal linear hyperplane for the best separation between two neuronal populations relative to lower vs. higher ICs.

All analyses employed single-neuron data, consisting of single-trial impulse counts that had been z-normalised to the activity during the Pretrial epoch in all trials recorded with the neuron under study. The analysis included activity from all neurons whose responses followed the IC scheme of revealed preferences during any of the four task epochs, as identified by our three-test statistics, except where noted. The neurons were recorded one at a time; therefore, the analysis concerned aggregated pseudo-populations of neuronal responses.

The decoding analysis used 10 trials per neuron for each of two ICs (total of 20 trials). Extensive analysis suggested that higher inclusion of 15-20 trials per group did not provide significantly better decoding rates (while reducing the number of included neurons). For neurons that had been recorded with > 10 trials per IC, we selected randomly 10 trials from each neuron for each of the two ICs. We used a leave-one-out cross-validation method in which we removed one of the 20 trials and trained the SVM decoder on the remaining 19 trials. We then used the SVM decoder to assess whether it accurately detected the IC of the left-out trial. We repeated this procedure 20 times, every time leaving out another one of the 20 trials. These 20 repetitions resulted in a percentage of accurate decoding (% out of n = 20). The final percentage estimate of accurate decoding resulted from averaging the results from 150 iterations of this 20-trial random selection procedure. To distinguish from chance decoding, we randomly shuffled the assignment of neuronal responses to the tested ICs, which should result in chance decoding (accuracy of 50% correct). A significant decoding with the real, non-shuffled data would be expressed as statistically significant difference against the shuffled data (P < 0.01; Wilcoxon rank-sum test).

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Author contributions

A.P.-B. and W.S. designed the research, A.P.-B. performed the experiments, A.P.-B. and A.S. analyzed the data, A.P.-B. and W.S. wrote the manuscript.

Additional information

Supplementary Information accompanies this paper (3 figures)

Competing Interests

The authors declare no competing interests.

SUPPLEMENTARY MATERIAL

Figure S1. Supportive behavioral measures

(A) Psychophysical assessment of choice between single-component bundles with grape juice variation (constant Reference Bundle: 0.6 ml blackcurrant juice, 0.0 ml grape juice; Variable bundle: 0.0 ml blackcurrant juice, varying grape juice). Green and violet curves inside green +95% confidence intervals: initial choices; blue, orange and red curves: on-going consumption. Each curve was estimated from 80 trials (Weibull fits). The decrease in ratio blackcurrant/grape juice amounts at IP was significant between the confidence interval of the first IP and all IPs exceeding it (ratios of 1.9857 ± 0.0173, N = 139, green, vs. 1.0077 ± 0.02, orange and red; mean ± standard error of the mean, SEM; individual trial blocks: p = 9.6943 x 10⁷, Kolmogorov-Smirnov test; p = 2.336 x 10^-32, Wilcoxon rank-sum test; p = 3.1712 x 10^-46, t-test; N = 43 trial blocks).

(B) Gradually developing relative satiety for grape juice indicated by increasing choice indifference points (IP; same bundles as in A): with on-going consumption of both juices, the animal gave up progressively more grape juice for obtaining the same 0.4 ml of blackcurrant juice (from green to red). The ratio blackcurrant/grape juice amounts at IP decreased from approximately 2:1 (0.4 ml of blackcurrant juice for 0.25 ml of grape juice, black vs. green dots) to about 1:1 (0.4 ml blackcurrant for 0.45 ml grape juice, black vs. red), suggesting subjective value loss of grape juice relative to blackcurrant juice.

(C) Significant decrease of ratio blackcurrant/grape juice amounts at IP with on-going consumption (same bundles as in A; Wilcoxon test). N= 139 and 76 IPs estimated in 43 trial blocks.

(D) Gradual changes with grape juice variation in slope and curvature of choice indifference curves (IC) between pre-satiety (green, violet) and during increasing satiety (blue, orange, red).

(E), (F) Psychophysical tests and consumption-dependent change of ICs in Monkey B during choice between single-component bundles (constant Reference Bundle: 0.25 ml blackcurrant juice, 0.0 ml water; Variable bundle: 0.0 ml blackcurrant juice, varying water). With on-going consumption of both liquids, the animal gave up progressively more water for obtaining the same 0.25 ml of blackcurrant juice (from green to red), suggesting subjective value loss of water relative to blackcurrant juice. Same conventions as in A and C.

(G), (H) Significant IC slope and curvature changes from pre-sated to sated states with on-going consumption with individual bundles (Bc, blackcurrant juice; MSG, monosodium glutamate; IMP, inosine monophosphate; p = 0.0156 and p= 0.0313, respectively; Wilcoxon test). The slope parameter reflects the amount ratio blackcurrant/other liquids at IP.

(I) Value control by logistic regression for choice of Variable Bundle over non-zero Reference Bundle during satiety (Eq. 2). According to significance of β regression coefficients, choice of the Variable Bundle (Choice VarBundle) correlated significantly with amount of rewards A and B in the Variable Bundle (VA, VB) and the Reference Bundle (RA, RB) and the consumed amount of bundle rewards A (blackcurrant; MA) and B (various other liquids; MA). Choice varied insignificantly with consecutive trial number within blocks (CT) and left-right choice (CL). * P < 0.05; ** P < 0.01; t-test on βs.

Figure S2. Asymmetric neuronal response change with reward-specific satiety (choice between two nonzero bundles)

(A) Significant monotonic neuronal response increase with value of chosen bundle across indifference curves (IC) before satiety (from green via blue to red) (P = 0.0055, F = 10.49, 17 trials; 1-way Anova). The animal chose between the Reference Bundle (hollow blue dot) and one of the Variable Bundles (solid colored dots). The responses to the two blue bundles on the same IC (indicating equal preference) varied insignificantly despite different juice composition (P = 0.5488, F = 0.38, 18 trials). Response to Reference Bundle (hollow blue dot) is indicated by dotted line.

(B) As (A) but for grape juice variation. Responses varied significantly across ICs with grape juice (P = 0.0046, F = 9.7, 27 trials). The responses to the two blue bundles on the same IC differed insignificantly (P = 0.2622, F = 1.31, 29 trials). Same color labels as in (A).

(C) Despite IC change indicating satiety, the neuronal response increase across ICs remained significant (P = 0.0014, F = 10.87, 17 trials). However, the two unchanged blue bundles were now on different ICs, and their responses varied significantly (P = 0.0028, F = 5.46, 40 trials).

(D) With IC change from convex to concave indicating satiety, the three bundles with grape juice variation were now located within only two ICs. Although the neuronal response increase across ICs remained significant (P = 0.0144, F = 6.02, 35 trials), the peak response was reduced by 25% (from 40 to 30 imp/s, red) and the three responses were closer to each other. Further, the two unchanged blue bundles were now on different ICs, and their responses now differed significantly (P = 0.0201, F = 9.27, 52 trials). Thus, the changes of neuronal responses were consistent with the IC change indicating satiety.

Figure S3. Decoding of bundle discrimination and bundle choice from neuronal activity is maintained during satiety

(A) - (D) Classification by support vector machine (SVM) using neuronal responses during different task epochs. The classifier was trained before satiety and tested for bundle distinction between the two ICs before satiety (black) and during satiety (red). The bundles were positioned on the lowest and highest indifference curve (IC), respectively, as shown in the ICs of Figure 5. Data are from choice over zero-bundle.

(E) - (I) As (A-D) but for choice over non-zero bundle.

(J) Classification accuracy of neuronal responses across on-going liquid consumption. Same data selection as for (A-D) and collapsed across all task epochs.

Acknowledgements

We thank Aled David for invaluable help with animal training, Dr. Polly Taylor for anaesthesia during implantation, Paul Cisek for sharing his SQL-Matlab toolbox (NeuroMath), Charles R. Plott, Christopher Harris, Simone Ferrari-Toniolo and Fabian Grabenhorst for inspiration and insightful comments on experimental economics and neuronal data analysis. The Wellcome Trust (WT 095495, WT 204811), European Research Council (ERC; 293549) and US National Institutes of Mental Health Caltech Conte Center (NIMH; P50MH094258) supported this work.

Footnotes

Email addresses: Alexandre Pastor-Bernier: ap787{at}cam.ac.uk, Arkadiusz Stasiak: as863{at}cam.ac.uk

REFERENCES

↵
Cabanac, M. (1971). Physiological role of pleasure. Science 173, 1103–1107.
OpenUrl Abstract/FREE Full Text
↵
Critchley, H.D., and Rolls, E.T. (1996). Olfactory neuronal responses in the primate orbitofrontal cortex: analysis in an olfactory discrimination task. J. Neurophysiol. 75, 1659–1672.
OpenUrl PubMed Web of Science
↵
Dhar, R., & Novemsky, N. (2008). Beyond rationality: the content of preferences. J Consum Psychol 18, 175–178.
OpenUrl
↵
Fisher, I. (1892). Mathematical Investigations in the theory of value and prices. Trans Connecticut Acad 9, 1–124.
OpenUrl
↵
Kivetz, R., Netzer, O., & Schrift, R. (2008) The synthesis of preference: Bridging behavioral decision research and marketing science. Journal of Consumer Research 18, 179–186.
OpenUrl
↵
Kobayashi, S., Pinto de Carvalho, O., and Schultz, W. (2010). Adaptation of reward sensitivity in orbitofrontal neurons. J. Neurosci. 30, 534–544.
OpenUrl Abstract/FREE Full Text
↵
Kringelbach, M.L., O’Doherty, J., Rolls, E.T., and Andrews, C. (2003). Activation of the human orbitofrontal cortex to a liquid food stimulus is correlated with its subjective pleasantness. Cereb. Cortex 13, 1064–1071.
OpenUrl CrossRef PubMed Web of Science
↵
Merrill, E. G., and Ainsworth, A. (1972). Glass-coated platinum-plated tungsten microelectrodes. Med. Biol. Eng. 10, 662–672.
OpenUrl CrossRef PubMed Web of Science
↵
Padoa-Schioppa, C. (2009). Range-adapting representation of economic value in the orbitofrontal cortex. J. Neurosci. 29, 14004–14014.
OpenUrl Abstract/FREE Full Text
↵
Pastor-Bernier, A, Volkmann, K., Stasiak, A., Grabenhorst, F., and Schultz, W. (2020). Experimentally revealed stochastic preferences for multi-component choice options. J. exp. Psych.: Anim. Learn. Cog. (in press).
↵
Pastor-Bernier, A., Stasiak, A., and Schultz, W. (2019). Orbitofrontal signals for two-component choice options comply with indifference curves of Revealed Preference Theory. Nat. Comm. 10, 4885.
OpenUrl
↵
Pastor-Bernier, A., Plott, C.R., and Schultz, W. (2017). Monkeys choose as if maximizing utility compatible with basic principles of revealed preference theory. Proc Natl AcadSci U S A. 114, E1766–E1775.
OpenUrl Abstract/FREE Full Text
↵
Paxinos, G., Huang, X.-F. and Toga, A. W. (2000). The Rhesus Monkey Brain in Stereotaxic Coordinates (Academic Press, San Diego).
↵
Payne, J. W., Bettman, J. R., & Schkade, D. A. (1999). Measuring constructed preferences: towards a building code. J Risk Uncert 19, 243–270.
OpenUrl CrossRef Web of Science
↵
Pritchard, T.C., Nedderman, E.N., Edwards, E.M., Petticoffer, A.C., Schwartz, G.J., and Scott, T.R. (2008). Satiety-responsive neurons in the medial orbitofrontal cortex of the macaque. Behav. Neurosci. 122, 174–82.
OpenUrl CrossRef PubMed Web of Science
↵
Reichelt, A.C., Morris, M.J., and Westbrook, R.F. (2014). Cafeteria diet impairs expression of sensory-specific satiety and stimulus-outcome learning. Front. Psychol. 5, 852.
OpenUrl
↵
Robinson, M.J.F., and Berridge, K.C. (2013). Instant transformation of learned repulsion into motivational ‘‘wanting’’. Curr Biol 23, 282–289.
OpenUrl CrossRef PubMed
↵
Rolls, E.T., Sienkiewicz, Z.J., and Yaxley, S. (1989). Hunger Modulates the Responses to Gustatory Stimuli of Single Neurons in the Caudolateral Orbitofrontal Cortex of the Macaque Monkey. Eur. J. Neurosci. 1, 53–60.
OpenUrl CrossRef PubMed Web of Science
↵
Rolls, B.J, Rolls, E.T., Rowe, E.A., and Sweeney, K. (1983). Sensory-specific and motivationspecific satiety for the sight and taste of food and water in man. Physiol. Behav. 30, 185–192.
OpenUrl CrossRef PubMed Web of Science
↵
Rolls, E.T., Scott, T.R., Sienkiewicz, Z.J., and Yaxley, S. (1988). The responsiveness of neurons in the frontal opercular gustatory cortex of the macaque monkey is independent of hunger. J Physiol. 397, 1–12.
OpenUrl PubMed Web of Science
↵
Rustichini, A., Conen, K.E., Cai, X., Padoa-Schioppa, C. (2017). Optimal coding and neuronal adaptation in economic decisions. Nat. Comm. 8, 1208.
OpenUrl
↵
Samuelson, P. A. (1937). A note on measurement of utility. Rev Econ Stud 4, 155–161.
OpenUrl CrossRef
↵
Samuelson, P. A. (1938). A note on the pure theory of consumer’s behavior. Economica 5, 61–71.
OpenUrl CrossRef
↵
Simonson, I. (2008). Will I like a “medium” pillow? Another look at constructed and inherent preferences. J Consum Psychol 18, 155–169.
OpenUrl
↵
Small, D.M., Zatorre, R.J., Dagher, A., Evans, A.C., and Jones-Gotman, M. (2001). Changes in brain activity related to eating chocolate: from pleasure to aversion. Brain 124, 1720–33.
OpenUrl CrossRef PubMed Web of Science
↵
Tremblay, L., and Schultz, W. (1999). Relative reward preference in primate orbitofrontal cortex. Nature 398, 704–708.
OpenUrl CrossRef PubMed Web of Science
↵
Tsutsui, K. I., Grabenhorst, F., Kobayashi, S. & Schultz, W. (2016). A dynamic code for economic object valuation in prefrontal cortex neurons. Nat. Comm. 7, 12554.
OpenUrl
↵
Warren, C., McGraw, A. P., & Van Boven, L. (2011). Values and preferences: defining preference construction. WIREs Cogni Sci 2, 193–205.
OpenUrl
↵
Yaxley, S., Rolls, E.T., Sienkiewicz, Z.J., and Scott, T.R. (1985). Satiety does not affect gustatory activity in the nucleus of the solitary tract of the alert monkey. Brain Res. 347, 85–93.
OpenUrl CrossRef PubMed Web of Science
↵
Yaxley, S., Rolls, E.T., and Sienkiewicz, Z.J. (1988). The responsiveness of neurons in the insular gustatory cortex of the macaque monkey is independent of hunger. Physiol. Behav. 42, 223–229.
OpenUrl CrossRef PubMed

View the discussion thread.

Posted July 05, 2020.

Download PDF

Citation Tools

Subject Area

Neuroscience

Subject Areas

All Articles

Animal Behavior and Cognition (5215)
Biochemistry (11753)
Bioengineering (8752)
Bioinformatics (29201)
Biophysics (14974)
Cancer Biology (12100)
Cell Biology (17413)
Clinical Trials (138)
Developmental Biology (9422)
Ecology (14182)
Epidemiology (2067)
Evolutionary Biology (18309)
Genetics (12245)
Genomics (16804)
Immunology (11869)
Microbiology (28098)
Molecular Biology (11596)
Neuroscience (60975)
Paleontology (451)
Pathology (1871)
Pharmacology and Toxicology (3238)
Physiology (4959)
Plant Biology (10427)
Scientific Communication and Education (1683)
Synthetic Biology (2886)
Systems Biology (7340)
Zoology (1651)

[1] ↵
Cabanac, M. (1971). Physiological role of pleasure. Science 173, 1103–1107.
OpenUrl Abstract/FREE Full Text

[2] ↵
Critchley, H.D., and Rolls, E.T. (1996). Olfactory neuronal responses in the primate orbitofrontal cortex: analysis in an olfactory discrimination task. J. Neurophysiol. 75, 1659–1672.
OpenUrl PubMed Web of Science

[3] ↵
Dhar, R., & Novemsky, N. (2008). Beyond rationality: the content of preferences. J Consum Psychol 18, 175–178.
OpenUrl

[4] ↵
Fisher, I. (1892). Mathematical Investigations in the theory of value and prices. Trans Connecticut Acad 9, 1–124.
OpenUrl

[5] ↵
Kivetz, R., Netzer, O., & Schrift, R. (2008) The synthesis of preference: Bridging behavioral decision research and marketing science. Journal of Consumer Research 18, 179–186.
OpenUrl

[6] ↵
Kobayashi, S., Pinto de Carvalho, O., and Schultz, W. (2010). Adaptation of reward sensitivity in orbitofrontal neurons. J. Neurosci. 30, 534–544.
OpenUrl Abstract/FREE Full Text

[7] ↵
Kringelbach, M.L., O’Doherty, J., Rolls, E.T., and Andrews, C. (2003). Activation of the human orbitofrontal cortex to a liquid food stimulus is correlated with its subjective pleasantness. Cereb. Cortex 13, 1064–1071.
OpenUrl CrossRef PubMed Web of Science

[8] ↵
Merrill, E. G., and Ainsworth, A. (1972). Glass-coated platinum-plated tungsten microelectrodes. Med. Biol. Eng. 10, 662–672.
OpenUrl CrossRef PubMed Web of Science

[9] ↵
Padoa-Schioppa, C. (2009). Range-adapting representation of economic value in the orbitofrontal cortex. J. Neurosci. 29, 14004–14014.
OpenUrl Abstract/FREE Full Text

[10] ↵
Pastor-Bernier, A, Volkmann, K., Stasiak, A., Grabenhorst, F., and Schultz, W. (2020). Experimentally revealed stochastic preferences for multi-component choice options. J. exp. Psych.: Anim. Learn. Cog. (in press).

[11] ↵
Pastor-Bernier, A., Stasiak, A., and Schultz, W. (2019). Orbitofrontal signals for two-component choice options comply with indifference curves of Revealed Preference Theory. Nat. Comm. 10, 4885.
OpenUrl

[12] ↵
Pastor-Bernier, A., Plott, C.R., and Schultz, W. (2017). Monkeys choose as if maximizing utility compatible with basic principles of revealed preference theory. Proc Natl AcadSci U S A. 114, E1766–E1775.
OpenUrl Abstract/FREE Full Text

[13] ↵
Paxinos, G., Huang, X.-F. and Toga, A. W. (2000). The Rhesus Monkey Brain in Stereotaxic Coordinates (Academic Press, San Diego).

[14] ↵
Payne, J. W., Bettman, J. R., & Schkade, D. A. (1999). Measuring constructed preferences: towards a building code. J Risk Uncert 19, 243–270.
OpenUrl CrossRef Web of Science

[15] ↵
Pritchard, T.C., Nedderman, E.N., Edwards, E.M., Petticoffer, A.C., Schwartz, G.J., and Scott, T.R. (2008). Satiety-responsive neurons in the medial orbitofrontal cortex of the macaque. Behav. Neurosci. 122, 174–82.
OpenUrl CrossRef PubMed Web of Science

[16] ↵
Reichelt, A.C., Morris, M.J., and Westbrook, R.F. (2014). Cafeteria diet impairs expression of sensory-specific satiety and stimulus-outcome learning. Front. Psychol. 5, 852.
OpenUrl

[17] ↵
Robinson, M.J.F., and Berridge, K.C. (2013). Instant transformation of learned repulsion into motivational ‘‘wanting’’. Curr Biol 23, 282–289.
OpenUrl CrossRef PubMed

[18] ↵
Rolls, E.T., Sienkiewicz, Z.J., and Yaxley, S. (1989). Hunger Modulates the Responses to Gustatory Stimuli of Single Neurons in the Caudolateral Orbitofrontal Cortex of the Macaque Monkey. Eur. J. Neurosci. 1, 53–60.
OpenUrl CrossRef PubMed Web of Science

[19] ↵
Rolls, B.J, Rolls, E.T., Rowe, E.A., and Sweeney, K. (1983). Sensory-specific and motivationspecific satiety for the sight and taste of food and water in man. Physiol. Behav. 30, 185–192.
OpenUrl CrossRef PubMed Web of Science

[20] ↵
Rolls, E.T., Scott, T.R., Sienkiewicz, Z.J., and Yaxley, S. (1988). The responsiveness of neurons in the frontal opercular gustatory cortex of the macaque monkey is independent of hunger. J Physiol. 397, 1–12.
OpenUrl PubMed Web of Science

[21] ↵
Rustichini, A., Conen, K.E., Cai, X., Padoa-Schioppa, C. (2017). Optimal coding and neuronal adaptation in economic decisions. Nat. Comm. 8, 1208.
OpenUrl

[22] ↵
Samuelson, P. A. (1937). A note on measurement of utility. Rev Econ Stud 4, 155–161.
OpenUrl CrossRef

[23] ↵
Samuelson, P. A. (1938). A note on the pure theory of consumer’s behavior. Economica 5, 61–71.
OpenUrl CrossRef

[24] ↵
Simonson, I. (2008). Will I like a “medium” pillow? Another look at constructed and inherent preferences. J Consum Psychol 18, 155–169.
OpenUrl

[25] ↵
Small, D.M., Zatorre, R.J., Dagher, A., Evans, A.C., and Jones-Gotman, M. (2001). Changes in brain activity related to eating chocolate: from pleasure to aversion. Brain 124, 1720–33.
OpenUrl CrossRef PubMed Web of Science

[26] ↵
Tremblay, L., and Schultz, W. (1999). Relative reward preference in primate orbitofrontal cortex. Nature 398, 704–708.
OpenUrl CrossRef PubMed Web of Science

[27] ↵
Tsutsui, K. I., Grabenhorst, F., Kobayashi, S. & Schultz, W. (2016). A dynamic code for economic object valuation in prefrontal cortex neurons. Nat. Comm. 7, 12554.
OpenUrl

[28] ↵
Warren, C., McGraw, A. P., & Van Boven, L. (2011). Values and preferences: defining preference construction. WIREs Cogni Sci 2, 193–205.
OpenUrl

[29] ↵
Yaxley, S., Rolls, E.T., Sienkiewicz, Z.J., and Scott, T.R. (1985). Satiety does not affect gustatory activity in the nucleus of the solitary tract of the alert monkey. Brain Res. 347, 85–93.
OpenUrl CrossRef PubMed Web of Science

[30] ↵
Yaxley, S., Rolls, E.T., and Sienkiewicz, Z.J. (1988). The responsiveness of neurons in the insular gustatory cortex of the macaque monkey is independent of hunger. Physiol. Behav. 42, 223–229.
OpenUrl CrossRef PubMed