Summary
Forrest Gump or Matrix? Choosing which movie you prefer is a subjective decision that entails self-reflection, a feature unaccounted for by known neural mechanisms of valuation and choice. Here, we show that subjective valuation is functionally coupled to the neural circuitry monitoring physiological variables, i.e. the simplest biological form of self-reflection. Human participants chose which movie they preferred, or performed a control objective discrimination that did not require self-reflection. Using magnetoencephalograpghy, we measured heartbeat-evoked responses (HERs) before option presentation, and retrieved the decision network during choice. In subjective preference-based decisions only, HERs modulated the encoding of chosen value, in ventro-medial prefrontal cortex, and this neural interaction increased choice precision. Results could not be trivially explained by changes in cardiac activity or in arousal. The neural monitoring of physiological variables thus supports subjective valuation based on self-reflection and improves the consistency of decisions based on subjective values.
Highlights
Preference-based decisions are subjective and require self-reflection
Neural responses to heartbeats interact with subjective value encoding in vmPFC
This interaction predicts choice precision
The influence of neural responses to heartbeats is specific to subjective decisions
Introduction
Do you prefer Forrest Gump or Matrix? The decision is subjective: only you know which movie you like best. The subjective values used in preference-based decision making are internally generated, intrinsically private, and entail self-reflection. In contrast, the evidence required to decide which of the two words ‘listen’ and ‘look’ has more characters is publicly available to any reader of this article. Given that a ‘ground truth’ exists – six letters in ‘listen’, four in ‘look’, this perceptual decision can be operationally defined as objective. The comparison between subjective preference-based and objective perceptual decisions revealed that the ventro-medial prefrontal cortex (vmPFC) is particularly engaged when comparing two options based on their subjective values1–3. However, subjective values have been considered so far as quantities already present in the environment, similar to sensory evidence4. Current approaches thus leave unspecified the biological mechanisms supporting the self-reflection intrinsic to the assignment of subjective values to different options.
The self-reflection required for preference-based decisions might derive from the simplest biological implementation of self-reflection, i.e., the monitoring of one’s current physiological state5–9 required to decide which behavior is best suited to restore homeostatic balance and ensure the integrity of the living organism. In other words, the organism takes into account its internal state to assign a value to a given option4, 10– and might do so even for subjective choices that have no immediate physiological consequences. The neural response reflecting the constant and automatic cortical monitoring of heartbeats, also known as the heartbeat-evoked response (HER)11, 12, is related to subjective, self-related cognitive processes in vmPFC13–15. HERs in vmPFC are thus a likely candidate neural mechanism underlying the self-reflection required for subjective valuation. We thus hypothesized that fluctuations in HERs should affect valuation in subjective preference-based decisions, but not in decisions such as perceptual discriminations, based on objective evidence publicly available in the outside world.
We asked participants to perform either a subjective, preference-based choice, or a control, objective perceptual discrimination, between two visually presented movie titles (Fig. 1), while their neural and cardiac activity were measured with magnetoencephalograhy (MEG) and electrocardiography (ECG), respectively. Each trial began with a symbol instructing participants which type of decision to perform. In subjective preference trials, participants selected the movie they preferred, while in objective perceptual discrimination trials they had to indicate which movie title was written with the highest contrast. We measured HERs during the instruction period, i.e. before option presentation. According to our hypothesis, we found that HERs are differently involved in the preparation to subjective and objective decisions. In addition, HERs influenced the precision of the neural encoding of subjective value and improved choice consistency in preference-based decisions, but they did not impact the objective, perceptual decisions.
Results
Behavioral results
Participants were asked to choose between two simultaneously presented movie titles according either to their subjective preferences or to the visual contrast of movie titles, as indicated by trial-by-trial instructions presented before the alternatives (Fig. 1A). Decision difficulty, operationalized as the difference between the two options (difference between likeability ratings measured one day before the MEG session in the preference task (Supplementary Fig.1), difference between contrasts in the perceptual task), had the expected impact on behavior in both tasks. Accuracy increased and reaction times decreased for easier decisions (Fig. 1C; preference task, one-way repeated measure ANOVA, main effect of difficulty: accuracy, F(3,60) = 99.25, p < 10−15; RT, F(3,60) = 41.14, p < 10−13; perceptual task, main effect of difficulty: accuracy, F(3,60) = 280.2, p < 10−15; RT, F(3,60) = 87.67, p < 10−15). Preference and perceptual decisions were matched in accuracy (two-way repeated measures ANOVA, main effect of task on accuracy, F(1,20) = 0.38, p = 0.55, interaction between task and difficulty, F(3,60) = 2.53, p = 0.07). Preference-based decisions were generally slower (two-way repeated measures ANOVA, main effect of task, RT, F(1,20) = 57.64, p < 10−6) and reaction times decreased less rapidly for easier decisions (interaction effect between task and difficulty, F(3,60) = 4.08, p = 0.01). Participants only used task-relevant information (subjective preference value or objective contrast) to reach their decision, since non-relevant information could not predict choice (Fig. 1D; Supplementary Table 1).
Neural responses to heartbeats are larger when preparing for preference-based decisions
Because preference-based choices are intrinsically subjective and require self-reflection, we hypothesized that HERs would be larger when preparing for subjective, preference-based decisions than when preparing for objective, perceptual ones. We found that HERs during the instruction period, before option presentation, were indeed larger when participants prepared for preference-based decisions than for perceptual ones (Fig. 2A, 2B; non-parametric clustering, 201-262 ms after T-wave, sum(t) = 1789, Monte Carlo cluster level p = 0.037). The cortical regions that mostly contributed to this effect (Fig. 2C, Table 1) were localized in right and left anterior vmPFC (areas 11m and 14 bilaterally; cluster peak at MNI coordinates, [1 57 −21] and [−3 47 −6]), in the right post-central complex ([32 −22 56]) and right supramarginal gyrus ([41 −33 43]).
The HER difference between subjective preference-based trials and objective perceptual discrimination trials was not accompanied by any difference in ECG activity (Supplementary Table 2), in cardiac parameters (inter-beat intervals, inter-beat intervals variability, stroke volume) or arousal indices (alpha power and pupil diameter) measured during the instruction period (Supplementary Table 3). Importantly, the difference was time-locked to heartbeats (Monte-Carlo p=0.026. see Methods for details).
The subjective value of the chosen option is encoded in medial prefrontal cortices in preference-based decisions
We then identified when and where subjective value was encoded during preference-based choice. First, we modeled single trial response-locked neural activity at the sensor level using a GLM (GLM1a, see Methods), using as regressors the subjective values of the chosen (ChosenSV) and unchosen (UnchosenSV) options, as well as the response button used. Neural activity over frontal sensors encoded the subjective value of the chosen option in two neighboring time-windows (ßChosenSV, first cluster: −580 to −370 ms before response, sum(t)=−7613, Monte Carlo p = 0.004. Second cluster: −336 to −197 ms before response, sum(t) = −4405, Monte Carlo p= 0.033; Fig. 2D and Fig. 2E). No cluster of neural activity significantly encoded the subjective value of the unchosen option. Motor preparation was encoded later in time in two posterior-parietal clusters of opposite polarities (ßButton Press, negative cluster: −287 to −28 ms before response, sum(t) = −10918, Monte Carlo p = 0.003; positive cluster: −373 to −196 ms before response, sum(t) = 5848, Monte Carlo p = 0.02).
To identify the cortical regions contributing to the encoding of subjective value at sensor-level, we used the same model (GLM1a) to predict source-reconstructed activity averaged in the time-window identified at sensor level (−580 to −197 ms before response). A network of medial regions comprising right posterior vmPFC (area 32 and 24, cluster peak at MNI coordinates: [7 40 0]), right dorso-medial prefrontal cortex (dmPFC, area 8m, [5 30 40]), bilateral occipital poles ([6 −77 11], [−1 −85 16]) and mid-posterior insula ([−34 −27 17]) encoded the subjective value of the chosen option (Fig. 2F; Supplementary Table 4).
HER amplitude modulates trial-by-trial subjective value encoding in right vmPFC
We thus show that two different sub-regions of vmPFC were involved at different moments in a trial: during the instruction period, HERs differed when participants prepared for preference-based vs. perceptual decisions in left and right anterior vmPFC, and during the choice period, subjective value was encoded in right posterior vmPFC. We then addressed our main question: does the amplitude of neural responses to heartbeats during the instruction period affect the encoding of subjective value during choice in vmPFC (Fig. 1B)?
We first focused on right vmPFC, which shows both a differential HER during instructions anteriorly and a parametric encoding of subjective value during choice posteriorly. For each participant, we median-split preference-based choice trials according to HER amplitude in anterior r-vmPFC during instruction period. We then determined the strength of subjective value encoding during choice in posterior r-vmPFC separately for trials with small vs. large HERs, by regressing posterior r-vmPFC activity against the subjective value of the chosen option. Subjective value encoding was significantly modulated by HERs (Fig. 3A), with greater encoding for large HER trials compared to small HER trials (two-tailed paired t-test on ßChosenSV in large vs. small HER trials, t(20) = 2.52, p = 0.02). This effect was confined to the right hemisphere: HER amplitude during instruction presentation in left anterior vmPFC did not influence subjective value encoding in right posterior vmPFC during choice (median split of trials according to HERs amplitude in left anterior vmPFC, comparison of ßChosenSV in right vmPFC, two-tailed paired t-test, t(20) = 0.83, p = 0.42, BF = 0.40). Importantly, HER amplitude did not vary with pupil diameter or alpha power during choice (Supplementary Table 5), indicating that strength encoding is not driven by an overall change in brain state. In addition, the influence of HER amplitude was specific to value encoding since it did not affect the visual responses evoked by option presentation (Supplementary Table 5).
The interaction between HER amplitude and subjective value encoding in right vmPFC was further tested using a full parametric approach. Here (GLM2), we predicted the activity of posterior r-vmPFC during choice from the subjective value of the chosen option, the HER amplitude in anterior r-vmPFC during instruction period and the interaction between these two terms (Fig. 3B). Since the posterior vmPFC region of interest was defined based on its encoding of the chosen value, the parameter estimate for chosen value was, as expected, large (ßChosenSV = −0.06 ± 0.02, two-tailed t-test against 0, t(20) = −3.37, p = 0.003). Activity in posterior vmPFC was also predicted by the amplitude of HERs occurring about 1.5 s earlier, during the instruction period, independently from the chosen value (ßHER = 0.04 ± 0.02, two-tailed t-test against 0, t(20)= 2.13, p = 0.046), and importantly by the interaction between HERs and chosen value (ßHER* ChosenSV = −0.05 ± 0.02, two-tailed t-test against 0, t(20)= −2.41, p = 0.025). Both the median-split analysis and the parametric model thus reveal a significant interaction between the amplitude of HERs during instruction period and the neural encoding of subjective value during choice.
We then verified that the effect on the neural encoding of subjective value was specific to HER amplitude, and not due to an overall baseline shift in anterior r-vmPFC during the instruction period. We ran an alternative model (GLM3) in which the activity in posterior r-vmPFC was predicted from the subjective value of the chosen option, the activity in anterior r-vmPFC averaged during the whole instruction period, i.e. activity not time-locked to heartbeats, and the interaction between the two terms. This analysis revealed that while the subjective value of the chosen option still significantly predicted the activity of posterior r-vmPFC (ßChosenSV = −0.05 ± 0.02, two-tailed t-test against 0, t(20) = −3.27, p = 0.004), the other two terms did not (activity in anterior r-vmPFC averaged during instruction period: ßBL vmPFC = 0.006 ± 0.03, two-tailed t-test against 0, t(20) = 0.22, p = 0.83, BF = 0.25; interaction: ßBL vmPFC* ChosenSV −0.03 ± 0.02, two-tailed t-test against 0, t(20) = −1.55, p = 0.14, BF = 1.14). The encoding of subjective value is thus specifically modulated by HER amplitude in anterior r-vmPFC and not by an overall baseline shift unrelated to heartbeats in the same region.
The functional coupling between HERs and subjective value encoding was also region specific: HER amplitude in anterior r-vmPFC did not modulate the strength of value encoding in any other value-related regions (dmPFC, occipital poles and posterior insula; all p ≥ 0.13, all BF ≤ 0.81; Supplementary Table 6). Conversely, HERs outside anterior r-vmPFC did not significantly interact with value encoding in posterior r-vmPFC. Splitting trials based on the amplitude in the two other cortical regions showing differential heartbeats-evoked responses (Fig. 2C) showed no significant modulation of value encoding in right posterior vmPFC (post-central complex: two-tailed paired t-test, t(20)= −1.41, p = 0.17, BF = 0.90; right supramarginal gyrus: two-tailed paired t-test, t(20)= −1.96, p = 0.06, BF = 2.41).
The interaction between HER and value encoding affects choice consistency
To which extent does the interaction between HER and value encoding in vmPFC influence behavior? We first tested whether the interaction between HER and value encoding relates to inter-individual differences in choice consistency, i.e. whether participants selected the movie to which they had attributed the greatest likeability rating the day before. Given the overall high performance in preference-based decisions, which may reduce our ability to detect significant relationships, we computed mean choice consistency using the top-50% most difficult trials (i.e. trials above median difficulty in each participant). We regressed the model parameter of the interaction between HER and value encoding (ßHER*ChosenSV obtained from GLM2) against mean choice consistency across participants. The larger the interaction between HER and value encoding, the more consistent were participants in their choices (ßrobust = 0.41, R2 = 0.22, t(19) = 2.29, p = 0.03; Fig. 3C). In other words, 22% of inter-individual difference in behavioral consistency is explained by the magnitude of the interaction between HER and value encoding. This correlation with behavior was specific to the interaction parameter: inter-individual differences in choice consistency could not be predicted from the model parameter estimate of HER (ßHER from GLM2; ßrobust = 0.02, R2 = 4*10−4, t(19) = 0.09, p = 0.93, BF = 0.39), nor from the parameter estimate of value (ßChosenSV from GLM2; ßrobust = −0.19, R2 = 0.04, t(19) = −0.88, p = 0.39, BF = 0.52). None of the personality traits tested nor interoceptive ability, assessed with the heartbeat counting task, significantly co-varied with the interaction between HER and subjective value encoding (Supplementary Table 7).
So far, results are based on parameter estimates computed across trials for a given participant. To assess within-participant trial-by-trial influence of the interaction between HERs and subjective value encoding on behavior, we computed the z-scored product of the HER amplitude in anterior r-vmPFC during the instruction period and the value-related activity in posterior r-vmPFC during choice. We then median-split the trials according to this product and modeled participants’ choices separately for trials with a small vs. large interaction (Fig. 3D). When the interaction was large psychometric curves featured a steeper slope, corresponding to an increased choice precision (two-tailed paired t-test, t(20)= −2.24, p = 0.037; after removal of the unique outlier with a slope estimate exceeding 3 SD from population mean, t(19) = −3.30, p = 0.003; Fig. 3E), while decision criterion was not affected (two-tailed paired t-test, t(20)= −1.20, p = 0.25, BF = 0.64; after outlier removal t(19) =−0.96, p = 0.35, BF = 0.46; Fig. 3E).
To control for the specificity of the interaction, we estimated the psychometric function on trials median-split on HER amplitude alone but found no difference in choice precision (two-tailed paired t-test, t(20)= 0.41, p = 0.69, BF = 0.27) nor in criterion (two-tailed paired t-test, t(20)= 0.52, p = 0.61, BF = 0.29). Similarly, median-splitting trials on value-related posterior r-vmPFC activity alone revealed no difference in the psychometric curve (two-tailed paired t-test, slope, t(20)= 0.21, p = 0.84, BF = 0.25; criterion: t(20)= −0.57, p = 0.58, BF = 0.30). Therefore, our results indicate that trial-by-trial choice precision is specifically modulated by the interaction between HERs in anterior r-vmPFC and value-related activity in posterior r-vmPFC.
Altogether, these results indicate that the interaction between HER amplitude and subjective value encoding accounts both for within-subject inter-trial variability, and for inter-individual differences in preference-based choice consistency.
HER influence is specific to preference-based choices
Finally, we tested whether the influence of HER was specific to subjective value or whether it is a more general mechanism modulating all types of decision-relevant evidence. To this aim, we analyzed perceptual discrimination trials using the same approach as for preference-based trials. First, we modeled the single trial response-locked MEG sensor-level data using a GLM (GLM1b) with the parameters accounting for choice in the perceptual task (i.e. contrast of the chosen option – ChosenCtrs – and the contrast of the unchosen option – UnchosenCtrs), as well as response button. We then identified the time windows where the contrast of the chosen option, unchosen option and response button were encoded (Supplementary Table 8) and determined the cortical regions contributing to the encoding of the contrast of the chosen option (Supplementary Fig. 2, Supplementary Table 9). Finally, we median-split perceptual trials according to the amplitude of HERs in anterior r-vmPFC. The encoding strength of the contrast of the chosen option did not depend on heart-evoked responses amplitude in any of the contrast-encoding regions (all p ≥ 0.26, BF ≤ 0.62; Supplementary Table 10). The results thus indicate that HER amplitude in r-vmPFC specifically influences the cortical representation of subjective value.
Discussion
We show that preference-based choice consistency depends on the influence that neural responses to heartbeats exert on the neural representation of subjective value in vmPFC. Preparing for the subjective preference-based decisions led to larger responses to heartbeats than preparing for the objective discrimination task, in vmPFC and post-central gyrus, two regions known to respond to heartbeats 12–15. HERs during task preparation influenced the neural representation of decisional evidence in the subjective task only. The interaction between HERs and subjective value encoding was specific to vmPFC, showing that this region – known to play a role in both valuation16–18 and self-related cognition19, 20 – integrates those two processes. HERs’ multiplicative influence on subjective value benefitted behavior, improving choice precision at the single trial level, and predicted inter-individual variability in choice consistency. HER influence could not be trivially explained by changes in cardiac parameters (heart rate, heart rate variability, ECG, stroke volume) or changes in arousal (pupil diameter, alpha power), indicating that the HER effect corresponds to change in the quality of the neural monitoring in cardiac inputs, rather than to changes in cardiac inputs or in overall brain state. Our results thus differ from previous observations in risky decisions, where changes in peripheral bodily signals predicted behavioral performance21. Altogether, our results indicate that the self-reflection intrinsic to preference-based decisions involves the neural read-out of a physiological variable and its integration in the valuation process.
We successfully replicated with MEG the cortical valuation network described in the fMRI literature 3, 22, 23, including dmPFC and vmPFC, during the choice period. These findings are interesting per se, as data on the temporal course of value-based choices in the human prefrontal cortex remain scarce 24–26. Here, we find that the chosen value is robustly encoded in the valuation network from 600 ms to 200 ms before motor response, with a temporal (but not spatial) overlap with motor preparation. Note that we did not find a robust encoding of the unchosen value, in line with electrophysiological recordings in the monkey orbitofrontal cortex where the encoding of the chosen value dominates 25, 27–29. The encoding of the chosen value interacted with HER specifically in vmPFC. In the objective discrimination task, the contrast of the chosen option was encoded, among other regions, in posterior parietal cortex, consistent with monkey electrophysiology literature30, 31. Importantly, HERs did not exert any influence on the neural encoding of objective perceptual evidence. This result might seem at odds with previously reported effects of HERs on sensory evidence during visual detection at threshold15. As opposed to the perceptual discrimination task used here, visual detection at threshold is intrinsically subjective, as participants are asked to report their fluctuating experience in response to physically and objectively constant stimuli 32, 33.
What is the nature of the interaction between HERs and value encoding in vmPFC? Because fluctuations in HERs occurred during task preparation, before option presentation, their influence on value encoding might generally pertain to the interaction between ongoing activity (during task preparation) and stimulus-evoked activity (in response to option presentation). However, the interaction with value encoding was specific to the transient neural activity evoked by heartbeats; no interaction between overall ongoing activity in vmPFC and value encoding could be found. In addition, HERs modulated value encoding in a multiplicative manner. Our results thus differ from a previously reported vmPFC baseline-shift additive effect in pleasantness ratings 34. A more detailed mechanistic account of how the response to heartbeat influences the subjective valuation process taking place about 1.5 second later remains to be established. Still, our results offer a new perspective on the unspecified ‘neural noise’ driving fluctuations in choice consistency 35–37 in subjective preference-based decisions, choice consistency is specifically affected by the interaction between neural responses to heartbeats and neural evaluation.
Decisions on primary goods such as food take into account internal state to select options that preserve the integrity of the organism that needs to be fed and protected. We show here that the subjective valuation of cultural goods, that rely on the same cortical valuation network as employed for primary goods 36–39, has inherited a functional link with the central monitoring of physiological variables. Even when choosing between cultural goods that do not fulfill any immediate basic need, the neural monitoring of heartbeats supports self-reflection underlying evaluation, contributes to the precision of subjective decisions and fosters the stable expression of long-lasting preferences that define, at least in part, our identity.
Methods
Participants
24 right-handed volunteers with normal or corrected-to-normal vision took part in the study after having given written informed consent. They received monetary compensation for their participation. The ethics committee CPP Ile de France III approved all experimental procedures. Three subjects were excluded from further analysis: one subject for too low overall performance (74%, 2 standard deviations below the mean = 87.6%), one subject was excluded for excessive number of artifacts (29.7% of trials, above 2 standard deviations, mean = 7%), one subject was excluded because the ICA correction of the cardiac artifact was not successful.
21 subjects were thus retained for all subsequent analyses (9 male; mean age: 23.57 ± 2.4 years; mean ± SD).
Tasks and procedure
Participants came on two consecutive days to the lab (mean elapsed time between the two sessions 22.28 ± 3.55 hours) to complete two experimental sessions. The first session was a likability rating on movies (behavior only), from which we drew the stimuli used in the second experimental session, during which brain activity was recorded with magnetoencephalography (MEG).
Rating session
We selected 540 popular movies form the Web (allocine.fr) whose title maximal length was 16 characters (spaces included). DVD covers and titles of the pre-selected movies were displayed one by one on a computer screen and subjects had to indicate whether they had previously watched the currently displayed movie by pressing a ‘yes’ or ‘no’ key on a computer keyboard, without any time constraint (Supplementary Fig. 1A). Participants were then presented with the list of movie titles they had previously watched and asked to name the 2 movies they liked the most and the two they liked the least. Participants were explicitly instructed to use these 4 movies as reference points (the extremes of the rating scale) to rate all other movies. Last, the titles and the covers of the movies belonging to the list were displayed one by one at the center of the computer screen in random order. Participants assigned to each movie a likability rating by displacing (with arrow keys) a cursor on a 21-points Likert scale and validated their choice with an additional button press (Supplementary Fig. 1B). Likability ratings were self-paced and the starting position of the cursor was randomized at every trial.
Stimuli
Experimental stimuli consisted of 256 pairs of written movie titles drawn from the list of movies that each participant had rated on the first day. Each movie title was characterized along two experimental dimensions: its likability rating (as provided by the participant) and its contrast. The mean contrast was obtained by averaging the luminance value (between 40 and 100; grey background at 190) randomly assigned to each character of the title. We manipulated trials difficulty by pairing movie titles so that the differences between the two items along the two dimensions (i.e., likability and contrast) were parametric and orthogonal. Additionally, we controlled that the sum of ratings and the sum of contrast within each difficulty level was independent of their difference and evenly distributed. Each pair of stimuli was presented twice in the experiment: one per decision type. A given movie title could appear in up to 10 different pairs. The position of the movie titles on the screen was pseudo-randomly assigned so that the position of the correct option (higher likeability rating or higher contrast) was fully counter-balanced.
Experimental task
On the second day, subjects performed a two-alternative forced-choice (2AFC) task while brain activity was recorded with MEG. At each trial, participants were instructed to perform one of the two decisions types on the pair of movie titles (Fig. 1A): either a preference decision, in which they had to indicate the item they liked the most, or a perceptual discrimination, in which they had to indicate the title written with the higher contrast. Each trial began with a fixation period of variable duration (uniformly distributed between 0.8 and 1.2 seconds in step of 0.05 s) indicated by a black fixation dot surrounded by a black ring (internal dot, 0.20° of visual angle; external black ring, 0.40° of visual angle), starting from which participants were required not to blink anymore. Next, the outer ring of the fixation turned either into a square or a diamond (0.40° and 0.56° visual angle, respectively) indicating which type of decision participants were to perform (preference-based or perceptual, counter-balanced across participants), for 1.5 seconds. Then, the outer shape turned again into a ring and two movie titles appeared above and below it (visual angle 1.09°). Options remained on screen until response was provided (via button press with the right hand) or until 3 seconds had elapsed. After response delivery, movie titles disappeared and the black fixation dot surrounded by the black circle remained on screen for 1 more second. The central dot turned green and stayed on screen for variable time (uniformly distributed between 2.5 and 3 seconds in step of 0.05 s), indicating participants that they were allowed to blink before the beginning of next trial. Each recording session consisted of 8 blocks of 64 trials each.
Prior to the recording session, participants familiarized themselves with the experimental task by carrying out 3 training blocks. The first 2 blocks (10 trials each) comprised trials of one type only, hence preceded by the same cue symbol. The last block contained interleaved trials (n=20), as in the actual recording. The movie pairs used during training were not presented again during the recording session.
Heartbeat counting task
After performing the 8 experimental blocks, we assessed participants’ interoceptive abilities by asking them to count their heartbeats by focusing on their bodily sensations, while fixating the screen 42. Subjects performed six blocks of different durations (30, 45, 60, 80, 100, 120 seconds) in randomized order. No feedback on performance was provided. Since the acquisition of our dataset, this widely used paradigm has been criticized in several respects 43–45.
Recordings
Neural activity was continuously recorded using a MEG system with 102 magnetometers and 204 planar gradiometers (Elekta Neuromag TRIUX, sampling rate 1000 Hz, online low-pass filter at 330 Hz). Cardiac activity was simultaneously recorded (BIOPAC Systems, Inc.; sampling frequency 1000 Hz; online filter 0.05-35 Hz). The electrocardiogram was obtained from 4 electrodes (2 placed in over the left and right clavicles, 2 over left and right supraspinatus muscles 50) and referenced to another electrode on the left iliac region of the abdomen, corresponding to four vertical derivations. The four horizontal derivations were computed offline by subtracting the activity of two adjacent electrodes. Additionally, we measured beat-to-beat changes in cardiac impedance, to compute the beat-by-beat stroke volume (i.e. the volume of blood ejected by the heart at each heartbeat 51). Impedance cardiography is a non-invasive technique based on the impedance changes in the thorax due to the changes in fluid volume (blood). A very low-intensity (400 μA rms) high frequency (12.5 kHz) electric current was injected via two source electrodes: the first one was placed on the left side of the neck and the second 30 cm below it (roughly on the sixth rib). Two other monitoring electrodes (placed 4 cm apart from the source ones: below the source electrode on the neck and above the source electrode on the rib cage) measured the voltage across the tissue. To determine left ventricular ejection time, aortic valve activity was recorded by placing an a-magnetic homemade microphone (online band-pass filter 0.05-300 Hz) on the chest of the subject.
Pupil diameter and eye movements were tracked using an eye-tracker device (EyeLink 1000, SR Research) and 4 electrodes (2 electrodes placed on the left and right temples and 2 electrodes placed above and below participant’s dominant eye).
Cardiac events and parameters
Cardiac events were detected on the right clavicle-left abdomen ECG derivation in all participants. We computed a template of the cardiac cycle, by averaging a subset of cardiac cycles, which was then convolved with the ECG time series. R-peaks were identified as peaks of the result of the convolution, normalized between 0 and 1, exceeding 0.6. All other cardiac waves were defined with respect R-peak. In particular, T-waves were identified as the maximum amplitude occurring within 420 milliseconds after the Q-wave. R-peak and T-wave automatic detection was visually verified for each participant.
Inter-beat intervals (IBIs) were defined for each phase of the trial as the intervals between two consecutive R-peaks. More specifically, we considered for ‘fixation’, ‘instruction period’ and ‘response’ phases the two R-peaks around their occurrence. IBIs during ‘choice’ were based on the two R-peaks preceding response delivery. Inter-beat variability was defined as the standard deviation across trials of IBIs in a given trial phase.
Stroke volume was computed according to the formula 51, 52: where ρ is the resistivity of the blood (135 Ohms*cm) 53, L2 is the distance between the two source electrodes, Zo2 is the base impedance, LVET is the systolic left ventricular ejection time (in seconds), dZ/dT(max) is the largest impedance change during systole (Ohms/sec). Note that because we obtained stroke volume by injecting a current at 12.5 kHz, rather than the more typical frequency of 100 kHz, absolute stroke volumes are systematically underestimated, but relative values are preserved.
MEG data preprocessing
External noise was removed from the continuous data using MaxFilter algorithm. Continuous data were then high-pass filtered at 0.5 Hz (4th order Butterworth filter). Trials (defined as epochs ranging from fixation period to 1 second after response) contaminated by muscle and movement artifacts were manually identified and discarded from further analyses (6% of trials on average, ranging from 0% to 15%).
Independent component analysis (ICA) 54, as implemented in FieldTrip Toolbox 55 was used to attenuate the cardiac artifact on MEG data. ICA was computed on MEG data epoched ±200 ms around the R-peak of the ECG, in data segments that were free of artifacts, blinks and saccades above 3 degrees. The number of independent components to compute was set to be equal to the rank of the MEG data. Mean pairwise phase consistency (PPC) was estimated for each independent component56 with the right clavicle-left abdomen ECG derivation signal in the frequency band 0-25 Hz. Components (up to 3) that exceeded 3 standard deviations from mean PPC were then removed from the continuous data.
To correct for blinks, 2-seconds segments of data were used to estimate blink and eye-movement components. Mean PCC was then computed with respect to vertical EOG signal, and components exceeding mean PCC + 3 standard deviations were removed from continuous data. The procedure was iterated until no component was beyond 3 standard deviations or until 3 components in total were removed. Stereotypical blink components were manually selected in two participants as the automated procedure failed to identified them.
ICA-corrected data were then low-pass filtered at 25 Hz (6th order Butterworth filter)
Trials selection
Trials had to meet the following criteria to be included in all subsequent analysis: no movement artifacts, sum of blinking periods less than 20% of total trial time, at least one T-peak during instruction period (cf. HERs section), and reaction time neither too short (at least 250 ms) nor too long. To identify exceedingly long RTs, we binned the trials of each task in 4 difficulty levels based on the difference of the two options (i.e., difference in ratings in preference-based choice and difference in contrast for the perceptual ones). Within each difficulty level, for correct and error trials separately, we excluded the trials with reaction times exceeding the participant’s mean RT + 2 standard deviations.
The average number of trials retained per participant was 421.67 ± 43.36 (mean ± SD).
Heartbeat evoked responses
Heart-evoked responses were computed on MEG data time-locked to T-wave occurring during the instruction period. T-waves had to be at minimum 400 ms distance from the subsequent R-peak. In order to avoid contamination by transient visual responses or by preparation to the subsequent visual presentation, we only retained T-waves that occurred at least 300 milliseconds after the onset of the instruction cue and 350 milliseconds before the onset of options presentation. If more than one T-wave occurred in this period, HERs for that trial were averaged. HERs were analyzed from T-wave + 50 ms to minimize contamination by the residual cardiac artifact 57 after ICA correction.
We verify that differences in HERs between the two types of decision were truly locked to heartbeats, and that a difference of similar magnitude could not arise by locking the data to any time point of the instruction period. To this end, we created surrogate timings for heartbeats (within the instruction period), to break the temporal relationship between neural data and heartbeats, and computed surrogate HERs. We created 500 surrogate heartbeat data set, by permuting the timings of the real T-wave between trials belonging to the same decision type (i.e., the timing of the T-wave at trial i was randomly assigned to trial j). We then searched for surrogate HER differences between trial types using a cluster-based permutation test 58 (see below). For each of the 500 iterations, we retained the value of the largest cluster statistics (sum(t)) to estimate the distribution of the largest difference that could be obtained randomly sampling ongoing neural activity during the same instruction period. To assess statistical significance, we compared the cluster statistics from the original data against the distribution of surrogate statistics.
Nonparametric statistical testing of MEG data
HERs difference between preference-based and perceptual trials during instruction presentation was tested for statistical significance using cluster-based permutation two-tailed t-test 58 as implemented in FieldTrip toolbox 55, on magnetometer activity in the time-window 50-300 ms after T-wave. This method defines candidate clusters of neural activity based on spatio-temporal adjacency exceeding a statistical threshold (p < 0.05) for a given number of neighboring sensors (n=3). Each candidate cluster is assigned a cluster-level test statistics corresponding to the sum of t values of all samples belonging to the given cluster. The null distribution is obtained non-parametrically by randomly shuffling conditions labels 10,000 times, computing at each iteration the cluster statistics and saving the largest positive and negative t sum. Monte Carlo p value corresponds to the proportion of cluster statistics under the null distribution that exceed the original cluster-level test statistics. Because the largest chance values are retained to construct the null distribution, this method intrinsically corrects for multiple comparisons across time and space. Controls analyses involving the clustering procedure were performed with the same parameters.
The significance of beta time-series obtained from GLM analyses at the sensor level was obtained using cluster-based permutation two-tailed t-test against zero.
Bayes factor
We used Bayes factors (BF) to quantify the evidence in support of the null hypothesis (H0 = no difference between 2 measures). To this aim, we computed the maximum log-likelihood of a gaussian model in favor of the alternative hypothesis and for the model favoring the null adjusting the effect size to correspond to a p = 0.05 for our sample size (n = 21 for all analyses except for pupil for which n = 16 and for 3 ECG derivations for which n=20). Finally, we computed Bayesian information criterion and the corresponding Bayes factor. As a summary indication, BF < 0.33 provides substantial evidence in favor of the null hypothesis, BF between 0.33 and 3 does not provide enough evidence for or against the null 59.
For regression analyses, Bayes Factor was computed using the online calculator tool (http://pcl.missouri.edu/bf-reg) based on Liang and colleagues 60.
Generalized linear model on response-locked single trials
To analyze how task-related variables are encoded in neural activity during decision, we ran a generalized linear model (GLM) on baseline-corrected (−500 to −200 ms before instruction presentation) single trial MEG data time-locked to button press. We predicted z-scored MEG activity at each time-point and channel using task-relevant experimental variables. For preference-based decisions we modeled MEG activity as: where t and c represent MEG activity at time-point t at channel c, β0 is the intercept, βChosenSV are the z-scored ratings of the chosen option, βUnChosenSV is the z-scored rating of the alternative unchosen option and βButton press is a categorical variable representing motor response (i.e. top or bottom). Similarly, for perceptual decisions we used: where βChosenCtrs and βUnChosenCtrs are the z-scored contrast of the chosen and unchosen option, respectively.
This procedure provided us with time series of beta values at each channel that could be tested against zero for significance using spatio-temporal clustering58. Once significant clusters encoding task-related variables were identified at the sensor level, we reconstructed the cortical sources corresponding to the sensor-level activity averaged within the significant time-window. We modeled source-reconstructed neural activity with the same GLMs to identify the cortical areas mostly contributing to the significant sensor-level effect.
Generalized linear model on posterior right vmPFC
To quantify the influence of HER in anterior r-vmPFC during instructions on subjective value encoding during choice, we modeled the activity of posterior r-vmPFC, encoding subjective value with the following GLM: where, β0 is the intercept, βChosenSV are the z-scored ratings of the chosen option, βHER is the z-scored activity in the anterior right vmPFC cluster defined by comparing HERs in preference-based vs. perceptual choices and βChosenSV*HER is the interaction term obtained by multiplying the z-scored previous predictors.
To verify that the interaction between subjective value encoding and HER amplitude was specifically time-locked to heartbeats and not a general influence of baseline activity in anterior r-vmPFC, we ran an alternative model to explain posterior r-vmPFC activity: where, β0 is the intercept, βChosenSV are the z-scored ratings of the chosen option, βBLvmPFC is the z-scored activity in anterior r-vmPFC during instructions averaged across the whole instruction period, not time-locked to heartbeats and βBLvmPFC*ChosenSV is the interaction between the two preceding predictors.
Note that regressors are not orthogonalized in any of the GLMs.
Anatomical MR acquisition and preprocessing
An anatomical T1 scan was acquired for each participant on a 3 Tesla Siemens TRIO (n = 2) or Siemens PRISMA (n = 20) or Siemens VERIO (n = 2). Cortical segmentation was obtained by using automated procedure as implemented in the FreeSurfer software package 61. The results were visually inspected and used for minimum-norm estimation.
Source reconstruction
Cortical localization of neural activity was performed with BrainStorm toolbox 62. After co-registration of individual anatomy and MEG sensors, 15,003 current dipoles were estimated using a linear inverse solution from time-series of magnetometers and planar gradiometers (weighted minimum-norm, SNR of 3, whitening PCA, depth weighting of 0.5) using overlapping-spheres head model. Current dipoles were constrained to be normally oriented to cortical surface, based on individual anatomy. Source activity was obtained by averaging sensor-level time-series in the time-windows showing significant effects (difference between HERs and beta values different from zero), spatially smoothed (FWHM 6 mm) and projected onto standard brain model (ICBM152_T1, 15,003 vertices). Note that sources in subcortical regions cannot be retrieved with the reconstruction method used here.
To assess which cortical areas contributed the most to the effects observed at the sensor-level, we ran parametric two-tailed t-test and reported all clusters of activity spatially extending more than 20 vertices with individual t-values corresponding to p < 0.005 (uncorrected for multiple comparisons). We reported the coordinates of vertices with the maximal t value and their anatomical labels according to AAL atlas 63. For clusters falling into prefrontal cortices, we reported the corresponding areas according to the connectivity-based parcellation developed by Neubert and colleagues 64.
Pupil data analysis
Pupil data that contained blinks (automatically detected with EyeLink software and extended before and after by 150 ms), saccades beyond 2 degrees and segments in which pupil size changed abruptly (signal temporal derivative exceeding 0.3, arbitrary unit) were linearly interpolated. All interpolated portions of the data that exceeded 1 second were removed from further analyses. Continuous pupil data from each experimental block were then band-pass filtered between 0.01 and 10 Hz (second order Butterworth) and z-scored. 16 subjects were retained for pupil analysis; 5 subjects were excluded due to too low quality of data. Pupil analysis was performed in two ways: 1) averaged pupil diameter in the same time period used for HER computation (i.e., 300 ms after instruction presentation until 350 ms before options display) and 2) averaged pupil diameter in the time-window spanning 1 second before button press until its execution.
Author contributions
DA, SP and CTB designed the study. DA collected the data. DA, CTB and AB analyzed the data. DA and CTB interpreted the results and wrote the first draft of the paper. All authors contributed to the final version of the paper.
Competing interests
The authors declare no competing interests.
Materials & correspondence
Correspondence and requests for materials should be addressed to C.T.B. (email: catherine.tallon-baudry{at}ens.fr)
Data availability
The data that support the findings of this study will be available from the corresponding authors upon reasonable request.
Code availability
The custom code used for the main analyses of this paper can be accessed online at https://github.com/DamianoAzzalini/HER-preferences.
Supplementary Tables
Acknowledgements
The authors thank Clémence Alméras, Maximilien Chaumon and Christophe Gitton for assistance in data acquisition.
This work was supported by funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No 670325, Advanced grant BRAVIUS) and a senior fellowship of the Canadian Institute For Advance Research (CIFAR) program in Brain, Mind and Consciousness to CTB; by a grant from the Ecole des Neurosciences de Paris Ile de France to DA; and by ANR-17-EURE-0017.