Frontoparietal dynamics and value accumulation in intertemporal choice

Intertemporal choice requires choosing between a smaller reward available after a shorter time delay and a larger reward available after a longer time delay. Previous studies suggest that intertemporal preferences are formed by generating a subjective value of the monetary rewards that depends on reward amount and the associated time delay. Neuroimaging results indicate that this subjective value is tracked by ventral medial prefrontal cortex (vmPFC) and ventral striatum. Subsequently, an accumulation process, subserved by a network including dorsal medial frontal cortex (dmFC), dorsal lateral prefrontal cortex (dlPFC) and posterior parietal cortex (pPC), selects a choice based on the subjective values. The mechanisms of how value accumulation interacts with subjective valuation to make a choice, and how brain regions communicate during decision making are undetermined. We developed and performed an EEG experiment that parametrically manipulated the probability of preferring delayed larger rewards. A computational model equipped with time and reward information transformation, selective attention, and stochastic value accumulation mechanisms was constructed and fit to choice and response time data using a hierarchical Bayesian approach. Phase-based functional connectivity between putative dmFC and pPC was found to be associated with stimulus processing and to resemble the reconstructed accumulation dynamics from the best performing computational model across experimental conditions. By combining computational modeling and phase-based functional connectivity, our results suggest an association between value accumulation, choice competition, and frontoparietal connectivity in intertemporal choice. Author summary Intertemporal choice is a prominent experimental assay for impulsivity. Behavior in the task involves several cognitive functions including valuation, action selection and self-control. It is unknown how these different functions are temporally implemented during the course of decision making. In the current study, we combined formal computational models of intertemporal choice with a phase-based EEG measure of activity across brain regions to show that functional connectivity between dmFC and pPC reflects cognitive mechanisms of both visual stimulus processing and choice value accumulation. The result supports the notion that dynamic interaction between frontopatietal regions instantiates the critical value accumulation process in intertemporal choice.

The ability to make future-oriented decisions is a key component of long-term 2 well-being [1][2][3]. Often, securing the largest cumulative reward requires forgoing 3 immediate gratification, and holding out for a long term reward. A decision that 4 requires an individual to choose among two or more rewards occurring at different time 5 points is referred to as an intertemporal choice. Neuroimaging studies have identified a 6 number of brain regions involved in intertemporal decision making [4][5][6][7], and scoped a 7 general framework for understanding the functions implemented by these neural 8 systems [8][9][10]. According to this framework, an intertemporal choice requires the 9 completion of a subjective value representation, and a choice selection 10 processes [4-6, 9, 11, 12]. To construct a subjective value representation, the information 11 about the monetary value (i.e., reward) of a choice alternative and the amount of time 12 (i.e., delay) associated with it must be fused together. This subjective value 13 representation process is now widely acknowledged to be closely associated with neural 14 processes in the ventral striatum and the ventromedial prefrontal cortex (vmPFC) 15 across different experimental tasks [7,8,12]. 16 The process of choosing among intertemporal choice alternatives creates a type of 17 competition for choice selection, and consideration of different aspects of the 18 alternatives (e.g., reward and delay) creates different instantaneous subjective values, 19 which must be compared. We have argued that value comparison occurs by 20 accumulating stochastic evidence of the relative values of the choice options [13,14]. 21 This choice value accumulation process has been associated with a large brain network, 22 including dorsal medial frontal cortex (dmFC), dorsolateral prefrontal cortex (dlPFC) 23 and posterior parietal cortex (pPC) [12,15,16]. Disrupting brain activity using 24 transcranial magnetic stimulation (TMS) to bilaterally inhibit prefrontal cortex and 25 right pPC increases impulsive behavior [11,17], implicating these brain regions in the 26 consideration of future rewards. Harris et al. [18] exploited the temporal dynamics of 27 intertemporal choice with EEG and found two separate roles of dlPFC: top-down 28 attention filtering at early decision periods, and value modulation at late decision 29 periods. Although there is extensive research and general consensus about the 30 subjective valuation functions of vmPFC and ventral striatum, there is less consensus 31 about the functional roles of dlPFC, dmFC and pPC because most previous studies 32 have focused on only one specific region or mechanism. Such focus can be productive for 33 localizing a region's functional role, but it leaves open the question of how neural 34 interaction between fronto-parietal regions contributes to decision-making. At least two 35 important questions wait to be explored: (1) Do fronto-parietal regions assume separate 36 or collaborative roles for making intertemporal choices? (2) What neural mechanisms 37 most closely adhere to the value accumulation process? 38 To answer these questions, our current study uses the temporal resolution of EEG to 39 probe the spatio-temporal covariates of cognitive mechanisms during intertemporal among putative dmFC, pPC and dlPFC sources estimated from EEG data. Our results 48 suggest that the functional connectivity between dmFC and pPC is a signature of the 49 value accumulation process. Specifically, the dmFC-pPC functional connectivity, 50 measured by theta-band phase clustering patterns in EEG data, is positively associated 51 with the competitive dynamics of the value accumulation process. This finding enhances 52 our understanding about the spatio-temporal neural correlates of intertemporal choice. 53

54
Subjects completed a pretest session and an EEG session of the intertemporal choice 55 task. The purpose of the pretest session was to execute better control over subjects' 56 behaviors and to adequately decide choice offers for the EEG session. Specifically, we 57 estimated two decision parameters for each subject according to a staircase procedure in 58 the pretest session, such that we could tempt each subject to choose the delayed option 59 with approximate probabilities of 0.1, 0.3, 0.5, 0.7, or 0.9 in the EEG session. We refer 60 to such expected probabilities of choosing delayed options as experimental P D 61 conditions throughout the text. 62 In every trial of the EEG session, a delay t was randomly selected from a range of 15 63 to 45 days. We then calculated and offered an amount r that would give P D of 0.1, 0.3, 64 0.5, 0.7, or 0.9; given the estimated parameters for the subject from the pretest session 65 and the selected delay for the trial (see Methods section for details).  71 stimulus presentation. Delays (t) were presented first for 1000 milliseconds. The amount 72 information (r) was then shown and kept on screen until a choice was made or for a 73 maximum of 4000 milliseconds. The immediate reward was the same ($10) on every 74 trial, but it was only instructed to the subjects before the EEG session, instead of being 75 presented visually. EEG data were collected during the EEG session. Behavioral data 77 To confirm that subjects' behavior was consistent with the experimental manipulation, 78 we first calculated the empirical P D as the proportion of delayed choice responses for 79 each subject in each experimental P D condition. Fig 2a illustrates that the empirical 80 P D (y-axis) during the EEG session closely matches the experimental P D (x-axis) 81 constructed from the pretest session. The solid bordered points correspond to the 82 average empirical P D across subjects, and they fall closely onto the diagonal line, 83 suggesting that our experimental stimuli produced patterns of choice response we 84 intended, despite some individual differences. To test that, we performed a 23 (subjects) 85 × 5 (experimental P D conditions) mixed effect logistic regression analysis with the P D 86 condition as a repeated measure and subject as a random effect. We found a significant 87 effect of the experimental P D condition, and the estimated probability of choosing the 88 delayed option is 0.089, 0.287, 0.509, 0.759, and 0.869 for conditions P D = 0.1, 0.3, 0.5, 89 0.7, and 0.9, respectively. The match between the experimental P D and empirical P D 90 was thus confirmed.

109
In addition to analyses on manifest variables, we fit a computational model to the 110 behavioral data to gain insight into the underlying within-trial temporal dynamics of 111 intertemporal choice [15]. The model we adopted assumed a trade-off mechanism 112 between features of rewards and time delays in forming the representation of choice 113 alternatives [19]. Previous studies have shown that such attribute-wise models could 114 mimic various delay discounting curves in terms of subjective valuation [15] and that 115 these models explain choice probabilities and response times as well or better than 116 hyperbolic discounting models do [20][21][22]. Depending on specific experimental designs, 117 attribute-wise models can provide more information on temporal dynamics when 118 different attributes were presented sequentially (as in the current experimental design). 119 Similarly, attribute-wise models can be convenient to explain eye tracking data where (a) The model takes the objective value of rewards (i.e., r I and r D ; blue nodes) and time delays (i.e., t I and t D ; yellow nodes), and converts these inputs to subjective representations (i.e., I r and I t ) through transformation parameters α r and α t , respectively. Features are selected at each moment with the parameter ω, creating an integrated subjective value of each alternative (i.e., the green node). Deliberation among the immediate and delayed alternatives is modulated by lateral inhibition parameters β I and β D (i.e., the orange node). Once an accumulator reaches a threshold amount of preference (θ), a decision is made corresponding to the winning accumulator (i.e., the red node). (b) Model fits from each model in (b), aggregated across subjects. (c) Model fitting results in terms of a z-transformed BIC statistic for each model configuration (by row) and each subject (by column). Each model variant is described on the left panel by a set of empty or filled circles, where empty circles indicate that a parameter was free to vary, whereas filled nodes indicate that a parameter was fixed. The model structures are grouped by their number of free parameters: black (3), purple (4), blue (5), green (6) and yellow (7). For the zBIC, lower values (blue) suggest better model performance.
In Fig 3a, x . The power transformation is intended to represent how subjects encode stimulus 131 values, where the reward r x and delay t x (x = I or D) information are transformed into 132 subjective representations r * x and t * x through the parameters α r and α t , respectively.

133
The reward and time delay can be viewed as two separate feature dimensions that 134 are selectively attended to at each moment in time t. Let w(t) denote the feature 135 dimension that is being attended at the moment t. For simplicity, we assume that w(t) 136 follows a Bernoulli distribution over time, such that where ω is an attention parameter. We arbitrarily specified the values of w(t) such 138 that when w(t) = 0, attention was directed toward the time information, whereas when 139 w(t) = 1, attention was directed toward the reward information. Hence, estimates of the 140 ω parameter that are large (i.e., near one) indicate greater attention to the reward 141 information.  Fig 3a, this isolated subjective valuation is represented as the green node. As independently from the reward information (i.e. w(t) = 0). To account for this, we 149 assumed that the subjective values were constructed in a "piecewise" manner, such that 150 for the first 1000 milliseconds, V I (t) = t αt D , and V D (t) = t αt I . Hence, before the reward information is displayed (i.e., before one second), the 152 subjective value of V I and V D only contains the encoded value of time delay. Once the 153 reward information is presented (i.e., after one second), the subjective value oscillates leakage on the temporal dynamics of decision making [15,16,21,23,[26][27][28][29][30][31]. The lateral 164 inhibition mechanism creates mutual competition between the immediate and delayed 165 options during the course of value accumulation. The leakage mechanisms allow for the 166 passive loss of information in the value accumulation process. For example, in our 167 previous work [15], we have shown that trial-to-trial variability in the lateral inhibition 168 parameter covaries with trial-to-trial measures of self control expressed in several of the 169 integration brain regions discussed in the introduction (e.g., dmFC and dlPFC), yet 170 were not linked to aspects of the valuation process (e.g., vmPFC).  Unlike the leakage term, the noise term σ was estimated in all model variants.

183
Together, the value accumulation process including lateral inhibition, leakage and 184 noise for each of immediate (A I ) and delayed (A D ) alternatives can be expressed as Although the model is actually a continuous diffusion process, Eq 2 is written as the 186 recursive numerical approximation of the process in the Euler method [32]. The term 187 denotes a discrete time step in the accumulation process. We set = 0.1 in our 188 approximation of the model's dynamics.

189
There are two other dynamics of the model to discuss before finalizing the model 190 specification, and both relate to boundaries that the accumulators A I and A D take.

191
First, there is a threshold parameter θ that designates when enough value has 192 accumulated for either option to trigger a response. On the presentation of the stimulus, 193 both accumulators race to this threshold value θ, and a choice is made that corresponds 194 to the accumulator that reaches the threshold first. Additionally, we specified that the 195 accumulation process started at a fixed distance away from the threshold value θ. The 196 starting point was fixed to be θ/5. Fig 3a illustrates that the accumulation process 197 produces the response, designated as the red node O. For our purposes, we assumed a 198 common threshold for both accumulators, and this parameter is estimated for every 199 model variant we discuss below. Second, there is a lower boundary on the accumulation 200 process, such that no accumulator can go below zero. This assumption follows the 201 tradition of the Leaky Competing Accumulator model and is based off the principle that 202 neurons cannot have negative firing patterns. To enforce this constraint, we specified July 28, 2020 7/28 the five parameters was set to a particular constant value when they were fixed and not 230 freely estimated. Namely, we set α r = 1, α t = 1, ω = 0.9, β I = 0, and β D = 0. We 231 chose to set ω to 0.9 because we suspected reward information would be prioritized 232 asymmetrically because delay information was presented first and in isolation for the 233 first second. Recall that for the first second, ω is set to zero for all models. In addition 234 to these settings, we also freely estimated the accumulation threshold θ, 235 moment-to-moment noise σ, and an additional non-decision time parameter τ for each 236 model variant. 237 We fit each configuration to all behavioral data in a hierarchical Bayesian framework, 238 meaning that each parameter just discussed was estimated for each subject, along with 239 group-level parameters in the hierarchy (see Methods section for details of model  To ensure that Model 11 is the best performing model among all tested model 254 configurations, we further compared the best five models in Table 1  better performance by the model. Table 1 shows that Model 14 outperformed any other 263 model for 9 out of 23 subjects, but it did struggle to fit certain subjects, such as subject 264 9, 14, and 16 (as suggested in Fig 3c). Model 11 worked best for 7 out of 23 subjects, 265 and it had the lowest sum of zBIC rank. Model 11 also had a satisfactory fit to all the 266 subjects according to Fig 3c. Together, we selected Model 11 as containing the most 267 plausible representation mechanisms to explain our behavioral data. Hence, the model 268 performed best when the following parameters were free to vary across subjects: α t , β I , 269 and β D . To get a sense of the optimal parameter values, we computed the maximum a 270 posteriori (MAP) values from estimated posterior distributions for each parameter from 271 both Model 11 (S1 Table) and Model 14 (S2 Table).

272
Although the model evaluation process assessed model performance in a relative 273 sense (i.e., performance across models), it is also important to assess performance in an 274 absolute sense. To do this, we evaluated how well the model fit the empirical data by  Value accumulation dynamics reconstructed from the behavioral model 295 The purpose of the model selection process was to identify the most plausible 296 representation of value accumulation to explain the pattern of choices and response 297 times across conditions. Our ultimate goal was to use this most plausible representation 298 to interpret the temporal dynamics in our EEG data. Fig 5b provides an example of the 299 model's accumulation dynamics, and how well they fit to an individual subject. The top 300 row shows the time course (0 -5 seconds) of the modeled accumulation process for each 301 of delayed (red) and immediate (blue) accumulators, where each line indicates one 302 simulated trial. Across the five P D conditions, the delayed and immediate accumulators 303 race toward a common threshold. In each condition, the initial delay information causes 304 some initial preference for the immediate option. However, once the reward information 305 is presented (i.e., after one second), the delayed option gains preference, and the rate of 306 this gain varies by condition, where the P D = 0.1 condition is slowest, and the P D = 0.9 307 is fastest. This rate differential produces a dynamic that results in the delayed option In all panels, red colors correspond to delayed options, whereas blue colors correspond to immediate options. In panels in the middle and bottom rows, choice probabilities can be inferred by assessing the relative heights of the two histograms.
We applied a quantitative measure -absolute balance of evidence (ABOE; [34]) that 316 summarizes accumulation dynamics for each experimental P D condition. Specifically, we 317 wanted to know how much competition occurred between the two alternatives across the 318 conditions. In the model, competition reflects the degree to which each alternative is After calculating the ABOE for each simulation, we averaged the ABOEs across 335 conditions, such that where ABOE c refers to the absolute balance of evidence for condition P D = c, N c We analyzed the event-related potentials (ERPs) on several electrodes of EEG data 356 after the extracted epochs were transformed to current source density (CSD) estimates 357 (see Methods section). The ERPs were calculated based on a set of electrodes that were 358 considered to correspond to dmFC, left/right pPC, and left/right dlPFC. Table 2 listed 359 the different sets of electrodes in the analysis, the approximately corresponding 360 electrodes in the International 10/20 system, and the respective putative brain areas. 361 We included those electrodes in the ERPs analysis and the following ISPC analysis 362 based off previous studies. For example, the putative electrode E6 that represents dmFC 363 region corresponds to FCz electrode in 10/20 system [35,36], and the FCz electrode has 364 been implicated with decision related functions in other domains [37,38]. The location 365 of F3 and F4 have been localized to Brodmann area 46 [39], roughly corresponding with 366 dlPFC [40]. The location of TP3 and TP4 have been localized to Brodmann area 40 (or 367 the inferior parietal lobule) [39], which constitutes the lateral part of pPC [41].  ERPs from Putative dmFC, dlPFC, and pPC for each P D condition. ERPs were calculated on the preprocessed EEG data after CSD transformation to reduce volume conduction. Three panels from top to bottom show scalp electrical activities approximately located to dmFC, dlPFC, and pPC, respectively. Trials were aligned by the onset of time delay presentation, and the time course extends from -500 ms to 2000 ms. Two dashed lines indicate the presentation of time delay (0ms) and reward amount (1000ms). Trials were separated by P D conditions for comparison. region, as we have previously found its association with value accumulation activity 398 using fMRI [43]. The connectivity was established between dmFC and other regions 399 previously identified important in intertemporal choice, including bilateral pPC and 400 bilateral dlpFC. Each brain area was localized by a set of electrodes listed in Table 2. between dmFC and dlPFC based on the pISPC. In contrast with dmFC-pPC, the 411 connectivity between dmFC and dlPFC did not yield the obviously increased pISPC.

412
Although previous studies have identified gamma frequency band (30 -90 Hz) in EEG 413 activity as reflecting evidence accumulation mechanisms [44,45], the present study 414 localized the informative frequency band to lower frequency, especially to theta band (4 415 -8 Hz). Therefore, we focused subsequent analyses on theta band activity and tested for 416 increases of pISPC following the stimulus presentation by analyzing pISPC within the 417 time windows 0 -0.3 s and 1 -1.3 s (demarcated by black borders in Fig 8). processes that underlie these choices.

448
To investigate whether connectivity is related to the cognitive aspects of our task, We further examined the topographical characteristics across time windows for the 468 P D = 0.5 condition, which had the largest pISPC between dmFC and pPC. Fig 9b   469 shows topographical plots of the pISPC between the seed region E6 (putative dmFC)   Together, we concluded that the magnitude of pISPC was modulated by the time delay 533 in the 0 -0.3 second period, with higher pISPC associated with longer time delay.

534
Combining the above findings during 0 -0.3 s and during 1 -1.3 s, we observed that 535 the magnitude of pISPC is modulated by the time delay before the amount of reward is 536 presented, and is afterwards modulated by the experimental P D , which represents the 537 integral information of both time delay and amount of reward. Given that experimental 538 P D is closely associated with the amount of value competition between two choice 539 alternatives, pISPC can be viewed as an indicator of general value accumulation, such 540 that it tracks either the value of time delay when only time delay is available, or the 541 amount of value competition when both time delay and amount of reward are available. 542

543
In this article, we have shown that fronto-parietal connectivity is an expression of 544 valuation accumulation within an intertemporal choice task. We have also shown that a 545 computational model in which subjective values are constructed through 546 moment-to-moment sampling of feature dimensions exhibits patterns of choice 547 competition that closely resembles the pattern of connectivity results across conditions 548 of our experiment. In this section, we first discuss the relevance of the cognitive 549 mechanism we identified from our behavioral data. We then discuss the connectivity 550 results, and what they might suggest from a cognitive perspective. Finally, we discuss 551 how our neural results can be more fully integrated into computational models in future 552 work. In this study, we constructed a model in which a few different mechanisms could explain 555 temporal discounting behavior, such as distorted encoding of the stimuli, attention 556 prioritization to the delay information, or an inability to suppress the shorter sooner 557 option. Fifteen different models were constructed from this overarching structure, each 558 of which possessed a unique configurations of mechanisms. In addition to the core 559 parameters θ, τ , and σ, our current study revealed that model performance was best 560 achieved when the following three additional parameters were allowed to freely vary 561 across subjects: α t , β I , and β D .

562
In a previous application [15], we used a similar model to account for behavioral 563 data from another intertemporal choice task. In a similar analysis, a factorial model 564 fitting exercise suggested that model performance was best achieved when ω, β I , and 565 β D were freely estimated -similar to Model 13 presented here. The consensus of the 566 lateral inhibition terms β I and β D across the two studies suggest that these mechanisms 567 are strong contributors in explaining valuation accumulation dynamics within 568 intertemporal choice. By contrast, we can speculate about the disagreement between ω 569 and α t on two grounds. 570 First, in contrast to the design presented in [15], our current experimental design 571 showed the time delay and amount of reward for the delayed option sequentially. Here, 572 we adjusted the model to account for the sequential structure by first allowing 573 subjective valuation of the alternatives to accumulate for the first second on the basis of 574 only delay information. Once the reward information was presented, the model 575 performed identically to that presented in [15]. Additionally, when parameters are not 576 freely estimated, a choice must be made as to what value these fixed parameters should 577 take. In [15], ω was set to 0.5, where attention is equally divided between the two 578 dimensions. Here, we set ω to 0.9 in light of the potential asymmetry that might ensue, 579 due to the delay information being presented first, and in isolation. Although we also 580 tested ω = 0.5, with negligible difference in model performance, it is possible that these 581 fixed parameter choices could at least partially explain the differences between the 582 results. Second, it is also possible that choosing between a delayed option and a fixed 583 immediate option in our study, and choosing between two delayed options (i.e., a 584 shorter sooner and a larger later option) in [15], can partially induce the use of different 585 mechanisms in the choice process. The present study identified that phase-based frontoparietal functional connectivity 588 between putative dmFC and pPC was associated with the degree of competition 589 occurring in the value accumulation process. This finding is in line with previous 590 studies where EEG gamma band activity reflected the gradual accumulation of evidence 591 in value-based decision making tasks [44]. [45] adopted simultaneous EEG-fMRI 592 approach and identified the posterior-medial frontal cortex (PMFC) as the spatial 593 correlates of such accumulation mechanisms. In addition, they found that the PMFC 594 exhibited task-dependent coupling with the vmPFC and the striatum. Our study 595 further endorsed the proposition of an evidence accumulation process during 596 value-based decision making, and extended it with a choice task modulated by higher 597 level of self-control rather than a preference-based hedonic decision making task in [45]. 598 As such, our study postulated that the evidence accumulation mechanism may subserve 599 a variety of decision making tasks. Also, the phase-based functional connectivity that 600 we identified from EEG was between putative dmFC and putative pPC, extending from 601 previous focus on within-frontal connectivity dynamics. Although it may be 602 problematic to directly compare the spatial correlates of accumulation dynamics across 603 different studies due to the different task structure and different modality of neural 604 measures, some consensus between the two results is assuring. Future studies may shed 605 light on the connections between these different studies.

606
Previous investigations of connectivity dynamics in intertemporal choice mostly 607 relied on fMRI. For example, [12,46] investigated the connectivity between vmPFC and 608 dlPFC, as evidence for the mechanism of self-control that the dlPFC has on the 609 putative valuation region vmPFC. [13] identified functional connectivity between the 610 vmPFC and three regions of dlPFC, bilateral pPC and dmFC, as a partial support for 611 the value accumulation mechanism of the three regions. Our study was unable to 612 localize the vmPFC or the ventral striatum due to the low spatial resolution of EEG 613 measurements for sites distal from the scalp. Instead, our results add to our knowledge 614 about how frontoparietal connectivity relates to choice competition in the accumulation 615 process. Across these results, is not yet clear whether the connectivity between the 616 dmFC and the pPC reflects a genuine connection between the two anatomical regions, 617 or a common connection with vmPFC. Interestingly, we did not find such phase-based 618 connectivity between dmFC and dlPFC.

619
Links between brain and behavior 620 In this study, we constructed value accumulation dynamics from a computational model 621 and related them to connectivity dynamics in our EEG data. Such a link between the 622 two modalities is considered purely theoretical [47], and is relatively weak from the 623 perspective of statistical power. A more integrated approach would be to use the neural 624 dynamics to directly drive the computational model [48][49][50][51], or to use the covariation 625 betwen their trial-to-trial fluctuations to modulate and constrain the compuational 626 model [52,53]. Although an ideal neural measure would have trial-level information to 627 substantiate mechanistic claims [15,[54][55][56], functional connectivity measured by ISPC 628 requires several trials to establish a reliable measure, making a trial-level analytic 629 approach infeasible for such precise linking in our study. Another possibility would be 630 to establish a link to the computational model at the moment-by-moment level, which 631 would further exploit the temporal resolution of EEG data provided by EEG measures. 632 For example, a time-frequency analysis on the reconstructed accumulation trajectories 633 might reveal similar information with respect to time and frequency, compared to a 634 time-frequency result in ISPC.

635
Although providing high temporal resolution, EEG suffers from limited spatial 636 resolution. The dmFC and pPC electrodes we identified provide tentative connections of 637 the anatomical regions. However, it is difficult or impossible to establish connectivity 638 results for other important deep structures or subcortical regions, such as vmPFC and 639 ventral striatum. Therefore, the previous finding of connectivity between vmPFC and 640 dlPFC cannot be replicated from purely EEG analysis. Due to the limitations of both 641 approaches, a combination of EEG and fMRI, or even more modalities could provide a 642 workaround on the measurement limitations and offer complementary correlates of 643 neural activity. The idea of fusing EEG and fMRI is not novel, and there have been 644 attempts using simultaneous fMRI and EEG on various decision making tasks including 645 intertemporal choice (e.g. [57][58][59]. We argue that including cognitive representations for 646 the behavioral data will also be an important connective medium, which has been 647 discussed in detail elsewhere [56,[60][61][62][63]. pointing out that discounting across a time delay is not consistent with discounting 656 across subintervals of that delay (i.e. discounting between 0 and t and between t and t 657 can be greater than simply between 0 and t, where 0 < t < t, a phenomenon known as 658 subadditivity). Scholten and Read [19] have proposed a new model of discounting that 659 accounts for these effects and that simultaneously accommodates hyperbolic discounting. 660 Their model is mechanistically similar to the model presented here, and we have also 661 shown that our model provides excellent fits to hyperbolic discounting patterns [15]. A 662 further significant advantage of our feature-based process model is that it provides a 663 means to incorporate order dependence in intertemporal preference. In this experiment 664 we showed delay information ahead of reward amount and our findings suggest that 665 participants are influenced by this asynchrony so that decision making begins from 666 biased starting points once reward information is presented, corresponding to well 667 documented order effects in intertemporal choice (e.g., [67]). A next step for our 668 modeling efforts will be to extend the generality of phenomena that feature-based 669 process models accommodate.  been used in our previous fMRI studies [13][14][15]. The pretest session used a staircase 682 procedure to measure each individual's discount rate k, assuming a hyperbolic 683 discounting function, where V is the subjective value of a reward (in $), r is the monetary amount offered 685 (in $), and t is the delay (in days; t = 0 for payment available "Today"). After 686 completing the behavioral pretest task, we fit a softmax decision function to subjects' 687 choices. We assumed that the likelihood of choosing the delayed reward (P D ) was given 688 by a softmax rule, where m accounts for individual sensitivity to changes in discounted value, V I is the 690 value of the smaller sooner option (in $, t = 0 for all choices in this task), and V D is the 691 value of the larger later option. Using a maximum likelihood procedure, decision 692 parameters (k and m) were estimated from the data obtained during the staircase task 693 and used to generate stimuli for the EEG session. In this way, we were able to 694 systematically manipulate the expected probability of choosing the delayed reward (P D ) 695 as the within-subject independent variable for the EEG session.

696
The staircase procedure used during the first task required subjects to select between 697 a delayed reward (of r dollars available at delay t) and a fixed immediate reward of $10 698 (V I ). For any choice, indifference between the immediate and delayed options implies a 699 discount rate of k = (r − V I )(V I t). We refer to this implied equivalence point as k eq .

700
Our procedure amounted to varying k eq until indifference was reached. Specifically, we 701 began with k eq = 0.02, and if the subject chose the delayed reward, k eq decreased by a 702 step size of 0.01 for the next trial. Otherwise, k eq increased by the same amount. Every 703 time the subject chose both a delayed and an immediate offer within five consecutive 704 trials, the step size was reduced by 5%. Subjects completed 60 trials of this procedure, 705 and we placed no limits on the response time in this pretest session.

706
In the EEG session, we selected a 1000 ms separation between the presentation of 707 delay and amount information to maintain a minimal epoch length that allowed 708 subjects to make accurate intertemporal decisions. We wished to minimize the epoch 709 length to ensure subject's continual engagement and accurate time-locked signals in the 710 EEG data. The 1000 ms separation was sufficient to allow us to distinguish between 711 neural responses to delay and amount presentations with EEG. In addition, the 712 sequential design allowed us to keep subjects fixated on the center of the screen, 713 minimizing eye movement artifacts. After removing trials with excessive eye movements, 714 the median number of trials across subjects was 179. Except for one subject left with 715 only 115 trials after trial rejection (indexed as subject 15 in Fig 3c), trial numbers for 716 other subjects were in the range of 168-180.

717
EEG recordings and preprocessing 718 EEG data was collected using a 128 channel Geodesic Sensor Net (Electrical Geodesics, 719 Inc., Eugene OR, USA), with a 500 Hz sampling rate, using the vertex as reference.

720
During pre-processing, we re-referenced to the average reference, epoched trials from 721 -1500 to +6500 ms around the onset of delay presentation, baseline corrected trials using 722 the average from 0 to 5000 ms, and band-pass filtered the data at 0.5 -200 Hz. Trials 723 were visually inspected and rejected if excessive artifacts were present. Normally occurring artifacts were rejected using an independent component analysis algorithm 725 from the EEGLab toolbox [68]. Epochs were then transformed to current source density 726 (CSD) using the CSD toolbox [69]. CSD transformations of EEG data reduce the 727 influence of volume conduction across the scalp and constrain the effect of cortical 728 sources to within ≤ 5cm 2 from the given electrode, facilitating the analysis of oscillatory 729 neural activity and localization of activity from recordings at the scalp [70,71]. 730 Intersite phase clustering 731 We analyzed functional connectivity in the EEG data by performing statistical tests of 732 intersite phase clustering (ISPC). ISPC quantifies the time-dependent connectivity 733 strengths between putative brain regions by measuring the consistency of the difference 734 of phase angles between a pair of electrodes [42]. To obtain ISPC from our EEG data, 735 we convolved CSD-EEG epochs with a set of Morlet wavelets, which are defined as 736 e −i2πf t e −t 2 /2σ 2 , where t is time, f is frequency, and σ defines the width of each 737 frequency band in terms of cycles. We constructed wavelets for 30 frequencies, from 738 2 − 30Hz in logarithmic steps, and varied the width for each of these wavelets, from 739 6 − 10 cycles, also in logarithmic steps. We computed ISPC between pairwise electrodes 740 by extracting the phase angle (φ) from the convolved signal, and then calculated  Model parameter estimation 761 We fit our behavioral model to individual data in a hierarchical Bayesian framework.

762
For simplicity and based on our previous analyses, model parameters at the subject 763 level were assumed to be distributed according to a normal distribution with group 764 each subject using the MAP estimates from Model 11. The MAP estimates were 793 practically calculated as the average values of the last 100 iterations from the Markov 794 chains. Given the synthetic data of each subject, we followed the same procedures to 795 recover the model parameter, as were performed for actual experimental data.