Abstract
Everyday behaviors are governed by decisions, about what we see and which actions to take. Here we present a model of the evolution of decisions from visual perception to voluntary action, in humans. We combine accumulation-to-threshold modelling of visuomotor decisions under different levels of uncertainty, with electro-/magneto-encephalographic recording, to trace the sequence of localised decision processes, separately encoded in beta and gamma frequency ranges, and the flow of information through cortical networks. We show that evidence accumulation in motor and prefrontal cortex, to resolve action uncertainty, begins within 100ms from the onset of visual evidence accumulation, before the threshold in sensory regions is reached suggesting a continuous (rather than sequential) processing of information from perception to action. Moreover, the direction of flow of information between sensory, motor and association cortices, is opposite in beta and gamma frequency bands. The frequency, temporal and spatial distributions of the decision processes reveal widespread hierarchical information processing networks through which we resolve trial-by-trial action decisions despite environmental uncertainty.
Introduction
Human behaviors are the result of many decisions, from early or automatic perceptual inferences about our environment to complex goal-directed choices between alternate courses of action. Three broad lines of research have made separate contributions to understanding such decisions. First, the psychophysical analysis of visuomotor task performance and reaction times, in health 1 or in the presence of focal 2 and degenerative brain lesions 3.
Second, the functional anatomical analysis of decision making using brain imaging and neurophysiology, including paradigms that manipulate visual uncertainty 4, action selection 5, or outcome evaluation 6. Third, the development of computational models of how decisions can be reached, at the level of neuronal ensembles 7 or groups of individuals 8.
It remains a challenge however, to bring these separate lines of enquiry together in a unified model of neurophysiologically informed decision process, embedded in a functional anatomical framework, that can together explain the transformation of noisy visual inputs to alternative motor outputs. The anatomical framework has an additional requirement, which is to accommodate the evidence for functional segregation between sensory and motor areas at the same time as allowing the flow of information through hierarchical and distributed brain networks.
Here we develop an integrated account of visuomotor decision-making, as summarised in Figure 1, working from a novel visuomotor task that adjusts sensory and action uncertainty during functional brain imaging by combined electro-/magnetoencephalography (MEEG).
A long tradition in mathematical psychology has argued that decisions and their latencies are controlled by when cumulative evidence in favour of a choice reaches a criterion decision threshold 9. We identify the accumulation-to-threshold of latent variables representing sensory evidences, based on the transformation of visual signals into evidence about the behaviorally relevant stimulus features (perceptual decisions,10, 11); and the analogous ‘evidence’ for motor schema, which have been termed motor intentions (action decisions 12, 13).
Previous studies of visuomotor tasks typically focus on either perceptual decisions (e.g. judgement of motion direction) or on action decisions (e.g. choice of a motor response), whereas in real-world scenarios agents are required to use the outcome of their perceptual deliberations to inform decisions between alternate responses. This distinction can be lost in experimental paradigms where perceptual decisions are rigidly mapped onto motor responses 14, 15, potentially conflating perceptual and action decisions or attributing variance to one or other process 16.
Rather than arbitrarily divide visual from motor transformations, we investigated their associated decision processes by separately manipulating uncertainty in the identity of visual features (perceptual uncertainty, by variable motion coherence) and range of possible actions (action uncertainty, by variable number of response options). While many studies have used two-alternate forced choice paradigms with differential rewards, we adopted a n-way decision task to study decisions made between equivalent outcomes17 (Figure 2).
Several brain regions have been identified that accumulate perceptual evidence10, 11 and motor intentions12, 13. However, it is also necessary to understand how a network of accumulator regions orchestrates their activity for the critical transformation between perceptual and action decisions.
Specifically, we sought to distinguish (i) a serial process 18 where perceptual decisions are complete and their output passed to motor accumulators, from (ii) a continuous flow of information 19, through perceptual to associative and motor regions before completion of perceptual analysis. A serial process would be robust to error, but continuous flow would enable faster action decisions. To differentiate these alternatives, we mapped the modelled temporal profile of evidence accumulation to neurophysiological signatures, trial-by-trial. The temporal evolution of predicted evidence was based on behaviorally optimized generative linear ballistic accumulator model of the decision (Figure 3f).
We exploited the temporal resolution of MEEG to measure spatiotemporal variance of the induced power20. We focused on the beta and gamma band power as the candidate correlates of the evidence for three reasons.
First, the growing evidence for separate functions of gamma and beta in the feedforward and feedback of information respectively in hierarchical brain networks 21, 22. Second, that the accumulation of evidence for perceptual choices correlates with gamma-frequency oscillations 23. Third, that the processes underlying the deliberation between alternate actions have been associated with beta power modulation 24–26.
The use of MEEG affords a source model of cortical generators 27 and enables the functional segregation of sensory and motor area, as well as areas where sensory-motor transformations occur. Complementary connectivity measures (phase transfer entropy28) reveal the flow of information between areas, orchestrating the emergence of decision-evidences across decision networks.
We show that evidence accumulation in motor and prefrontal cortex begins very soon after visual cortex, and before perceptual decisions are concluded. We further demonstrate that the timing of evidence accumulation and the direction of flow of information between widespread sensory, motor and association cortices differ between Beta (13-30Hz) and Gamma (31-90Hz) frequency range. An early sweep of Gamma activity across an occipito-parietal-frontal network precedes the gradual arising of Beta mediated decision signals.
These signals emerge progressively in a lateralized caudo-rostral cascade unfolding along the dorsal stream. The cascade is mainly driven by a lateralized and continuous flow of information from posterior visual areas to distant anterior action control regions. Crucially, the strength of the information flow (as measured by phase-transfer entropy) determines the speed of progression throughout all stages of information processing from perception through action as reflected by a positive relationship between connectivity and both faster model accumulation-rates and shorter reaction-times. This provides an important formal link between behaviour, established models of decision-making, and connectivity measures. Taken together, the results reveal a continuous flow of information transmitted and integrated through a hierarchical network that transforms decision-making from perception to action.
Results
Behavior
To functionally segregate computations mediating visual and action decisions, we used a novel decision-making task to separately manipulate uncertainty in the identity of visual features (perceptual uncertainty), and actions (action uncertainty). The task combined elements of the classic motion discrimination task29 with a response selection task13. Noisy visual stimuli indicated the one or more response options, which were executed by pressing a corresponding button (Figure 2 and Methods).
Uncertainty in perceptual and action decisions was manipulated by varying the noise in the option stimuli and manipulating the number of permitted responses in a full factorial design. The noise in the visual stimuli introduces perceptual uncertainty11, 29. The variable number of permitted response options introduced action uncertainty13, 30.
Previous work has shown that the uncertainty associated with both the stimulus motion and the number of available choices systematically influences the parameters of models of decision-evidence accumulation 11, 13, 30. Therefore, by manipulating motion coherence in the random dots stimuli and the number of offered choices, we sought to isolate the neural signatures of decision-evidence accumulation for perceptual and action decisions, respectively.
Participants performed the task first in a training session where individual motion thresholds were estimated for both low and high action uncertainty levels (Figure 3a). Subsequently, participants performed the task with the motion thresholds that standardized performance, while undergoing MEEG scan.
During training, participants where slower and less accurate when motion coherence was lower (Figure 3a). Similarly, during the scan session (Figure 3b and Supplementary Figure 1) responses were slower under high perceptual (low = 0.77s ±0.1; high = 0.88s±0.1; F(1,17) = 158.17 p < 0.0001; post-hoc p < 0.0001) and action uncertainty (low = 0.80s ± 0.13; high = 0.85s±0.1; F(1,17) = 6.28 p = 0.022; post-hoc p = 0.022; 2-by-2 repeated measures ANOVA; Tukey-Kramer correction). In summary, behavior scaled with levels of perceptual and action uncertainties, confirming the efficacy of our manipulations.
To verify that participants’ choices were substantially independent over trials, Shannon’s equitability index was calculated for sequential choice pairs 13. The Shannon’s equitability index for all participants had mean 0.77 (SD ± 0.016) and did not differ significantly from the index generated by random permutations of trial order (see Supplementary Figure 2) confirming that subjects’ choices were not biased by previous responses.
Uncertainty modulates the rate of evidence accumulation
Summary statistics of behavioral data cannot adequately explain the mechanism by which uncertainty slows decisions. We adopted formal models of decision-making to decompose the behavioral performance into cognitively relevant latent variables. We fitted accumulation-to-threshold models (Linear Ballistic Accumulator, LBA, 31 to each participant’s reaction time and accuracy data.
The LBA model of decisions is more tractable than drift-diffusion models for n-way decisions while still remaining physiologically informative 32. In the LBA each decision was represented by an accumulator that integrated decision-evidence up to a boundary. When the accumulated evidence crosses the boundary a decision is committed (Figure 3c). Instead of adopting a two-stage model, which assumes a discrete serial process between perceptual and action decisions, we opted for a ‘unitary’ model where both perceptual and action uncertainty concur in determining participant’s performance in a given trial. The factorial design of the experiment enabled us to divorce perceptual and action decision processes using connectivity metrics (see below).
Uncertainty can slow responses by reducing the speed of information accumulation (accumulation-rate), increasing response caution (decision boundary), stretching the time required by perceptual and motor processes not directly related to the decision process (non-decision time), or by a combination thereof.
To differentiate these competing mechanisms, we fitted all possible combinations of free parameters in a set of 15 LBAs. We compared the goodness-of-fit of each model using random-effects Bayesian model comparison 33, 34. The model comparison revealed that changes in the accumulation-rate alone (model number 2; Figure 3d top panel) accounted parsimoniously for the effects of uncertainty on behavior. The goodness-of-fit of the winning model was further confirmed by posterior predictive checks (Figure 3d bottom panel), performed by simulating data under the winning model and then comparing these to the observed data.
In the winning model (henceforth, the LBA model), high uncertainty is associated with comparatively slow accumulation rates. This relationship between uncertainty and accumulation rate held for both perception (z = 3.723, p = 0.00019; Wilkoxon sign rank test) and action (z = 3.723, p = 0.00019; Wilkoxon sign rank test) uncertainty, as well as for each subject (Figure 3e), in accord with previous studies 11, 30.
Non-decision time (t0), encompassing sensory delays and motor execution, was estimated to be 370ms on average (see Supplementary table 1), which is within the plausible range of non-decision times for humans 35, 36.
Localization of decision-evidence accumulation
To localize neural signatures of decision-evidence represented across the brain, we derived temporally resolved estimates of neuronal population activity from the winning model, which we fitted to a combined MEG and EEG signal, inverted to source space using the L2-Minimum Norm27.
We reduced the dimensionality of the MEEG data by parcellating the cortical surface into a set of 96 regions of interest (ROIs) defined using the Harvard-Oxford cortical atlas (FSL, FMRIB, Oxford) and by representing the dynamic of each ROI with a single time-course, obtained using principal component analysis37. Dimensionality reduction allows for improved computational efficiency. Further, it reduces multiple comparisons issues and increases statistical power, while retaining the maximum amount of information38.
The temporal evolution of the spectral power (power envelope) in each region served as the signal for our analysis in beta (13-30Hz) and gamma (31-90Hz) bands. The time onset of evidence accumulation across ROIs was identified by optimizing the split of the non-decision time before and after the accumulation period using Spearman correlation to the MEEG power envelope (Figure 3f, see Supplementary Figure 3 for the statistical map). This allows one to depict in space and time the emergence of decision-evidence accumulation.
Traditionally, evidence accumulation is associated with increased activity (e.g. firing rates) during decisions. However, recent studies indicate that both increasing and non-increasing activity can mediate evidence accumulation39–41. In agreement with this idea, we found significant (negative) correlations between the LBA model predictions and the MEEG oscillations in beta and gamma bands20 (Figure 4). Specifically, for both beta and gamma, neural activity after coherence onset desynchronized in a graded fashion and peaked approximately before response suggesting a form of threshold mechanism (Figure 4a)42–44.
In the beta band, desynchronization was strongly modulated by uncertainty in good agreement with our predictions. As the decision unfolds, the accumulated decision-evidence will ramp quickly with low perceptual uncertainty, and slowly with high perceptual uncertainty. Accordingly, desynchronization of beta power-envelopes averaged across trials and ROIs was larger (p < 0.0001, cluster corrected random permutations) for low than high perceptual uncertainty22, 42.
When a response is chosen between multiple options, the race underlying the selection of each alternative is characterized by an overall larger amount of decision-evidence summed across all he racing accumulators by the time of response13, 30. Accordingly, desynchronization of beta power-envelopes averaged across trials and ROIs was larger for high than low action uncertainty (p < 0.0001, cluster corrected random permutations). Gamma power-envelopes, showed a similar trend, but the effects were statistically insignificant.
To locate activity related to decision-evidence accumulation, the time course of power-envelopes was correlated (Spearman) to time-varying model predictions in a trial-to-trial fashion. This allows one to take advantage of inter-trial variability. Statistical significance of the resulting z-transformed correlation values was assessed for each ROI by comparisons against a null distribution created from correlating the model predictions with single trial power-envelopes scrambled by phase (104 permutations).
This analysis revealed a brain-wide network displaying decision-related dynamics expressed in the beta range (Fig3b, mean across significant ROIs: sign-test z = −3.15 ± 0.48, p = 0.00065 ± 0.0016, FDR corrected). These observations agree with previous human EEG work suggesting that evidence accumulation might correlate with widespread low-frequency desynchronization45.
In the gamma band we observed a more localized mosaic of ROIs including contralateral motion sensitive areas (inferior lateral occipital region), bilateral extrastriate areas and bilateral frontal motor regions (comprising premotor areas and supplementary motor area; mean across significant ROIs: sign-test z = −2.27 ± 0.27, p = 0.0058 ± 0.003, FDR corrected).
In addition, we compared the z-transformed correlation values for each of the four levels of our manipulations in isolation and confirmed that the quality of fit and the results did not vary across trials types (p>0.05, FDR corrected).
A continuous flow of information
We traced the spectrally resolved temporal evolution of decisions through the visuo-motor hierarchy, finding that decision-evidence accumulation emerges with distinct spatio-temporal profiles between beta and gamma (Figure 4b).
An early wave of accumulation begins at ∼120ms from coherence onset within the sparse network oscillating at gamma frequency. It is followed by a second wave mediated by Beta at ∼160ms from coherence onset (Figure 4c; Conjunction of significant ROIs in beta and gamma, median latency across participants, z = 5.53, p<0.0001, Wilkoxon rank test). No difference in latencies was found between hemispheres across frequency bands.
The latency maps (Figure 4b) show an accumulation gradient towards the precentral gyrus. We fitted a piecewise regression model with a free internal knot to the mean latencies of ROIs located along the dorsal path (Figure 4d), a critical system for visuomotor decisions26, 46.
In keeping with our observations the model (Figure 4e left top-bottom panels) identified the precentral gyrus (comprising primary motor cortex and part of the premotor cortex) as the point of convergence of two linear functions (R2 = 0.734, p < 0.0001) and outperformed a single regression model (piecewise R2 adj = 0.681; linear R2 adj = 0.649; adjusted R2 penalizes extra free parameters in favor of simple models).
Interestingly, in the gamma band (Figure 4d bottom left panel) we found a mirror-symmetric trend with increasing accumulation latencies while proceeding from the precentral gyrus to more posterior and anterior regions (R2 = 0.245, p = 0.042). Thus, accumulation starts with gamma at ∼120ms from coherence onset in the precentral gyrus and at ∼160ms in the occipital and frontal poles.
The onset of the accumulation in beta overlaps with gamma in the occipital pole at ∼160ms from coherence onset 47. The interval from earliest onset of accumulation to last onset, is only ∼100ms and the onset in precentral gyrus is on average ∼570ms before a motor response 44. The delay from motion onset to the beginning of the accumulation on the occipital pole (∼160ms), and the delay from action decision to movement initiation in precentral gyrus (∼100ms) are close to the sensory (∼200ms) and motor (∼80ms) delays measured from neural recordings on macaque 11, 48.
These patterns, albeit with lower spatial resolution, were also found at the sensor level (Supplementary Figure 4). As a note of caution for the piecewise regression, the fit of the LBA model for some of the ROIs within the dorsal path was not significant in the gamma band, reducing the accuracy of their latency estimates.
An important observation is that the latest ROIs in the gradient for both beta and gamma starts accumulating decision-evidence before the earliest ROI (e.g. the occipital lobe for beta) has reached its decision boundary (Figure 4e right top-bottom panels). This suggests that decisions are made on the basis of a continuous flow of information, rather than a serial sequence of discrete decisions.
From perception to action
The above analyses identified a flow of information across a widespread visuomotor network. To functionally segregate accumulators sub-serving perceptual and action decisions, and to reveal the influx and efflux of information across them we measured the phase-transfer entropy, a data-driven measure of information flow that is robust to signal leakage28.
The analyses focused on regions whose activity significantly fitted the LBA model’s prediction. We first identified ROIs that preferentially accumulated evidence for perception or action decisions. We reasoned that in a continuous flow of information, the amount of information transferred between perceptual and action accumulators is expected to co-vary with the rate of the accumulating process. Since the estimated accumulation-rates scale with uncertainty, the amount of information sent by a given region should also scale with uncertainty. This relationship enables one to identify regions where the amount of information varies systematically with the levels of either perceptual or action uncertainty.
Figure 5a shows, for the beta band, the regions modulated by action uncertainty (Action decision regions, pcorrected < 0.0005 in all ROIs) and perceptual uncertainty (Perceptual decision regions, pcorrected < 0.0005 in all ROIs). Action decision regions include ipsilateral cingulate and paracingulate cortex 49, contralateral frontopolar cortex, ventromedial cortex, insula, supplementary motor cortex, inferior parietal lobule and medial parietal cortex13, 50. Of notice, bilateral precentral gyri were identified as action decision regions which replicates previous findings13, 51.
Perceptual decision regions in the contralateral hemisphere include posterior areas typically associated with decisions about motion direction. These include lateral occipital cortex (including motion area MT-complex), superior temporal cortex (comprising the superior temporal sulcus) and the superior parietal lobule comprising the superior intraparietal sulcus52 along with the dorsomedial frontal cortex4, 53. Interestingly, two areas along the dorsal path on the left hemisphere were sensitive to both perceptual and action uncertainty manipulations (superior frontal gyrus, middle frontal gyrus, lateral occipital cortex superior division (comprising V2 and V3; pcorrected < 0.0005 in all ROIs).
In the gamma band, we observed bilateral involvement of the superior frontal gyrus54 and inferior frontal gyrus pars triangularis5, along with contralateral frontal medial cortex (Rowe et al, 2010) and ipsilateral paracingulate gyrus in action decisions (pcorrected < 0.005 in all ROIs). Perceptual decision areas (pcorrected < 0.0005 in all ROIs) included bilateral superior temporal areas (comprising the superior temporal sulcus; Pesaran and Freedman, 2016), cuneal cortex, and subcallosal cortex which has been linked to early encoding of confidence for perceptual decisions55.
The dominant direction of information transfer between ROIs was estimated using the directed phase-transfer entropy (Hillebrand et al, 2016). The average direction of information flow for each ROI was computed resulting in a single estimate of preferred direction of information flow (either inflow or outflow). Based on these estimates, we calculated a posterior-anterior index (Hillebrand et al., 2016; PAx) to quantify the direction of flow between caudal and rostral ROIs.
Figure 5b show the smooth global pattern of preferential information flow in the beta range with caudal ROIs preferentially sending information to anterior regions. This pattern is similar to that reported by Hillebrand et al. 2016 in human resting state, except that our results show a task-related lateralization, with the contralateral PAx almost twice the size of the ipsilateral one (left: p = 0.0002, PAx = 0.47; right: p = 0.0051; PAx = 0.27).
It can be seen from Figure 5b-c that, for beta, the strongest information flow was from the left lateral occipital cortex to the left middle frontal gyrus and the frontopolar cortex. This accords with previous reports of beta-synchronization between primate MT and frontal regions during motion discrimination56. No significant effect was seen for the gamma range in either hemisphere which might reflect the shorter range of gamma interactions57.
Integration of behavioral, computational and physiological evidence
To highlight the behavioral relevance of the integrated account of visuomotor decision-making, we explored the relationships between connectivity, accumulator model parameters and behavior. To account for multiple-comparisons, we used Holm-Bonferroni correction over eight tests.
In the beta range, the caudo-rostral gradient of evidence-accumulation is matched by a gradual transition from perception to action decisions, as shown by a positive correlation between regional specificity to the type of uncertainty and the estimated accumulation latencies (Figure 6a top left panel, r = 0.27, pcorrected = 0.044). Moreover, the information flow is aligned with the caudo-rostral gradient of accumulation since the flow proceeds from perceptual-decision regions to action-decision regions (Figure 6a bottom left panel, correlation between regional specificity and direction of information flow: r = −0.37, pcorrected = 0.0016).
In contrast, for the gamma band we found there was neither significant relationship between region specificity and accumulation latency (Figure 6a top right panel, r = 0.19, pcorrected = 0.759) nor significant evidence of flow of information from perception to action decision regions (Figure 6a bottom right panel, r = −0.37, pcorrected = 0.068).
Finally, we hypothesized that in a continuous flow of information the amount of information transferred between perceptual and action accumulators co-varies with the rate of accumulation. Faster progression from perception through action should be correlated with phase transfer entropy and model accumulation rate, but negatively with reaction-times.
This was the case in the beta range where strong flow was associated with short reaction times (Figure 6b top left panel, repeated-measures correlation: rrm= −0.378, pcorrected = 0.03, CI [-0.588, −0.12]) and accumulation rates (Figure 6b bottom left panel, rrm= 0.356, pcorrected = 0.045, CI [0.096, 0.572]). No significant correlation was observed in gamma (Figure 6b top-bottom right, reaction times vs information flow: rrm= −0.079 pcorrected = 0.564; accumulation-rate vs information flow: rrm = 0.113, pcorrected = 0.816).
Conclusions
There are two principal results from this study that illuminate the interaction between neural systems for perception and action. The first is that decisions in regions sensitive to motor precision do not wait until sensory decisions are completed. Instead, the accumulation of evidence in motor decisions begin within 100ms soon after the initiation of evidence accumulation in the first sensory regions. This indicates a continuous flow or cascade of information and its gradual transformation from sensory evidence to motor ‘intention’58.
The second is that the correlates of evidence-accumulation in the beta and gamma frequency ranges have distinct spatiotemporal profiles, and opposite dominant directions of flow. This spectral directionality is predicted by hierarchical cortical networks for prediction and inference in visuomotor control 22, 59–61. In the beta band, there is not only a spatial gradient in the timing of accumulation-to-threshold between occipital and pre-central cortex, but also a qualitative change in the accumulated signals: from sensitivity to visual uncertainty to sensitivity to response uncertainty. Moreover, the more sensitive a region is to action uncertainty (vs. perceptual uncertainty), the later its onset of beta accumulation, and the greater its bias to inflow (vs. outflow) as measured by phase transfer entropy (Figure 6). These effects were not confined to classical visual and motor regions, or even to the ‘dorsal stream’, but were identified throughout much of the cortex.
We set out to integrate the analysis of information flow, with decision-making implemented by the accumulation of evidence, and their joint influence on trial-to-trial variation in behavior (see Figure 1). Independent manipulation of perceptual and action uncertainty was coupled with the decomposition of performance into latent variables in a parsimonious linear ballistic accumulator model 31, which accurately generated the response distributions in each task condition including the expected effects of task variance on response latencies 11, 30. The model predictions of within-trial accumulation were correlated with change in beta and gamma power after the onset of stimulus coherence. Beta desynchronization has been shown to scale with uncertainty 51, but here we show its interaction with the temporal evolution of decision making over sub-second intervals. The observed desynchronization displays two signatures of the accumulation-to-threshold class of models: accumulation of decision-evidence over time and the consistent bound reached shortly before each movement10, 42–44.
Beta and gamma desynchronization have previously been correlated with behavioural performance. For example, in direct recording from non-human primates during working memory62 and sensory discrimination25, the beta band desynchronization was greater for accurate trials compared with inaccurate trials. Such beta power encoding of decision outcomes is supramodal in many cortical areas63. The change in beta power followed the change in gamma power as in the current study: we found an early wave of gamma followed by a second wave of beta.
Although gamma and beta rhythms have been observed to occur together or in close succession64, 65, the temporal relationship is functionally relevant. For hierarchical cortical networks, message passing between regions is a function of the laminar asymmetry of afferent vs. efferent connections59, and the properties of columnar circuitry which preferentially generates gamma rhythms superficially, and lower frequencies from deep layers66, 67. This promotes predictive feedback connectivity in beta and lower frequencies, and preferential feedforward ‘error’ signalling in gamma22, 61. The beta band’s lower frequency makes it inherently more suitable for coordination of information processing over longer conduction delays than gamma46.
As seen in Figure 4, where changes in spectral power were predicted by the LBA model, the latency to accumulation was confirmed as shorter for gamma than beta. Indeed, the spatial distribution of beta latencies in the dominant hemisphere (Figure 4e) also shows a gradient from occipital, to parietal and prefrontal, and lastly motor cortex. The motor cortex is also a region of strong net influx of beta (Figure 5b), even more than premotor cortex, consistent with the active inference model of motor control22, 61.
The spatial gradient of gamma latencies is reversed, with earliest changes observed in precentral cortex, before occipital cortex, and later gamma latencies in time with beta responses in occipital cortex. This may be because of the difference between predicting when a response may be required and what that response should be68. The sensory stimulus change (visual coherence) in our task is not the result of the participant’s own response, but is predictable a second after the onset of the non-coherent display. The participant can predict when an action is required, but not which actions are permitted or specified. An increase in localized and predominantly short-range interactions in gamma range may therefore be a permissive of information required for the beta-mediated decision between action alternatives69.
Despite the similarity of onset of beta and gamma accumulation in occipital cortex, the connectivity analyses indicated distinct channels routing information at longer and shorter spatial scales, respectively. The pattern of net efflux vs. influx of beta (Figure 5b) shows a clear division between frontal cortex and posterior lobes. In other words, there was a cascade of overlapping accumulators and information flow along a rostro-caudal axis from perceptual to motor regions for beta, at least in the hemisphere contralateral to the response hand.
Lateralized beta activity during a decision-making task reflects not just movement preparation, but has also been related to a dynamic decision process with updating of a motor plan as a decision evolves42–44, 51. The beta power lateralization in motor areas was correlated with the state of decision-evidence. Crucially, these earlier MEG and EEG studies used a fixed-mapping between decisions outcomes and categorical behavioural responses, without choice or independence of perception and action decisions. When this fixed mapping between perceptual decisions outcome and motor responses is removed, sensorimotor beta lateralization disappears15. Our findings complement this work by directly revealing a lateralized progression of evidence accumulation from posterior perceptual regions to anterior motor areas.
Moreover, previous pioneering work on visuomotor decisions have focused on processes occurring at the final choice stage, leaving unresolved the question of whether evidence accumulation is coordinated throughout the whole cortex or just in specific regions. Our findings rest on a generalized model in which accumulation-to-threshold provides a canonical mechanism evolving throughout all layers of a visuomotor transformation (Figure 3a) and suggest that evidence accumulation is not a limited (perceptual) process with a single cortical focus, but distributed70, 71 and applicable to non-sensory evidence or intentions. This multi-focal property of evidence accumulation resonates with results from animal optogenetic70 and pharmacological71 studies showing that inactivation of local cortical areas carrying decision-related activity did not affect decision-making performance.
Taken together, our observations support the hypothesis that the beta band response links sensory evidence to motor plans, throughout a widespread network72. We propose that an early neural signalling regarding the need for a response is followed by a second phase that integrates a continuous flow of information to make a decision between them73. In this second phase, decisions unfold on the basis of a continuous flow of information (Figure 4d), rather than sequential completion of intermediate decisions at the population level. However, this hypothesis refers to the population level, and we cannot exclude the possibility that within each region, a subsection of neurons completes the relevant decision and forwards this outcome to the next level in the hierarchy, while others in that region continue to accumulate.
The fluctuations in the strength of information flow caused by changes in uncertainty are behaviourally relevant, in their positive correlation with accumulation-rate and negative correlation with reaction times. This establishes an important formal link between behaviour, models of decision-making, and physiological connectivity. Fast accumulating-rates of the linear ballistic accumulator model are associated with a more effective information flow throughout the visuo-motor processing hierarchy, resulting in faster decisions and responses. This relationship could be exploited to investigate clinical conditions in which the ability to use sensory inputs to guide actions is impaired.
In summary, our analytical approach explains visuomotor decisions through the combination of computational modelling of behaviour to derive latent decision variables that are identified by their neurophysiological signatures in distributed cortical networks. Variations of beta and gamma power reflect the temporal and spatial dynamics of the accumulation and transfer of decision-evidence, with a continuous flow of information between regions rather than sequential discrete decisions. During this flow, there is a gradual transition from the resolution of sensory uncertainty to resolution of response uncertainty enabling goal-directed actions in the face of sensory uncertainty.
Methods
Participants
Twenty healthy volunteers (9 females, 11 males, age range 18-39 years) took part in this study, after providing informed consent. Inclusion criteria included age 18-40 years, right-handed, and screening for neurological or psychiatric illness. Two subjects failed to reach the requisite performance criterion during training and were excluded, leaving 18 subjects in all subsequent behavioral and neural analyses. Experimental protocols conformed to the guidelines of the Declarations of Helsinki and were approved by the local research ethics committee.
Stimuli
Stimuli were presented using Matlab and the Psychotoolbox routines in a sound-proof and dimly lit room. For the psychophysical training stimuli were displayed on a CRT monitor at 60cm, and for the scan session stimuli were projected on a screen through a projector at 130cm (both with a 60Hz refresh rate) with equivalent pixel resolution of 0.03°.
Stimuli were four random dot kinematograms 29 displayed within four circular apertures (4° diameter) positioned along a notional semi-circular arc (3.4° eccentricity) on a black background (100% contrast). 200 dots were displayed during each frame and spatially displaced in the next frame to introduce apparent downward motion (6°/sec velocity). To manipulate motion strength (i.e. motion coherence) between trials, on each frame only a certain proportion of dots moved downward whilst the rest of the dots where randomly reallocated. Motion coherence level was kept constant throughout the trial.
Since abrupt stimulus onset and offset could elicit large sensory-evoked potentials which might mask decision processes, the 1.5 seconds long coherent motion interval was preceded and followed by intervals of zero-coherence levels lasting 1sec and 0.5sec, respectively.
Task and procedures
Participants performed a finger-tapping task adapted from previous studies 13, 30. Their goal was to detect the onset of coherent motion and to press the button corresponding to one of the downward moving stimuli (coherent stimuli). The number of coherent stimuli defined two trial types: Low action uncertainty trials, where a single coherent stimulus commanded which button to press; and high action uncertainty trials, where three coherent stimuli required the participants to make a simple choice and press any one of the three corresponding buttons (a “fresh choice, regardless of what you have done in previous trials”30). Equal emphasis was placed on the speed and accuracy of the responses. Participants were instructed to fixate on a central red mark throughout the trial. Eye-tracking data collected during the first six scanning sessions confirmed participants were able to successfully perform the task while maintaining fixation (see supplementary results). Each trial started with the presentation of the fixation mark and stimuli onset ensued after a variable interval comprised between 0.5sec and 1sec. The imaging session was preceded by one training psychophysical session and one test session scheduled on separate days; the scanning session was conducted a maximum of four days after the psychophysical training, depending on the availability of the participants.
Psychometric calibration
Participants were firstly familiarized with the finger-tapping task during a short practice session where 100% coherent stimuli were adopted. The familiarization phase was completed when participants reached 90% accuracy across all trial types. In the following psychophysical training, motion coherence was randomly varied between trials to estimate individual motion thresholds. Eight logarithmically spaced motion coherence levels (0 0.5 0.10….0.9) were used (32 trials per level) following extensive piloting to ensure coverage of a wide range of individual motion sensitivity. Each training session comprised 16 blocks of 32 trials. Feedback was provided for correctness of responses as well as for too early or too late responses (100ms and 2.5s from motion coherence onset, respectively).
To ensure that participants perceived all the available options (i.e. coherent stimuli) before committing to a decision, occasionally (p = 0.2) after a correct choice they had to perform a secondary match-to-sample task: a set of grey discs replaced the stimuli and participants had to report whether their locations matched the location of the previously displayed coherent stimuli. They had to press any button to report a match and withhold any response otherwise. A trial was considered as correct only when both choice and matching were correct. Trials with un-matching responses were discarded and repeated within the session.
To tailor the sensory evidence to the participants’ individual motion sensitivity across number of options, the discrimination accuracy of each trial type in each training session was fitted using a maximum likelihood method, with a Log-Quick function defined as where α is the threshold, β is the slope and x is the coherence level. To obtain the proportion correct for each trial type, the Log-Quick function was scaled by, where γ is the guess rate and λ is the lapse rate controlling the lower and upper asymptote of the psychometric function, respectively.
Individual low and high perceptual uncertainty levels for each trial type were estimated as the 75th and 90th percentile of the psychometric functions from the last session. The reason for adopting these thresholds was twofold: firstly, participants need to perceive all the available options before committing to a decision. Secondly, supra-threshold trials are best suited for investigating neural correlates of evidence accumulation74.
Test and scan sessions
Test and scan sessions were scheduled on separate days; the scanning session was conducted a maximum of four days after the psychophysical training, depending on the availability of the participants. The test session was to ensure that the participants were able to perform well under the individually adjusted motion thresholds. In the test and scan sessions, coherence levels were fixed to the individual thresholds corresponding to high and low levels of perceptual uncertainty, the match-to-sample task was removed, and no feedback was provided except for too late or too long responses. Levels of perceptual and action uncertainty where randomly interspersed across trials. Each session consisted of 10 blocks (total 720 trials per participant) separated by a short rest. Trials on which responses were made before 0.1-sec or after 2-sec (on average 1.3% of total trials) were excluded from subsequent analyses.
MEG and EEG data acquisition and processing
An Elekta Neuromag Vectorview System simultaneously acquired magnetic fields from 102 magnetometers and 204 paired planar gradiometers, and electrical potential from 70 Ag-AgCl scalp electrodes in an Easycap extended 10-10% system. Additional electrodes provided a nasal reference, a forehead ground, paired horizontal and vertical electro-oculography (EOG), electrocardiography (ECG) and neck electromyography (EMG). All data were recorded and digitized continuously at a sample rate of 1kHz and high-pass filtered above 0.01 Hz.
Before scanning, head shape, the locations of five evenly distributed head position indicator coils, EEG electrodes location, and the position of three anatomical fiducial points (nasion and left and right pre-auricular) were recorded using a 3D digitizer (Fastrak Polhemus Inc., Colchester, VA). The initial impedence of all EEG electrodes was optimized to below 10 kΩ, and if this could not be achieved in a particular channel, or if it appeared noisy to visual inspection, it was excluded from further analysis. The 3D position of the head position indicators relative to the MEG sensors was monitored throughout the scan. These data were used by Neuromag Maxfilter 2.2 software, to perform environmental noise suppression, motion compensation, and Signal Source Separation.
Subsequent analyses were performed using in-house Matlab (Mathworks) code, SPM12 (http://www.fil.ion.ucl.ac.uk/spm) and EEGLab (Swartz Center for Computational Neuroscience, University of California San Diego). Separate independent component analysis was computed for the three sensor types and artifactual components were rejected. For EEG data, components temporally and spatially correlated to eye movements, blinks and cardiac activity were automatically identified with EEGLab’s toolbox ADJUST. For MEG data, components were automatically identified that were both significantly temporally correlated with electrooculography and electrocardiography data, and spatially correlated with separately acquired topographies for ocular and cardiac artifacts. Artifactual components were finally projected out of the dataset with a translation matrix.
The continuous artefact-corrected data were low-pass filtered (cut-off = 100Hz, Butterworth, fourth order), notch filtered between 48 and 52Hz to remove main power supply artifacts, down-sampled to 250Hz, and epoched from −1500 to 2500ms relative to motion coherence onset. EEG data were referenced to the average over electrodes.
MEEG source reconstruction
MEG and EEG data were combined before inversion into source space 27. The forward model (lead field) was estimated from a single shell canonical cortical mesh with >8000 vertices of each participant’s anatomical T1-weighted MRI image. Lead fields were calculated over a window from −1500 to 2500ms relative to motion coherence onset. The cortical mesh was co-registered to the MEEG data using the digitised fiducial and scalp points. We computed the inverse source reconstruction for single trials using the minimum norm algorithm as implemented by SPM12. All conditions were included in the inversion to ensure an unbiased linear mapping. The source images were spatially smoothed using an 8 mm FWHM Gaussian kernel.
Dimensionality reduction
To address the problem of multiple comparisons and reduce the computational load when comparing the model predictions with the source-localized time series, we applied a parcellation-based dimensionality reduction to our data following the procedure described by Colclough and colleagues 37. First the whole-brain surface was parcellated into 96 anatomical regions of interest (ROIs) as defined by the Harvard-Oxford cortical brain atlas. Then we represented the dynamic of each ROI with a single time-course, obtained using principal component analysis. The reconstructed sources within each ROI were first bandpass-filtered. The coefficients of the principal component accounting for the majority of the variance of the vertices within each ROI, were then taken as an appropriate representation of source activity for that region.
Accumulator model of perceptual and action decisions
Behavioral data were analyzed using a variant of the linear ballistic accumulator (LBA) model which has been previously applied to a finger tapping task to model fMRI evidence accumulation 13, 30. According to this class of models, a decision about when and which action to select is dictated by a ‘race’ competition among independent accumulators. Each accumulator linearly integrates the decision-evidence (or the intention) over time in favor of one action, and the decision is made when the accumulated activity reaches threshold. In our task possible actions correspond to a button press from one of four fingers, each modeled by independent accumulators i ∈{1, 2, 3, 4}. When three valid actions are available, three accumulators are engaged with activation starting at levels independently drawn from a uniform distribution [0, c0], and increasing linearly over time with an accumulation rate (v) drawn from an independent normal distribution with mean μi and standard deviation σi.
A response is triggered once one accumulator wins the ‘race’ and reaches a decision bound b. When only one action is available, only the accumulator corresponding to the available action is engaged. Predicted reaction time (RT) is given by the duration of the accumulation process for the winning accumulator, plus a constant non-decision time t0 representing the latency associated with stimulus encoding and motor response initiation 31.
Parameter estimation and model selection
To identify the combinations of free parameters that best accounted for the observed behavioral data we firstly fitted 15 variants (i.e. all possible combinations without repetition) of the LBA. Each variant was characterized by a unique combination of free parameters allowed to vary across trials. We followed the former procedure 30 to estimate the model prediction of reaction times quantiles and selection probabilities of each condition. The best-fitting parameters for each model variant were used to calculate the Bayesian Information Criterion (BIC), which penalize extra free parameters in favor of simpler models. BIC values were then used to compare the goodness-of-fit of each variant using random-effects Bayesian model comparison 33, 34. In this comparison, each model variant is treated as a random effect that could differ between participants. The critical statistical quantity is the probability that any given model outperforms the other variants most of the time (exceedance probability).
Estimation of expected neural activity
We generated predictions of decision-related activity from the LBA model to locate neural signatures of decisions-evidence accumulation in single-trial analyses of MEEG data. For multiple options, the LBA model assumes multiple active accumulators, one for each finger option. Let be the accumulation rate of the winning option (i.e. the one reaching response threshold b), sampled from the normal distribution .
Let and be the sampled accumulation rates of the alternative options (i.e. the losers), sampled from normal distributions , respectively. If the reaction time of a given trial is RT, the latency of the accumulation process is RT – t0, such that the expected accumulation of the winning option is: Since the losing accumulators have not reached the threshold by the time of the response RT, the expected values of and are smaller than . Therefore, the losing accumulation rates have truncated normal distributions with an upper bound of and with expected values of: where and .
The sum of the winning and losing accumulation rates gives an estimation of total accumulation activity for single trials. For trials with only one available option, the accumulation activity is determined by the only active accumulator.
Single-trial analysis
To identify the spatio-temporal profile of decision-related accumulation over the brain we derived model-predicted signals for each trial to compare with neural oscillations in theta (4 – 8 Hz), alpha (8-12 Hz), beta (12-30Hz) and gamma (31-90Hz) frequency bands. To estimate the power of oscillations on a single-trial basis, stimulus-locked epochs from 500 ms before to 1500ms after coherence onset. Next, we extracted frequency-specific signal envelope modulations using a Hilbert transform of the source data from each reconstructed ROI. The Hilbert’s envelope is a convenient measure of how the power of the signal varies over time in the frequency range of interest, and thus particularly suited to capture relatively slow fluctuations associated to the instantaneous accumulation of evidence/intentions. The power estimates of individual participants were down-sampled to 100Hz and normalized by their baseline (from 400ms to 100ms before coherence onset).
We estimated the maximum lagged absolute Spearman correlation between the model predicted activity and the signal envelope in a trial-by-trial fashion. The lagged correlation was used to optimally split the non-decision time before and after the accumulation period to determine the time delay between the neural signal and the model predictions. The time before accumulation provides a measure of the temporal separation between coherence onset and accumulation onset.
If the model prediction x is a lagged version of the neural signal y so that Where τ0 is a time delay that can vary from 0ms to the individual non-decision time (t0) with steps of 10ms, then the maximum absolute lagged correlation between x and y is defined as where i = [0, 10, 20 … t0].
With the peak value of ρxy(t) occurring when τi = τ0 which allows us to determine the time delay. We estimated the largest absolute lagged correlation value for each ROI and individuals by comparing concatenated epochs and model predictions. This choice permits to measure accumulation lags specific to each ROI, under the assumption that they differ across brain regions for each participant. The strength of the Fisher-transformed maximum lagged correlations for each ROI was then quantified (z-score) using a one-sample sign-test. To provide a conservative estimate of significant correlations between model prediction and neural activity, we repeated the above procedure 10.000 times, each iteration using a different phase-randomized version of the original MEEG signal, to obtain a distribution of correlations under chance. Two-tailed statistical significance was assessed by computing the proportion of absolute values of the distribution of correlations generated by chance exceeding the correlation between model predictions and the original MEEG signal. The resulting p-values were corrected for multiple comparisons (False Discovery Rate) across ROIs and frequency bands.
Connectivity analysis
To explore the direction of the information flow we employed phase-transfer entropy, a data-driven effective-connectivity measure robust to signal leakage 28. The preferred direction of information between ROIs whose activity best matched with model’s predicted activity was estimated using the directed phase-transfer entropy.
To identify the ROIs that preferentially accumulated evidence for perception or action decisions, the average information flow (quantified by phase transfer entropy) sent by each ROI was calculated for each subject and condition. The difference of information flow between uncertainty levels for perception and action is compared at the ROI level with a surrogate distribution generated by flipping the condition labels for a random number of participants (10.000 iterations). Since significance was estimated separately for perception and action, the critical value for the FDR correction was halved to α = 0.025.
To quantify the direction of information flow, we calculated a posterior to anterior index (PAx) as implemented by Hillebrand et al, 2016. A positive PAx indicates preferential flow from posterior regions toward anterior regions. ROIs were split into anterior and posterior region with respect to the precentral gyrus (see Table S1). Significance was assessed with permutation testing where the average directional phase-transfer entropies were shuffled across ROIs and PAx was estimated. This procedure was repeated 10.000 times to generate a surrogate distribution of PAx values against which the observed PAx values were tested (p<0.025 to account for multiple comparisons).
For the correlations in Figure 6a, we first confirmed homoscedasticity of our data and then calculated bootstrapped Pearson’s correlations. For the correlations in Figure 6b we used repeated-measures correlation (as implemented in the rmcorr package in R) which accounts for non-independence among observations due to multiple measurements per participant. The resulting p-values were corrected for multiple comparisons by applying Holm-Bonferroni correction.
Hypothesis testing
Differences in reaction times were tested with a 2-way repeated measures ANOVA (Low/High Uncertainty x Action/Perception). All other hypothesis tests used non-parametric tests or random permutation methods that do not rely on specific assumptions about the distributions of data values. All tests were evaluated at the p<0.05 level (two-tailed), correcting for multiple comparisons where appropriate.