Abstract
A gradual buildup of electrical potential over motor areas precedes self-initiated movements. These “readiness potentials” (RPs) could simply reflect stochastic fluctuations in neural activity. We operationalised self-initiated actions as endogenous ‘skip’ responses while waiting for target stimuli in a perceptual decision task. Across-trial variability of EEG decreased more markedly prior to self-initiated compared to externally-triggered skip actions. This convergence towards a fixed pattern suggests a consistent preparatory process prior to self-initiated action. A leaky stochastic accumulator model could reproduce these features of the data, given the additional assumption of a decrease in noise level at the input to the accumulator prior to self-initiated, but not externally-triggered actions. The assumed reduction in neural noise was supported by analyses of both within-trial EEG variability and of spectral power. We suggest that a process of noise reduction is consistently recruited prior to self-initiated action. This precursor event may underlie the emergence of RP.
Introduction
Functional and neuroanatomical evidence has been used to distinguish between two broad classes of human actions: self-initiated actions that happen endogenously, in the absence of any specific stimulus (Haggard, 2008; Passingham, Bengtsson, & Lau, 2010), and reactions to external cues. Endogenous actions are distinctive in several ways. First, they depend on an internal decision to act and are not triggered by external stimuli. In other words, the agent decides internally what to do, or when to do it, without any external cue specifying the action (Passingham et al., 2010). Second, we often deliberate and consider reasons before choosing and performing one course of action rather than an alternative. Thus, endogenous actions should be responsive to reasons (Anscombe, 2000). These features of endogenous action capture many key attributes of human volition. While there has been extensive research on action selection and initiation in response to external cues, the brain mechanisms for endogenous, self-initiated actions have been less studied.
Many studies of human voluntary action involve the paradoxical instruction to ‘act freely’ e.g., “press a key when you feel the urge to do so” (Cunnington, Windischberger, Deecke, & Moser, 2002; Jahanshahi et al., 1995; Libet, Gleason, Wright, & Pearl, 1983; Wiese et al., 2004). However, the situation and task demands of such experiments are complex, and have been justly criticised (Nachev & Hacker, 2014). We adapted for humans a paradigm previously used in animal research (Murakami, Vicente, Costa, & Mainen, 2014), which embeds endogenous actions within the broader framework of reward-guided perceptual decision-making. Participants responded to the direction of unpredictably-occurring dot motion stimuli by pressing left or right arrow keys (Gold & Shadlen, 2007). Importantly, they could also choose to skip waiting for the stimuli to appear, by pressing both keys simultaneously whenever they wished. The skip response thus reflects a purely endogenous decision to act, without any direct external stimulus, and provides an operational definition of a self-initiated action. Self-initiated ‘skip’ responses were compared to a block where participants made the same bilateral ‘skip’ actions in response to an unpredictable change in the fixation point (Figure. 1).
Controversies regarding precursor processes have been central to neuroscientific debates about volition (Dennett, 2015; Libet et al., 1983). The classical neural marker of precursor processes for endogenous action is the readiness potential (RP: Kornhuber & Deecke, 1965). The RP is taken to be “the electro-physiological sign of planning, preparation, and initiation of volitional acts” (Kornhuber & Deecke, 1990) and was considered a pre-requisite of the conscious intention to act (Libet et al., 1983; Sinnott-Armstrong & Nadel, 2010). Classical studies explicitly or implicitly assume that the RP reflects a putative ‘internal volitional signal’, with a constant, characteristic ramp-like form, necessarily preceding action initiation - although this signal is heavily masked by noise in any individual trial (Dirnberger, Lang, & Lindinger, 2008). However, the very idea that the RP reflects a specific precursor process leading to endogenous action has been recently challenged. Alternative models suggest that the rising ramp pattern of the mean RP does not reflect a goal-directed process but rather reflects subthreshold stochastic fluctuations that influence the precise time of crossing the threshold for movement (Murakami et al., 2014; Schurger, Sitt, & Dehaene, 2012). Crucially, averaging these random fluctuations time-locked to action initiation results in the rising ramp pattern of the mean RP, with its appearance of a stable ERP component. According to the stochastic account, the RP reflects cross-trial averaging of data epochs time locked to crests in autocorrelated neural noise, rather than a specific, goal-directed process that causes action. This view, which can be formally expressed in a quantitative model, and tested against neural data (Murakami et al., 2014; Schurger et al., 2012), has radically revised neuroscientific theories of voluntary action.
Both classical and stochastic models can reproduce the existence of mean RP, treating it as signal or as averaged noise, respectively. However, the two models make different predictions about EEG variability prior to action. On the stochastic model, neural activity eventually and necessarily converges because stochastic fluctuations must always approach the motor threshold from below. The time of convergence depends only on the temporal autocorrelation of the EEG signal. On the classical model, the distribution of single trial RPs additionally converges because the RP marks a consistent precursor process that reliably precedes self-initiated action. While variability of RP activity has rarely been studied previously (but see Dirnberger et al., 2008), several studies of externally-triggered processing have used variability of neural responses to identify neural codes. For example, variability goes down in the interval between a go cue and movement onset (Churchland, Yu, Ryu, Santhanam, & Shenoy, 2006), and during perceptual processing (He, 2013; Schurger, Sarigiannidis, Naccache, Sitt, & Dehaene, 2015). We thus hypothesised that variability of neural activity should decrease more markedly prior to self-initiated skip actions. The additional decrease in variability would be a marker of a consistent preparatory process, or precursor for self-initiated action.
We found an additional drop in inter-trial variability beginning 1.5 s before self-initiated skips, compared to externally-driven skip actions. This differential convergence suggests a consistent precursor process that reliably precedes self-initiated action. The gradual onset of differential convergence suggests that the precursor process has a variable duration, rather than a fixed duration (Dirnberger et al., 2008), although we could not formally compare these alternative models using our methods. We showed that a modified version of an established computational model based on stochastic fluctuations on neural activity (Schurger et al., 2012; Usher & McClelland, 2001) could capture the patterns of EEG variability found in our data. Crucially, the model required a specific modification to explain the data, namely a neurocognitive process of noise reduction, time-locked to the moment of action. Our model further predicts that decreases in within-trial EEG fluctuation and spectral power should be observed prior to self-initiated action. These predictions were confirmed. Importantly, these differences between conditions began well before the stimulus onset in our externally-triggered condition (Figures 2 & 7), so presumably reflects changes in ongoing EEG prior to self-initiated action. Thus, impending self-initiated voluntary action appears to be associated with a distinctive, time-specific stabilisation of premovement EEG patterns.
Results
Behavioural data
Participants (n=22) waited for a display of random dots to move coherently (step change from 0% to 100% coherence) towards the left or right. They responded with the left or right hand by pressing a left or right arrow key on a keyboard, accordingly. They received a reward for correct responses. However, the time of movement onset was drawn unpredictably from an exponential distribution, so waiting was sometimes extremely long. In the ‘self-initiated’ condition blocks (Figure 1A), participants could skip waiting if they chose to, by pressing the left and right response keys simultaneously. The skip response saved time, but produced a smaller reward than a response to dot motion. The experiment was limited to one hour, so using the skip response implied a general understanding of the trade-off between time and money. A skip response thus reflects a purely endogenous decision to act, in the absence of any external instruction to act, and based on the tradeoff between later, larger, and smaller, earlier rewards. This provides an operational definition of volition within our experimental design, which captures some of the important features of endogenous voluntary control, as well as the linkage of self-initiated action to other aspects of cognition, such as decision-making and judgement (Schüür & Haggard, 2011). In the ‘externally-triggered’ condition blocks, participants could not choose for themselves when to skip. Instead, they were instructed to make skip responses by an external signal (Figure 1B) (see materials and methods).
On average participants skipped 108 (SD = 16) and 106 (SD = 17) times in the self-initiated and externally-triggered conditions, respectively. They responded to coherent dot motion in the remaining trials (N = 177, SD = 61), with a reaction time of 767 ms (SD = 111 ms). Those responses were correct on 86% (SD = 4%) of trials. The average waiting time before skipping in the self-initiated condition (7.3 s, SD = 1.6) was similar to that in the externally-triggered condition (7.6 s, SD = 1.6), confirming the success of our yoking procedure (see materials and methods). The waiting time varied more across trials within each individual, than across individuals, suggesting that self-initiated skip responses represented an on-line decision to act, rather than a pre-decided stereotyped response. Thus, the SD across trials had a mean of 3.17 s (SD across participants = 1.42 s) for self-initiated skips. Our yoking procedure ensured similar values for externally-triggered skips (mean of SD across trials 3.15 s, SD = 1.43 s). In the externally-triggered condition, the average reaction time to the fixation cross change was 699 ms (SD = 67 ms). On average participants earned £2.14 (SD = £0.33) from skipping and £2.78 (SD = £0.99) from correctly responding to dot motion stimuli. This reward supplemented a fixed fee for participation. The mean and distribution of waiting time before skip action of each participant are presented in Table S1 and Figure S1.
EEG variability decreases disproportionately prior to action in self-initiated and externally-triggered conditions
EEG data were pre-processed and averaged separately for self-initiated and externally-triggered conditions. Figure 2A shows the grand average RP amplitude in both conditions. The mean RP for self-initiated actions showed the familiar negative-going ramp. Note that our choice to baseline-correct at the time of the action itself (see materials and methods) means that the RP never in fact reaches negative voltage values. This negative-going potential is absent from externally-triggered skip actions (Jahanshahi et al., 1995; Papa, Artieda, & Obeso, 1991). The morphology of the mean RP might simply reflect the average of stochastic fluctuations, rather than a goal-directed build-up. However, these theories offer differing interpretations of the variability of individual EEG trajectories across trials (see intro).
To investigate this distribution we computed standard deviation of individual trial EEG, and found a marked decrease prior to self-initiated skip action. This decrease is partly an artefact of the analysis technique: individual EEG epochs were time-locked and baseline-corrected at action onset, making the across-trial standard deviation at the time of action necessarily zero (but see Figure S2). However, this premovement drop in EEG standard deviation was more marked for self-initiated than for externally-triggered skip actions, although the analysis techniques were identical. Paired-samples t-test on jack-knifed data showed that this difference in SD was significant in the last three of the four pre-movement time bins before skip actions (see materials and methods): that is from -1.5 to -1 s (t(21) = 4.32, p < 0.01, dz = 0.92, p values are Bonferroni corrected for four comparisons), -1 to -0.5 s (t(21) = 5.97, p < 0.01, dz = 1.27), and -0.5 to 0 s (t(21) = 5.39, p < 0.01, dz = 1.15).
To mitigate any effects of arbitrary selection of electrodes or time-bins, we also performed cluster-based permutation tests (see materials and methods). For the comparison between SDs prior to self-initiated vs externally-triggered skip actions, a significant cluster (p < 0.01) was identified extending from 1488 to 80 ms premovement (Figure 2B, see also Figure S2 for a different baseline). This suggests that neural activity gradually converges towards an increasingly reliable pattern prior to self-initiated actions. Importantly, this effect is not specific to FCz but could be observed over a wide cluster above central electrodes (Figure S3). However, the bilateral skip response used here makes the dataset suboptimal for thoroughly exploring the fine spatial topography of these potentials, which we hope to address in future research.
It has been shown that stimulus anticipation is preceded by a cortical negative wave, the contingent negative variation (CNV) (Walter, Cooper, Aldridge, McCallum, & Winter, 1964). The CNV has been associated with expectation and temporal processing (Casini & Vidal, 2011; Van Rijn, Kononowicz, Meck, Ng, & Penney, 2011). Hence, our measures in the self-initiated condition could reflect both accumulating conditions that make a skip action desirable (e.g., passage of time without dot motion onset), and the preparation of the skip action itself. However, our externally-triggered skip condition controls for effects of mere passage of time, and expectation of dot motion onset.
To ensure that the key cognitive factors in the task were balanced between self-initiated and externally-triggered conditions, we also analysed mean and SD EEG amplitude prior to stimulus-triggered responses to coherent dot motion (as opposed to skip responses). We did not observe any negative-going potential prior to coherent dot motion (Figure 3A), again suggesting that temporal expectation did not strongly contribute to our ERPs. More importantly, the SD of EEG prior to coherent dot motion onset did not differ between conditions in any time window (p > 0.5, Bonferroni corrected for four comparisons) (Figure 3B). This suggests that the disproportionate drop in SD prior to skip actions cannot be explained merely by a difference in expectation of dot stimuli or temporal processing. Moreover, an explanation based on temporal processing would presumably predict stronger EEG convergence when participants wait longer before skip action. In fact, we found a negative correlation between EEG convergence and waiting time (Figure S4).
Finally, variability in the reaction time to respond to externally-triggered skip cues could potentially smear out stimulus-driven preparation of skip actions. Such jitter in RT would have the artefactual effect of increasing EEG variability across trials. To rule out this possibility we checked whether across-trial EEG convergence was correlated across participants with variability in behavioural reaction time to the skip response cue, but found no significant correlation between the two variables. This suggests that the difference in EEG convergence between self-initiated and externally-triggered skip conditions could not be explained by mere variability in RT to skip cues (Figure S5).
Modelling the converging EEG distribution of self-initiated actions
Leaky stochastic accumulator models have been used previously to explain the neural decision of ‘when’ to move in a self-initiated task (Schurger et al., 2012). A general imperative to perform the task shifts the premotor activity up closer to threshold and then a random threshold-crossing event provides the proximate cause of action. Hence, the precise time of action is driven by accumulated internal physiological noise, and could therefore be viewed as random, rather than decided (Schurger et al., 2012). However, the across-trial variability of cortical potentials in our dataset suggests that neural activity converges on a fixed pattern prior to self-initiated actions, to a greater extent than for externally-triggered actions. This differential convergence could reflect a between-condition difference in the autocorrelation function of the EEG. The early and sustained additional reduction in SD before self-initiated actions motivated us to hypothesise an additional process of noise control associated with self-initiated actions.
Sensitivity analysis
To investigate this hypothesis we first performed a sensitivity analysis by investigating how changing key parameters of the model could influence across-trial variability of the output (for details see materials and methods). We modelled the hypothesised process of noise control by allowing a gradual change in noise (Δc) prior to action. We also explored how changes in the key drift (I) and leak (k) parameters would influence the trial-to-trial variability of RP. We gradually changed each parameter while holding the others fixed, and simulated RP amplitude in 1000 trials time locked to a threshold-crossing event. SD was then measured across these simulated trials. Simulated across-trial SDs showed that lower drift rates and shorter leak constants were associated with a higher across-trial SD. Conversely, reductions in noise were associated with a lower across-trial SD (Figure 4A-C). Thus, for the model to reproduce the differential EEG convergence found in our EEG data, either the drift or the leak should be higher, or the change in noise parameter should be lower, in self-initiated compared to externally-triggered skip action conditions.
Model fitting and optimal parameters
We next fitted the model on the mean RP amplitude of each participant, separately for the self-initiated and externally triggered conditions (Table S2, S3). The best fitting parameters were then compared between the two conditions. The drift was significantly lower (t(21) = - 4.47, p < 0.001, after Bonferroni correction for the three parameters tested) in the self-initiated (mean across participants = 0.09, SD = 0.03) compared to the externally-triggered condition (mean across participants = 0.13, SD = 0.04) (Figure 4D). The leak was also significantly lower (t(21) = -4.20, p < 0.001, Bonferroni corrected) in the self-initiated (mean across participants = 0.40, SD = 0.21) compared to the externally-triggered condition (mean across participants = 0.58, SD = 0.20) (Figure 4E). The change in noise was negative in the self-initiated (mean across participants = -0.03, SD = 0.05) but positive in the externally-triggered condition (mean across participants = 0.02, SD = 0.05). This difference was significant between the conditions (t(21) = -5.38, p < 0.001, Bonferroni corrected) (Figure 4F). Finally, to investigate which parameters were most sensitive to the difference between self-initiated and externally-triggered conditions, we expressed the effect of condition on each parameter as an effect size (standardized mean difference, Cohen’s dz). Importantly, the effect size for the between-condition difference in the change in noise parameter (dz = 1.15, 95%CI = [0.60 1.68]) was larger than that for the drift (dz = 0.95, 95%CI = [0.44 1.45]) or the leak (dz = 0.89, 95%CI = [0.39 1.38]) parameters (Figure 4G).
So far, we fitted model parameters to the mean RP amplitude, and noted through separate sensitivity analysis their implications for across-trial SD. Next, we directly predicted the drop in across-trial SD of simulated RP data in self-initiated compared to externally-triggered conditions, using the optimal model parameters for each participant in each condition. We therefore simulated 22 RP data sets, using each participant’s best fitting parameters in each condition (see materials and methods), and computed the SD across the simulated trials. We observed a marked additional drop in simulated across-trial SD in the self-initiated compared to externally-triggered condition (Figure 5A, B). The differential convergence between conditions in the simulated data closely tracked the differential convergence in our EEG data (Correlation across participants, Pearson’s r = 0.90, p < 0.001) (Figure 5C).
Within-trial reduction in EEG variability
Optimum parameter values from the model suggest that a consistent process of noise reduction reliably occurs prior to self-initiated actions. This theory predicts that, compared to externally-triggered actions, EEG variability should reduce more strongly not only across trials but also within each single self-initiated action trial. To test this prediction we measured SD within a 100 ms sliding window for each trial, and each condition (see materials and methods) (Figure. 6A). We then used linear regression to calculate the slope of the within-trial SD change for each trial, and compared slopes between the self-initiated and externally-triggered conditions using a multilevel model with single trials as level 1 and participants as level 2 variables. While EEG variability decreased within self-initiated skip trials (mean slope = -0.01 μV/sample, SD across participants = 0.02 μV), it increased within externally-triggered trials (mean slope = 0.01 μV/sample, SD across participants = 0.02 μV). The between-condition difference in slopes was highly significant (t(4102) = 3.39, p < 0.001; Figure. 6B), consistent with a progressive reduction of EEG variability prior to self-initiated actions.
Previous discussions of amplitude variation in EEG focussed on synchronised activity within specific spectral bands (Pfurtscheller & Neuper, 1994). Preparatory decrease in beta-band power has been used as a reliable biomarker of voluntary action (Kristeva, Patino, & Omlor, 2007). While time-series methods identify activity that is phase-locked, spectral methods identify EEG power that is both phase-locked and non-phase-locked, within each specific frequency band (Cohen, 2014; Pfurtscheller & Lopes da Silva, 1999). Since motor threshold models simply accumulate all neural activity, whether stochastic or synchronised, we reasoned that reduction in the noise scaling factor within an accumulator model might be associated with reduction in the synchronised activity. We therefore also investigated the decreasing variability of neural activity prior to self-initiated action using spectral methods (see materials and methods). Specifically, we focused on the event-related desynchronization (ERD) of beta band activity (Bai et al., 2011; Calmels et al., 2006; Stancák & Pfurtscheller, 1996). We compared ERD between the self-initiated and externally-triggered conditions in a 500 ms window (1 – 0.5 s prior to action, based on previous reports (Tzagarakis, Ince, Leuthold, & Pellizzer, 2010)). Beta power in this period decreased prior to self-initiated skip (mean percentage change = -%9.3, SD = %7.4) (Figure. 7A), but not before externally-triggered skip actions (mean percentage change = -%0.6, SD = %6.9) (Figure. 7B). Importantly, percentage change in beta power was significantly different between the two conditions (t(21) = -4.16, p < 0.001) (Figure 7C).
Discussion
The capacity for endogenous voluntary action lies at the heart of human nature, but the brain mechanisms that enable this capacity remain unclear. A key research bottleneck has been the lack of convincing experimental paradigms for studying volition. Many existing paradigms rely on paradoxical instructions equivalent to “be voluntary” or “act freely” (Haggard, 2005; Libet et al., 1983). In a novel paradigm, we operationalized self-initiated actions as endogenous ‘skip’ responses embedded in a perceptual decision task, with a long, random foreperiod. Participants could decide to skip waiting for an imperative stimulus, by endogenously initiating a bilateral keypress. Although previous studies in animals also used ‘Giving up waiting’ to study spontaneous action decisions (Murakami et al., 2014), we believe this is the first use of this approach to study self-initiated actions in humans.
Our experimental task provided a purely operational definition of self-initiated, endogenous skip responses. Thus, the skip action can be understood from a strict behaviourist perspective. However, skip responses have many of the hallmarks of volition traditionally used in philosophy, including internal-generation (Passingham et al., 2010), reasons-responsiveness (Anscombe, 2000), freedom from immediacy (Shadlen & Gold, 2004), and a clear counterfactual alternative (Pereboom, 2011). Crucially, operationalising self-initiated voluntary action in this way avoids explicit instructions to “act freely”, and avoids subjective reports about “volition”. We compared such actions to an exogenous skip response triggered by a visual cue in control blocks.
The neural activity that generates self-initiated voluntary actions remains controversial. Several theories attribute a key role to medial frontal regions (Krieghoff, Waszak, Prinz, & Brass, 2011; Nachev, Kennard, & Husain, 2008; Passingham, 1995). Averaged scalp EEG in humans revealed a rising negativity beginning 1 s or more before the onset of endogenous actions (Kornhuber & Deecke, 1965), and appearing to originate in medial frontal cortex (Boschert, Hink, & Deecke, 1983; Deecke & Kornhuber, 1978). Since this ‘readiness potential’ does not occur before involuntary or externally-triggered movements, it has been interpreted as the electro-physiological sign of planning, preparation, and initiation of self-initiated actions (Keller & Heckhausen, 1990; Kornhuber & Deecke, 1990). RP-like brain activities preceding self-initiated actions were also reported at the single-neuron level (Fried, Mukamel, & Kreiman, 2011). However, the view of the RP as a causal signal for voluntary action has been challenged, because simply averaging random neural fluctuations that trigger a motor action also produces RP-like patterns (Schurger et al., 2012). Such stochastic accumulator models were subsequently used to predict humans’ (Schurger et al., 2012) and rats’ self-initiated actions in a task similar to ours (Murakami et al., 2014). Thus, it remains highly controversial whether the RP results from a specific precursor process that prepares self-initiated actions, or from random intrinsic fluctuations. We combined an experimental design that provides a clear operational definition of volition, and an analysis of distribution across and within individual trials of pre-movement EEG. Our results support a novel combination of both the classical and the stochastic views. We report the novel finding that self-initiated movements are reliably preceded by a process of noise reduction. This process alters the pattern of stochastic fluctuations that accumulate towards the motor threshold.
EEG showed decreased trial-to-trial variability prior to skip actions. This partly reflects the time-locking and baseline-correction at the time of action: ERP methods necessarily imply zero variability at the baseline (Luck, 2005). However, around 1.5 s prior to skip actions, the decrease in variability became more marked for self-initiated compared to control externally-triggered skip actions. Since the skip action in the externally-triggered control condition has no endogenous volitional component, the decrease in variability prior to skip actions in the control condition presumably reflects only the effects of time-locking, and the temporal autocorrelation of the background EEG. However, the additional decrease in variability prior to self-initiated action may reflect convergence of neural activity towards a steady trajectory that precedes self-initiated actions. We hypothesised that this could indicate a consistent preparatory process leading to self-initiated voluntary action.
Measurement of inter-trial variability has been extensively used in the analysis of neural data (Averbeck & Lee, 2003; Churchland et al., 2011; Churchland et al., 2010, 2006; He, 2013; Saberi-Moghadam, Ferrari-Toniolo, Ferraina, Caminiti, & Battaglia-Mayer, 2016; Schurger et al., 2015). For example, presenting a target stimulus decreases inter-trial variability of neural firing rate in premotor cortex (Churchland et al., 2006). Interestingly, RTs to external stimuli are shortest when variability is lowest, suggesting that a decrease in neural variability is a marker of motor preparation (Churchland et al., 2006). Moreover, reducing neural variability is characteristic of cortical responses to any external stimulus (Churchland et al., 2010), and could be a reliable signature of conscious perception (Schurger et al., 2015). Importantly, in previous studies, the decline in neural variability was triggered by a target stimulus, i.e. decreasing neural variability was triggered exogenously (Churchland et al., 2010). Our results show that inter-trial variability also decreases prior to a self-initiated action, in the absence of any external target.
Integration to bound models have been recently used to account for the neural activity preceding self-initiated actions in humans (Schurger et al., 2012) and rodents (Murakami et al., 2014). In the absence of external evidence, these models are fed solely with internal physiological noise. Importantly, when signal-to-noise ratio is low the timing of the decision to move is mainly determined by random fluctuations. Schurger et al.’s model first shifts premotor activity closer to a motor threshold. This is followed by a threshold-crossing event, triggered by stochastic fluctuations (Schurger et al., 2012). By fitting a modified version of the leaky stochastic accumulator model on each participant’s mean RP amplitude, we observed that integration of internal noise evolves differently prior to self-initiated and externally-triggered skip actions. The rate of the drift and the leak was lower and the change in noise was negative prior to self-initiated actions, compared to externally-triggered actions. Importantly, analysis of across-trial variability of simulated data, using model parameters optimised for each participant, implied that the marked drop in variability that we observed prior to self-initiated action was mainly driven by a gradually reducing noise level. Previous studies show that changes in noise level influences choice, RT and confidence in accumulation-to-bound models of perceptual decision (Fetsch, Kiani, Newsome, & Shadlen, 2014; Furstenberg, Breska, Sompolinsky, & Deouell, 2015; Kiani, Hanks, & Shadlen, 2008; Zylberberg, Fetsch, & Shadlen, 2016). Interestingly, the motivating effects of reward on speed and accuracy of behaviour were recently shown to be attributable to active control of internal noise (Manohar et al., 2015). In general, previous studies show an important role of active noise control in tasks requiring responses to external stimuli (Kool & Botvinick, 2013; Manohar et al., 2015). We have shown that similar processes may underlie self-initiated action, and that a consistent process of noise reduction may be a key precursor of self-initiated voluntary action.
Finally, we showed that a decrease in premotor neural variability prior to self-initiated action is not only observed across-trials, but is also realised within-trial and as a reduction in power of beta frequency band. Clearly, any natural muscular action must have some precursors. Sherrington’s final common path concept proposed that descending neural commands from primary motor cortex necessarily preceded voluntary action (Sherrington, 1906). However, it remains unclear how long before action such precursor processes can be identified. Our result provides a new method for addressing this question. The question is theoretically important, because cognitive accounts of self-initiated action control divide into two broad classes. In classical accounts, a fixed, and relatively long-lasting precursor process is caused by a prior decision to act (Anscombe, 2000; Kornhuber & Deecke, 1990). In other recent accounts, stochastic fluctuations dominate until a relatively late stage, and fixed precursor processes would be confined to brief, motoric execution processes (Schurger et al., 2012).
Our study cannot show whether self-initiated voluntary actions are caused by prior decisions, or by randomness. However, our results do suggest that the contribution of stochastic fluctuations is supplemented by a precursor process of noise reduction starting from around 1.5 s prior to action. Importantly, and by the same token, our results cannot show whether the precursor process of noise reduction is initiated by some top-down decision, or is itself triggered by some ongoing spontaneous fluctuations. Further, the precursor processes that our method identifies may be necessary for self-initiated action, but may not be sufficient: identifying a precursor process prior to self-initiated movement says nothing about whether and how often such a process might also be present in the absence of movement. On one view, the precursor process might occur quite frequently, but a last-minute decision might influence whether the precursor process completes with a movement, or is vetoed. Our movement-locked analyses cannot identify any putative vetoed precursor processes, or precursor-like processes that failed to result in a movement. However, our spectral analyses (Figure 7) make this possibility unlikely. They show a gradual decline in total beta-band power beginning around 1 s prior to self-initiated action. Any putative vetoed precursor processes would produce partial versions of this effect at other time-points in the epoch, but these are not readily apparent.
Inter-trial variability provides an additional dimension to information coding in the brain (He, 2013; Schurger et al., 2015; Stein, Gossen, & Jones, 2005). We showed that both inter-and within-trial variability decreases prior to a self-initiated action, akin to a reliable preparatory process. Our computational modelling further suggests that this preparatory stabilising process may itself reflect some as yet unknown mechanism of noise reduction. Actively regulating noise at optimal levels typically enhances system performance (Faisal, Selen, & Wolpert, 2008; Fitts, 1954; Groen & Wenderoth, 2016; Shu, Hasenstaub, Badoual, Bal, & McCormick, 2003). However, noise regulation can also arise incidentally, as a result of attractor dynamics in the motor system, as in the “optimal subspace hypothesis” (Shenoy, Sahani, & Churchland, 2013). Further, the noise reduction mechanism in our data could itself be triggered either by a specific top-down signal or a stochastic event. Finally, we speculate that the process of noise reduction not only explains the reduction in inter-and within-trial EEG variability prior to self-initiated action, but also generates the slow rising negativity of the RP, and the well-known beta ERD that precedes voluntary actions. ERD represents an activation in cortical areas that produce motor behaviour (Pfurtscheller, 1992; Pfurtscheller & Lopes da Silva, 1999). Factors such as effort and attention enhance the ERD (Defebvre, Bourriez, Destée, & Guieu, 1996). However, the relation between ERD and RP remained unclear. Our modelling results suggest that a reliable process of noise reduction could explain both the ERD, and the RP.
Interestingly, our endogenous skip response resembles the decision to explore during foraging behaviour (Constantino & Daw, 2015; Kolling, Behrens, Mars, & Rushworth, 2012). That is, endogenous skip responses amounted to deciding to look out for dot-motion stimuli in forthcoming time-periods, rather than the present one. This prompts the speculation that spontaneous transition from rest to foraging or vice-versa could be an early evolutionary antecedent of human volition.
In conclusion, we show that self-initiated actions have a reliable precursor, namely a consistent process of neural noise reduction prior to movement. We began this paper by distinguishing between a classical model, in which a fixed preparation process consistently preceded self-initiated action, and a fully stochastic model, in which the triggering of self-initiated action is essentially random – though the artefact of working with movement-locked epochs might give the appearance of a consistent precursor event such as the RP. We have identified a reliable precursor process, but this precursor process can be accommodated as a parameter change within the stochastic model framework. Future research might usefully investigate whether the precursor process is the cause or the consequence of the subjective ‘decision to act’.
Materials and Methods
Participants
24 healthy volunteers, aged 18-35 years of age (9 male, mean age = 23 years), were recruited from the Institute of Cognitive Neuroscience subject data pool. Two participants were excluded before data analysis (they provided insufficient EEG data because of excessive blinking). All participants were right handed, had normal or corrected to normal vision, had no history or family history of seizure, epilepsy or any neurologic or psychiatric disorder. Participants affirmed that they had not participated in any brain stimulation experiment in the last 48 h, nor had consumed alcohol in the last 24 h. Participants were paid an institution-approved amount for participating in the experiment. Experimental design and procedure were approved by the UCL research ethics committee, and followed the principles of the Declaration of Helsinki.
Behavioural task and procedure
Participants were placed in an electrically shielded chamber, 55 cm in front of a computer screen (60 Hz refresh rate). After signing the consent form, the experimental procedure was explained and the EEG cap was set up. The behavioural task was as follows: participants were instructed to look at a fixation cross in the middle of the screen. The colour of the fixation cross changed slowly and continuously throughout the trial. This colour always started from ‘black’ and then gradually changed to other colours in a randomised order. The fixation cross changed colour gradually (e.g., from green to pink), taking 2.57 s. The fixation cross was initially black, but the sequence of colours thereafter was random. At the same time, participants waited for a display of randomly moving dots (displayed within a circular aperture of 7° of diameter with a density of 14.28 dots/degree, initially moving with 0% coherence with a speed of 2°/s (Desantis, Waszak, & Gorea, 2016; Desantis, Waszak, Moutsopoulou, & Haggard, 2016), to move coherently (step change to 100% coherence) towards the left or right. They responded with the left or right hand by pressing a left or right arrow key on a keyboard, accordingly. The change in dot motion coherence happened abruptly. Correct responses were rewarded (2p). Conversely, participants lost money (-1p) for giving a wrong answer (responding with the left hand when dots were moving to right or vice versa), for responding before dots start moving, or not responding within 2 s after dot motion. The trial was interrupted while such error feedback was given. Importantly, the time of coherent movement onset was drawn unpredictably from an exponential distribution (min = 2 s, max = 60 s, mean = 12 s), so waiting was sometimes extremely long. However, this wait could be avoided by a ‘skip’ response (see later). Participants could lose time by waiting, but receive a big reward (2p) if they responded correctly, or could save time by ‘skipping’ but collect a smaller reward (1p) (Fig. 1A). The experiment was limited to one hour, so using the skip response required a general understanding of the trade-off between time and money. Participants were carefully informed in advance of the rewards for responses to dot motion, and for skip response, and were clearly informed that the experiment had a fixed duration of one hour.
There were two blocked conditions, which differed only in the origin of the skip response. In the ‘self-initiated’ condition blocks, participants could skip waiting if they chose to, by pressing the left and right response keys simultaneously. The skip response saved time, but produced a smaller reward (1p) than a response to dot motion. Each block consisted of 10 trials. To ensure consistent visual attention, participants were required to monitor the colour of the fixation cross, which cycled through an unpredictable sequence of colours. At the end of each block they were asked to classify the number of times the fixation cross turned ‘yellow’, according to the following categories: never, less than 50%, 50%, more than 50%. They lost money (-1p) for giving a wrong answer. At the end of each block, participants received feedback of total reward values, total elapsed time, and number of skips. They could use this feedback to adjust their behaviour and maximise earnings, by regulating the number of endogenous ‘skip’ responses.
In the ‘externally-triggered’ condition blocks, participants could not choose for themselves when to skip. Instead, they were instructed to make skip responses by an external signal. The external signal was an unpredictable change in the colour of the fixation cross to ‘red’ (Fig. 1B). Participants were instructed to make the skip response as soon as they detected the change. The time of the red colour appearance was yoked to the time of participant’s own previous skip responses in the immediately preceding self-initiated block, in a randomised order. For participants who started with the externally-triggered block, the timing of the red colour appearance in the first block only was yoked to the time of previous participant’s last self-initiated block. The colour cycle of the fixation cross had a random sequence, so that the onset of a red fixation could not be predicted. The fixation cross ramped to ‘red’ from its previous colour in 300 ms. Again, a small reward (1p) was given for skipping. The trial finished and the participant lost money (-1p) if s/he did not skip within 2.5 s from beginning of the ramping colour of the fixation cross. The ‘red’ colour was left out of the colour cycle in the self-initiated blocks. To control for any confounding effect of attending to the fixation cross, participants were also required to attend to the fixation cross in the self-initiated blocks and to roughly estimate the number of times the fixation cross turned ‘yellow’ (see previous). Each externally-triggered block had 10 trials, and after each block feedback was displayed. Each self-initiated block was interleaved with an externally-triggered block, and the order of the blocks was counterbalanced between the participants. The behavioural task was designed in Psychophysics Toolbox Version 3 (Brainard, 1997).
EEG recording
While participants were performing the behavioural task in a shielded chamber, EEG signals were recorded and amplified using an ActiveTwo Biosemi system (BioSemi, Amsterdam, The Netherlands). Participants wore a 64-channel EEG cap. To shorten the preparation time we recorded from a subset of electrodes that mainly covers central and visual areas: F3, Fz, F4, FC1, FCz, FC2, C3, C1, Cz, C2, C4, CP1, CPz, CP2, P3, Pz, P4, O1, Oz, O2. Bipolar channels placed on the outer canthi of each eye and below and above the right eye were used to record horizontal and vertical electro-oculogram (EOG), respectively. The Biosemi Active electrode has an output impedance of less than 1 Ohm. EEG signals were recorded at a sampling rate of 2048 Hz.
EEG preprocessing
EEG data preprocessing was performed in Matlab (MathWorks, MA, USA) with the help of EEGLAB toolbox (Delorme & Makeig, 2004). Data were downsampled to 250 Hz and low-pass filtered at 30 Hz. No high-pass filtering and no detrending were applied, to preserve slow fluctuations. All electrodes were referenced to the average of both mastoid electrodes. Separate data epochs of 4 s duration were extracted for self-initiated and externally-triggered skip actions. Data epochs started from 3 s before to 1 s after the action. To avoid EEG epochs overlapping each other any trial in which participants skipped earlier than 3 s from trial initiation was removed. On average, 5% and 4% of trials were removed from the self-initiated and externally-triggered conditions, respectively.
RP recordings are conventionally baseline-corrected using a baseline 2.5 until 2 s before action. This involves the implicit assumption that RPs begin only in the 2 s before action onset (Shibasaki & Hallett, 2006), but this assumption is rarely articulated explicitly, and is in fact questionable (Verbaarschot, Farquhar, & Haselager, 2015). We instead took a baseline from -5 ms +5 ms with respect to action onset. This choice avoids making any assumption about how or when the RP starts. To ensure this choice of baseline did not capitalize on chance, we performed parallel analyses on demeaned data (effectively taking the entire epoch as baseline), with consistent results (see Figure. S2). Finally, to reject non-ocular artefacts, data epochs from EEG channels (not including EOG) with values exceeding a threshold of ±150 μv were removed. On average 7% and 8% of trials were rejected from self-initiated and externally-triggered conditions, respectively. In the next step, Independent Component Analysis (ICA) was used to remove ocular artefacts from the data. Ocular ICA components were identified by visual inspection. Trials with artefacts remaining after this procedure were excluded by visual inspection.
EEG analysis
Preliminary inspection showed a typical RP-shaped negative-going slow component that was generally maximal at FCz. Therefore, data from FCz was chosen for subsequent analysis. Time series analysis was performed in Matlab (MathWorks) with the help of the FieldTrip toolbox (Oostenveld et al., 2010). We measured two dependent variables as precursors of both self-initiated and externally-triggered skip actions: mean RP amplitude across trials and variability of RP amplitudes across and within trials (measured by SD). To compare across-trials SD between the two conditions, data epochs were divided into four 500 ms windows, starting 2 s before action onset: [-2, -1.5 s], [-1.5, -1 s], [-1, -0.5 s], [-0.5, 0 s]. All p-values were Bonferroni corrected for four comparisons. To get a precise estimate of the standard error of the difference between conditions, paired-samples t-tests were performed on jack-knifed data (Efron & Stein, 1981; Kiesel, Miller, Jolicoeur, & Brisson, 2008). Unlike the traditional methods, this technique compares variation of interest across subsets of the total sample rather than across individuals, by temporarily leaving each subject out of the calculation. In addition, we also performed cluster-based permutation tests on SD (Maris & Oostenveld, 2007). These involve a priori identification of a set of electrodes and a time-window of interest, and incorporate appropriate corrections for multiple comparisons. Importantly, they avoid further arbitrary assumptions associated with selecting specific sub-elements of the data of interest, such as individual electrodes, time-bins or ERP components. The cluster-based tests were performed using the following parameters: time interval = [-2 - 0 s relative to action], minimum number of neighbouring electrodes required = 2, number of draws from the permutation distribution = 1000.
To measure variability of RP amplitudes within each individual trial, the SD of the EEG signal from FCz was measured across time in a 100 ms window. This window was applied successively in 30 time bins from the beginning of the epoch (3 s prior to action) to the time of action onset. We used linear regression to calculate the slope of the within-trial SD as a function of time (Figure. 6A). This was performed separately for each trial and each participant. Slopes greater than 0 indicate that EEG within the 100 ms window becomes more variable with the approach to action onset. Finally, we compared slopes of this within-trial SD measure between self-initiated and externally-triggered conditions in a multilevel model with single trials as level 1 and participants as level 2 variables. Multilevel analysis was performed in R (R Core Team, Vienna, Austria).
Time-frequency analysis was performed with custom written Matlab scripts. The preprocessed EEG time series were decomposed into their time-frequency representation by using Complex Morlet wavelet with 20 frequencies, ranging linearly from 5 to 30 Hz. The number of wavelet cycles increased from 3 to 7 in the same number of steps used to increase the frequency of the wavelets from 5 to 30 Hz. Power at each trial, each frequency and each time point was measured by convolving the raw time series with the wavelet and squaring the resulting complex number. The power at each frequency and each time point was then averaged across trials for each participant. Edge artefacts were removed by discarding the first and last 500 ms of the epoch. Baseline time window was defined as the first 500 ms of the epoch (after removal of edge artefacts: 2.5 - 2 s prior to skip action). Changes in power during action preparation was subsequently expressed as the percentage of change relative to the average power during the baseline time window, across time at a specific frequency. Baseline normalisation was performed by using the following equation:
Values > 0 indicates that power at a specific frequency (f) and a specific time (t) is higher relative to the average power at the same frequency during the first 500 ms of the epoch. Finally, we asked whether percentage change in power relative to baseline differs between self-initiated and externally-triggered skip conditions in the beta band (15 – 30 Hz). Beta band Event-related Desynchronization (ERD) during action preparation is a well-established phenomenon (Bai, Mari, Vorbach, & Hallett, 2005; Doyle, Yarrow, & Brown, 2005; Pfurtscheller & Lopes da Silva, 1999). Beta power was calculated in a 500 ms window starting from 1 s and ending 0.5 s prior to skip action. We avoided analysing later windows (e.g., 0.5 - 0 s prior to action) to avoid possible contamination from action execution following presentation of the red fixation cross that cued externally-triggered responses. The average normalised power across all pixels within the selected window was then calculated for each participant and compared across conditions using paired-samples t-tests.
Modelling and simulations
All simulations were done in Matlab (MathWorks). We used a modified version of the Leaky Stochastic Accumulator Model (Usher & McClelland, 2001), in which the activity of accumulators increases stochastically over time but is limited by leakage.
Where I is drift rate, k is leak (exponential decay in x), ξ is Gaussian noise, c is noise scaling factor, and Δt is the discrete time step (we used Δt = 0.001). This leaky stochastic accumulator has been used previously to model the neural decision of ‘when’ to move in a self-initiated task (Schurger et al., 2012). In that experiment, I was defined as the general imperative to respond (with a constant rate). This imperative, if appropriately small in magnitude, moves the baseline level of activity closer to the threshold, but not over it. Thus, imperative alone does not trigger action, but does increase the likelihood of a random threshold-crossing event triggering action. In the original model, c was assumed to be constant and was fixed at 0.1. In a departure from the original model, we assumed that the noise scaling factor could change linearly from an initial value of c1 to a final value of c2, during action preparation. Consequently, Δc was defined as the magnitude of change in the noise scaling factor during the trial.
A negative Δc means that signal becomes less noisy as it approaches the threshold for action. Therefore, the modified model in our experiment had five free parameters: I, k, c1, c2 and threshold.
Where ct is noise scaling factor at time t. The threshold was expressed as a percentile of the output amplitude over a set of 1000 simulated trials (each of 50,000 time steps each). Epochs of simulated data were matched to epochs of actual EEG data by identifying the point of first threshold crossing event within each simulated trial and then extracting an epoch from 3000 time steps before to 1000 time steps after the threshold crossing.
Parameter estimation for self-initiated skip action was performed by fitting the model against the real mean RP amplitude of each participant in self-initiated condition. First, 1000 unique trials of Gaussian noise, each 50,000 time steps, were generated for each participant and were fed into the model. The initial values of the model’s parameters were derived from previous studies (Schurger et al., 2012). The output of the model was then averaged across trials and was down sampled to 250 Hz to match the sampling rate of the real EEG data. A least squares approach was used to minimise root mean squared deviation (RMSD) between the simulated and real mean RP, by adjusting the free parameters of the model for each participant (by using the MATLAB ‘fminsearch’ function). Note that this procedure optimised the model parameters to reproduce the mean RP, rather than individual trials.
To fit the model to our externally-triggered skip condition, we fixed the threshold of each participant at their best fitting threshold from the self-initiated condition. We wanted to keep the threshold the same in both conditions so that we could test the effect of changing noise levels for a given threshold. Importantly, we also fixed the value of c1 at its optimal value form the self-initiated condition. By using this strategy, we can ask how noisiness of the signal changes, from its initial value, and we can compare this change in noise between conditions. We additionally performed parallel simulations without the assumption of a common initial noise level, and obtained essentially similar results. Specifically Δc in the all-parameter-free model (mean= 0.02, SD = 0.06) was similar to the Δc in the model with c1 and threshold fixed (mean= 0.02, SD = 0.05). The remaining parameters (I, k, c2) were optimised by minimising the deviation between the simulated mean RP and the real mean RP in externally-triggered condition.
Finally, we tested the model on the across-trial variability of RP epochs, having fitted the model parameters to the mean RP. All parameters of the model were fixed at each participant’s optimised values for the self-initiated condition, and for the externally-triggered condition respectively. The model was run 44 times (22 participants, x 2 conditions) with the appropriate parameters, and 1000 separate trials were generated, each corresponding to a putative RP exemplar. The Gaussian noise element of the model ensured that these 1000 exemplars were non-identical. The standard deviation across trials was calculated from these 1000 simulated RP exemplars, for each participant and each condition. Importantly, this procedure fits the model to each participant’s mean RP amplitude, but then tests the fit on the standard deviation across the 1000 simulated trials. Finally, to assess similarity between the real and predicted SD reduction, the predicted SD in self-initiated and externally-triggered conditions was plotted as a function of time and the area between the two curves was computed. We then compared the area between the SD curves in a 2 s interval prior to self-initiated and externally-triggered conditions for all participants’ simulated data, and actual data (Figure 5), using Pearson’s correlation.
Author Contributions
Conceptualization, N.K., P.H., A.S., and A.D.; Methodology, N.K., A.S., and A.D.; Formal Analysis, N.K., L.Z., and P.H.; Investigation, N.K., L.Z.; Writing-original draft, N.K., Writing review & editing, P.H., A.S., Supervision, P.H. and A.S.
Acknowledgement
This work was supported by the European Research Council Advanced Grant HUMVOL (Grant number: 323943). This work was additionally supported by a grant from The Leverhulme Trust (Ref. RPG-2016-378) to PH. AS was supported by a starting grant from the European Research Council (Grant number 640626). The authors declare no competing financial interest.
Footnotes
NK is currently at Department of Experimental Psychology, University of Oxford, Oxford, UK AD is currently at Laboratoire Psychologie de la Perception, Universite Paris Descartes & CNRS (UMR 8242), Paris, France