Abstract
Metacognition, the ability to know about one's thought process, is self-referential. Here, we studied the brain mechanisms underlying metacognitive inferences in a self-generated behavior. Human participants generated a time interval, and evaluated the signed magnitude of their timing (first and second order behavioral judgments, respectively) while being recorded with time-resolved neuroimaging. We show that the first-and second-order judgments relied on the power of beta oscillations (β; 15-40 Hz), while error monitoring subsystems engaged alpha oscillations (α; 8-14 Hz). The spread of an individual’s β power state-space trajectories during timing was indicative of the individual’s metacognitive inference. Our results suggest that network inhibition (β power) instantiates a state variable determining future network trajectory; this naturally provides a code for duration and metacognitive inferences would consist in reading out this state variable. Altogether, our study describes oscillatory mechanisms for timing suggesting that temporal metacognition relies on inferential processes of self-generated dynamics.
INTRODUCTION
A central question in cognitive neuroscience is how humans evaluate and self-monitor their cognitive states, a generic ability referred to as metacognition (Fleming & Dolan, 2012). In daily life, self-monitoring and action evaluation typically involve some form of time estimation, from appropriate turn taking during conversations to sophisticated movement timing when creating enjoyable artistic expressions. Surprisingly however, theories of psychological time do not account for time monitoring abilities despite early work suggesting their implication in metacognition (Miltner et al., 1997). By far, only one observation in animals (Meck et al., 1984) and one behavioral study in humans (Akdogan & Balci, 2017) have reported the ability to self-monitor and to rate confidence in decision-making during time estimation, respectively. We will refer to (temporal) metacognition in humans as the intelligible inference of one's magnitude estimation – measured herein through temporal production. Here, we asked to which extent temporal metacognition can be reliable during time estimation, and which neural indices may dissociate the self-generation of a time interval (first order judgment, FOJ) from the self-assessment of the produced time interval (second order judgment, SOJ).
To address these questions, human participants were asked to produce time intervals and, post factum, to self-estimate the signed error magnitude of their time production, i.e., to evaluate whether they were shorter or longer than the required target duration (Fig. 1). In this temporal production task, participants were required to initiate a time interval by button press followed by a second button press after they considered that 1.45 s had elapsed. Hence, participants had full volitional control over their time production in the absence of any sensory cues. Additionally, participants could solely rely on their self-generated internal dynamics coding for the target duration both, for their time production, and for their metacognitive assessment (FOJ and SOJ, respectively). This design enabled to address the possibility of internal timing or duration being explicitly available to awareness through self-referential metacognition (Block, 1995; Fleming et al., 2012). Because current neuroscientific models posit that internal dynamics in timing are mediated by oscillatory or state-dependent network dynamics (Laje & Buonomano, 2013; Karmarkar & Buonomano, 2007; Buhusi & Meck, 2005; Merchant et al., 2013; Allman et al., 2014; Gu et al., 2015; van Wassenhove, 2016; Bueno et al., 2017), we combined magneto-and electroencephalography (M/EEG) to quantify the dynamics of oscillatory brain responses when participants engaged in self-generated and self-assessed timing.
Using a temporal production task to assess metacognition provided several paradigmatic and conceptual benefits. First, the requirements of a temporal production task may capitalize on forward-inverse models posited in motor execution (Miall & Wolpert, 1998): here, the key variable for the motor goal would be the timing of the second button press given when the first button press was made. This internal state variable could be used forward in the elicitation of a ballistic motor trajectory for the execution of the second button press, thereby allowing the minimization of execution errors (Harris & Wolpert, 1998) in time. Second, the availability of an internal variable coding for the target duration could provide useful information regarding the uncertainty or noise level of the motor timing goal (Harris & Wolpert, 1998), which, in turn, could be utilized for metacognitive assessment. In this context, a parsimonious hypothesis was that the neural markers of the internal timing variable would support both FOJ and SOJ. In agreement with a two-staged process in time estimation (van Wassenhove, 2009), an internal variable coding for duration would be set forth in the brain, and decoding the current-state (via reverse-inference) would help establish temporal meta-representations. Our experimental approach also differed from previous metacognitive tasks in at least two important ways. A majority of studies investigating second order metacognitive inferences have focused on the role of sensory evidence in driving confidence, and used rewards that were contingent on sensory stimuli being smaller or larger than an imposed categorical boundary (Bogacz et al., 2006; Gold & Shadlen, 2007). Here, the task fully relied on internal variables in the absence of sensory-driven confidence. Additionally, metacognitive inferences consisted in indicating the signed magnitude of timing-errors in the produced duration unlike a recent temporal error monitoring approach (Akdogan & Balci, 2017) in which participants categorically estimated their confidence in making a temporal judgment. We also provided participants with an unlimited decision time to prevent the tendency of fast and error-prone decisions.
In this study, we report that participants can reliably estimate the direction and the magnitude of their errors prior to receiving any feedback on the validity of their produced time interval. These results confirmed the ability to detect and estimate one's temporal errors during time production. Second, we describe several neural markers tracking participants' first and second order judgments. In particular, our results suggest that the power of beta (β) oscillations may indicate the internal state variable posited to code for duration in FOJ. The second order inferences (SOJ) were non-linearly related to the β power coding for FOJ. We also report that the distance in β state-space could inform on the reliability of an individual's metacognitive inferences. Finally, alpha (α) power in distinct brain regions correlated with FOJ and SOJ after time estimation, indicating parallel analyses of FOJ and SOJ likely based on the β state-dependent networks. Altogether, our findings provide novel evidence for the neural signatures of self-referential awareness in time estimation.
Portions of these results have been presented at conferences (C. Roger, M. Buiatti, V. van Wassenhove, Soc. Neurosci. Abstr. 2012-S-8620, 2012; H. Tajuddin, T.W. Kononowicz, C. Roger, V. van Wassenhove, Soc. Neurosci. Abstr. 439.21, 2015), and other group have recently showed metacognitive effects in line with our current report. Nevertheless, their task and conclusions based on psychophysical and computational approaches are very different from the magneto-and electro-physiological approaches presented here, and the conceptual insights they provide.
RESULTS
We first provide behavioral evidence showing that participants can accurately perform a temporal FOJ and retrospectively access its precision with a SOJ. We then quantify oscillatory signatures of the FOJ, and explore to which extent they may serve metacognitive inferences (SOJ). We also explore typical markers of time related processes and error-monitoring (contingent negative variation, CNV and error related negativity, ERN, respectively). Our analytical approach uses statistical modeling of single-trial data and dimensionality reduction methods for M/EEG data.
Drifting time production (FOJ) over time with no loss of precision
In the course of the experiment, participants (n = 12) accurately produced the 1.45 s target interval (Blocks 1-3) and the adjusted 1.56 s intervals (Blocks 4-6: feedback was implicitly and individually adjusted to a new target value; cf. Methods) as depicted in the density function over FOJ (Fig. 2a). Moreover, as can be seen in Fig. 2b, the increase of FOJ was steady and progressive over the course of the entire experiment. Given the task structure, the consecutive changes in feedback and implicit duration co-occurred with the unexpected drift of participants' duration estimation: specifically, in the 100% feedback blocks (Blocks 1 and 4), both time intervals were accurately produced with a mean of 1.490 s and 1.582 s, respectively (Fig. 2a); in the 15% feedback blocks (Blocks 2, 3 and 5, 6), the length of the FOJ significantly increased (Fig. 2a; dashed lines). In these 15% feedback blocks, participants' temporal productions were significantly longer for both the 1.45 s target interval (Fig 2a; F(8.0) = 11.3, p > 10-15; + 90 ms on average) and the 1.56 s adjusted interval (F(7.6) = 10.4, p > 10-15; + 78 ms). This suggested a natural tendency of participants to lengthen their time estimates over time when less feedback was provided: however, the blocks providing less feedback systematically occurred after the 100% feedback blocks. Although for both the 1.45s and 1.56s time intervals, the amount of feedback was the only factor to be warranted in the models containing the intercept (1.45 s target: ΔAIC = 123, χ2(1.0) = 6084832, p > 10-15; adjusted 1.56 s interval: ΔAIC = 104, χ2(0.9) = 4500629, p > 10-15), we could not disentangle with full certainty the time-on-task effect of consecutive blocks, and the specific feedback effects.
Nevertheless, some patterns suggested that the amount of feedback was not the main contributor to the observed drift in time estimation (Fig 2b). For instance, to assess the possibility of a step-like transition between Block 3 (explicit 1.45s target) and 4 (implicit adjusted 1.56s target), we used the last 10 trials of Block 3 and the first 10 trials of Block 4. The transition trials from Block 3 and 4 did not significantly differ (Fig. 2c; F(1) = 0.6, p > 0.1) suggesting no clear behavioral transition when the feedback was implicitly changed, even when provided on every single trial (Block 4). Given the lengthening of FOJ, we also tested whether the precision of time production changed in the course of the experiment. If so, this would indicate that participants may not perform the task with consistent focus over time, or may change their strategy when feedback was implicitly changed. For this, we measured the coefficients of variation (CVs, variance divided by the mean) per block, and interestingly found that, in spite of the drift, the precision of FOJ did not significantly differ across blocks as tested by repeated measures ANOVA (F(5, 65) = 0.9, p > 0.1). This demonstrated the stability of behavioral precision throughout the experiment, and indicated that participants were performing the task as required.
It is thus noteworthy that despite not explicitly informing participants about the change in target duration introduced in Blocks 4, 5, and 6, participants readily adjusted their temporal production (FOJ) without any loss in the precision of their temporal production (Fig. 2c). This observation suggests that humans can implicitly monitor their temporal criterion as was previously shown in rats (Meck et al., 1984).
We then asked whether humans could explicitly monitor such internal criterion explicitly, as recent work would suggest (Akdogan & Balci, 2017), by asking participants to introspect about their produced duration (SOJ). Due to the progressive lengthening of FOJ across experimental blocks, the behavioral data were z-scored separately for each block to allow exploiting and analyzing the local variations in single-trial estimates without any confounding influence of time-on-task effects.
Behavioral evidence for metacognitive inference in time estimation
Individuals could estimate and track their FOJ over time (Fig. 2d, e, respectively). To understand how precisely participants could self-estimate their temporal productions, we assessed whether the SOJ predicted the performance during FOJ on a single-trial basis using a generalized mixed model approach. We found that FOJ and SOJ were strongly correlated on a trial-by-trial basis (Fig. 2f, F(4.0) = 192.5, edf = 4.0, p < 10-15), suggesting that participants could correctly assess the signed error magnitude of their just produced time interval. Additionally, the model fit revealed a nonlinearity in the regression slope between FOJ and SOJ (Fig. 2f). The nonlinear term was modeled using a flexible nonlinear spline term provided by GAMM (cf. Methods) and was compared to the model allowing only linear terms (ΔAIC = 15.5, χ2(4.6) = 18576, p < 0.001). This nonlinearity indicated that the correlation was slightly less steep for the FOJ intervals close to the target duration, and steeper for FOJ away from the target duration. This pattern indicated that temporal productions (FOJ) closest to the internal target duration criterion were harder to self-estimate (SOJ). This intuitive pattern will also be observed later on in brain activity.
We then asked whether this SOJ effect was driven by a sequential effect due to feedback delivered on the previous trial. To test this, we fitted the model adding the factor ‘FOJ on n-1 trial’ as a fixed effect. The analysis was constrained to blocks with 100% feedback (Block 1 and 4) so that all trials - but the first one - were effectively preceded by feedback. The model comparison showed a marginal trend towards an n-1 duration effect (ΔAIC = 0.9, χ2(1.0) = 2135, p = 0.091), suggesting that the longer the previous FOJ was, the shorter the current SOJ tended to be (F(1.0) = 2.9, edf = 1.0, p = 0.09). The trend towards sequential effect in duration production was suggestive of implicit monitoring in humans in line with previous studies (Meck et al., 1984; Meck 1988) showing that rats' anticipatory behavior was negatively correlated with response times in the previous trial. Nevertheless, on a trial-by-trial basis, the strongest statistical relationship we observed was between FOJ and SOJ (F(1.0) = 293.2, edf = 1.0, p < 10-15), suggesting that human participants most exclusively based their self-estimation (SOJ) on the current trial estimation.
Although unlikely given the experimental instructions, an additional scenario accounting for the interaction between FOJ and SOJ was that participants intentionally increased the variance of their temporal productions (FOJ) improve their self-assessment (SOJ). This cognitive strategy would confound the possibility of an internal state variable being monitored, and this possibility was not controlled for in previous work (Akdogan & Balci, 2017). To address this important issue, we performed a control experiment in which we manipulated the incentives: participants produced 1.45 s durations, and either received points for their FOJ accuracy in one block (Supplementary Fig. 1a) or for the congruence between their FOJ and their SOJ in another block (Supplementary Fig. 1a). If participants used the confounding strategy described above, the regression slope for FOJ and SOJ should be significantly larger in the congruence incentive condition than in the FOJ accuracy incentive condition. In this control experiment, the FOJ significantly covaried with SOJ (Supplementary Fig. 1a; F(1.0) = 76.0, edf = 1.0, p < 10-15) just as we observed in our main experimental design. Most importantly, the addition of the block type factor was not justified in the model (ΔAIC = 1.9, χ2(1) = 1.7, p > 0.1), demonstrating that participants performed the task sequentially by first estimating FOJ and then estimating SOJ without alternative cognitive strategies. To further strengthen the observations in the control experiment, we checked whether the spread of FOJ estimations could account for an individual's metacognitive inference. Individual differences in metacognitive inference (FOJ*SOJ) were not correlated with the individuals' coefficient of variation observed in FOJ (Supplementary Fig. 1b, rho = 0.08; p = 0.81). This observation did not support the idea that participants boost their FOJ and SOJ correlation, but rather that participants effectively complied with the original task goal of inferring the magnitudes of their temporal errors following their temporal productions.
One last confounding factor regarding the existence of reliable SOJ in temporal estimation could be the uncertainty associated with the timing process itself. Specifically, the width of diffusion of an unknown number of accumulators could be used for metacognitive inference as opposed to timing per se (Akdogan & Balci, 2017). In our task, this hypothesis would predict that the speed of decision-making in SOJ (measured as reaction times) should significantly differ as a function of the magnitude of the SOJ. To test this, we fitted a model in which FOJ and SOJ were entered as predictors for reaction times on a given trial: neither FOJ nor SOJ terms were significant (all p > 0.1). Thus, we found no evidence that decision speed covaried with FOJ or with SOJ. Participants did not seem to use chronometric uncertainties to reach their decision for their metacognitive inference, suggesting that the second order decision in our task could be a readout of the FOJ motor timing goal. This operation would be conceptually and computationally very different from a second order decision established on the certainty of temporal error (Akdogan & Balci, 2017) as will be commented in the Discussion section.
To sum up, our behavioral findings indicated that participants produced temporal targets with overall constant precision in the course of the experiment, and that timing errors during temporal production could be accessed through metacognitive inference. We then tested the working hypothesis that the neural markers of the internal variable coding for the target duration may support both first and second order judgments. For this, we analyzed the M/EEG data recorded while participant underwent the behavioral task.
β power as a signature state variable for FOJ
We first explored the implication of known timing-related signatures such as slow activity (Contingent Negative Variation, CNV (Kononowicz & Penney, 2016; Macar & Vidal, 2004) and β power (Kononowicz & Van Rijn, 2015; Kulashekar et al., 2016) in FOJ. Using all sensors over the interval ranging from 0.4 s to 1.2 s following the first button press, we found no significant differences in the amplitude of the slow evoked responses as a function of duration estimation (all p > 0.1, Supplementary Fig. 2a). This replicated previous work discussing the functional implication of slow activity in timing (Kononowicz & Van Rijn, 2011, Kononowicz & Van Rijn, 2014). In the same time window (0.4 to 1.2 s), we instead observed that longer produced durations showed a significant cluster of larger β power in both EEG and MEG data (p = 0.020, Fig. 3a; Supplementary Fig. 2bc). To prevent binning FOJ data according to performance, which could be considered somewhat arbitrary, we also used FOJ as a continuous predictor in a Generalized Additive Mixed Model (Wood, 2017; GAMM), which could accommodate nonlinear predictors and interactions: this analysis confirmed that longer durations elicited larger β power in a large cluster of sensors (Fig. 3b, Supplementary Table 1). The bilateral motor and midline cortices were the likeliest generators for the β power changes as a function of duration production (Fig. 3a). No other oscillatory responses showed significant changes as a function of temporal production (FOJ, p > 0.1).
Crucially, duration production did not correlate with EMG activity (Supplementary Fig. 3, p > 0.1) or with duration press (measured between the button press and its release) (p > 0.1, GAMM), thereby excluding the contribution of low-level factors to the β power effect which could have confounded the role of β oscillations in time estimation in previous work (Kononowicz & Van Rijn, 2015).
Nevertheless, one possibility was that β power was directly related to motor preparation, and not to timing per se. For instance, previous research has shown that changes in β power correlated with latency changes caused by the directional uncertainty of motor responses (Tzagarakis, et al., 2010). Here, no changes in direction were necessary as both button presses were identical, and no other variables than internal timing was required by the task. An alternative hypothesis was that β power changes were a marker of a drift-like accumulation so that the peak latency of β power preceding the second button press would differ across produced durations. Such latency differences would predict that, at a given latency near the second button press, β power would significantly differ across produced timing. To test this hypothesis, we fitted the GAMM to the mean values of β power locked to the second button press (0.4 s before the second button press). None of the fitted model terms were significant after FDR correction in either MEG or EEG (e.g. Supplementary Fig. 4 for EEG), providing no substantial evidence for the contribution of motor preparation or accumulation-like processes in the fluctuations of β power.
Across many studies (e.g., Kononowicz & Van Rijn, 2015; Meijer et al., 2016), β power can readily be seen at a stable latency of roughly ~500 ms following the onset of the timed interval as also observed here following the first button press. A plausible and parsimonious way to reconcile seemingly disparate findings would be to posit that the internal state variable for timing in motor execution always contributes to the uncertainty in motor execution. This would be consistent with reports of β power in motor execution tasks (Tan et al., 2016), the more specific observations that β power fluctuates according to uncertainty and predictability in motor timing (Meijer et al., 2016; Tzagarakis, et al., 2010) and, more generally, in many explicit motor timing tasks (e.g. Fujioka et al, 2012). Altogether, our results replicate and strengthen the implication of β oscillations in duration estimation (Kononowicz & Van Rijn, 2015; Kulashekar et al., 2016) and their possible instantiation as a state variable indicative of the motor timing goal. Our and others' results would thus be consistent with the notion that a state-dependent variable for timing can be initiated after the start of a time interval, and β power appears a good predictor of the subsequent length of the time interval.
Non-linear use of β power for metacognitive inference
One working hypothesis was that the availability of an internal variable coding for duration may be shared by both FOJ and SOJ. Hence, the next question was whether β power also varied as a function of SOJ. The analysis of oscillatory power in the same time window showed no significant clusters, whether splitting SOJ in categories (Supplementary Fig. 5, all p > 0.1) or using SOJ as a continuous predictor in a GAMM (p > 0.1). Although β power did not directly correlate with SOJ, metacognitive inferences may still rely on β power by means of establishing the difference between the β power indicating the length of the produced time interval, and the actually produced interval. To address this possibility, we defined a neural congruence score between FOJ and SOJ, which consisted of the absolute difference of z-scored β power in single-trials FOJ and SOJ. This congruence score was meant to capture the distance of an individual's SOJ from the individual's temporal production (FOJ) so that high Congruence indicated that the SOJ matched the FOJ. The sign of the congruence score was captured by the interaction between FOJ and SOJ.
The full GAMM model allowing non-linear terms had the following specification: β power ~ FOJ + SOJ + Congruence + FOJ*SOJ + FOJ*Congruence. FOJ, SOJ, and other predictors were entered as continuous variables (cf. Methods). Although the main term Congruence, and the interaction between FOJ and SOJ showed no significant clusters, the interaction between FOJ and Congruence revealed a significant change of β power in selected sensors (Fig. 4, right panel, Supplementary Table 2). Specifically, the model comparison showed a significant nonlinear interaction between FOJ and Congruence when compared to the model without the interaction term (Fig. 4; ΔAIC = 14.0 χ2(2.1) = 9.6, p < 0.001). The 3D plot and the heat map illustrate that for high Congruence scores, the power of β oscillations strongly predicted the FOJ. This effect could be seen in the increase of β power from point ‘b’ to point ‘c’ in the ‘congruent’ part of the surface (Fig. 4, left panel). This indicated that when participants correctly self-estimated their temporal production, β power was also predictive of FOJ. As the Congruence score decreased towards point ‘a’ in the plot (Fig. 4, left panel), the predictive power of β for FOJ also decreased. In sum, the use of GAMM allowed us to show that β power was non-linearly related to SOJ with increased β power for trials in which participants provided an accurate metacognitive inference on their temporal production, i.e., when participants were aware of the direction and the magnitude of their time errors. This pattern was consistent with the non-linearities reported in the behavioral results.
As β power appeared to spread according to the produced duration, we hypothesized that the more distant beta power was from the planned target duration, the better metacognitive judgments might be on a per individual basis – as also suggested in behavioral data (Fig. 2f) in which SOJ closest to the FOJ were harder to self-estimate. To test this hypothesis, we decomposed the β power into latent components using dPCA method (Machens, 2010; Kobak et al., 2016) and quantified the distance in β state-space as a function of FOJ and SOJ.
Distance in β state space predicts metacognitive inference
We quantified metacognitive inference as the correlation (Spearman's rho) between FOJ and SOJ on a per individual basis. To avoid an arbitrary topographic selection of sensors we employed two methods to obtain distance in β state space driven by the FOJ split.
First, we quantified the grand average power in each frequency band for each individual and across all sensors in order to decompose brain activity into latent components with the dPCA method (Machens, 2010; Kobak et al., 2016). Although the dPCA method was originally designed for cell population analyses in a single cortical region, we used it here across sensors, with sensors somewhat homologous to a population of neurons with mixed selectivities. Specifically, we looked at the morphology of the two first demixed Principal Components (dPC; cf. Methods) which showed that β power rose fast at the onset of the FOJ and remained segregated in different locations of the neural state-space corresponding to the 'short’ ‘correct’, and ‘long’ FOJ (Fig. 5a). This pattern was found for nearly all participants in the study as depicted in Fig. 5a and the separation is illustrated for participant 1 in Fig. 5b.
Next, we conservatively quantified the multidimensional distance in β power state-space between FOJ ('short’, ‘correct’ ‘long’) as a function of the individual's metacognitive inference (FOJ*SOJ), i.e. we computed the Spearman's rho correlation between FOJ and SOJ on a per individual basis. We found a significant correlation between an individual's SOJ and the distance between FOJ conditions in the defined β state-space (rho = 0.75, p = 0.005; Fig. 5b). This result indicated that the larger the distance between the FOJ, the better the SOJ were. Importantly, the multidimensional distance in β state-space was not driven by the spread of individual FOJ responses, quantified by Coefficient of Variation (CV-FOJ; standard deviation of an individual's FOJ divided by its mean) (rho = 0.26, p = 0.42; Supplementary Fig. 6a). To further insure that there was no confound with the behavioral variance in CV-FOJ, we computed robust linear regressions: a comparison of the null model to the model containing β power was justified (Frobust = 8.0, p = 0.004) whereas the inclusion of CV-FOJ against the null model was not warranted (Frobust = 0.03, p = 0.9) demonstrating that including the CV-FOJ did not significantly account for the variance in the model. CV-FOJ and dPC1 were also dissociated as the model containing CV-FOJ and dPC1 was preferred over the model containing only CV (Frobust = 7.3, p = 0.006).
The dissociation between CV-FOJ and dPC1 showed that the distance in β state-space was not solely driven by inter-individual variability in FOJ, but rather, by timing. No other frequency bands or components were significant (all p > 0.1; Supplementary Fig. 6bc). To strengthen the outcomes of the dPCA analysis, we also quantified the Euclidean distance in β power separately, for each sensor. With this method, the estimated distance in β space for a cluster of sensors was significantly correlated with metacognitive inference (Fig. 5c, white topographic dots).
We thus showed that the larger the distance in β state space (using dPCA or directly computing Euclidean distances of β power), the more accurate the metacognitive inference on a per individual basis. For instance, the individuals framed in blue (Fig. 5c) showed a much large spread of their beta state-space (Fig. 5a) and higher metacognitive inference (Fig. 5c). Conversely, the individuals framed in red (Fig. 5c) showed a much smaller spread of their beta state-space (Fig. 5a) and lower metacognitive inference (Fig. 5c). Our β state-space analysis provided strong evidence that the distance in β state-space could serve temporal metacognition by providing a second order estimation of FOJ. The distance in β state-space may thus reflect the individual degree of metacognitive inference, akin to reading-out β power retrospectively.
Sustained activity as signature of SOJ readout
If such β power readout for SOJ occurred retrospectively, it may also be time-locked to the moment at which participants decided to make their second button press in order to terminate the planned time interval. At roughly 200 ms preceding a movement, recent findings (Schultze-Kraft et al., 2016) have demonstrated that participants could not veto their decision to move. Here, this signifies that information relevant to the outcome of the temporal production task may be available for the SOJ, and thus that this time period may contribute to SOJ. To assess this possibility, evoked activity was locked to the second keypress in order to look at the amplitude of the evoked brain responses preceding (-0.4 to 0 s) the second button press. We found significant differences in the amplitude of the evoked activity as a function of SOJ (Fig. 6a) by considering SOJ as a continuous factor using GAMM (Fig. 6b). Specifically, we observed a positive frontal cluster (Fig. 6b, Supplementary Table 3) that covaried negatively with SOJ (Fig. 6b) (and a posterior negative cluster that co-varied positively with SOJ; Fig. 6b, Supplementary Table 4). In line with this bipolar EEG scalp distribution, and in agreement with previous work (Wiener et al., 2010), the brain sources at the origin of this activity were located in the motor, the premotor and the mid-frontal regions (Fig. 6c). These results indicate that the SOJ of temporal production was available prior to the second button press has been made. However, as FOJ contribution to slow activity was not relevant, the whole process may signify some initial stages of readout in which first order information becomes available after termination of the second button press. Therefore, we had a closer look at the post interval signatures of temporal error-monitoring.
Post-interval signatures of temporal error-monitoring
A seminal signature of error monitoring and metacognitive inference is the Error Related Negativity (ERN), which is typically observed after a decisional error has been committed (Yeung et al., 2004; Holroyd & Coles, 2004). As the sources of the ERN are typically located in cingulate cortices, we mainly focused the analysis on EEG data (although contrasts were also ran on MEG), which is more sensitive to midline structures. Surprisingly, contrasting evoked brain responses following the second button press did not elicit significant differences across FOJ or SOJ conditions (Supplementary Fig. 7; all p > 0.1). The lack of differential ERF/P allowed excluding that refractory effects could play a role at the interval offset (Kononowicz & Van Rijn, 2014).
To fully explore the post-duration evaluation period (0 to 0.4 s following the second button press), we then performed spatiotemporal cluster permutation tests in time-frequency data. We found significant clusters in the alpha band (α, 8-14 Hz) as a function of performance in FOJ (Fig. 7a, p = 0.035) with main sources originating in the medial and prefrontal regions (Fig. 7a, bottom row). Using single-trial GAMM analysis on normalized mean α power revealed a significant group of electrodes showing a linear relationship between α power and FOJ (Fig. 7b, Supplementary Table 5); another cluster showed a significant linear relationship in the theta band (4-7 Hz) with FOJ (Supplementary Fig. 8). A distinct cluster of electrodes revealed a significant cluster of α power inversely correlating with SOJ (Fig. 7c, p = 0.031) so that trials judged as ‘long’ were associated with larger α power alpha decreases (Fig. 7c). Source analysis revealed bilateral sources in the precuneus (Fig. 7c, bottom row), a region markedly different from those which contributed to the FOJ effects. Considering that the sources at the origin of the FOJ and SOJ α power effects were distinct (Fig. 7ac), we refitted the GAMM model without including the FOJ term. This analysis revealed a significant group of electrodes for which α power was linearly predicting SOJ (Fig. 7d, Supplementary Table 6). As for the FOJ fits, the linear relationship between SOJ and post-interval α power indicated that the post-duration production was sensitive to the whole spectrum of SOJ.
The linear trend of α power following FOJ was surprising given that error detection would predict a V-shaped pattern expressed as the typical ERN (Yeung et al., 2004; Holroyd & Coles, 2004): trials considered to be too short or too long were expected to yield a larger magnitude of error signal coding for the difference between the internal duration variable and the behavioral outcome. Instead, we found that α power following FOJ was indicative of the directional difference (or self-assessed residual error) between being too short and being too long.
DISCUSSION
Using a temporal metacognition task with time-resolved neuroimaging and statistical modeling, we investigated how the human brain monitors its self-generated timing. We report several main findings: first, humans can maintain their precision of temporal production over time despite implicit changes and drifts in mean duration values (FOJ). Second, human participants can accurately self-evaluate the signed error magnitude of their temporal productions (SOJ). Third, the power of β oscillatory activity following the initiation of the time interval was an accurate predictor of FOJ, even more so when participants were aware of their time-errors (SOJ). In particular, we report that the distance in β state-space separating the self-generated time intervals was indicative of the precision with which an individual could infer their performance. Fourth, sustained activity locked to the termination of the generated time interval provided a distinctive signature of second order decision. Fifth, α power following temporal production negatively correlated with both SOJ and FOJ in different medial regions implicated in self-monitoring and metacognition. Altogether, we interpret our results as supporting the availability of an internal state variable coding for duration, which sets up the goal of a state-dependent trajectory for time production. The metacognitive inferences would consist in reading out or decoding the internal state variables, consistent with a recent proposal (Fleming & Daw, 2017) and possibly relying on forward-inverse models of state-dependent computations in motor systems (Harris & Wolpert, 1998). We review and discuss the main evidence supporting this viewpoint along with the limitations of the current study.
β power a marker of state variable coding for time
Our behavioral results build on a recent report (Akdogan & Balci, 2017) by showing that participants can report the signed error magnitude of self-generated time intervals. This observation provides support to the notion that metacognition is a self-referential process (Block, 2005; Fleming & Daw, 2017) in turn questioning which mechanisms may support inference of self-generated dynamics (e.g., Kepecs et al, 2008). Departing from the basic requirements of the volitional task used here, the second button press (timed action) may rely on an internal state variable informing when the second button press should be made, given the first button press has occurred. The post-movement rebound of β oscillations has seminally been proposed to reflect the idling state of motor cortices (Pfurtscheller et al., 1996) possibly sensitive to sensory afferents (Casimi et al., 2001). That the power of β oscillations post-movement be stronger (smaller) for timed intervals that were longer (shorter) is parsimoniously consistent with the notion that the strength of network inhibition or idling would be predictive of the produced time interval (FOJ): the stronger the inhibition, the longer the time delay before the next button press can occur.
However, it is important to add a few more observations, so as to not trivialize the importance of β power as passive motor rebound, notably in light of recent studies indicating active cognitive components encoded in the β rebound (Tan et al., 2016). First, we found no significant impact of simultaneous EMG in the β effect that was indicative of FOJ, providing no evidence for the implication of the strength or afferent feedback confounded with time estimation. Second, the variability in β power did not appear to reflect random variance in time production (or explicit variance induced by the participant, cf. control experiment), but the actual time interval that participants were subsequently aware of. In fact, our results suggested that β power was even more telling of an individual's FOJ when participants correctly self-estimated their FOJ (SOJ). Considering that β power reflects the amount of inhibition in a network (Whittington et al., 2000; Wang, 2010), β power de facto provides an estimation of the time delay before the next network update can be made, which, in our context, corresponds to the next movement - consistent with general motor inhibition schemes (Duque et al., 2017). Additionally, that the power of β oscillations determines the duration of network inhibition is consistent with the implication of β oscillations in predictive timing (Arnal & Giraud, 2012) and in the notion of an idle state until network updating or reorganization (Engel & Fries, 2010; Spitzer & Haegens, 2017). The power of β has been repeatedly shown to reflect explicit time estimation and prediction (Kononowicz & Van Rijn, 2015, Fujioka et al., 2010) not only in motor timing (Bartolo et al., 2015) but also in sensory-driven duration estimations (Kulashekhar et al., 2015). Hence, we suggest that, because the power of β oscillations is indicative of the strength of inhibition in the network, it also naturally provides a state-dependent variable, which could be used as a duration code.
It is noteworthy that this observation departs away from the notion that time estimation relies on the accumulation of sensory or internal evidence (Buonomano & Maass, 2009; Buhusi & Meck, 2009; Hardy & Buonomano, 2016; Kononowicz & Penney, 2016; van Wassenhove, 2016). As depicted in Fig. 8, our state-variable hypothesis rather suggests that, by indicating the amount of inhibition in the network, β power inherently determines the planned trajectory that the network will subsequently follow. Hence, the initial value of β power may be sufficient to predict the time interval that will be produced by participants, and this was essentially captured by the positive interaction between FOJ and β power in our study. Additionally, the initial parameters (mean, noise) defined by this state variable would be predictive of the cascade of neural events leading to the next volitional button press (Fig. 8). With respect to behavioral outcome, the second press would be the outcome of a planned trajectory with limited revisions - or vetoing - shortly before the course of the next action, a scenario that has been recently described (Schultze-Kraft et al., 2016). Until reaching the state at which the motor plan for the second button press was initialized, the uncertainty in β state-space would thus reflect the uncertainty with which a decoder or a reader may estimate the duration. This interpretation is consistent with two distinct results: β power was non-linearly related to SOJ with increased β power for trials in which participants provided an accurate metacognitive inference and, second, in the state-space analysis, the more distinct the β state-space trajectories leading to a FOJ category were, the better participants were at self-estimating their time errors. This view thus supports the working hypothesis that network states (Maass et al., 2002; Laje & Buonomano, 2013) or population vectors (Rigotti et al., 2013) may provide a duration code subsequently used for classification by reader units (Maass et al., 2002; Sussillo & Abbott, 2009), serving as self-referential read-out in temporal cognition (van Wassenhove, 2009). Future neurophysiological and neurocomputational work could test the possibility of training such network models.
Markers of metacognitive inference as internal state variable readout
The non-linear interaction between FOJ and SOJ, and the spread of β state-space trajectories indicative of an individual's metacognitive ability were also used by the second order decision system before the end of the trial. The notion that β power may be used to compute second order decision is consistent with theoretical work suggesting that secondary read-out networks can learn to interpret the contingencies in the first order network (Buonomano & Maass, 2009; Cleeremans et al., 2007; Deneve et al., 1999; Dayan & Abbott, 2001; Pouget et al., 1998). In support of this theoretical work, pulvinar neurons have been shown to encode confidence, a second order variable, independently of other areas processing first order variables; this study thus suggested that one population of neurons can read-out neural population that encode primary sensory variables (Komura et al., 2013). Other studies have also suggested that particular brain regions independently code for first and second order signals (Lak et al., 2014). In line with these ideas, the mapping between β power and duration interval may be realized via networking through higher order brain regions. For example, prefrontal cortex, implicated in timing (Kim et al., 2017), monitor signals in motor cortex (Narayanan & Laubach; 2006) and such anatomical separation was found following the termination of the timed interval: larger α power were found for FOJ and SOJ that were shorter than the target duration, whereas smaller α power were found for FOJ and SOJ that were longer than the target duration. α oscillations have been implicated in performance monitoring (van Driel et al., 2012) and the cortical sources observed here were consistent with the acknowledged role of midline cingulate regions in self-monitoring (Miyamoto et al., 2017) and error detection (Ullsperger et al., 2014). The differences in FOJ originated from prefrontal cortices and ACC, whereas differences in SOJ implicated the precuneus, which has been reported during confidence judgments (De Martino et al., 2013), error processing (Menon et al., 2001) and intentional actions, self-awareness and self-processing (Den Ouden et al., 2005). Gray matter volume in precuneus predicts introspective accuracy (Fleming et al., 2010), and metacognitive efficiency in memory (McCurdy et al., 2013). Altogether, this pattern of results following the end of the time interval suggest that signals are integrated in the performance monitoring system (Ullsperger et al., 2014).
Conclusions
We showed that the brain can read-out self-generated signals to assess timing together with time errors. The spread of an individual's β power trajectories during timing was indicative of the individual's metacognitive inference. Our results suggest that network inhibition (β power) instantiates a state variable determining future network trajectory, thus providing a code for duration.
METHODS
Participants
Nineteen right-handed volunteers (11 females, mean age: 24 years old) with no self-reported hearing/vision loss or neurological pathology were recruited for the experiment and received monetary compensation for their participation. Prior to the experiment, each participant provided a written informed consent in accordance with the Declaration of Helsinki (2008) and the Ethics Committee on Human Research at Neurospin (Gif-sur-Yvette). The data of seven participants were excluded from the analysis due to the absence of anatomical MRI, technical issues with the head positioning system during MEG acquisition, abnormal artifacts during MEG recordings, and two participants not finishing the experiment. These datasets were excluded a priori and were not visualized or inspected. Hence, the final sample comprised twelve participants (7 females, mean age: 24 y.o.). All but two participants performed six experimental blocks; the first block was removed for one participant due to excessive artifacts, the last block was removed for another participant who did not conform to the requirements of the task.
Stimuli and Procedure
Before the MEG acquisitions, participants were explained they were taking part in a time estimation experiment, and written instructions were provided explaining all steps of the experimental protocol. In a given trial, the participant had to perform three successive steps: first, the participant produced a 1.45 s time interval; second, they self-estimated their time production as too short or too long as compared to the instructed time interval and third, they received feedback on their produced time interval (Fig. 1a). We will refer to the produced time interval as the first order temporal judgment (FOJ), and to the self-estimation of the first order judgment as second order temporal judgment (SOJ). The feedback participants received was for all trials in the 1st and in 4th experimental block, or on 15% of the trials in the other blocks (Fig. 1a). To tailor an accurate feedback for each individual, a perceptual threshold for duration discrimination of the same 1.45 s duration was collected before the experiment; this individual threshold was used to scale the spacing of the feedback categories as too short, correct or too long (Fig. 1b) as well as change the feedback unbeknownst of the participant in blocks 4 to 6.
Each trial started with the presentation of a fixation cross “+”on the screen indicating participants they could start whenever they decided to (Fig. 1a). The inter-trial interval ranged between 1 s and 1.5 s. Participants initiated their production of the time interval with a brief but strong button press once they felt relaxed and ready to start. Once they estimated that a 1.45 s interval had elapsed, they terminated the interval by another brief button press. To initiate and terminate their time production (FOJ) participants were asked to press the top button on Fiber Optic Response Pad (FORP, Science Plus Group, DE) using their right thumb (Fig. 1b). The “+” was removed from the screen during the estimation of the time interval to avoid any sensory cue or confounding responses in brain activity related to the FOJ. Following the production of the time interval, participants were asked to self-estimate their time estimation (second order judgment; Fig 1b). For this, participants were provided with a scale displayed on the screen 0.4 s after the keypress that terminated the produced time interval. Participants could move a cursor continuously using the yellow and green FORP buttons (Fig. 1b). Participants were instructed to place the cursor according to how close they thought their FOJ was with respect to the instructed target interval indicated by the sign ‘~’ placed in the middle of the scale. Participants placed the cursor to indicate whether they considered their produced time interval to be too short (‘--’ left side of the scale) or too long (‘++’, right side of the scale). Participants were instructed to take as much time as needed to be accurate in their SOJ and there was no time limit imposed on participants.
Following the completion of the SOJ, participants received feedback displayed on a scale identical to the one used for SOJ. The row of five symbols indicated the length of the just produced FOJ (Fig. 1a). The feedback range was set to the value of the perceptual threshold estimated on a per (mean population threshold = 0.223 s, SD = 0.111 s). A near correct FOJ yielded the middle ‘~’ symbol to turn green; a too short or too long FOJ turned the symbols ‘-’ or ‘+’ orange, respectively (Fig. 1b); a FOJ that exceeded these categories turned the symbols ‘- -’ or ‘++’ red. In Block 1 and 4, feedback was provided in all trials; in Block 2, 3, 5 and 6, feedback was randomly assigned to 15% of the trials (Fig. 1a). From Block 4 on, and unbeknownst to participants, the target duration was increased to 1.45 + (individual threshold/2; mean population duration = 1.56 s). In Block 1 and 4, participants had to produce 100 trials; in Block 2, 3, 5, and 6, participants produced 118 trials. Between the experimental blocks, participants were reminded to produce the target duration of 1.45 s as accurately as possible and to maximize the number of correct trials in each block.
Estimation of temporal discrimination threshold
The psychoacoustics toolbox was used to calculate the temporal discrimination threshold for each participant (Soranzo & Grassi, 2014) by adapting the available routine “DurationDiscriminationPureTone” provided in the toolbox. An adaptive procedure was chosen using a staircase method with a two-down one-up rule, and stopped after twelve reversals (Levitt, 1971). For each trial, three identical tones of 1 kHz were presented to the participant. One of the tones lasted longer than 1.45 sec (deviant tone) while the other 2 tones lasted precisely 1.45 sec (standard tones). The position of the deviant tone changed randomly across trials. The task was to identify the deviant tone and to give its position in the sequence. Tones were provided by earphones binaurally. The value of the correct category was set as target duration +/– (threshold/3), the lower and upper limit values were set as target duration +/– (2* individual threshold/3). These values were used to provide feedback to participants.
Simultaneous M/EEG recordings
The experiment was conducted in a dimly-lit, standard magnetically-shielded room located at Neurospin (CEA/DRF) in Gif-sur-Yvette. Participants sat in an armchair with eyes open looking at a screen used to show visual stimuli using a projector located outside of the magnetically shielded room. Participants were asked to respond by pushing a button on a FORP response pad (Science Plus Group, DE) held in their right hand. Electromagnetic brain activity was recorded using the whole-head Elekta Neuromag Vector View 306 MEG system (Neuromag Elekta LTD, Helsinki) equipped with 102 triple-sensors elements (two orthogonal planar gradiometers, and one magnetometer per sensor location) and the 64 native EEG system using Ag-AgCl electrodes (EasyCap, Germany) with impedances below 15 kΩ. Participants were seated in upright position and their head position was measured before each block using four head-position coils placed over the frontal and the mastoid areas. The four head-position coils and three additional fiducial points (nasion, left and right pre-auricular areas) were used during digitization to help with co-registration of the individual's anatomical MRI. MEG and EEG (M/EEG) recordings were sampled at 1 kHz and band-pass filtered between 0.03 Hz and 330 Hz. The electro-occulograms (EOG, horizontal and vertical eye movements), -cardiograms (ECG), and -myograms (EMG) were recorded simultaneously with MEG. The head position with respect to the MEG sensors was measured using coils attached to the scalp. The locations of the coils and EEG electrodes were digitized with respect to three anatomical landmarks using a 3D digitizer (Polhemus, US/Canada). Stimuli were presented using a PC running Psychtoolbox software (Brainard, 1997) that has been executed in Matlab environment.
Data Analysis
M/EEG data preprocessing
Signal Space Separation (SSS) was applied to decrease the impact of external noise on recorded brain signals (Taulu & Simola, 2006). SSS correction, head movement compensation, and bad channel rejection was done using MaxFilter Software (Elekta Neuromag). Trials containing excessive ocular artifacts, movement artifacts, amplifier saturation, or SQUID artifacts were automatically rejected using rejection criterion applied on magnetometers (55e-12 T/m) and on EEG channels (250e-6 V). Trial rejection was performed using epochs ranging from - 0.8 s to 2.5 s following the first press initiating the time production trial. Eye blinks, heart-beats, and muscle artifacts were corrected using Independent Component Analysis (Bell & Sejnowski, 1995) with mne-python. Baseline correction was applied using the mean value ranging from -0.3 s to -0.1 s before the first key press.
Preprocessed M/EEG data were then analyzed using MNE Python 0.13 (Gramfort et al., 2014) and custom written Python code. For time-domain evoked response analysis, a low-pass zero phase lag FIR filter (40 Hz) was applied to raw M/EEG data. For time frequency analyses, raw data were filtered using a double-pass bandpass FIR filter (0.8 – 160 Hz). The high-pass cutoff was added to remove slow trends which could lead to instabilities in time frequency analyses. To reduce the dimensionality, all evoked and time-frequency analyses were performed on virtual sensor data combining magnetometers and gradiometers into single MEG sensor types using ‘as_type’ method from MNE-Python 0.13 for gradiometers. This procedure largely simplified visualization and statistical analysis without losing information provided by all types of MEG sensors (gradiometers and magnetometers). M/EEG-aMRI coregistration
Anatomical Magnetic Resonance Imaging (aMRI) was used to provide high-resolution structural images of each individual's brain. The anatomical MRI was recorded using a 3-T Siemens Trio MRI scanner. Parameters of the sequence were: voxel size: 1.0 x 1.0 x 1.1 mm; acquisition time: 466s; repetition time TR = 2300 ms; and echo time TE= 2.98 ms. Volumetric segmentation of participants' anatomical MRI and cortical surface reconstruction was performed with the FreeSurfer software (http://surfer.nmr.mgh.harvard.edu/). A multi-echo FLASH pulse sequence with two flip angles (5 and 30 degrees) was also acquired (Jovicich et al., 2006; Fischl et al., 2004) to improve co-registration between EEG and aMRI. These procedures were used for group analysis with the MNE suite software (Gramfort et al., 2014). The co-registration of the M/EEG data with the individual's structural MRI was carried out by realigning the digitized fiducial points with MRI slices. Using mne_analyze within the MNE suite, digitized fiducial points were aligned manually with the multimodal markers on the automatically extracted scalp of the participant. To insure reliable coregistration, an iterative refinement procedure was used to realign all digitized points with the individual's scalp.
MEG source reconstruction
Individual forward solutions for all source locations located on the cortical sheet were computed using a 3-layers boundary element model (BEM) constrained by the individual's aMRI. Cortical surfaces extracted with FreeSurfer were sub-sampled to 10,242 equally spaced sources on each hemisphere (3.1 mm between sources). The noise covariance matrix for each individual was estimated from the baseline activity of all trials and all conditions. The forward solution, the noise covariance and source covariance matrices were used to calculate the dSPM estimates (Dale et al., 2000). The inverse computation was done using a loose orientation constraint (loose = 0.4, depth = 0.8) on the radial component of the signal. Individuals' current source estimates were registered on the Freesurfer average brain for surface based analysis and visualization.
ERF/P analysis
The analyses of evoked-related fields (ERF) and potentials (ERP) with MEG and EEG, respectively, focused on the quantification of the amplitude of slow evoked components using non-parametric cluster-based permutation tests which control for multiple comparisons (Maris & Oostenveld, 2007. This analysis combined all sensors and electrodes into the analysis without predefining a particular subset of electrodes or sensors, allowing to keep the set of MEG and EEG data as similar and consistent as possible. We used a period ranging from 0.3 s to 0.1 s before the first press as a baseline. For the ERF/P analysis the data were low-pass filtered using 40 Hz FIR filter.
Time-frequency analysis
To analyze the oscillatory power in different frequency bands, we used a single Hanning taper with an adaptive time window of 5 cycles per frequency in 4 ms steps for frequencies ranging from 3 to 100 Hz. We used a period ranging from 0.3 s to 0.1 s before the first press as a baseline. The data were processed using 'tfr_morlet’ function from MNE-Python. Statistical analyses were performed on theta (3-7 Hz), alpha (8-14 Hz), β (15-40 Hz), and gamma bands (41-100 Hz) submitted to spatiotemporal cluster permutation tests in the same way as for evoked response analyses. Both time-frequency and power spectral density (PSD) estimates were computed using discrete spheroidal sequences tapers (Slepian, 1978). PSD estimates were computed in 1 Hz steps.
Cluster based statistical analysis of MEG and EEG data
Cluster-based analyses identified significant clusters of neighboring electrodes or sensors in the millisecond time dimension. To assess the differences between the experimental conditions as defined by behavioral outcomes, we ran cluster-based permutation analysis (Maris & Oostenveld, 2007), as implemented by MNE-Python by drawing 1000 samples for the Monte Carlo approximation and using FieldTrip's default neighbor templates. The randomization method identified the MEG virtual sensors and the EEG electrodes whose statistics exceeded a critical value. Neighboring sensors exceeding the critical value were considered as belonging to a significant cluster. The cluster level statistic was defined as the sum of values of a given statistical test in a given cluster, and was compared to a null distribution created by randomizing the data between conditions across multiple participants. The p-value was estimated based on the proportion of the randomizations exceeding the observed maximum cluster-level test statistic. Only clusters with corrected p-value < 0.05 are reported. For visualization, we have chosen to plot the MEG sensor or the EEG electrode of the significant cluster, with the highest statistical power.
Behavioral data analysis
The analysis of behavioral data was performed as in the GAMMs framework as fully described below in the Single-trial analysis of MEG and EEG data, unless stated otherwise in the Results section. Each model was fitted with subject as a random factor. For the Block analysis, Block was included as a fixed factor. For the analysis of metacognitive inference, SOJ was entered as a linear predictor of FOJ. Relative model comparison was performed using Akaike Information Criterion (AIC) and χ2 test.
Binning procedure of behavioral and neuroimaging data
All cluster-based analyses were performed on three experimentally-driven conditions defined on the basis of either the objective performance in time production (FOJ: short, correct, long) or the subjective self-estimation (SOJ: short, correct, long) separately for each experimental block. Computing these three conditions within a block focused the analysis on local variations of brain activity as a function of objective or subjective performance. Additionally, to overcome limitations of arbitrary binning, and to capitalize on the continuous performance naturally provided by the time production and time self-estimation tasks, we also used a single trial approach, which allowed the investigation of interactions between the first and second order terms.
Single-trial analysis of MEG and EEG data
To analyze single trial data we used generalized additive mixed models (Wood, 2017; GAMM). Detailed discussions on the GAMM method can be found in elsewhere (Wood, 2017). Here, we briefly introduce the main advantages and overall approach of the method. GAMMs are an extension of the generalized linear regression model in which non-linear terms can be modeled jointly. They are more flexible than simple linear regression models as they do not require that a non-linear function be specified and the specific shape of the non-linear function (i.e. smooth) is determined automatically. Specifically, the non-linearities are modeled by so-called basis functions that consist of several low-level functions (linear, quadratic, etc.). We have chosen GAMMs as they can estimate the relationship between multiple predictors and the dependent variable using a non-linear smooth function. The appropriate degrees of freedom and overfitting concerns are addressed through cross-validation procedures. Importantly, interactions between two nonlinear predictors can be modeled as well. In that case, the fitted function takes a form of a plane consisting of two predictors. Mathematically, this is accomplished by modeling tensor product smooths. Here, we used thin plate regression splines as it seemed most appropriate for large data sets and flexible fitting (Wood, 2003). In all presented analyses, we used a maximum likelihood method for smooth parameter optimization (Wood, 2011). GAMM analyses were performed using the mgcv R package (Wood, 2009, version 1.8.12). GAMM results were plotted using the itsadug R package (Van Rij et al., 2016, version 1.0.1).
Although not widely used, GAMMs have been proven useful for modeling EEG data (Tremblay & Newman, 2015). Contrary to some of the previous studies using GAMMs for modeling of multidimensional electrophysiological data, sensors were not included as fixed effects. Rather, we took a more conservative approach and fitted the same model for every sensor separately. The resulting p-values were then corrected for multiple comparisons using false discovery rate (FDR) correction (Genovese et al., 2002). For plotting purposes, we collapsed the data across significant sensors after FDR correction and refitted the model.
We fitted the same GAMMs for several neurophysiological measurements chosen on the basis of previous literature. The fitted model contained random effects term for participant and fixed effects that were based on theoretical predictions. Specifically the full model had the following specification: uV/Tesla/power ~ s(FOJ) + s(SOJ) + s(Incongruence) + ti(FOJ, SOJ) + ti(FOJ, Incongruence) + s(participant, bs=‘re’). Besides the random term for participants, the model contained smooth terms for the first and second order judgments, Incongruence between the first and second order judgment, and the interaction term between FOJ and Incongruence. Notably, FOJ, SOJ and other predictors were entered as continuous variables in GAM analyses as opposed to post-hoc experimental conditions which suffered limitations from choosing arbitrary split point in the data.
Although GAMMs have built-in regularization procedures (meaning that they are somewhat inherently resistant against multicollinearity), multicollinearity can been assessed using variance inflation factor (VIF; fmsb R package, version 0.5.2). Here, VIF were assessed for the final model and consisted in averaging data from multiple sensors collapsed over a particular variable at hand. None of the VIF values exceeded 1.1, indicating that multicollinearity was unlikely to have had a major influence on the reported findings. Note that Rogerson (2001) recommended maximum VIF value of 5 and the author of fmsb recommended value of 10.
Before entering empirical variables in the model, we calculated normalized values or z-scores: trials in which a given variable deviated more than 3 z-scores were removed from further analysis. This normalization was computed separately for every MEG sensor and every EEG electrode. For single-trial analyses of β power in FOJ, we focused on the maximum power within the 0.4 s to 0.8 s period following the first button press. This time window overlapped with the selected time window that was used in cluster analyses. The main difference for the width of the time window is that we used one value for the GAMM and hence insured we capture only the β power elicited by the first button press and not spurious preparatory brain responses to the second button press. For the single-trial analyses of other brain signatures – e.g., sustained evoked activity or alpha oscillations - we focused on the mean values in a given time window.
Demixed Principal Component Analysis (dPCA)
To extract patterns in different frequency bands of averaged brain activity over experimental conditions, we used dPCA analysis (Kobak et al., 2016). Although, dPCA is commonly used in the population analysis of single cell recordings we apply it to collection of sensors, something that is analogical to population of neurons with mixed selectivities. Detailed discussions and proofs of the dPCA method can be found elsewhere (Kobak et al., 2016). In brief, dPCA aims to find a decomposition of the data into latent components that are easily interpretable with experimental conditions, preserving the original data to a maximal extent. The method compresses the data but also demixes dependencies on measured quantity of the task parameters. dPCA is essentially driven by a trade-off between demixing and compression, thus is a mixture of ordinary PCA and linear discriminant analysis (LDA): PCA aims at determining a projection of the data that optimally separates conditions and LDA aims at determining a projection of the data which minimizes the reconstruction error between the projections and the original data. All analyses were performed using a Python version of the dPCA module (https://github.com/machenslab/dPCA). To prevent overfitting, we used a regularization procedure to find the optimal λ parameter for our dataset, and used a 10-fold cross-validation. We focused dPCA analyses on the two first components (dPCA 1 and dPCA 2) of the βoscillatory activity because they explained on average 73% (SD= 20%) and 26% (SD= 20%) of the overall variance, respectively. Multidimensional distance was quantified as Euclidean distance for both dPCA analysis and for per-sensor analysis.
Robust regression of MEG and EEG data
Robust regression is an alternative to least squares regression when data have outliers or influential observations. It is also an alternative to Spearman's correlation when a more complex model has to be built. Model comparisons were performed using robust F test. All analyses were performed using robust R package (Wang et al., 2017, version 0.4-18).
ACKNOWLEDGMENTS
This work was supported by an ERC-YStG-263584 and an ANR10JCJC-1904 to V.vW. We thank the members of UNIACT and the medical staff at NeuroSpin for their help in recruiting and scheduling participants. We thank members of UNICOG for fruitful discussions. Preliminary data were presented at SFN (2012, Washington DC), and SFN (2015, Washington DC). We thank four anonymous reviewers for the depth and incisiveness of their comments.