Electrophysiological correlates of confidence differ across correct and erroneous perceptual decisions

Every decision we make is accompanied by an estimate of the probability that our decision is accurate or appropriate. This probability estimate is termed our degree of decision confidence. Recent work has uncovered event-related potential (ERP) correlates of confidence both during decision formation and after a decision has been made. However, the interpretation of these findings is complicated by methodological issues related to ERP amplitude measurement that are prevalent across existing studies. To more accurately characterise the neural correlates of confidence, we presented participants with a difficult perceptual decision task that elicited a broad range of confidence ratings. We identified a frontal ERP component within an onset prior to the behavioural response, which exhibited more positive-going amplitudes in trials with higher confidence ratings. This frontal effect also biased measures of the centro-parietal positivity (CPP) component at parietal electrodes via volume conduction. Amplitudes of the error positivity (Pe) component that followed each decision were negatively associated with confidence for trials with decision errors, but not for trials with correct decisions, with Bayes factors providing moderate evidence for the null in the latter case. We provide evidence for both pre- and post-decisional neural correlates of decision confidence that are observed in trials with correct and erroneous decisions, respectively. Our findings suggest that certainty in having made a correct response is associated with frontal activity during decision formation, whereas certainty in having committed an error is instead associated with the post-decisional Pe component. These findings also highlight the possibility that some previously reported associations between decision confidence and CPP/Pe component amplitudes may have been a consequence of ERP amplitude measurement-related confounds. Re-analysis of existing datasets may be useful to test this hypothesis more directly. Highlights – We mapped the event-related potential correlates of decision confidence – A frontal component was associated with confidence during decision formation – The error positivity component was associated with confidence in error trials – The error positivity was not associated with confidence in correct response trials


Introduction
Every decision we make is accompanied by an estimate of the probability that our choice is accurate or appropriate for the task-at-hand. This probability estimate is known as our sense of decision confidence (Pouget et al.,ON5[). We can use our sense of confidence to decide whether to adjust our decision-making strategies in preparation for future events (van den Berg et al., ON5[a;Desender et al.,ON5^a) and to rapidly correct decision errors when accuracy-related feedback is unavailable (Yeung & Summerfield,ON5O). Computation of confidence is often conceptualised as a 'second-order' decision across a continuous dimension (e.g., ranging from 'certainly wrong' to 'certainly correct') that relates to a corresponding first-order decision (Yeung & Summerfield,ON5O;Fleming & Daw,ON5c). Within this framework, researchers have proposed two broad classes of theoretical models which delineate different sources of evidence that inform confidence judgments.
The first set of 'decisional-locus' models (as labelled in Yeung & Summerfield,ON5O) assume that confidence judgments are based on information that directly relates to the first-order decision, such as the relative extent of evidence accumulated in favour of each choice alternative (Vickers,5^c^;Vickers & Packer,5^dO;Ratcliff & Starns,ONN^;Kiani & Shadlen,ONN^;Kiani et al.,ON5Z). The other class of 'postdecisional-locus' models posit that confidence judgments are informed by processes which occur after the time of the initial decision, for example via postdecisional evidence accumulation (Rabbitt & Vyass,5^d5;Pleskac & Busemeyer,ON5N;Moran et al.,ON5e;van den Berg et al.,ON5[b;Desender et al.,ONO5a;Maniscalco et al.,ONO5) or motor action-related processes (e.g., Fleming & Daw,ON5c;Turner et al.,ONO5). The main point of disagreement between these model classes relates to whether post-decisional processes are relevant to the computation of confidence (discussed in Moran et al.,ON5e;Fleming & Daw,ON5c). For example, the decisional-locus model described in Vickers and Packer (5^dO) specifies no role of post-decisional evidence accumulation, whereas the model in Moran et al. (ON5e) specifies that post-decisional evidence accumulation is an important determinant of confidence judgments.
Because each account differs with respect to the timing of confidence-related computations relative to the first-order decision, electrophysiological measures with high temporal resolution, such as electroencephalography (EEG), have been used to Electrophysiological Correlates of Decision Confidence e identify neural correlates of decision confidence. This work has provided some support for both decisional and post-decisional locus models, however there are important methodological issues that limit the inferences we can draw from these studies.

Support for Decisional Locus Models
In line with predictions of decisional locus models, previous work has revealed that subjective and model-derived confidence ratings monotonically scale with the amplitude of the centro-parietal positivity (CPP) event-related potential (ERP) component (O'Connell et al.,ON5O) from around MNN ms after target stimulus onset (Squires et al.,5^cM;Gherman & Philiastides,ON5e,ON5d;Herding et al.,ON5^;Zarkewski et al.,ON5^;Rausch et al.,ONON) or immediately preceding a keypress response used to report a decision (Philiastides et al.,ON5Z). The CPP is thought to be analogous to the parietal PM component in perceptual decision tasks (Twomey et al.,ON5e) and typically increases in amplitude to a fixed threshold around the time of a decision (O'Connell et al.,ON5O;Kelly & O'Connell,ON5M;Twomey et al.,ON5e), closely resembling the accumulation-to-bound trajectories of decision variables in evidence accumulation models (Ratcliff,5^cd;Ratcliff et al.,ON5[;Twomey et al.,ON5e;Kelly et al.,ONO5). These findings have been interpreted as reflecting higher levels of decision evidence accumulation in favour of the chosen option in trials with higher confidence ratings (e.g., Philiastides et al.,ON5Z;Gherman & Philiastides,ON5d). Consequently, this has been taken as support for the 'balance of evidence hypothesis' described in some decisional-locus models of confidence, which specifies that confidence indexes differences in the positions of racing accumulators in discrete choice tasks (Vickers,5^c^;Vickers & Packer,5^dO;Kiani & Shadlen,ONN^). Here, we make the assumption that the CPP is time-locked to the response, in line with the original definition of this component (O'Connell et al.,ON5O). However, we note that in some studies the CPP and PM are also considered to be stimulus-locked (e.g., Rausch et al., ONON).
The abovementioned studies have measured CPP/PM amplitudes within a fixed time window relative to stimulus onset, except for Philiastides et al. (ON5Z) which analysed model-derived (rather than self-reported) confidence ratings. Importantly, the CPP has been found to be tightly time-locked to the time of the keypress used to Electrophysiological Correlates of Decision Confidence [ report a decision (e.g., O'Connell et al.,ON5O;van Vugt et al.,ON5^). Higher confidence ratings are typically given in trials with faster choice response times (RTs; e.g., Johnson, 5^M^; Vickers & Packer,5^dO;Kiani et al.,ON5Z), at least for a sizeable majority of individuals (for an analysis of Z,Nd^ participants see Rahnev et al., ONON). In addition, participant-level RT distributions are strongly right-skewed, meaning that there is a larger amount of timing variability for relatively slower RTs. For example, there is typically a much larger range of RTs between the cN th and ^N th percentiles of an RT distribution as compared to the range between the 5N th and MN th percentiles.
This means that, in many perceptual decision tasks, the CPP typically peaks within commonly-used stimulus-locked CPP/PM amplitude measurement windows (e.g., MeN-eNNms in Rausch et al., ONON) in trials with faster RTs and higher confidence ratings (see Figure 5 of Kelly & O'Connell,ON5M). In trials with slower RTs (and lower confidence ratings) the CPP is likely to peak later than these typical stimulus-locked measurement windows and will also show higher amounts of timing variability (i.e., temporal smearing, see Ouyang et al.,ON5e), producing apparently smaller stimuluslocked CPP amplitude measures in those trials. This, in turn, can artificially produce differences in stimulus-locked CPP amplitude measures across higher/lower confidence ratings in cases where there are no real differences during the pre-response time window in response-locked ERPs (e.g., O'Connell et al.,ON5O;Kelly & O'Connell,ON5M). Consequently, based on our assumption that the CPP is a response-locked component, we believe it is important to measure CPP amplitudes using responselocked ERPs in addition to stimulus-locked measures, to ensure that effects on stimulus-locked ERPs are not simply by-products of RT differences across confidence rating conditions.
The findings of a recent study (Kelly & O'Connell,ON5M) also raise questions about whether these previously reported effects at centro-parietal electrodes actually reflect amplitude modulations of the CPP component. Kelly and O'Connell (ON5M) observed larger pre-response CPP amplitudes in conditions of higher stimulus discriminability (which often closely correlates with confidence). However, this effect was not observed after applying a current source density (CSD) transformation (Kayser & Tenke, ONN[), which better isolates distinct cortical sources of EEG signals that are often conflated in analyses of standard ERPs. Kelly and O'Connell also Electrophysiological Correlates of Decision Confidence c identified a fronto-central component which exhibited more positive-going amplitudes in higher discriminability conditions, which appeared to bias CPP measurements at centro-parietal channels via volume conduction. CSDtransformations were not used in the primary data analyses of the other studies listed above, and so it remains to be verified if these findings reflect genuine modulations of the CPP, or contributions from other temporally-overlapping sources.

9.A Support for Post-Decisional Locus Models
Studies supporting post-decisional locus models have described negative correlations between the amplitude of the error positivity (Pe) component (Falkenstein et al.,5^^5) and decision confidence ratings (Boldt & Yeung,ON5e;Desender et al.,ON5^b). The Pe component occurs around ONN-ZNN ms after the participant has formed a first-order decision and is measured at the same centroparietal electrodes as the CPP component. A negative association between Pe amplitudes and confidence was first described by Boldt and Yeung (ON5e), who reported that Pe amplitudes were larger (i.e. more positive-going) when participants gave confidence ratings indicating that they had made an error, and also when they were less confident that they had made a correct decision. More specifically, they identified a monotonic relationship between confidence and Pe amplitude across the confidence spectrum ranging from 'certainly wrong' to 'unsure' to 'certainly correct'.
The Pe component was proposed to be a neural correlate of post-decisional evidence accumulation that is specifically framed in terms of detecting a response error, which in turn informs decision confidence judgements (Desender et al.,ONO5b; see also Murphy et al.,ON5e). Congruent with this notion, the Pe is also more positive-going when participants detect that they had committed a response error (e.g., Ridderinkhof et al.,ONN^;Steinhauser & Yeung,ON5N;Wessel et al.,ON55;Murphy et al.,ON5e).
Although Boldt and Yeung (ON5e) developed an innovative framework that attempted to unify confidence and error detection, there are also methodological issues that should be considered when interpreting their findings. Most importantly, for their key analyses they measured response-locked ERPs and used a pre-response baseline. Importantly, this pre-response baseline largely overlapped with a time window over which CPP/PM amplitudes varied across confidence ratings, and the CPP Electrophysiological Correlates of Decision Confidence d and Pe were measured over similar sets of centro-parietal electrodes (their Figure MB, see also Philiastides et al.,ON5Z;Gherman & Philiastides,ON5e,ON5d). In such cases, any systematic differences across conditions during the pre-response baseline period will lead to spurious differences in post-response ERPs (Luck,ON5Z). In the case of Boldt and Yeung (ON5e), their baseline subtraction procedure would have artificially inflated Pe amplitudes in trials with lower pre-response CPP amplitudes, such as trials with lower confidence ratings or response errors (e.g.,von Lautz et al.,ON5^). This issue also applies to a subsequent study that replicated this effect (Desender et al.,ON5^b). Therefore, it remains to be verified whether Pe amplitudes do monotonically scale with decision confidence ratings in perceptual decision tasks (e.g., as claimed by Desender et al.,ONO5b).

9.D The Present Study
To more accurately characterise associations between decision confidence and CPP and Pe component amplitudes, we presented a difficult perceptual discrimination task and required participants to give confidence ratings in each trial. To better understand the sources of pre-decisional ERP correlates of confidence, we assessed the effects of applying CSD transforms to our data in line with Kelly and O'Connell (ON5M).
We expected to find response-locked CPP amplitudes to be positively correlated with confidence in trials with correct responses (as reported in Philiastides et al.,ON5Z;Rausch et al.,ONON), however we were agnostic about whether this effect would remain once a CSD transform had been applied. We also investigated the effects of using target stimulus-and response-locked baselines on associations between confidence and Pe amplitudes. We predicted that the associations between decision confidence and Pe amplitudes reported in Boldt and Yeung (ON5e) would not be replicated when using a pre-stimulus ERP baseline, but would be artificially produced when using a pre-response baseline.  M = OZ.c, SD = Z.d) participated in this experiment. Participants were right-handed, fluent in English and had normal or corrected-to-normal vision. Four participants were excluded due to near-chance task performance (i.e., accuracy below ee% for any of the three stimulus discriminability conditions). One additional participant was excluded due to excessively noisy EEG data. One participant was excluded because they were unable to complete the task, leaving O^ participants for both behavioural and EEG data analyses (5c female, 5O male, aged 5d-M[ years, M = Oe.N, SD = Z.^). This study was approved by the Human Ethics Committee of the Melbourne School of Psychological Sciences (ID 5ceNdc5).

A.A. Stimuli
Stimuli were presented using a gamma-corrected OZ" Benq RLOZeeHM LCD monitor with a refresh rate of [N Hz. Stimuli were presented using functions from MATLAB (Mathworks) and PsychToolbox (Brainard,5^^c;Kleiner et al.,ONNc). Code used for stimulus presentation will be available at https://osf.io/gazxO/ at the time of publication.
The critical stimuli consisted of two overlaid diagonal gratings within a circular aperture, presented against a grey background (similar to Steinemann et al.,ON5d;Feuerriegel et al.,ONO5a). The two gratings were oriented Ze° to the left and right of vertical, respectively. The circular aperture was divided into two concentric circles: an inner circle (target stimulus) and an outer circle (distractor) with radii of O.d[° and e.M5° of visual angle ( Figure 5B). The inner circle contrast-reversed at a rate of ON Hz; the outer circle contrast-reversed at a rate of MN Hz (which allowed for frequency tagging of target and distractor stimuli in analyses of steady state visual evoked potentials, however these signals were not relevant to the research question of this paper).

A.D. Procedure
Participants sat dN cm from the monitor in a darkened room and were asked to fixate on a central cross throughout all trials. The trial structure is depicted in Figure   Electrophysiological Correlates of Decision Confidence 5N 5A. In each trial, a white fixation cross appeared for dNN ms. Following this, both leftand right-tilted gratings within each circle increased from N% to eN% contrast. Both gratings remained at eN% contrast for a further 5,NNN ms, during which the contrast levels of both gratings were identical (i.e. the stimulus was "neutral", see Figure 5B).
Immediately after this neutral stimulus period, one of the gratings within the inner circle increased in contrast and the other decreased in contrast (see Figure 5C; labelled the "S5" target). This contrast difference persisted for 5NN ms, after which the neutral stimulus was presented again. Participants indicated which grating within the inner circle (i.e. left-tilted or right-tilted) was dominant (i.e. of higher contrast) by pressing keys on a TESORO Tizona Numpad (5,NNN Hz polling rate) using their left and right index fingers. Participants were required to respond within 5,NNN ms of S5 target onset.
If responses were made prior to the S5 target onset, or after the 5,NNN ms S5 response deadline, then "Too Early" or "Too Slow" feedback appeared, respectively. Feedback signalling the accuracy of the decision was not provided.
Relative contrast levels of the dominant and non-dominant gratings varied throughout the experiment according to an accelerated stochastic approximation (ASA) staircase procedure (Kesten,5^ed; initial step size = N.Oe, minimum contrast level = N.e5, maximum contrast level = N.^e). Rather than using an up-down staircase procedure that only converges on a small number of accuracy ratings, this ASA procedure was used because it quickly converges to pre-specified accuracy targets (Lu & Dosher,ON5M). We used three different staircases (interleaved across trials) that were designed to converge to accuracy levels of [N%, ce% and ^N%. The staircase procedure was employed continuously throughout the experiment to account for any improvements in task performance that can occur across the first few blocks of an experiment, and to provide a wide range of stimulus contrast values (and a wide range of confidence ratings). In each trial, staircase 5, O or M was pseudorandomly selected to determine the contrast level of the target stimulus. Equal numbers of each staircase condition were presented within each block, and across the experiment. Both the leftand right-tilted gratings within the outer distractor circle were kept constant at eN% contrast throughout each trial. The purpose of presenting the distractors was to increase the difficulty of the task by encircling the target with a dynamically contrastreversing neutral stimulus.

55
Following the response to the S5 target, the neutral stimulus was presented for a further 5,NNN ms in eN% of trials. In the other eN% of trials a second SO target was presented at the time of the next inner circle (target) contrast reversal after the response (i.e. within eN ms following the response), whereby the dominant S5 grating in the inner circle was again presented at ce% contrast, and the non-dominant S5 grating at Oe% contrast. This second target was presented for ZNN ms, after which a neutral stimulus was presented for [NN ms. Note that the dominant grating orientation was always consistent across the first and second targets, meaning that the second target was informative as to the correct response in that trial. Participants were instructed not to respond to the second target but were advised that the information conveyed by this stimulus would be useful in forming their confidence judgements in the trials in which it appeared. These SO targets were originally included to investigate the neural correlates of decision updating that occurs when additional information is provided after making a perceptual decision (similar to Fleming et al.,ON5d), and the respective analyses will be part of a separate publication. Importantly, none of the analyses testing for associations between ERPs and confidence presented here included the trials in which the SO target appeared. Numbers of trials with and without the second target were balanced within each staircase condition. In all trials the grating stimuli were then replaced with a blank screen with a fixation cross for ZNN ms.
Participants then rated their confidence in their decision on a continuous scale (ranging from -5NN to 5NN) with equal intervals between the labels 'Certainly Wrong' , 'Probably Wrong' , 'Maybe Wrong' , 'Maybe Correct' , 'Probably Correct' and 'Certainly Correct' ( Figure 5D). Please note that these labels were indicators to guide selection of a continuously-valued confidence response, but not discrete rating choice options. The zero value was the midpoint of the scale, indicating maximal uncertainty as to whether a correct response or an error had occurred. To provide confidence ratings, participants held down the left and right response keys to move a vertical bar to the left or right on the scale. To discourage premature preparation of motor responses associated with specific confidence ratings, the vertical bar was initially placed in a random location on the scale in each trial. The confidence rating scale was presented for M,NNN ms. The location of the bar at the end of this period constituted the confidence rating for that trial.

5O
To encourage participants to perform the task with maximum accuracy and make confidence judgements that reflected their true degree of belief in the correctness of their choices, we implemented a points system based on both task performance and the correspondence between participants' confidence ratings and their objective accuracy within each trial (as done by Fleming et al.,ON5d). Participants were awarded 5N points if they made a correct decision in each trial. No points were lost if the decision was incorrect. Trials with more than one response, responses prior to S5 target onset, or no response within the 5,NNN ms deadline, resulted in a loss of eN points.
Participants could gain or lose up to an additional eN points by making an accurate confidence rating regarding their response to the S5 target. As the confidence responses were graded, the most extreme confidence ratings were associated with the highest number of points wagered. An accurate confidence rating resulted in a gain between 5-eN points; a confidence rating in the incorrect direction resulted in a loss of between 5-eN points. For example, if a participant made a correct response and moved their rating bar halfway toward 'certainly correct' from the midpoint, they would win Oe points. In order to encourage optimal performance, participants were told they could earn between ON-Oe AUD based on how many points they accumulated, with 5 AUD awarded for every e,NNN points obtained (maximum possible score = Od,dNN).
However, all participants were actually reimbursed Oe AUD at the end of the experiment.
The experiment consisted of eight blocks, each containing [N trials (total number of trials = ZdN). This included OZN trials where the SO target appeared, and OZN trials in which it did not appear. Participants could take self-paced breaks between each block (minimum break length = MN seconds). Prior to the experiment participants completed a brief practice block of ON trials. During this practice phase, participants received feedback at the end of each trial as to whether their response was correct or an error.
Participants were allowed to repeat the practice block until both they and the experimenter were confident that they understood the task.
Electrophysiological Correlates of Decision Confidence 5M Figure 9. Trial structure and task. A) Each trial commenced with the presentation of a fixation cross. Following this, two circular apertures containing overlaid left-and righttilted gratings were presented. Red and blue lines respectively depict the contrast levels of the dominant (i.e. higher contrast) and non-dominant (lower contrast) gratings within the target stimulus (inner circle) in each phase of the trial. Participants indicated which set of stripes in the S5 target stimulus was dominant (i.e. of higher contrast). In eN% of trials the S5 target was followed by a second SO target that appeared within eN ms of the response to the S5 target. Each SO target contained a dominant grating at ce% contrast in the same direction as the preceding S5 target (SO trials were not relevant to our research questions and were excluded from all confidence-related ERP analyses; see main text). Participants rated their decision confidence at the end of each trial. B) Example of the neutral stimulus. Gratings in the inner (target) and outer (distractor) circles contrast-reversed at ON Hz and MN Hz, respectively. The dashed orange line denotes the boundary between the inner (target) and outer (distractor) circles. Both the left-and right-tilted gratings within the distractor circle were kept constant at eN% contrast throughout the trial. C) Example of an S5 target stimulus with a dominant left-tilted grating. D) Confidence rating screen.
Participants used their left and right index fingers to move the yellow cursor to their desired level of confidence.

A.H. Analyses of Accuracy, Response Times and Confidence Ratings
Code used for all behavioural and EEG data analyses will be available at https://osf.io/gazxO/ at the time of publication. Trials with responses slower than the response deadline or earlier than S5 target stimulus onset were removed from the dataset. Only trials with correct or erroneous responses and response times (RTs) of >5NN ms were included for analyses of RTs. For analyses of accuracy and RTs we included trials in which the SO target appeared because this target was presented after the time of the response to the S5 target and so could not influence these measures.
For analyses of confidence ratings, we excluded trials where the SO target appeared.
We modelled proportions of correct responses using generalised linear mixed effects logistic regressions (binomial family) as implemented in the R package lmeZ (Bates et al.,ON5e). We modelled RTs using generalised linear mixed effects regressions (Gamma family, identity link function) as recommended by Lo and Andrews (ON5e).
We modelled confidence ratings using linear mixed effects models (Gaussian family) as done by Fleming et al. (ON5d).
To test for effects of each factor of interest on measures of accuracy, RTs and confidence ratings, we compared models with and without that fixed effect of interest using likelihood ratio tests. For each comparison, both models included identical random effects structures, including random intercepts by participant. The fixed effect of interest in all analyses was target discriminability (i.e. contrast level). The fixed effect of correct/error trial outcome was included in all models for RT analyses.
Random slopes were also included for effects of target discriminability (Accuracy and RT analyses) and trial outcome (RT analyses) as these models converged successfully.
Models of confidence ratings were fit to correct and error trials separately (as done by Fleming et al.,ON5d). The structure of each model and the coefficients of each fitted model are detailed in the Supplementary Material.

A.K. EEG Data Acquisition and Processing
We recorded EEG at a sampling rate of e5O Hz from [Z active electrodes using a Biosemi Active Two system (Biosemi). Recordings were grounded using common mode sense and driven right leg electrodes (http://www.biosemi.com/faq/cms&drl.htm). We added six additional channels: two Electrophysiological Correlates of Decision Confidence 5e electrodes placed 5 cm from the outer canthi of each eye, and electrodes placed above and below the center of each eye.
We processed EEG data using EEGLab v5M.Z.Zb (Delorme & Makeig, ONNZ). All data processing and analysis code and data will be available at https://osf.io/gazxO/ at the time of publication. First, we identified excessively noisy channels by visual inspection (median number of bad channels = O, range N-d) and excluded these from average reference calculations and Independent Components Analysis (ICA). Sections with large artefacts were also manually identified and removed. We re-referenced the data to the average of all channels, low-pass filtered the data at ZN Hz (EEGLab Basic Finite Impulse Response Filter New, default settings), and removed one extra channel (AFz) to correct for the rank deficiency caused by the average reference. We processed a copy of this dataset in the same way and additionally applied a N.5 Hz high-pass filter (EEGLab Basic FIR Filter New, default settings) to improve stationarity for the ICA.
ICA was performed on the high-pass filtered dataset (RunICA extended algorithm, Jung et al., ONNN). We then copied the independent component information to the non high-pass filtered dataset (e.g., as done by Feuerriegel et al.,ON5d). Independent components generated by blinks and saccades were identified and removed according to guidelines in Chaumon et al. (ON5e). After ICA we interpolated any excessively noisy channels and AFz using the cleaned data (spherical spline interpolation). EEG data were then high-pass filtered at N.5 Hz (EEGLab Basic Finite Impulse Response Filter New, default settings).
The resulting data were segmented from -M,ONN ms to Z,NNN ms relative to S5 target onset, and were baseline-corrected using the -ONN to N ms pre-target interval (note that, for some analyses described below, ERPs were baseline-corrected relative to a pre-response baseline at a later step). These long epochs were derived to also allow for analyses of SSVEPs and time-frequency data, which are not relevant to the research questions here. Epochs containing amplitudes exceeding ±ONN μV at any scalp channels between -eNN ms and O,eNN ms from S5 target onset were rejected (mean trials retained = ZNe out of ZdN, range Od^-Ze[). This long time window was used to ensure that the same epochs were included for analyses of both stimulus-and response-locked ERPs. Numbers of retained epochs by condition are displayed in Supplementary Tables S5, SO). From the resulting epoched data, we then derived Electrophysiological Correlates of Decision Confidence 5[ stimulus-locked epochs using the interval from -eNN ms to 5,NNN ms relative to the S5 target onset. We also derived response-locked epochs using the interval from -5,eNN ms to 5,eNN ms relative to the time of the next inner circle contrast reversal after the response to the S5 target. Because the gratings in the inner circle contrast-reversed at a rate of ON Hz (i.e., every eN ms), the time point for deriving response-locked epochs always occurred within a very short latency (N-eN ms) following the keypress response. This epoching method was used to align the timing of target stimulus contrast-reversals across conditions, so that there would be no systematic discrepancies in the timing of visual evoked responses associated with these reversals.
This likely resulted in a small amount of temporal smearing of response-locked ERP components, the extent of which is smaller than the width of the time windows used to measure the ERP components of interest. We also note that, although the inner circle is termed the target stimulus, the inner circle actually contained a neutral stimulus (i.e., left-and right-tilted gratings at equal contrast) at the time of the keypress response in each trial. Please also note that the derived epochs extended beyond the time windows used for ERP analyses to also allow analyses of steady state visual evoked potentials (SSVEPs) and time-frequency power measures. However, such measures were not directly relevant to the research questions of this paper and are not reported here.

A.O. ERP Component Amplitude Analyses
We measured mean ERP amplitudes of the pre-response CPP between -5MN to -cN ms relative to the response at parietal electrodes Pz, P5, PO, CPz, and POz (same time window as Steinemann et al.,ON5d;Feuerriegel et al.,ONO5a). For these analyses we used a pre-stimulus baseline (i.e., the -ONN to N ms pre-target interval). To link our results to previous work using stimulus-locked CPP measures (e.g., Gherman & Philiastides,ON5d;Rausch et al.,ONON) we also measured the CPP as the mean amplitude between MeN-eNN ms from S5 target onset (as done by Rausch et al., ONON).
For the reasons described in section 5.5 we do not focus on the results of these analyses in our paper. However, we acknowledge that these results may be interesting to those who assume that the CPP is best understood as a component that is time-locked to the stimulus rather than the response.

Electrophysiological Correlates of Decision Confidence 5c
For analyses of the Pe component we first derived single-trial ERPs using the pre-stimulus baseline described above. To more directly compare our results with those of Boldt and Yeung (ON5e), we additionally ran the same set of Pe mean amplitude analyses using a -5NN to N ms pre-response baseline. We measured Pe amplitudes as the mean amplitude between ONN-MeN ms relative to the response, at For analyses of the CPP component we compared correct and erroneous responses using paired-samples frequentist and Bayesian t-tests as implemented in JASP vN.^.5 (JASP Core Team; Cauchy prior distribution, width N.cNc, default settings). We additionally fitted linear regression models using MATLAB to predict mean amplitudes based on confidence ratings. This was done separately for analyses of trials with correct responses and trials with errors. The resulting Beta coefficients (slopes) were tested at the group-level using one-sample frequentist and Bayesian ttests (as done by Feuerriegel et al.,ONO5b). For analyses of the CPP the correct/error comparison included both trials whereby the SO target did and did not appear, as the stimulation conditions were not systematically different until the time of the response.
For analyses including confidence ratings, only trials whereby the SO target did not appear were included. As described above, this is because the appearance of this informative, easily-discriminable SO target systematically biased confidence ratings toward the extremes of the rating scale.
Analyses of the Pe component used data from trials in which the SO target did not appear. Paired-samples t-tests were conducted to compare Pe amplitudes across trials with correct and erroneous responses. Within-subject regressions and grouplevel t-tests were performed using the predictor of confidence as described above.
We also performed complimentary, post hoc regression analyses using restricted ranges of confidence ratings, including the range from "unsure" (N) to "certainly correct" (5NN; indexing participants' certainty that they had made a correct response) and, in separate analyses the range from "certainly wrong" (-5NN) to "unsure" (indexing participants' certainty that they had made an error). For these analyses, we included both trials with correct responses and errors. We only included

A.P. Current Source Density Transformation
Based on observed positive associations between CPP amplitudes and confidence, we repeated the CPP analyses using CSD-transformed data estimated

D.9. Task Performance Results
The interleaved staircase procedure had the intended effects on measures of accuracy, RT and decision confidence (plotted in Figure O). Group-averaged mean contrast levels of the dominant S5 gratings were c5%, dN% and ^5% for staircases 5 (low discriminability), O (medium discriminability) and M (high discriminability). . Based on our assumption that the CPP reflects a response-locked ERP component, we have not focused on these results in our paper. However, we acknowledge that they might be interesting to other researchers who assume that the CPP is better described as a stimulus-locked component.

!.#.$. Effects of Current Source Density Transformation
After observing positive associations between CPP amplitudes and confidence we repeated these analyses using CSD-transformed data. This approach follows Kelly We also repeated our CPP analyses using the parietal ROI electrodes that were

D.D. Pe Component Amplitudes
We analysed Pe component mean amplitudes (between ONN-MeN ms from the time of the response) using both pre-stimulus and pre-response ERP baselines in separate analyses. This was done to systematically assess whether the use of preresponse baselines artificially produces observed associations between Pe amplitudes and confidence in cases where there are already ERP differences across conditions prior to the response (e.g., in Boldt & Yeung,ON5e). Both sets of analyses only included trials in which the informative SO stimulus did not appear.

!.!.$. Analyses Using Pre-Stimulus Baseline-Corrected ERPs
We first measured mean amplitudes of the Pe component using a pre-stimulus baseline, which are not influenced by ERP differences across conditions that might already exist prior to the response (as we discuss in detail in section 5.O). Using this type of baseline correction, we did not observe Pe amplitude differences between trials

!.!.#. Analyses Using Pre-Response Baseline-Corrected ERPs
We also repeated our analyses using a pre-response baseline (-5NN to N ms relative to the response) to compare our results to those of Boldt and Yeung (ON5e). In  Figure eF). Notably, the apparent timing and duration of these effects are almost identical to the ERP differences across confidence rating categories in Boldt and Yeung (ON5e,their Figure MA).
Taken together, the results from analyses of pre-stimulus and pre-response baseline-corrected ERPs show that, when there are already ERP differences across conditions prior to the response (as indicated by effects on the CPP), the use of a preresponse baseline produces artefactual associations between Pe amplitudes and confidence. Using the more appropriate pre-stimulus baseline correction, however, these results demonstrate that Pe amplitudes were only truly associated with variations in confidence in error trials.

D.H. Analyses Using Restricted Ranges of Confidence Ratings
We also performed complimentary, post hoc regression analyses using restricted ranges of confidence ratings, including the range from "unsure" (N) to "certainly correct" (5NN; indexing participants' certainty that they had made a correct response) and, in separate analyses, the range from "certainly wrong" (-5NN) to "unsure" (indexing participants' certainty that they had made an error). Results are detailed in the Supplementary Material.
In summary, the results were broadly consistent with those of the main analyses. Pre-response amplitudes were positively associated with confidence across the range of "unsure" to "certainly correct" for the CPP in conventional (i.e., non CSDtransformed) ERPs, but not for CSD-transformed ERPs. Frontal Component amplitudes were positively associated with confidence across the range of "unsure" to "certainly correct". For confidence ratings across the range of "certainly wrong" to "unsure", a negative-going association was found, however the Bayes factor of BF!& = 5.Md indicated only weak evidence in favour of the alternative hypothesis. Pe amplitudes were associated with confidence across the range of "certainly wrong" to "unsure" when using both pre-stimulus and pre-response baselines. However, Electrophysiological Correlates of Decision Confidence MO associations with confidence across the range of "unsure" to "certainly correct" were only found for the Pe when using pre-response baselines. ERPs for trials with correct and erroneous responses. B, E) ERPs for higher/lower confidence ratings in trials with correct responses. C, F) ERPs for higher/lower confidence ratings in trials with errors. In all plots the grey shaded area denotes the @LL-5ML ms time window used to measure the Pe component. The shaded magenta area denotes the pre-response baseline time window. Asterisks denote statistically significant differences between correct responses and errors or significant associations between confidence ratings and ERP component amplitudes (** denotes p < .L; and *** denotes p < .LL;).

!. Discussion
To characterise the electrophysiological correlates of confidence we presented participants with a challenging perceptual decision task. We varied stimulus discriminability (i.e., target contrast) over a wide range, which produced marked variation in self-reported levels of confidence. We identified ERP correlates of confidence both during decision formation and after a decision had been made. By analysing conventional (non CSD-transformed) ERPs, we found that the amplitude of the response-locked CPP component positively correlated with decision confidence.
Subsequent analyses using CSD-transformed data, however, did not find evidence for this association, and instead provided moderate evidence in favour of the null hypothesis (BF!" = K.LL). Analyses of activity at electrode FCz revealed that associations observed in the non CSD-transformed data may instead be attributable to a frontal ERP component that influenced measures at parietal electrodes via volume conduction.
We also tested for associations between confidence and amplitudes of the postdecisional Pe component using a pre-stimulus baseline, and ran the same analyses using the conventional pre-response ERP baseline. Importantly, effects on ERPs corrected using a pre-response baseline were likely to reflect signals of a pre-response origin rather than a true modulation of the Pe component. Indeed, when we used a pre-response baseline, there was a strong negative association between confidence and Pe amplitudes for both trials with correct responses and errors. However, when using a more appropriate pre-stimulus baseline, we found this association for trials with errors, but not for trials with correct responses. In the latter case, the Bayes factor provided moderate evidence in favour of the null hypothesis (BF!" = K.LL).
Our findings, which are not subject to the same methodological issues as previous work, encourage a re-evaluation of existing evidence that links ERPs and decision confidence. They suggest that certainty in having made a correct decision is indexed by fronto-central activity during the evidence accumulation stage, whereas certainty in having committed an error is indexed by the amplitude of the postdecisional Pe component. By extension, it appears that confidence does not correlate with any single ERP component in a consistent direction across the rating spectrum ranging from 'certainly wrong' to 'unsure' to 'certainly correct'. Instead, confidence Electrophysiological Correlates of Decision Confidence 5V judgments may be jointly informed by processes occurring over distinct pre-and postdecisional time windows, with one's degree of confidence in favour of a correct decision being computed during decision formation, and error detection occurring after a decision has been made (as proposed by Rausch et al., LKLK). Importantly, our findings are not fully compatible with existing theoretical accounts that use ERP findings to claim preferential support for decisional locus and post-decisional locus models of confidence (e.g., Philiastides et al.,LKWX;Boldt & Yeung,LKW6;Desender et al.,LKLWb).

!.+. Neural Correlates of Confidence During Decision Formation
Our analyses of conventional (i.e., non CSD-transformed) ERPs revealed that CPP amplitudes (measured at centro-parietal channels) were positively associated with confidence ratings for both trials with correct responses and errors. This pattern, seen in our response-locked ERPs, aligns with existing studies that have measured response-locked CPP amplitudes and model-derived (rather than self-reported) confidence ratings. By analysing CPP amplitudes time-locked to the behavioural response, we verified that the association between CPP amplitudes and self-reported confidence was not simply due to differences in the timing of the CPP component across confidence rating conditions (also see section W.W above). Our findings demonstrate the utility of including both stimulus-and response-locked ERP measures that provide complimentary information about an ERP component.
Based on the observed accumulation-to-threshold morphology of the CPP, researchers have interpreted larger CPP amplitudes as reflecting a greater degree of accumulated evidence in favour of the chosen option in trials with higher confidence ratings (e.g., Philiastides et al.,LKWX;Gherman & Philiastides,LKW6;von Lautz et al.,LKW]). This has been taken as support for the 'balance-of-evidence hypothesis' as specified in some decisional locus models (e.g., Vickers, W]^]; Vickers & Packer, W]`L; Ratcliff & Starns, LKK]), which specifies that confidence indexes differences in the positions of racing accumulators in discrete choice tasks. However, when we applied a Electrophysiological Correlates of Decision Confidence 5^ CSD-transform to our data we no longer found associations between CPP amplitudes and confidence, with the Bayes factor for analyses of trials with correct responses (BF!" = K.LL) indicating moderate evidence in favour of the null hypothesis. Instead, we found that the CPP-confidence associations identified using non CSD-transformed ERPs may have been due to a temporally overlapping frontal component whose amplitude positively correlated with confidence. Importantly, this frontal component Here, we note that our findings are broadly congruent with decisional-locus models (which were developed using behavioural data and do not specify ERP correlates of evidence accumulation). The fact that confidence for correct responses, but not errors, was correlated with frontal component amplitudes is also consistent with decisional-locus models. If participants thought they were committing an error at the time of the decision in our difficult perceptual discrimination task, they would presumably have changed their decision. Therefore, it is reasonable that error detection (indexed by the Pe) would occur only after the response had been made.
Although frontal component amplitudes positively correlated with confidence ratings, it is unclear whether this reflects processes that are specifically associated with confidence computations. For example, Kelly and O'Connell (LKW5)  surprising that ERP components linked to motor action execution might also correlate with confidence. To better understand the relationships between fronto-central activity, response speed and confidence, it may be useful to investigate graded variations in confidence that are not closely correlated with RT (e.g., using similar designs to Bang & Fleming LKW`; Fleming et al., LKW`) or use model-based approaches that explicitly account for differences in confidence across fast and slow RTs (e.g., Rausch et al., LKLK).
We also caution that we may not have had sufficient statistical power to detect associations between confidence and frontal component amplitudes in trials with errors, as there were smaller numbers of these trials compared to correct responses.
Our post-hoc analyses (that included both correct and error trials) identified positivegoing associations for confidence ratings within the range of "unsure" to "certainly correct", but weak evidence (BF!" = W.5`) for a negative-going association across the range of "certainly wrong" to "unsure". This suggests that frontal component amplitudes may actually scale with certainty in having made a correct response or an error, rather than confidence per se (for a distinction between these concepts see Pouget et al., LKWV). Future work should investigate the relationship between frontal component amplitudes and confidence ratings indicating that an error had occurred, to better characterise any possible links between this ERP component, certainty, and error detection.
We additionally note that, in the CSD-ERPs, there were apparent differences between higher/lower confidence ratings prior to the CPP measurement window previous work that identified this time window as suitable for this purpose (Steinemann et al., LKW`; Kelly et al., LKLW; Feuerriegel et al., LKLW). Please also note that, although CPP build-up rates are often important to consider in decision-making research (e.g., O'Connell et al., LKWL; Kelly et al., LKLW), we did not measure CPP slopes as they were not relevant to claims about the extent of evidence accumulation as specified in decisional-locus models.

!.8. Post-Decisional Correlates of Confidence
We systematically tested for associations between Pe amplitudes and confidence using pre-stimulus and pre-response baselines in separate analyses. We analysed non CSD-transformed ERPs to be consistent with prior work on the Pe component and decision confidence. We found that, when using a pre-stimulus baseline, Pe amplitudes (measured at centro-parietal electrodes) inversely scaled with confidence in trials with decision errors, but not in trials with correct responses. In the latter case, the Bayes factor (BF!" = K.LL) indicated moderate evidence in favour of the null hypothesis. However, when we used a pre-response baseline, we replicated previous reports of more positive-going Pe amplitudes in trials with lower confidence ratings, for both correct responses and errors (Boldt & Yeung,LKW6). This difference in patterns of results is because amplitudes at the same centro-parietal electrodes were already positively correlated with confidence during the pre-response period (indexed by effects on CPP component amplitudes).
These findings demonstrate that ERP differences which occur before the response can be mistakenly interpreted as amplitude differences of post-response ERP components (such as the Pe) when pre-response baselines are used. The reason for this is that a pre-response baseline correction will nullify existing differences in the respective baseline time window and -if these differences are systematically related to the conditions of interest -artificially propagate them into subsequent time windows.
This suggests that associations between confidence and Pe amplitudes (in correct response trials) reported in Boldt and Yeung (LKW6,see also Desender et al.,LKW]b) may reflect differences in pre-response CPP amplitudes across confidence ratings, rather than true differences in Pe amplitudes. However, our results are broadly congruent with the pattern of Pe amplitudes visible when using a pre-stimulus Electrophysiological Correlates of Decision Confidence XK baseline in Boldt and Yeung (LKW6), whereby Pe amplitudes increased with higher certainty in having made an error, but not with higher certainty in having made a correct response (their Figure 5B, see also Desender et al., LKW]

b).
These findings run contrary to a recently proposed model that attempts to unify error detection and decision confidence into a single framework (Desender et al., LKLWb). According to this model, two-choice decisions are initially made according to a double-bounded evidence accumulation process. Following the decision, the 'reference frame' of an ensuing metacognitive decision is proposed to shift to a singlebounded accumulation process that reflects one's degree of evidence that a decision error has been committed. In other words, the decision is framed similarly to a single- inverse relationship with decision confidence ratings (see their Figure WC). By framing post-decisional evidence accumulation in this way, the model fits error detection and confidence judgments (ranging from 'certainly incorrect' to 'unsure' to 'certainly correct') into the same underlying framework.
Contrary to the assumptions of the model, we did not find evidence supporting the notion that decision confidence shows a simple monotonic relationship with Pe amplitudes across the full confidence rating spectrum. Rather, it appears that Pe amplitudes scale with one's degree of certainty that they had made an error, specifically in trials where an error had been committed. Importantly, we did not observe evidence of covariation between Pe amplitudes and confidence ratings for trials with correct responses. This pattern more closely resembles a hypothetical evidence accumulation associated with error detection (e.g., Murphy et al.,LKW6) rather than decision confidence more generally. This in turn suggests that decision Electrophysiological Correlates of Decision Confidence XW confidence and error detection do not neatly fit into the single framework proposed by Desender et al. (LKLWb).
Our findings hint at dissociable sources of information being used to compute confidence for correct responses and for errors. However, we caution that it is unclear whether these effects on ERP components reflect computations that are critical to our sense of confidence, or changes to other decision processes that co-vary with confidence ratings. For example, errors typically constitute a rare and surprising event when performance is well above chance (Wessel & Aron, LKW^). When errors are detected, this triggers a cascade of processes that onset rapidly after the error, for example those associated with the orienting response (reviewed in Wessel, LKW`).
Consequently, it is unclear whether Pe amplitudes in our study (and other paradigms with similar properties) reflect different proportions of detected errors (and associated surprise-related responses) across confidence rating conditions. Formal models of error detection and confidence (e.g., Desender et al., LKLWa) may be useful for identifying patterns that are more specifically related to decision confidence.
We additionally note that the baseline-related issue described above is not particular to Boldt and Yeung (LKW6) and is present in the work of many others who have investigated post-decisional ERP correlates of error detection and confidence

!.;. Study Limitations
Our findings should be interpreted with the following caveats in mind. Firstly, our experiment design is different to previous work in that, in 6K% of trials, an SL target (which was informative regarding the correct response in the trial) appeared after responding to the SW target. Although the appearance of this stimulus was not predictable, it is possible that participants anticipated the onset of the SL stimulus, which could help them make more accurate confidence ratings in trials where they were unsure of their decision (i.e., had lower confidence). Based on this idea, it could be argued that the frontal component identified in our study reflects the focusing of attention in preparation for the SL stimulus rather than confidence. We do not believe this is the case because a similar frontal component was observed in Kelly and O'Connell (LKW5), and they did not present informative SL stimuli. However, we Electrophysiological Correlates of Decision Confidence XL recommend that future work attempts to identify this frontal component in situations where there is no anticipation of upcoming task-relevant information (or even task performance feedback).
We also note that, based on analyses of our own data, we cannot be certain that the same patterns of results will be found in re-analyses of existing datasets. For example, the study of Boldt & Yeung (LKW6) did not use post-target masking, and sensory information may have been available for post-decisional evidence accumulation to a greater extent than in our experiment, or others that used posttarget masks (e.g., Rausch et al., LKLK; for discussion of the dynamics of postdecisional evidence accumulation see Resulaj et al., LKK]; Turner et al., LKLL). In addition, we used interleaved, continuously-running staircases to determine target contrast, which differs to previous work that used a single stimulus discriminability level (e.g., Boldt & Yeung,LKW6) or multiple, discrete levels (e.g., Rausch et al., LKLK).
Our inferences here are based on the fact that we replicated effects seen in previous work when using similar analysis methods to those studies, but found markedly different results when using other analysis methods that avoid the issues mentioned above. For example, the lack of Pe amplitude variation across confidence ratings in favour of a correct decision mirrors the apparent lack of Pe amplitude differences when using a pre-stimulus baseline in Boldt & Yeung (LKW6,their Figure 5B). However, we believe that our analysis approach should be applied to a range of existing datasets to assess whether our results generalise across different stimulation and task contexts, as well as different confidence rating scales.
In addition, we found that participants varied in how they used the confidence rating scale. Although mean confidence ratings positively scaled with stimulus discriminability (depicted in Figure LB) and group-averaged confidence ratings showed similar patterns to previous work (e.g., Fleming et al., LKW`; Turner et al., LKLW), some participants provided a much broader range of confidence ratings than others (shown in Supplementary Figure S5). For our dataset (and many others), it is difficult to know whether inter-individual differences in confidence rating distributions reflect actual differences in internal estimates of confidence, or differences in how such internal estimates map onto the ratings given by participants (known as the criterion problem, see Peters & Lau,LKW6). Because of this, we were Electrophysiological Correlates of Decision Confidence X5 restricted to testing for linear relationships between confidence and ERP component amplitudes (following the analysis approach of Boldt & Yeung,LKW6). Consequently, we may have missed more complex, non-linear relationships between ERP amplitudes and confidence ratings across the scale (ranging from "certainly wrong" to "unsure" to "certainly correct") that may be associated with certainty rather than confidence (for a distinction between these concepts see Pouget, LKWV). Future work seeking to identify fine-grained non-linear relationships between confidence and neural measures could employ strategies that promote a standardised use of the entire confidence rating scale, although this may require extensive training prior to the experiment.
We also note that the frontal component identified in our study (which had an onset of ~L6K ms prior to the response) appeared to overlap in time with the later error-related negativity (ERN) component (Falkenstein et Gregorio et al., LKW`), and has been investigated as a possible neural correlate of confidence (e.g., Boldt & Yeung,LKW6;Rausch et al.,LKLK). Although we measured the frontal component over a time window earlier than that used to measure the ERN (e.g., -XK to VK ms relative to the response in Boldt & Yeung, LKW6), we could not accurately measure the ERN itself due to the overlap. Further work is needed to clearly define the extent of covariance between these two components, in order to ascertain whether they reflect similar processes during the time-course of decision formation. If they do reflect distinct sources of neuronal activity, then measuring the ERN using a pre-response baseline window that overlaps with the frontal component (e.g., as in There are also two factors to consider when comparing our non CSDtransformed and CSD-transformed results. The first is that CSD-transformation attenuates sources of neural activity that are broadly-distributed across the scalp (Kayser & Tenke, LKKV). The CPP is reliably found in CSD-transformed data and shows accumulation-to-bound trajectories that are characteristic of this ERP Electrophysiological Correlates of Decision Confidence XX component (Kelly & O'Connell,LKW5;Steinemann et al.,LKW`;Feuerriegel et al.,LKLWa;Kelly et al.,LKLW). However, we cannot rule out the possibility that there may have been more broadly-distributed sources of ERPs that covary with confidence and were attenuated by CSD transformation. Whether these (if they exist) can be classified as the CPP component, however, is unclear. In any case, the topographies of associations between confidence and ERP amplitudes during the pre-response CPP time window in Supplementary Figure SV show that the frontal component identified in our study is very likely to bias measures in non CSD-transformed data, and caution should be taken to dissociate any overlapping effects.
The second factor is that CSD-transformed ERP measures tend to be more variable compared to non CSD-transformed ERPs (Vidal et al.,LKK5). This may have prevented us from identifying associations between CPP amplitudes and confidence in CSD-transformed data. However, we believe this is unlikely, as beta coefficients were tightly clustered around zero (Supplementary Figure SX), and the Bayes factor of BF!" = K.LL indicated moderate evidence for the null hypothesis, rather than showing values around W that do not provide substantial support for the null or alternative hypotheses, as would be expected if only the variability had increased. However, we note that this Bayes factor does not indicate overwhelming evidence for the null, and analyses of existing datasets will be useful to see if this null result can be replicated.

!.!. Conclusion
We probed the neural correlates of decision confidence using a difficult perceptual discrimination task. By analysing conventional (non CSD-transformed) ERPs we confirmed that pre-response CPP amplitudes are correlated with confidence.
Electrophysiological Correlates of Decision Confidence X6 However, analyses of CSD-transformed ERPs revealed that these effects at centroparietal channels might be due to the influence of a frontal component whose amplitude was also correlated with confidence. This frontal effect appeared to influence measures of the CPP at centro-parietal channels via volume conduction. By systematically analysing the post-decisional Pe component using pre-stimulus and pre-response baselines, we also determined that the amplitude of the Pe inversely scaled with confidence, but we only observed this association in trials with erroneous decisions. Our findings highlight the possibility that previously reported relationships between Pe component amplitudes and the full spectrum of confidence across correct and error trials were (at least partly) due to methodological issues related to the use of pre-response baselines. Taken together, our findings suggest that certainty in having made a correct decision is indexed by fronto-central activity during decision formation, and certainty in having made an error is indexed by the amplitude of the post-decisional Pe component. These processes, which occur over distinct time windows, may jointly inform confidence judgments in perceptual decision tasks.