Neural Mechanisms of Feedback Processing and Behavioral Adaptation during Neurofeedback Training

The acquisition of new skills can be facilitated by providing individuals with feedback that reflects their performance. This process creates a closed loop that utilizes feedback processing and behavioral adaptation following feedback to promote effective training. Functional magnetic resonance imaging (fMRI)-based neurofeedback is a specific instantiation of this principle, where the brain is trained directly by providing feedback of its self-regulation. Neurofeedback is unique in that it is the most direct form of brain training and it trains something we do not normally have conscious access to – our brain activity. To understand how learning with neurofeedback or other forms of feedback is accomplished, it is essential to understand how the feedback is evaluated and how behavior is adjusted following guidance from the feedback signal. In this pre-registered mega-analysis, we re-analyzed data from eight intermittent fMRI neurofeedback studies (N = 153 individuals) to investigate brain regions whose activity and connectivity are associated with feedback processing and behavioral adaptation to feedback during neurofeedback training. We converted and harmonized feedback scores across studies, and computed their linear associations with brain activity and connectivity in parametric general linear model analyses. We observed that, during feedback processing, feedback scores were positively associated with (1) activity in key regions of the reward system, as well as the dorsal attention network, default mode network, and cerebellum; and with (2) reward system-related connectivity in the salience network. During behavioral adaptation (i.e., regulation after feedback), no significant associations were observed between feedback scores and either activity or associative learning-related connectivity. Our results demonstrate that neurofeedback is processed in the reward system, thereby endorsing the theory that reinforcement learning shapes this form of brain training towards behavioral change. In addition, the association of large-scale networks with feedback suggests that higher-level processing, involving the continuous transition between the evaluation of external feedback and the subsequent internal evaluation of the adopted cognitive state, is also involved in this type of learning. Our findings highlight the pivotal role of performance-related feedback as a driving force during learning, a conclusion that can potentially be extended to other processes beyond neurofeedback training.


Introduction
Feedback can facilitate and aid the acquisition of new skills by providing individuals with information about their performance.The information has to be evaluated and compared to the previous action in order to improve task performance.This process creates a closed-loop system, where feedback is constantly updated and individuals adjust their behavior accordingly.In order to maximize positive and minimize negative feedback, individuals learn by adapting their behavior, which involves processing feedback information (requiring emotional valuation and memory), and the motivation for and planning of new behavior.This learning process is known as reinforcement learning.
Reinforcement learning is usually associated with the modulation of dopaminergic activity in the reward system, a brain network comprising limbic regions (midbrain, ventral and dorsal striatum, orbitofrontal cortex, ventromedial prefrontal cortex, amygdala, thalamus), as well as the anterior and posterior cingulate, insular cortex, and inferior frontal gyrus (O'Doherty, 2004;Schultz et al., 1997;Tricomi and DePasque, 2016;Tsukamoto et al., 2006).Several functional imaging studies have shown greater activity in several of these regions in response to positive feedback (Elliott et al., 2003;Marco-Pallarés et al., 2007;Nieuwenhuis et al., 2005bNieuwenhuis et al., , 2005a).The limbic system is thought to have an important role in feedback processing due to its involvement in motivation and assigning value to information (Haber, 2011).Notably, the nucleus accumbens (NAcc) is a key integrative region for feedback processing, motivation, and learning, and is thought to modulate behavior according to a goal (Goto and Grace, 2005), being connected to the amygdala, the hippocampus (Goto and Grace, 2008), the ventral tegmental area (Camara et al., 2009;Knutson and Gibbs, 2007), as well as other limbic regions and cortical regions.After receiving feedback on task performance, the individual can reframe behavior or cognitive strategies to improve performance.The dlPFC plays a critical role in associative learning due to its engagement in working memory, attentional switching, and response selection (Niendam et al., 2012).Top-down attentional control is primarily associated with the posterior parietal cortex (and intraparietal sulcus) (Corbetta and Shulman, 2002;Green and McDonald, 2008).Alongside parietal regions, the anterior cingulate cortex and NAcc have been suggested to be involved in behavior adaptation (Holroyd and Coles, 2002).
Feedback is essential for learning, but when it comes to training the brain directly, we do not have conscious access to our own brain activity.Neurofeedback overcomes this shortcoming by converting brain activity into sensory feedback (e.g., a visual thermometer).Using neurofeedback, participants can learn to modulate their own brain activity voluntarily, which can lead to behavioral changes (Sitaram et al., 2017;Weiskopf et al., 2004).Neurofeedback experiments entail a distinctive form of learning, wherein the objective is to improve the self-regulation of a particular neural signal.Neurofeedback learning is thought to be driven by reinforcement learning, whereby neural states become more probable when they are associated with performance-related rewards (Lubianiker et al., 2022;Shibata et al., 2019;Sitaram et al., 2017).In this context, effective feedback processing represents a crucial factor in reinforcement learning.Several other factors make neurofeedback a worthwhile paradigm to investigate the neural mechanisms of feedback processing and behavioral adjustment.Often, neurofeedback experiments provide graded feedback rather than binary feedback, which can be studied using parametric general linear models (GLM) (Radua et al., 2018).Neurofeedback experiments also often employ the acquisition with whole-brain coverage.Finally, neurofeedback studies using intermittent feedback, as opposed to continuous feedback, separate feedback presentation and periods of self-regulation into temporally distinct blocks, allowing feedback evaluation to be examined without interference from self-regulation (Johnson et al., 2012;Lubianiker et al., 2022).Although several cognitive theories have been proposed to elucidate the mechanisms underlying learning in neurofeedback experiments (Lubianiker et al., 2022;Sitaram et al., 2017), and the focus of some studies on brain responses due to feedback during neurofeedback training (Dewiputri et al., 2021;Hinterberger et al., 2005;Mathiak et al., 2015;Radua et al., 2018;Shibata et al., 2019), the relationship between feedback scores and brain changes during feedback and regulation blocks in the context of neurofeedback training remains poorly understood.
Here, our goal was to capitalize on existing neurofeedback studies to investigate the brain correlates of feedback processing and behavioral adjustment, i.e., adapting or reinforcing behavior depending on the feedback received.More specifically, we reanalyzed a dataset of 153 participants from eight fMRI-based intermittent neurofeedback studies using parametric GLMs.
Pooling data from several neurofeedback studies allows for examining generalizable neural mechanisms while minimizing study-specific effects (e.g., feedback representation, directionality of regulation, and target brain regions).We only included studies using intermittent feedback to make sure that the feedback processing and self-regulation are temporally not overlapping.
We hypothesized that, in the context of neurofeedback training, (1) during feedback presentation blocks, performance-related feedback scores are positively associated with activity in rewardrelated brain regions (ventral tegmental area, NAcc, ventral striatum, anterior cingulate cortex, anterior insular cortex), and (2) with connectivity between those regions.We also hypothesized that (3) during regulation blocks, performance-related feedback scores on the previous trial are negatively associated with activity in brain regions related to neurofeedback control (dorsolateral prefrontal cortex, posterior parietal cortex, lateral occipital cortex, thalamus) and (4) with connectivity between those regions.The negative association posited in hypotheses 3 and 4 would indicate that a reduction in the level of performance-related feedback received would necessitate greater effort for the recalibration of the regulatory strategy, resulting in an increase in brain activity/connectivity.
Testing these hypotheses will contribute to our understanding of how the brain processes feedback and how behavior is adapted after feedback presentation in neurofeedback training.
The insights we gain from this study might also generalize to other feedback-based learning contexts that depend on reward processing and behavioral adaptation.Thus, our findings may inform neuroscientific theories of learning and future brain-based interventions to optimize learning.

Methodology
The hypotheses and the analyses were preregistered prior to conducting the study: osf.io/bzweg/

Data collection procedure
The present study is a mega-analysis (Costafreda, 2009), which involved systematically searching the literature for relevant studies, requesting data sharing from the authors, and analyzing raw data using a different approach than those employed in the original studies.
First, we systematically searched for articles whose data could potentially be included in our mega-analysis study using the Scopus systematic search (www.scopus.com).Inclusion criteria were (1) original research on neurofeedback, (2) studies using fMRI as the acquisition technique, (3) paradigms using intermittent (rather than continuous) neurofeedback, and (4) published as peer-reviewed scientific articles (details of the procedure can be found in Supp.Material).The search identified seven studies that could be included in our mega-analysis.Two additional studies were included that were not identified in the search but met the inclusion criteria for this analysis (Amano et al., 2016;Zweerings et al., 2019).We contacted all authors via e-mail and inquired about data sharing of raw anatomical and functional imaging data, performance-related feedback scores for each feedback block, and basic information about the experimental design (e.g., block onsets and durations).We also requested minimal descriptive demographic information on gender and age.Of the nine selected studies, we obtained consent and collected data from eight fMRI-neurofeedback studies (Amano et al., 2016;Hellrung et al., 2018;Keller et al., 2021;Krause et al., 2021;Pamplona et al., 2020;Scheinost et al., 2020;Zweerings et al., 2020Zweerings et al., , 2019)).This study protocol was approved by the ethics committee of the University of Vienna (EK 00621).

Participants
Relevant demographic information about each study is described in Table 1.Irrespective of the original design, only data from healthy participants who received intermittent and veridical feedback were considered.Participants who received sham feedback (Scheinost et al., 2020) were not included in this study.Due to technical issues in the data sharing process, data from some participants could not be recovered (four from Zweerings et al., 2019, andone from Hellrung et al., 2018) resulting in a data set of 153 participants.

fMRI acquisition parameters and experimental design of the studies
A summary of the fMRI acquisition parameters is shown in Table 2.For all studies, data acquisition was performed using 3T MRI scanners, echo-planar sequences and axial slice orientation.A summary of the experimental design for each study can be found in the supplementary material (Table S1).

Harmonization of feedback values
We investigated linear associations between performance-related feedback during neurofeedback training and estimates of brain activity and connectivity in parametric analyses (i.e., over a range of feedback scores), rather than in a categorical analysis (i.e., positive versus negative feedback).Feedback scores from each dataset were harmonized across studies before computing these associations.
In the first step, we converted feedback scores into numerical values.Three studies presented a two-digit representation proportional to the previous block of self-regulation performance (Keller et al., 2021;Zweerings et al., 2020Zweerings et al., , 2019).Accordingly, no numerical transformation was conducted for these studies.For the other five studies, feedback was presented graphically in the form of thermometers (Hellrung et al., 2018;Pamplona et al., 2020), concentric discs (Amano et al., 2016;Krause et al., 2021), or a speedometer (Scheinost et al., 2020).We converted the feedback scores of all datasets to decimal numbers in the range 0-1, proportional to the level of steps compared to the maximum and minimum possible representations provided after the regulation block.Datasets from three studies (Hellrung et al., 2018;Krause et al., 2021;Zweerings et al., 2019) combined up-and down-regulation trials during neurofeedback training.To reflect feedback representation of success and failure independent of directionality, we inverted the scale for feedback presentation blocks following down-regulation (i.e., 0 and 1 represented the upper and lower feedback limits, respectively).For one study, the requested direction of feedback representation was rightward (Scheinost et al., 2020), so 0 and 1 represented the leftmost and rightmost feedback limits, respectively.
For some datasets, the converted feedback values showed a highly non-uniform distribution that could affect the linear regression of the parametric analysis.In extreme cases, a high number of feedback scores near the upper or lower limits would lead, in practice, to a categorical analysis (most commonly, near the upper and lower limits).Therefore, we applied the following iterative procedure to remove highly frequent range-specific values.First, for each individual, we calculated the histogram of feedback scores with h bins, in which h is the number of possible subject-specific graphical feedback steps between the minimum and maximum feedback scores received by the subject during training (Hellrung et al., 2018;Keller et al., 2021;Pamplona et al., 2020;Zweerings et al., 2020Zweerings et al., , 2019)).In some studies, continuous graphical feedback steps were presented, and h was determined by subtracting for each subject the minimum feedback score from the maximum and dividing the result by 100 (Amano et al., 2016;Krause et al., 2021;Scheinost et al., 2020).We then calculated the mean and standard deviation (SD) of the counts across the bins.If the bin with the highest count of feedback scores was above the threshold of the mean value plus three SD, one score from a randomly selected feedback presentation block belonging to that bin was marked as censored.Then, the mean and SD of the counts across bins were recalculated, and the procedure was repeated until the bin with the highest count was not higher than the threshold.Finally, scores from feedback presentation blocks marked as censored were removed from the analyses.The final mean percentage (± SD) of censored feedback scores across individuals in each dataset was 30.0 ± 15.1% for Keller et al., 2021;40.3Hellrung et al., 2018, 40 for Pamplona et al., 2020;98.6 ± 34.9 for Amano et al., 2016;11.5 ± 0.5 for Scheinost et al., 2020;375.5 ± 8.4 for Krause et al., 2021.

fMRI preprocessing
We applied the same preprocessing pipeline to all datasets to minimize variance across studies that could arise from preprocessing disparities.All fMRI data were preprocessed prior to statistical mapping using MATLAB (version R2022b, www.mathworks.com) and the Statistical Parametric Mapping toolbox (SPM12, Wellcome Department of Imaging Neuroscience, University College London, UK).We first slice-time corrected functional MRI data (according to the slice acquisition order of the dataset) using the middle slice as the reference.From the resulting images, we estimated the translation and rotation parameters of head motion, and resliced the images to a created mean image using a fourth-degree B-spline interpolation.We then coregistered the anatomical image to this mean image.The coregistered anatomical image was used to generate a deformation field to normalize the resulting anatomical and functional images of all subjects according to the standard Montreal Neurological Institute (MNI) stereotactic space.Finally, we performed spatial smoothing using a Gaussian kernel of 8 mm³ at full-width half maximum (FWHM).

First-level analysis of the association between performance-related feedback and estimates of activity/connectivity
We performed voxel-wise mass-univariate GLM analyses using SPM12 to determine regions whose activity/connectivity are parametrically associated with feedback values during feedback presentation blocks as well as during the subsequent regulation blocks.One GLM was specified separately for each model to test for distinct hypotheses.
Prior to GLM analyses, we concatenated the runs of each individual to center feedback scores subject-wise rather than run-wise, to increase the number of blocks for parametric analysis and to extract single time series for connectivity analyses.This step was done using the SPM 'spm_fmri_concatenate' function (github.com/spm/spm12/blob/master/spm_fmri_concatenate.m), which includes run-specific regressors to remove run effects (henceforth, run-effect regressors) and corrects the high-pass filter and non-sphericity calculations.
Model 1. Model 1 investigated the association between performance-related feedback and activity during feedback blocks (Hypothesis 1).As indicated in the preregistration, for some datasets, the last feedback block of each run was not included in the regressor because this block was at the end of the training run and its delayed hemodynamic response could not be determined.For each individual, we constructed two regressors, one for feedback presentation blocks and one for regulation blocks, as boxcar functions using study-specific onsets and durations.The (unmodulated) regressor representing feedback presentation was orthogonalized with a parametrically modulated regressor representing the performance-dependent feedback score presented in each block.
Model 2. Model 2 was designed to investigate brain regions whose connectivity with a rewardrelated region, the NAcc, is associated with the level of performance-related feedback during feedback blocks (Hypothesis 2).Similar to Model 1, the last feedback block of each run was not considered for some datasets.We used the atlas IBASPM 71 (Alemán-Gómez et al., 2006) in the WFU PickAtlas toolbox (Maldjian et al., 2003, www.nitrc.org/projects/wfu_pickatlas/) to define a binary mask of the bilateral NAcc.Next, we extracted the time-course averaged within this ROI over the concatenated runs, in which variance explained by the six parameters of head motion, run-effect regressors, and regressors of no-interest were subtracted.We then used the regressors Model 3. Model 3 investigated the association between performance-related feedback and activity during regulation blocks (Hypothesis 3).The first regulation block of each run was not included in the model, because there was no feedback presentation block prior to the first regulation block.As announced in the preregistration, we constructed four regressors representing feedback presentation blocks, baseline blocks, the first four seconds of the regulation blocks (referred to here as "onset of regulation"), and the remaining regulation block.
We included only the beginning of regulation blocks in the model because we assumed that the modulation would be stronger in the initial phase of the new strategy, as compared to its maintenance over the block.In addition, we aimed to control for different durations of the regulation block across studies.The (unmodulated) regressor representing the onset of regulation was orthogonalized with a parametrically modulated regressor representing the performancerelated feedback score presented in each previous feedback block.
Model 4. Model 4 was designed to investigate brain regions whose connectivity with the dorsolateral prefrontal cortex (dlPFC) is associated with the level of performance-related feedback during the onset of regulation blocks (Hypothesis 4).Similar to Model 3, the first regulation block in each run was not considered.We selected the left dlPFC as ROI due to its link to behavior adaptation during neurofeedback (Emmert et al., 2016;Sitaram et al., 2017).For this ROI definition, we first downloaded a meta-analytic map from Neurosynth (https://neurosynth.org/) using the term "dlpfc".Then, using the MarsBaR toolbox (marsbar.sourceforge.net),we built a 6mm-radius spherical ROI centered on the MNI coordinate peak (-46, 38, 30) of the meta-analytic map to represent the dlPFC.Next, we extracted the time-course averaged within this ROI over the concatenated runs, controlled by the variance from co-variates of no-interest, i.e., the head motion, run-effect regressors, and regressors of no-interest (i.e., other than the ones representing the modulated onset of regulation and baseline).We then constructed the PPI regressor following the same procedure described for Model 2 and using the regressors created for Model 3. The task regressor here was a boxcar function constructed with the blocks of modulated onset of regulation and baseline.
All parametric modulators were specified as first order (linear association).In addition to unmodulated, modulated, and run-effect regressors, the six head motion parameters estimated in the preprocessing step were added to the first-level design matrices.For six datasets (Hellrung et al., 2018;Keller et al., 2021;Krause et al., 2021;Pamplona et al., 2020;Zweerings et al., 2020Zweerings et al., , 2019)), we applied a high-pass filter with a 128-s cut-off to remove the low-frequency signal.Due to the long periods between feedback presentation blocks, a high-pass filter with a cut-off of half the run duration (Nurmi et al., 2018) was applied to the remaining two datasets.We used a liberal whole-brain mask with a threshold of 0.1 and a first-degree auto-regressive model to remove the autocorrelation in the signal.Regressors were convolved with the canonical hemodynamic response function (HRF) of SPM12.For each individual and each model, we created contrast maps with the beta estimates of the feedback-modulated regressors representing activity in feedback blocks (Model 1), connectivity in feedback blocks (Model 2), activity in regulation onset blocks (Model 3), and connectivity in regulation onset blocks (Model 4).For Models 1 and 2, we included the following study-specific regressors to represent covariates of no interest: selfregulation training blocks (Keller et al., 2021;Zweerings et al., 2020Zweerings et al., , 2019)), blocks for assessing emotional neural responses (Keller et al., 2021), for presenting a percent sign (neutral feedback) (Keller et al., 2021;Zweerings et al., 2020Zweerings et al., , 2019)), for passive viewing of a picture (Keller et al., 2021;Zweerings et al., 2020) for backward counting (Hellrung et al., 2018;Zweerings et al., 2019), and for feedback related to backward counting (Hellrung et al., 2018).The same covariates were included in Models 3 and 4, as well as the remaining regulation block period for all studies.Finally, we estimated the t-value map to test for voxelwise activation differences from zero for each individual.The PPI estimation, i.e., Models 2 and 4, was not performed for the data of Scheinost et al., because of the absence of baseline blocks.

Group-level analysis of activation and connectivity estimates
To investigate group estimates of whole-brain activation and connectivity, we performed secondlevel random-effects analyses using individual SPM t-value maps.We used t-values instead of the more conventional contrast maps (which here correspond to beta activation maps) to make the datasets more comparable.
Second-level one-sample t-tests were performed separately for each model to test for non-zero voxel-wise group estimates.Binary regressors representing studies (i.e., with values assigned as one or zero for inputs belonging to or not belonging to a study) were included as covariates of no interest.We used an explicit mask that included the entire cerebrum and the superior part of the cerebellum.As specified in the preregistration, to generate whole-brain group-level thresholded maps, we used a voxel-level inclusion threshold of p < 0.001 and a cluster-level threshold of p < 0.05, FWE (family-wise error)-corrected for multiple comparisons (random-field theory; Worsley et al., 1996).Because many brain regions were identified for Model 1, we further applied a more stringent statistical threshold (voxel-level FWE-corrected p < 0.05) to identify regions with the strongest associations among the findings.We generated whole-brain maps for visualization using bspmview (bobspunt.com/software/bspmview/).We reported peak coordinates (multiple peaks were reported for the same clusters if the separation between them was greater than 10 mm) and the results were automatically labeled using Automated Anatomical Labelling 3 (AAL3) (Rolls et al., 2020).For visualization purposes only and due to the lack of significant results for Models 3 and 4, we also report whole-brain maps thresholded for small effect sizes for these models.These liberal maps were obtained by thresholding absolute t-values at  = 2.46 for Model 3 and  = 2.38 for Model 4, which would correspond to the threshold for a weak effect size ( = 0.20), obtained by the following formula: where  is the effect size (Cohen's d), and N is the sample size ( = 152 for Model 3 and  = 142 for Model 4).

ROI analysis
An additional exploratory ROI analysis that was not preregistered was performed to investigate whether activation or connectivity in predefined regions was related to feedback processing or behavior adjustment after feedback.This analysis complements the whole-brain analyses because averaging the voxels within predefined ROIs might increase the signal-to-noise ratio, leading to higher statistical power.We included the following ROIs involved in (1) performancerelated reward processing (Camara et al., 2009;Drueke et al., 2015;Marco-Pallarés et al., 2007;Tricomi and DePasque, 2016): caudate nucleus, putamen, thalamus, NAcc, and substantia nigra; (2) self-regulation: (Sitaram et al., 2017) NAcc and the dlPFC, respectively, these regions were removed from the ROI analyses for these Models.The ROIs and their center coordinates are described in Table 3.The procedure for the definition of ROI masks is detailed in the Supp.Material.Using MATLAB custom scripts, we extracted and averaged the SPM t-values, obtained in the first-level analysis, from voxels within each defined ROI.This procedure was repeated for each (2) in which value is the mean SPM t-value within the ROI, and study is the study to which the individual belongs.Thus, we tested whether the intercept of the linear regression was non-null (i.e., if the association of the estimate with the feedback scores is significantly different from zero) while the studies were treated as covariates of no interest.The intercept was considered significant if the estimated 95% confidence interval of the result did not cross the zero-level.These confidence intervals were adjusted for multiple comparisons using the Bonferroni method at the ROI level.Using R, forest plots were generated for visualization.

Model 1association between feedback and activation during feedback blocks
In the whole-brain analysis using thresholds set at a voxel-wise uncorrected p < 0.001 and clusterwise FDR-corrected p < 0.05, we found a positive association between the level of performancerelated feedback and activity in clusters comprising the basal ganglia (NAcc, ventral part of the caudate nucleus and the anterior part of the putamen and the ventral pallidum), bilateral IFG, rostral ACC, PCC, mPFC, left angular gyrus, and bilateral cerebellum (Fig. 1A, Table S2) during the feedback blocks.For illustration purposes, we report a map showing that the strongest associations were in the bilateral NAcc, mPFC, and right cerebellum (Fig. 1B, Table 4; FWEcorrected voxel-level threshold of p < 0.05).In the ROI analysis, we also observed this positive association for the NAcc, the rostral ACC, and the mPFC (Fig. 1C).

Model 2association between feedback and connectivity with NAcc during feedback blocks
In the whole-brain analysis, we found no associations between the level of performance-related feedback and task-related connectivity with the NAcc during the feedback blocks.For illustration purposes, we provide a map for associations at a lower threshold (Fig. S1, Table S3; uncorrected voxel-level threshold of p < 0.001).In the ROI analysis, we observed this positive association for the superior ACC, the bilateral anterior insula, and the substantia nigra (Fig. 2).

Model 3association between feedback and activation during selfregulation
We found no associations between the preceding level of performance-related feedback and activity during regulation blocks in the whole-brain analysis.A whole-brain map thresholded for small effect sizes for this association is shown for visualization (Fig. S2A, Table S4).Similarly, no associations were found between the preceding level of performance-related feedback and activity during regulation blocks in the ROI analysis (Fig. S2B).

Model 4association between feedback and connectivity with the dlPFC during self-regulation
We found no associations between the preceding level of performance-related feedback and taskrelated connectivity with the dlPFC during regulation blocks in the whole-brain analysis.A wholebrain map thresholded for small effect sizes for this association is shown for visualization (Fig. S3A, Table S5).Similarly, no associations were found between the preceding level of performance-related feedback and task-related connectivity with the dlPFC during regulation blocks in the ROI analysis (Fig. S3B).

Discussion
We investigated brain regions whose activity and connectivity were associated with feedback processing or behavioral adjustment during neurofeedback training.The findings are based on a mega-analysis of eight studies and a total of 153 individuals.We found positive associations between activity and feedback scores during feedback processing in the nucleus accumbens (NAcc), putamen, caudate, ventral pallidum, medial prefrontal cortex, bilateral inferior frontal gyrus, rostral anterior cingulate cortex, posterior cingulate cortex, left angular gyrus, left superior parietal lobule, and cerebellum (Fig. 1).In addition, connectivity between the NAcc and several brain regions, namely the substantia nigra, anterior insula, and superior anterior cingulate cortex, was positively associated with feedback scores during feedback processing (Fig. 2).We found no regions whose activity or connectivity with the dlPFC was significantly associated with feedback scores during behavioral adjustment.In this discussion, we will elaborate on our findings of brain regions associated with reward processing during neurofeedback training by grouping them into core functional brain networks, namely the basal ganglia, the default mode network, and the salience network, as well as other regions associated with attentional control and activitymodulating dopaminergic regions.A summary of the findings and possible feedback-related roles of each network is depicted in Figure 3.In neurofeedback training, individuals can learn control over, e.g., activity within a predefined region by receiving contingent feedback that reflects the regulation performance.Several theories have been postulated to explain feedback-based learning during neurofeedback training.For example, it was proposed that operant conditioning (i.e., reinforcement) drives learning in neurofeedback training (Hellrung et al., 2022;Lubianiker et al., 2022;Sitaram et al., 2017).After the feedback is processed and evaluated, the individual's strategy is modulated in order to improve performance and prediction error is thought to decrease over successful training.For example, a "neurofeedback control network" has been proposed based on contrasting selfregulation over sham feedback (Ninaus et al., 2013), on the identification of common activations during regulation blocks across fMRI-neurofeedback studies (Emmert et al., 2016), or based on the comparison between continuous and intermittent feedback (Dewiputri et al., 2021).These studies indicate the involvement of regions comprising the salience and the frontoparietal control networks, related to the control of cognitive processes modulated by bottom-up stimuli and the switch between externalized and internalized cognitive processes (Corbetta and Shulman, 2002;Dosenbach et al., 2008;Menon, 2011;Sridharan et al., 2008).All these processes represent important aspects of neurofeedback learning, and revealing their underlying neural mechanisms is paramount to a better understanding of feedback-based learning.

Model 1association between feedback and activation during feedback blocks
We confirmed our first hypothesis, which stated that, during feedback presentation blocks, performance-related feedback scores are positively associated with activity in reward-related brain regions.More specifically, we observed that brain activity associated with feedback processing was mainly located in the bilateral NAcc, which is part of the basal ganglia and an important region of the dopaminergic reward system.The fact that the reward-related regions are associated with feedback scores provided during neurofeedback training supports the theory that neurofeedback learning is driven by reinforcement learning (Hellrung et al., 2022;Lubianiker et al., 2022;Shibata et al., 2019).We also observed this association in the medial prefrontal cortex, rostral anterior cingulate cortex, and cerebellum (Fig. 1A and 1C); as well as in other parts of the basal ganglia, the bilateral inferior frontal gyrus, posterior cingulate cortex, left angular gyrus, and left superior parietal lobule (Fig. 1A).
However, such findings on self-regulation could potentially be entangled with feedback processing, as the analyses were performed in neurofeedback experiments employing continuous feedback (Haller et al., 2013).During neurofeedback experiments with continuous feedback, the evaluation of feedback and regulation of the target brain regions occurs simultaneously.Here, we analyzed data from neurofeedback experiments that temporally segregate feedback appraisal and regulation phases.Therefore, we suggest that basal ganglia involvement is related to feedback appraisal, rather than self-regulation of neurofeedback training.
We observed a strong association between feedback scores and activity during feedback blocks in the NAcc (part of the ventral striatum).This association is consistent with several studies showing that the NAcc shows higher activity for positive compared to negative feedback (Delgado et al., 2000;Fouragnan et al., 2018;Marco-Pallarés et al., 2007;Nieuwenhuis et al., 2005b;Yacubian et al., 2006).The NAcc is an integrative region that connects cortical and limbic regions, as well as the midbrain, and regions associated with dopamine release (Garris et al., 1999).Other studies have shown that activity in the NAcc correlates with gain-related prediction error (Montague et al., 1996;Shohamy, 2011;Yacubian et al., 2006), which is crucial for associative learning (Daniel and Pollmann, 2012).The involvement of the NAcc in feedback processing during neurofeedback, and its established link to prediction error may also indicate the applicability of the associative learning theory of neurofeedback learning.The ventral striatum has also been implicated in unconscious reward processing (Ramot et al., 2016;Sitaram et al., 2017).Therefore, it is noteworthy that neurofeedback studies using implicit feedback, i.e., a setup that operates without the conscious awareness of the participant but with indirect methods to induce intended brain activity (Watanabe et al., 2017), also elicit activation of the ventral striatum (Shibata et al., 2019).
Although the highest association between activity and feedback scores during neurofeedback training was found in the NAcc, we also found this association in other regions of the basal ganglia (Fig. 1A).We observed this positive association in the ventral part of the caudate nucleus and the anterior part of the putamen.These regions comprise the dorsal striatum, which has been reported to be more activated in response to positive than negative feedback (Drueke et al., 2015;Duijvenvoorde et al., 2008;Wächter et al., 2009).It has been suggested that the dorsal striatum receives reward-related information from the ventral striatum and uses this information to predict and maximize positive outcomes (Tricomi and DePasque, 2016;Yin et al., 2005).The dorsal striatum is part of the "associative loop", which links rewards or punishments with previous actions (Tricomi and DePasque, 2016).Furthermore, activation in the caudate due to performancerelated intrinsic feedback is similar to that produced by extrinsic (e.g., monetary) rewards or punishments (Tricomi and DePasque, 2016).Tricomi et al. (2006) proposed that the caudate facilitates feedback-based learning by identifying and assigning value to correct and incorrect responses (Tricomi et al., 2006).The ventral caudate has also been associated with short-term reward and has been correlated with the magnitude of behavioral adaptation (Haruno et al., 2004), guiding behavioral learning through prediction error.Furthermore, the stimulus-action-reward association was found to be located in the anterior part of the putamen (Haruno and Kawato, 2006).These fMRI findings regarding a more precise anatomical definition of the ventral striatum in reward processing are consistent with our results.We also observed that the anterior part of the ventral pallidum (Fig. 1A), another major basal ganglia region, was positively associated with feedback scores during neurofeedback training.Studies indicate that the ventral pallidum codes several aspects of reward, such as information about prediction error, valence, and surprise (Schultz, 2016;Tachibana and Hikosaka, 2012).Furthermore, we observed a positive association between feedback scores and activity in the rostral anterior cingulate cortex.The present findings are consistent with previous research indicating that the rostral anterior cingulate cortex is influenced by a positive discrepancy between actual and anticipated feedback, and that this region plays a pivotal role in evaluating salient feedback and shaping optimal learning (Amiez et al., 2012).
We observed a positive association between activity and feedback scores during neurofeedback training in key regions of the default mode network (Andrews-Hanna et al., 2014): mPFC, posterior cingulate cortex, and left angular gyrus (Fig. 1A).This large-scale brain system, which is commonly associated with deactivation in response to stimuli demanding externally-directed attention (Fox et al., 2005), conversely shows activation when internally-directed attention is induced (Gusnard et al., 2001;Harrison et al., 2008;McDonald et al., 2017;Pamplona et al., 2020;Spreng, 2012).Previous findings support that the presentation of feedback during neurofeedback training induces the involvement of some regions of the default mode network (Shibata et al., 2019) or even the whole network (Radua et al., 2018).Interestingly, our results show that the more positive the performance-related feedback, the greater the activity in the default mode network.This finding indicates that positive rewards resulting from successful selfregulation during training may elicit higher levels of internally-focused attention, potentially related to the evaluative aspects of the strategy employed.Alternatively, or possibly in conjunction with this, negative feedback may prompt an active, goal-directed control process that alters the selfregulation strategy, drawing on higher-order cognitive processes and suppressing the default mode network.The process of revisiting, evaluating, and selecting self-regulatory strategies essentially uses internally-focused attention (Kam and Handy, 2013).An internal evaluation of the self-regulatory strategy is presumably necessary to maximize performance in subsequent trials and thus optimize reward.The association between default mode network activity and feedback scores may also support the reinforcement learning theory in neurofeedback: higher rewards elicit a higher level of internal appraisal, which in turn may be beneficial to drive learning over training.Methodologically, our findings also suggest that caution should be taken when designing neurofeedback experiments that target voluntary deactivation of the default mode network (or the other regions resulting from Model 1) using continuous feedback.While downregulation of this brain system is intended, positive feedback generated by successful regulation could, at least partially, elicit positive responses and thus operate in opposite direction as intended.Therefore, this feedback presentation interference favors the use of intermittent over continuous feedback, depending on the target region.
Our results from this analysis also show that activity in cortical regions, such as the bilateral inferior frontal gyrus, was positively associated with feedback scores during neurofeedback training (Fig. 1A).These findings are consistent with previous studies on the neural mechanisms of reward (Duijvenvoorde et al., 2008;Radua et al., 2018) and strategy execution during intermittent neurofeedback (Dewiputri et al., 2021).Involvement of the inferior frontal cortex may reflect the semantic conceptualization of abstract (nonverbal) information (Hoffman et al., 2015) or the selection of competing executed strategies retrieved from semantic memory during feedback presentation (Thompson-Schill et al., 1997).Reward-related activation in the visual cortex has been previously reported and interpreted as enhanced visual processing of stimuli (Drueke et al., 2015).We also observed that the cerebellum was positively associated with feedback scores.While relatively little is known about cerebellar activations and feedback processing, activation in the right cerebellum has previously been associated with positive (compared to negative) feedback (Marco-Pallarés et al., 2007).A recent animal study also showed that the cerebellum sends excitatory projections to reward-encoding regions (Carta et al., 2019).Finally, we found that feedback scores were positively associated with activity in the left superior parietal lobule (Fig. 1A).This region has been implicated in top-down attentional control processes (Corbetta and Shulman, 2002;Green and McDonald, 2008), goal-directed behavior (Bressler and Menon, 2010), and feedback processing (Crone et al., 2008).Altogether, we interpret this region to be responsible for controlling attentional effort in proportion to feedback values.

NAcc during feedback blocks
Model 1 confirmed the NAcc as the key region for feedback processing during neurofeedback training.Model 2 investigated regions whose connectivity with the NAcc was modulated by the feedback scores.This was done using PPI, which controls for functional connectivity that is independent of the task (e.g., in resting-state designs), thus revealing only functional connectivity that is modulated by feedback processing (O'Reilly et al., 2012).We confirmed our second hypothesis, which stated that, during feedback presentation blocks, performance-related feedback scores are positively associated with connectivity in reward-related brain regions.More specifically, significant associations were identified between feedback scores and connectivity with the NAcc in the substantia nigra, anterior insula, and superior anterior cingulate cortex (Fig. 2).It is known that the substantia nigra is connected to the NAcc via the dopaminergic pathway, with the substantia nigra projecting to various sites of the basal ganglia (Camara et al., 2009;Rabey and Hefti, 1990;Schultz, 2016).In fact, reward has been associated with activation in the substantia nigra (Cohen et al., 2012;Yasuda et al., 2012).
The anterior insula and the superior anterior cingulate cortex (Fig. 2) constitute the salience network, which is responsible for switching cognitive processes between internally-and externally-oriented thoughts (Menon, 2011;Sridharan et al., 2008).This finding is consistent with a recent study reporting the salience network underlying feedback processing during intermittent neurofeedback (Dewiputri et al., 2021).In the context of feedback-related reward, the recruitment of the salience network may be necessary to mediate the external evaluation of reward followed by the internal weighting and judgment of mental strategies used to regulate neurofeedback.The functional coupling between the NAcc and the salience network indicates that the switching of mental processes undertaken by the salience network may be triggered by basal ganglia activity.
The positive association between connectivity and feedback scores indicates that this triggering of switching external and internal processes is stronger for positive feedback.

Models 3 and 4null results for associations between feedback values
and activation/dlPFC connectivity during self-regulation Models 3 and 4 were designed to investigate activation/connectivity during the regulation blocks associated with the preceding feedback score.Our aim was to investigate how feedback can modulate behavioral adjustment, specifically strategy adaptation during self-regulation after receiving feedback.We hypothesized that there would be negative associations with feedback due to its increased adaptation-demanding nature of negative feedback.We hypothesized that this association would be primarily located in the dlPFC.The dlPFC has been suggested to be involved in cognitive control during neurofeedback learning (Ninaus et al., 2013;Sitaram et al., 2017).However, we could not confirm our third and fourth hypotheses, which stated that, during regulation blocks, performance-related feedback scores would be negatively associated with activity/connectivity with dlPFC in brain regions related to neurofeedback control.Therefore, although activation in the dlPFC during self-regulation has been found to be ubiquitous across studies (Emmert et al., 2016), its activation or functional coupling with other regions may not be modulated by feedback scores.Another possibility is that the process we hypothesized to begin at the beginning of the regulation blocks may have already been initiated when the related feedback was presented.This interpretation would explain why we did not observe timelocked responses in the subsequent regulation blocks.In this case, behavioral adaptation could have already been occurring during the feedback blocks, and the results in Models 1 and 2 would reflect not only feedback processing but also behavioral adaptation.

Limitations of the study
Firstly, the feedback-modulated connectivity analyses were based on one seed each for Models 2 and 4. Due to our pre-registered methodology and computationally demanding analysis, we restricted our analysis to a priori defined seeds.Future work may investigate other potentially representative regions.
Secondly, it should be noted that feedback appraisal is inherently subjective (Tricomi and DePasque, 2016).For example, highly motivated or initially successful individuals might perceive a half-full feedback representation as more negative than those who are poorly motivated or initially unsuccessful.Similarly, in bidirectional neurofeedback studies, the ability to regulate in one direction may be easier than in the other.Therefore, an individual's perception of success may depend on the difficulty of regulating in each direction.To account for such differences, future studies may include data on subject-specific motivation levels in their analysis.
Thirdly, only one out of eight studies included sham feedback; therefore, we could not compare sham and veridical feedback and could not extend our results to sham feedback.However, we argue that the neural mechanisms for processing sham feedback should lead to similar results, as long as the feedback presentation is not perceived as sham feedback (Ninaus et al., 2013).
Fourthly, feedback success and failure were evaluated using a one-dimensional scale and cannot be dissociated in our study.From a methodological perspective, it would not be reasonable to separate failure and success for some datasets of our study (Radua et al., 2018), since the feedback representations varied parametrically on the same scale.However, it is conceivable that some of the anticipated regions (e.g., in ROI analyses) were not identified as associated with feedback due to the modeling strategy of defining success and failure within a single onedimensional scale.We argue that studying the neural mechanisms of graded feedback, as opposed to binary, is, therefore more informative and ecologically valid (Radua et al., 2018).

Conclusion
Our mega-analysis using data from eight fMRI-neurofeedback experiments revealed that feedback processing is primarily associated with activity in subcortical regions (NAcc, putamen, caudate, and pallidum) and the cerebellum.Such findings indicate that neurofeedback is processed in core regions of the reward system, suggesting that inherent motivational reward/punishment aspects shape neurofeedback learning.The evoked neural responses are similar to those elicited by extrinsic or primary rewards, representing the dopaminergic release for rewarding feedback.We also observed that activity and connectivity with the NAcc was positively associated with feedback scores in several large-scale networks.This involvement represents the internally-directed attention to the strategies adopted and their appraisal for the subsequent trials, the switch between attention to internal appraisal and external reward modulated by the basal ganglia, and the top-down attention for associating feedback and selfregulatory performance.As a corollary to the positive association between activity and feedback scores, our findings have implications for neurofeedback paradigms that use continuous feedback to train down-regulation of the brain regions reported here.Specifically, feedback reward associated with successful down-regulation would elicit a positive neural response to downregulation effort, i.e., researchers may consider a possible interference between down-regulation performance and feedback-related activation.
Our findings contribute to the understanding of how self-regulatory learning is promoted by neurofeedback paradigms.We provide evidence that this learning process occurs through reinforcement learning: positive feedback elicits activity in the reward system, which in turn promotes performance improvement over training.These findings may extend to other feedbackdependent learning paradigms and subsequent behavioral adaptation (Tricomi and DePasque, 2016).In addition, we show that large-scale networks, which allocate and modulate attentional resources to both externally-presented feedback and evaluative processing of self-regulatory strategies, are involved in the learning process of neurofeedback training.Such a finding may be more specific to neurofeedback paradigms due to their introspective nature of evaluating internal self-regulatory strategies.Overall, our findings highlight the importance of feedback as a driving force for learningfrom grades on a school exam to complex experimental paradigms such as neurofeedback.
created for Model 1 to create the PPI regressor (O'Reilly et al., 2012), which calculates the element-by-element product of the NAcc time-course and the task regressor.The task regressor was a boxcar function constructed with the blocks of modulated feedback presentation.
: subgenual, rostral and superior anterior cingulate cortex (ACC), anterior and posterior insula, dorsolateral prefrontal cortex (dlPFC), posterior parietal cortex, and lateral occipital cortex; and (3) internally-oriented attention (Andrews-Hanna et al., 2014): medial prefrontal cortex (mPFC) and posterior cingulate cortex.ROI analyses were performed separately for Models 1-4.Because Models 2 and 4 considered connectivity with the individual, study, and model.Due to incomplete coverage, one individual was removed from the Keller et al. dataset for the analysis of the subgenual ACC, two individuals were removed from the Hellrung et al. dataset for the analysis of the posterior cingulate gyrus; and all individuals from the Hellrung et al.'s dataset for the analysis in the posterior parietal cortex.Using R (version 4.3.2(2022-10-31), PBC, Boston, MA, USA; rstudio.com),we ran linear regressions for each ROI and model using the following function: lm(value ~ 1 + study, data = data, contrasts = list(study=contr.sum))

Figure 1 .
Figure 1.Model 1. (A) Whole-brain map (N = 153) showing brain areas whose activation is positively associated with feedback scores during feedback blocks.The statistical threshold was set at voxel-wise uncorrected p < 0.001 and cluster-wise FDR-corrected p < 0.05.The z-coordinates for axial slices are displayed at the left bottom corner of each slice.(B) For illustration purposes, we show that the strongest associations were in the nucleus accumbens, medial prefrontal cortex, and cerebellum (FWE-corrected voxel-level threshold of p < 0.05).(C) Results from region-ofinterest (ROI) analysis showing regions whose activation is positively associated with feedback values during feedback blocks.Asterisks and red color represent significant positive associations and gray color represent no significant association.CI = confidence interval, ACC = anterior cingulate cortex.

Figure 2 .
Figure 2. Results from region-of-interest (ROI) analysis of Model 2 showing regions whose connectivity with NAcc is positively associated with feedback values during feedback blocks.Asterisks and red color represent significant positive associations and gray color represent no significant association.CI = confidence interval, ACC = anterior cingulate cortex.

Figure 3 .
Figure 3. Summary of the findings and suggested interpretation in the context of neurofeedback training and feedback.The colors represent key functions associated with the corresponding brain regions and presumably employed during neurofeedback training.These functional groupings are based on established associations from previous studies, described in sections 4.1 and 4.2.The shapes represent whether the finding is related to either activity or connectivity estimations in the analysis.mPFC = medial prefrontal cortex, rACC = rostral anterior cingulate cortex, sACC = superior anterior cingulate cortex, IFG = inferior frontal gyrus, Ins = insular cortex, SN = substantia nigra, PCC = posterior cingulate cortex, Ang = angular gyrus, SPL = superior posterior lobule, Cer = cerebellum, L/R = left/right.

Table 1 .
Demographic details of each included study (studies ordered by sample size after study-specific exclusion criteria).

Table 2 .
Summary of the fMRI acquisition parameters for each included study.

Table 3 .
Regions of interest and their center coordinates.

Table 4 .
Significant clusters in whole-brain analyses for Model 1 (Fig.1B, strong association between activity and feedback scores).Regions were labeled based on the meta-analytic associations of Neurosynth.