ABSTRACT
Intermittent Theta Burst Stimulation (TBS) applied to the left dorsolateral prefrontal cortex suppressed reward-related signaling in the anterior midcingulate cortex (aMCC), resulting in a change in goal-directed behavior. Continuous TBS had no effect. While these results are inconsistent with reported TBS effects on motor cortex, the present findings offer normative insights into the magnitude and time course of TBS-induced changes in aMCC excitability during goal-directed behavior.
Theta-burst stimulation (TBS) protocols have recently emerged as having a fast and robust faciliatory (intermittent TBS: iTBS) and inhibitory (continuous TBS: cTBS) action on motor cortex excitability1. As a result, several groups have begun exploiting its potential in the study of prefrontal function in both typical2 and atypical3 populations. While TBS effects on motor cortical excitability can be assessed relatively directly by measuring motor‐evoked potentials, the evaluation of TBS when applied to prefrontal regions requires different approaches, such as PET, EEG, and fMRI. However, combined TBS neuroimaging studies designed to investigate the after-effects of TBS beyond the motor cortex have largely produced conflicting results4,5. Collectively, the action of TBS protocols on neurocognitive functioning remain unclear. Here, we used robot-assisted TBS, in combination with scalp electrophysiological recordings, to examine whether TBS protocols applied to the left dorsal lateral prefrontal cortex (DLPFC) can differentially modulate the activity of the anterior midcingulate cortex (aMCC) during goal-directed behavior.
The function of aMCC is hotly debated, but an influential theory holds that aMCC utilizes reward prediction error signals (RPEs) for the purpose of reinforcing adaptive behaviors. In humans, this process is revealed by a component of the event-related brain potential called the reward positivity6 (Figure 1C). Converging evidence indicates that the reward positivity is generated in aMCC and indexes an RPE signal6. In parallel, 10-Hz repetitive transcranial magnetic stimulation (rTMS) applied to the left DLPFC has been shown to enhance dopamine release7, neuronal activity, and cerebral blood flow8 in the aMCC, as well as increase the amplitude of the reward positivity9. Leveraging these two streams of evidence, we first asked whether iTBS and cTBS applied to the left DLPFC could facilitate or disrupt aMCC activity, as evaluated by the reward positivity. If true, we then asked whether a TBS-induced change in aMCC activity would cause a change in goal-directed behaviour (e.g. win-stay and lose-shift actions). Given the importance of the aMCC to normal and pathological function, it would be a great theoretical and therapeutic advantage if one could specifically drive or inhibit aMCC activity with TBS.
We recorded the electroencephalogram from 19 right-handed participants (9 females, aged 18–28 [M = 22.2, ±2.8]) freely navigating a “virtual” T-maze to find monetary rewards (reward: 5 cents, no-reward: 0 cents). Beginning the experiment, a robotic arm positioned the TMS coil over the left DLPFC (electrode location F3) (Fig. 1A), and continuously tracked this position to ensure precise pulse delivery (< 1 mm). Each participant received iTBS or cTBS on separate sessions (counterbalanced), with each session consisting of four blocks of 100 trials. Following the first block, 600 pulses of either iTBS or cTBS were delivered at 80% resting motor threshold. Subjects then complete three post-TBS blocks (duration 15-20 minutes).
An analysis of the reward positivity revealed a significant quadratic interaction between Block and TBS, F1, 18 = 6.7, p < .01, η2 = 0.21. Specifically, the delivery of iTBS following Blk1−Baseline (M = −6.63 μV, ±.8) strongly reduced the reward positivity at Blk2−postTBS (M = −4.80 μV, ±.7; t(18) −2.78, p<.01 Cohen’s d = .63), Blk3−postTBS (M = −4.66±.7 μV; t(18) −2.7, p<.05 Cohen’s d = .47), and Blk4−postTBS (M = −5.0 μV, ±.7; t(18) −2.2, p<.05 Cohen’s d = .50). No differences were observed across cTBS blocks, and we have since replicated these results. To note, the reward positivity at baseline was comparable to other non-TBS studies (see SOM), and the amplitudes of other prominent ERP components (N2, P2 and P3) were not significantly impacted by TBS, confirming that the iTBS effect was isolated to the reward positivity.
Consistent with previous work, participants adopted a win-stay (M = 70%, +4, t(18) 5.04, p<.001) and lose-shift strategy (M = 62%, +2, t(18) −3.66, p<.005), and this strategy was maintained across blocks and did not differ between TBS protocols (Figure 2B). Further, a significant quadratic contrast was observed for reaction time for the iTBS session, F1, 18 = 6.5, p < .05, η2 = 0.27, with a nonsignificant linear contrast F1, 18 =0.73, p = .40, η2 = 0.04,. Conversely, a significant linear contrast was observed for the cTBS session, F1, 18 = 7.01, p < .01, η2 = 0.28, with a nonsignificant quadratic contrast F1, 18 = .67, p = .42, η2 = 0.04. These results suggest that reaction time constantly decreased following cTBS, but for the iTBS session, reaction time became slower and more variable across behavioral strategies following the second block. To note, aMCC lesions typically results in longer and more variable reaction times6, suggestive that iTBS induced a “virtual” lesion. Finally, a multiple regression analysis indicated that a decrease in reward positivity amplitude (relative to baseline) strongly predicted a decrease in selecting the alternative maze alley following feedback across Blk3−postTBS and Blk4−postTBS (see Figure 2C). This pattern of results was not observed in the cTBS session and no other brain – behavior relationships were observed.
The prominent feature of TBS over other conventional rTMS protocols is its ability to produce inhibitory (cTBS) and excitatory (iTBS) effects on motor cortex physiology using a shorter application time and lower stimulation intensity1. Here, we report the first evidence that reward positivity amplitude – which is said to reflect phasic modulation of aMCC activity by RPEs – was attenuated following iTBS, but not cTBS, in a simple decision-making task. Although surprising, a closer examination of the literature reveals the after-effects of TBS on prefrontal cortex function are often in conflict with motor cortex studies4,5. Nevertheless, such decreases in reward positivity amplitude most likely reflect either a net decrease in dopamine release to aMCC 7, or disrupted aMCC response to RPE signaling9. While the precise mechanisms of such effects are unknown, the iTBS effect on aMCC-dopamine interface is consistent with several sources of evidence. PET imaging studies indicate that left DLPFC stimulation can modulate dopamine release in aMCC 7; and the reward positivity, but not of other ERP components, is predictably altered by pharmaceutical and neuropsychological insults of the dopamine system 6,10. Alternatively, animal studies show that iTBS, but not cTBS, can induce a profound effect on cortical inhibition, which could itself cause dysfunction of cortical networks 11 (e.g., frontocingulate system) and possibly perturb reward processing in aMCC. Together, these interpretations may provide a common framework for relating TBS effects on aMCC-related electrophysiological signals in humans and nonhuman animals, and warrant future investigations.
Understanding the after-effects of TBS is critical if this technology is to be effectively implemented in cognitive neuroscience and psychiatric research. For instance, current theories of aMCC function — derived primarily from correlative techniques such EEG and fMRI — span a plethora of functions that include conflict processing, reinforcement learning, and error detection12. Because TMS offers a powerful tool for investigating causal brain-behavior relations2, understanding its modulatory role may help reconcile or dispute theories of aMCC function. Here, we show that decreasing reward positivity amplitude consequently reduced participants choice to select the alternative alley following feedback, regardless of its valence. One interpretation is that iTBS disrupted the aMCC valuation of the policy itself (switch alley locations following feedback), and by doing so, decreased the pursuit of such effortful behaviors. This interpretation is more in line with aMCC’s role in the valuation and selection of extended, goal-directed behavior6, rather than simple stimulus-response associations that characterize response-conflict and trial-and-error learning accounts of aMCC function12. In parallel, a substantial body of evidence implicates aMCC hypoactivity in depression patients10, and it has been argued that modulating cingulate activity via DLPFC might mediate the therapeutic effects of TMS13. If the goal of TMS in depression is to increase aMCC excitability, then applying iTBS may actually compound the impairment. However, in psychiatric conditions afflicted by aMCC hyperactivity (e.g. obsessive compulsive disorder10, and drug-cue hypersensitivity observed in addiction 9), then iTBS may prove to be beneficial.
In conclusion, instead of a general excitatory or inhibitory effect of TBS, it is reasonable to assume that TBS modulates prefrontal systems in a more complex manner than previously thought5. More importantly, our studies so far have now revealed both an excitatory (10 Hz rTMS)9 and suppressive (iTBS) action on aMCC activity, and thereby open an exciting new era of investigative possibilities in the understanding of aMCC function and treatment of aMCC dysfunction.
Methods
Participants
Twenty young adults (9 females, aged 18–28 [M = 22.3, SD = 2.7]) participated in the present study. All participants were right-handed (Laterality index = 70, SD = 7.5) as defined by the Edinburgh Handedness Inventory 14. All participants provided written informed consent and were screened for TMS permissibility, current use of Central Nervous System drugs (antidepressants, anxiolytics, antipsychotics, etc.) and recent / present neurologic symptoms using standardized forms. Participants were compensated for their participation at the rate of $15 per-hour, totaling approximately $80 USD for two experimental sessions spanning a 2 week interval. Participants were paid in full at the end of the final session and in the event that a subject wished to discontinue their participation, they were paid to the nearest half-hour. All experimental sessions took place at the Center for Molecular and Behavior Neuroscience, Rutgers-Newark. All participants had normal or corrected-to-normal vision, and were naive about the purpose of the study. The study was approved by the local research ethics committee at Rutgers University, USA, and was conducted in accordance with the ethical standards prescribed in the 1964 Declaration of Helsinki.
Procedure
Virtual T-Maze task
The virtual T-maze is a reward-based choice task that elicits robust reward positivities9,15. In brief, at the start of each trial, participants were presented with an image of the base arm of a T-maze showing the length of the arm and two alleys projecting to the left and to the right from its far end (Figure 1C). Participants were instructed to choose one of the two arms by pressing either a left or right button. Then, they were shown a view of the alley that they selected, followed by an image of either an apple or an orange appearing against the far wall of that alley. Participants were told that presentation of one type of fruit indicated that the alley they selected contained 5 cents (reward feedback) and that the presentation of the other fruit indicated that the alley they selected was empty (no-reward feedback). Participants were informed that, at the end of the experiment, they would be rewarded all the money they found and that they were to respond in a way that maximized the total amount of money earned. Unbeknownst to them, on each trial, the type of feedback stimulus was selected at random (50% probability for each feedback type). The experiment consisted of four blocks of 100 trials each separated by rest periods (Figure 1A). Following the first block of trials, subjects received 600 pulses of TBS (see methods below). Following their last pulse (< 30 seconds), subjects continued with the three post-TBS blocks of trials.
At the end of the session, participants were asked to fill out the Alcohol, Smoking and Substance Involvement Screening Test (ASSIST),16 and the Substance Use Risk Profile Scale (SURPS)17. Participants who scored above 36 on the Global Continuum of Substance Risk (GCR) were excluded from further analysis. The data of one participant was excluded because of a history of substance misuse. Following the questionnaires, subjects were debriefed about the purpose of the study and were compensated for their participation.
T-maze Behavioral Analysis
The percentage of choice (e.g. win-stay and lose-shift) and the reaction time were calculated for behavioral analyses. All trials with reaction time faster than 100ms or slower than 3000ms were excluded from the analyses. Specifically, if subjects chose the same or different alley on the trial N after receiving reward (i.e. apple) on the previous trial, the trial N is defined as a Win-Stay or Win-Shift trial. Similarly, if subjects chose the same or different alley on the trial N after receiving no-reward (i.e. orange) on the previous trial, the trial N is defined as a Lose-Stay or Lose-Shift trial. The percentage of Win-Stay or Win-Lose choice were calculated based on the number of Win-Stay or Win-Lose trials divided by the total of Win-Stay and Win-Shift trials. Similarly, the percentage of Lose-Shift choice were calculated based on the number of Lose-Shift trials divided by the total of Lose-Shift and Lose-Stay trials. The averaged reaction time on these four trial types (i.e. Win-Stay, Win-Shift, Lose-Stay, and Lose-Shift) were also calculated.
EEG Data Acquisition and Analysis
Continuous EEG was recorded from 27 scalp electrodes (Fp1, Fp2, F7, F3, Fz, F4, F8, FC5, FC1, FCz, FC2, FC6, T7, C3, Cz, C4, T8, CP5, CPz, CP6, P3, Pz, P4, PO7, POz, PO8, Oz), placed according to the 10/20 system. Signals were acquired using Ag–AgCl ring electrodes mounted in a nylon electrode cap with a conductive gel (Falk Minow Services, Herrsching, Germany). Signals were amplified by low-noise electrode differential amplifiers with a frequency response of DC 0.017–67.5 Hz (90-dB octave roll-off) and digitized at a rate of 1000 samples per second. Digitized signals were recorded to disk using Brain Vision Recorder software (Brain Products GmbH, Munich, Germany). Electrode impedances were maintained below 10 kΩ. Two electrodes were also placed on the left and right mastoids. The EEG was recorded using the average reference.
ERP Analysis
Postprocessing and data visualization were performed using Brain Vision Analyzer software (Brain Products GmbH). The digitized signals were filtered using a fourth-order digital Butterworth filter with a bandpass of .10–20 Hz. An 1-sec epoch of data extending from 200 msec before to 800 msec after the onset of each feedback stimulus was extracted from the continuous data file for analysis. Ocular artifacts were corrected using the eye movement correction algorithm described by Gratton, Coles, and Donchin (1983). The EEG data were rereferenced to linked mastoids electrodes. The data were baseline-corrected by subtracting from each sample the mean voltage associated with that electrode during the 200-msec interval preceding stimulus onset. Muscular and other artifacts were removed using a ±150-μV level threshold and a ±35-μV step threshold as rejection criteria. ERPs were then created for each electrode and participant by averaging the single-trial EEG according to feedback type (Reward, No-reward) and Block (Block 1 [pre-TBS baseline], Block 2 [post-TBS 1], Block 3 [post-TBS 2], Block 4 [post-TBS 3]).
Reward Positivity
Reward positivity amplitude is typically assessed as the size of the difference in the ERPs elicited to positive and negative feedback stimuli of equal expectancy 9,15. Thus, to isolate the reward positivity from other overlapping ERP components, the reward positivity was evaluated for each electrode channel and participant as a difference wave by subtracting the Reward feedback ERPs from the corresponding No-reward feedback ERPs. As before9, reward positivity amplitude was determined at FCz by identifying the maximum absolute amplitude of the difference wave within a 200-to 400-ms window following feedback onset for block (Blk1−Baseline, Blk2−postTBS, Blk3−postTBS, Blk4−postTBS).
Other ERP components
A consideration of TBS modulators’ effects is also valuable in interpreting additional ERP components in the post-feedback waveform, particularly the P200 (Table S1), N200 (Table S2), and P300 (Table S3). The P200 was measured using a local peak detection method by identifying the maximum positive value of the ERP within a window extending from 100 ms to 250 ms following the presentation of the feedback, the time of which was taken as the peak latency of the P200. This algorithm was applied to the Reward and No-reward ERPs associated with electrode site FCz. The N200 was quantified at FCz and its amplitude was measured for each participant and feedback condition (reward, no reward) using a local peak detection algorithm, in which the most negative value within a time window starting from 200 to 400 ms after feedback presentation was taken as the peak. P300 amplitude was measured by identifying the maximum positive-going value of the Reward and No-reward ERPs recorded at electrode site Pz, within a window extending from 300 to 600 ms following the presentation of the feedback stimulus.
Statistical analysis strategy
Reward Positivity
For the purpose of statistical analysis, reward positivity was evaluated at channel FCz where it reaches is maximal. To test our predictions, we conducted a two-factor mixed-design analysis of variance (ANOVA) on reward positivity amplitude with block (Blk1−Baseline, Blk2−postTBS, Blk3−postTBS, Blk4−postTBS) and TBS session (iTBS, cTBS) as within-subject factors. Sex, age, and TBS order were controlled for throughout our analysis. The Greenhouse– Geisser correction for nonsphericity was applied where appropriate, and the Benjamini– Hochberg procedure was used to correct for multiple testing in all post hoc analyses.
Behavior
For reaction time, we conducted a four-factor mixed-design ANOVA on reaction time with block (Blk1−Baseline, Blk2−postTBS, Blk3−postTBS, Blk4−postTBS), feedback (win, lose), behavior (stay, switch), and TBS session (iTBS, cTBS) as within-subject factors. For percentage of choice, we conducted a three-factor mixed-design ANOVA with block (Blk1−Baseline, Blk2−postTBS, Blk3−postTBS, Blk4−postTBS), strategy (win-stay, loose-shift) and TBS session (iTBS, cTBS) as within-subject factors. Sex, age, and TBS order were controlled for throughout our analysis. The Greenhouse–Geisser correction for nonsphericity was applied where appropriate, and the Benjamini–Hochberg procedure was used to correct for multiple testing in all post hoc analyses.
Brain and Behavior relationships
A multiple regression analysis was performed using the computer program AMOS 18.0.1 (Arbuckle, 1995–2009) to identify unique relationships between a change in reward positivity amplitude and a change in behavior following iTBS and cTBS ([Blk1−Baseline -Blk2−postTBS], [Blk1−Baseline - Blk3−postTBS], [Blk1−Baseline - Blk4−postTBS]). Type 1 errors were statistically controlled following Benjamini and Hochberg (1995) with a corrected significance level of α = 0.05. Sex, age, and TBS order were included in the regression model as nuisance variables.
Robot-assisted TMS protocol
Robotic-assisted TMS is a novel approach in the application for TBS protocols. For each subject, a 3D model of the head surface was created in Smartmove (ANT, Enschede, Netherlands). A headmarker carrying three passive Polaris reflective markers on a plastic frame was fixed to the head and tracked by a Polaris infrared stereo-optical tracking system. The construction of the surface of the 3D head model was done by pointer acquisition of up to 600 surface points. This also determined the coordinate transformation from the robot end-effector to the coil. The robot and the tracking system were registered to a common coordinate system. To set the target for TBS, the left DLPFC target (here, electrode location F3), the coil was placed on the surface of the 3D head model, and the corresponding point on the surface of the virtual head model was selected and the orientation of the coil was calculated tangentially to cortical surface and 45° to a sagittal plane. The distance between the TMS coil and the selected DLPFC target was less than 10 mm. The virtual coil position was then transformed to robot coordinates, which defined the movement of the robot to the corresponding target position relative to the subject’s head. The position of the cranium was continuously tracked and the trajectory of the coil’s path was adapted to movements of the head by a motion compensation module in real-time, which guaranteed a precise coil to target position and allowed free head movements during the study18.
TBS protocol
The stimulator device was a MagPro X100 with the Cool-B70 figure-of-eight coil (MagVenture, Falun, Denmark). Resting Motor Threshold (rMT) measurements were performed via visual twitch in the contralateral (right) hand. The coil was positioned over the supposed motor cortex area, then the coil was moved until the location at which a reproducible abductor pollicis brevis response, elicited at the lowest stimulator intensity could be identified. rMT was defined as the lowest stimulation intensity, expressed as a percentage of max output of the Magstim equipment that reliably (3/5 times) yielded a visible muscle twitch in the hand when stimulating the hand area of the contralateral motor cortex with a single pulse, and that was set as rMT. In keeping with safety guidelines, stimulation intensity for the experiment was set for each individual participant as 80% of motor threshold (range 55% - 75%). Average stimulation intensity was 53% (range: 44%– 56%) of maximum output.
Participants received iTBS or cTBS over the predefined left DLPFC target (F3) on separate occasions, with order counterbalanced across participants. For each session, participants completed 400 trials of the virtual T-maze task. Following the first block of 100 trials, participants received 600 pulses of either iTBS or cTBS. Following Huang et al. (2005) protocol, cTBS involved an uninterrupted train of TBS for 40s, in which the inhibitory after-effects can last for up to 50min post-stimulation. Conversely, iTBS was comprised a 2s TBS train (10 TBS bursts) delivered every 10s (8s inter-burst interval between trains) over a total of 190s, resulting in an excitatory effect for up to 60min post stimulation. As noted above, stimulation intensity for each TBS session was set for each participant as 80% of motor threshold as evaluated on the first session. One person could not tolerate the TBS protocol, and the experiment was stopped. In total, 20 participants completed both TBS sessions. Following TBS, participant then completed 3 blocks of 100 trials, with each block lasting approximately 5-7 minutes. After the task was completed, the EEG cap was removed and participants were debriefed and compensated for their participation in the study.