Rewarded extinction increases amygdalar connectivity and stabilizes long-term memory traces in the vmPFC

Neurobiological evidence in rodents indicates that threat extinction incorporates reward neurocircuitry. Consequently, incorporating reward associations with an extinction memory may be an effective strategy to persistently attenuate threat responses. Moreover, while there is considerable research on the short-term effects of extinction strategies in humans, the long-term effects of extinction are rarely considered. In a within-subjects fMRI study, we compared counterconditioning (a form of rewarded-extinction) to standard extinction, at recent (24 hours) and remote (∼1 month) retrieval tests. Relative to standard extinction, counterconditioning diminished 24-hour relapse of arousal and threat expectancy, and reduced activity in brain regions associated with the appraisal and expression of threat (e.g., thalamus, insula, periaqueductal gray). The retrieval of reward-associated extinction memory was accompanied by functional connectivity between the amygdala and the ventral striatum, whereas the retrieval of standard-extinction memories was associated with connectivity between the amygdala and ventromedial prefrontal cortex (vmPFC). One-month later, the retrieval of both standard- and rewarded-extinction was associated with amygdala-vmPFC connectivity. However, only rewarded extinction created a stable memory trace in the vmPFC, identified through overlapping multivariate patterns of fMRI activity from extinction to 24-hour and 1-month retrieval. These findings provide new evidence that reward may generate a more stable and enduring memory trace of attenuated threat in humans. Significance Statement Prevalent treatments for pathological fear and anxiety are based on the principles of Pavlovian extinction. Unfortunately, extinction forms weak memories that only temporarily inhibit the retrieval of threat associations. Thus, to increase the translational relevance of extinction research, it is critical to investigate whether extinction can be augmented to form a more enduring memory, especially after long intervals. Here, we used a multi-day fMRI paradigm in humans to compare the short- and long-term neurobehavioral effects of aversive-to-appetitive counterconditioning, a form of augmented extinction. Our results provide novel evidence that including an appetitive stimulus during extinction can reduce short-term threat relapse and stabilize the memory trace of extinction in the vmPFC, for at least one month after learning.


Introduction 58
While learning about threats is adaptive, persistent and misattributed fearful responses are 59 characteristic of anxiety disorders. Exposure therapy, based on the principles of Pavlovian 60 extinction, is a widely used treatment for anxiety-related disorders (Abramowitz et al., 2019). bound to the spatiotemporal context in which extinction memories were formed (Bouton, 2002). 66 Several augmented strategies to standard extinction have shown success in promoting relatively 67 short-term (~24 hours) retention of extinction memories in humans (Craske et al., 2018;68 Dunsmoor et al., 2015). However, evaluating the long-term success (> 1 week) of extinction 69 protocols in humans is extremely rare, which limits the clinical translational relevance of extinction 70 research, as symptoms frequently return some time after treatment (Vervliet et al., 2013). Here, 71 we compared the neurobehavioral effects of standard extinction and augmented extinction in 72 healthy adults at recent (24 hours) and remote (~1 month) intervals in the same individuals. 73 Whereas standard extinction involved simply omitting an expected aversive electrical shock, 74 augmented extinction involved replacing the shock with a positive outcome, a paradigm known 75 as aversive-to-appetitive counterconditioning (Dickinson & Pearce, 1977;Keller et al., 2020a). 76 In counterconditioning (CC), behavior is modified through a new association with a stimulus of 77 the opposite valence. Research on counterconditioning dates to the earliest studies of 78 conditioning in humans (Jones, 1924), and forms the basis for popular treatments for anxiety 79 disorders such as systematic desensitization (Wolpe, 1954(Wolpe, , 1968(Wolpe, , 1995. Contemporary One possibility is that reduced relapse following CC is mediated by augmented activity in networks 86 involved in the formation of extinction memories, specifically activity within and between the 87 ventromedial prefrontal cortex (vmPFC) and amygdala (Giustino &  could further engage reward-related regions of the mesostriatal dopamine system shown to be 90 One month later: Participants returned for their third and final session ~1 month later. This 188 session followed the same format as Day 2, and included four functional imaging runs: a final 189 threat renewal test, a recognition memory test for the rest of the CS exemplars from Day 1, and 190 two runs of a perceptual category localizer. Before participants entered the scanner, shock and 191 SCR electrodes were re-attached and the shock was re-calibrated. 192 Psychophysiology Analysis. SCRs were calculated using prior criteria (Keller & Dunsmoor, 193 2020). SCRs were considered valid to the CS trial if the trough-to-peak deflection of electrodermal 194 activity occurred between 0.5 to 6 seconds following CS onset and were not greater than 0.2 uS. 195 Trials that did not meet these criteria were scored as zero. SCRs were scored by an automated 196 analysis script implemented in Matlab (Green et al., 2014), and were later visually inspected by 197 research assistants blind to the experimental conditions. SCR data were square-root transformed 198 prior to statistical analysis to normalize the distributions. Participants were not excluded from the 199 analysis based on any response criteria for SCRs, based on recommendations from the field of 200 human threat conditioning (Lonsdorf et al., 2017). Two-AFC shock expectancy was coded as 201 1=expect to receive a shock, 0= do not expect. 202 Imaging parameters. Brain images were recorded on a 3T Siemens Vida with 64-channel head 203 coil at the University of Texas at Austin Biomedical Imaging Center. Functional task and localizer 204 data were acquired using T2*-weighted EPI sequences (TR = 1000ms, TE = 86ms, FOV = 86 x 205 86mm, 2.5mm isotropic voxels), with slices oriented parallel to the hippocampal long axis and 206 positioned to provide whole-brain coverage. High-resolution T1-weighted anatomical images 207 were obtained using 3D MPRAGE sequences (TR = 2400ms, TE = 1000ms, FOV = 208 x 300mm, 208 0.8mm isotropic voxels) before the EPIs in each session, to aid in co-registration and 209 normalization. Diffusion-weighted images were also acquired but were not examined. surface reconstruction with FreeSurfer 6.0.1 recon-all (Dale et al., 1999). The skull-stripped T1w 219 images were registered using FreeSurfer's mri_robust_template to generate a single unbiased T1w-reference map per participant for spatial normalization (Reuter et al., 2010). Spatial 221 normalization to MNI space was performed via nonlinear registration (ANTs Registration), using 222 skull-stripped versions of both the T1w reference volume and MNI152NLin2009cAsym template 223 (Fonov et al., 2009). 224 Functional data from each BOLD run were corrected for field distortion based on a B0-225 nonuniformity map estimated via AFNI 3dQwarp (Cox & Hyde, 1997), then co-registered to the 226 corresponding T1w reference using boundary-based registration (Greve & Fischl, 2009) with 6 227 degrees of freedom (FreeSurfer bbregister). Head-motion parameters, including transformation 228 matrices and six rotation and translation parameters, were estimated for each BOLD run prior to 229 any spatiotemporal filtering (FSL mcflirt). Framewise displacement and DVARS were calculated 230 for each functional run using Nipype (Power et al., 2014), and frames exceeding 0.3mm FD or 231 1.5 standardized DVARS were annotated as motion outliers. In addition, six principal components 232 of a combined CSF and white matter signal accounting for the most variance were extracted using 233 aCompCor (Behzadi et al., 2007) following highpass filtering (128s cutoff) with discrete cosine 234 filters. The BOLD runs were then slice-time corrected (AFNI 3dTshift; Cox, 1996), and resampled 235 onto original native space using custom methodology of fMRIPrep that applies all correction 236 transformations in a single interpolation step. Additional details on the fMRIPrep pipeline may be 237 found in the online documentation: https://fmriprep.org/en/1.5.9/. 238 Following preprocessing in fMRIPrep, we masked the preprocessed BOLD data for each 239 participant with the intersection of the average T1-reference brain mask with the average BOLD 240 reference mask. In final preparation of the MRI data for analysis with FSL (FMRIB's Software 241 Library, www.fmrib.ox.ac.uk/fsl, Version 6.00), the following pre-statistical processing was 242 threat ROIs, a sphere was drawn around peak coordinates reported in these studies, with a radius 280 of 10mm. Parameter estimates for ROIs were extracted using FSL's featquery tool and input to R 281 Studio for further analyses with paired t-tests. The vmPFC, an a priori ROI for functional 282 connectivity and RSA analyses, was defined functionally from the CS-> CS+s contrast during 283 acquisition. A 10 mm sphere was drawn around the coordinates of a significant cluster (z<3.1, 284 cluster corrected p<0.05) corresponding to the medial frontal gyrus (MNI coordinates, -14, 50, -1) 285 (Table 1). 286

Task-Based Functional Connectivity. We used generalized psychophysiological interaction 287
(gPPI) to examine functional connectivity at the 24-hour and ~1-month renewal tests, in two a 288 priori pathways (basolateral amygdala (BLA) nucleus accumbens (NAc) and vmPFCcentral 289 amygdala (CeM)). The timeseries for the seeds (BLA and vmPFC) were extracted using FSL's 290 meants command and input as regressors in the model. Interactions between the physiological 291 variable (i.e., the seed's respective timeseries) and each of the psychological variables (i.e., Mean z-scores of connectivity from target ROIs were extracted using Featquery for each 303 regressor of interest (CS+CC, CS+EXT and CS-), at both the 24-hour and ~1 month renewal 304 tests. These connectivity means were then input into R studio for further statistical analyses. 305 Representational Similarity Analysis (RSA). In order to facilitate RSA, LS-S style betaseries 306 were computed for each scanner run (Mumford et al., 2012(Mumford et al., , 2014. Within each scanner run trial-307 specific beta images were iteratively computed in FEAT using a design matrix which modeled a 308 single trial of interest and all of trials as regressors of no interest based on trial type (e.g., separate 309 CS+CC, CS+EXT, CS-regressors of no interest). FEAT settings were identical as in our univariate 310 analysis, with the exception that no spatial smoothing was applied in order to respect the 311 boundaries of our a priori ROIs in multivariate analyses. In addition to these trial-specific beta 312 estimates, we also generated conventional estimates of average activity for each CS type during 313 each phase (i.e., all CS+CC in one regressor of interest), again without spatial smoothing. For the 314 renewal sessions, separate regressors were used to model the early vs. late trials. 315 RSA was accomplished using custom Python code. The goal of our analyses was to iteratively 316 compare multivoxel patterns of activity in the vmPFC, between memory encoding in the 317 extinction/CC session, recent renewal, and remote renewal. In order to reduce noise across the 318 multivoxel pattern prior to estimating pattern similarity, each LS-S beta image was weighted 319 (multiplied) by the overall univariate activity estimate of the corresponding CS type and time point 320  Greenhouse-Geisser (GG) correction was applied when sphericity was violated. Main effects or 333 interactions were followed by post-hoc two-tailed paired t-tests. 334

Behavioral Results 336
Threat acquisition and extinction. Analyses of mean shock expectancy and SCRs during the 337 acquisition and extinction phases on Day 1 were separated into the first and second half of trials 338 (i.e., early/late) (Fig. 1B, 1C). Shock expectancy was significantly higher for both CS+s in 339 comparison to CS-during both early and late trials of acquisition (all p < 0.001) (Fig. 1B). A 340 repeated-measures ANOVA of SCR during acquisition revealed a main effect of CS type (F(1.50, 341 36.04) = 11.462, pgg < 0.001, η 2 G = 0.025) and a main effect of early/late trials (F(1,24) = 21.194, p < 342 0.001, η 2 G = 0.053), but no interaction (pgg = 0.071). Post-hoc paired t-tests showed successful 343 acquisition towards both CS+s, as SCRs were significantly higher for CS+CC vs CS-and CS+EXT 344 vs CS-(all p < 0.01) (Fig. 1C). Importantly, shock expectancy and SCR did not differ between 345 CS+s during acquisition. Thus, participants successfully acquired equivalent expectancy 346 responses and conditioned arousal towards both CS+s.

24-hour threat renewal test.
Mean shock expectancy during early 24-hour renewal (first 4 trials) 358 was higher for both CS+s in comparison to CS− (all p <0.01), and there were no differences 359 between CS+s (p = 0.387) (Fig. 1B). 360 Notably, given the limited sensitivity of a 2AFC, we did not expect to see differences between 361 CS+s within sessions. As such, we assessed expectancy during the end of extinction, and 362 compared it to expectancy during the renewal phase. A repeated measures ANOVA with a factor 363 of CS Type and phase (last half of extinction and early renewal), revealed a main effect of CS 364 Type (F(1.73,41.56) = 11.26, pgg < 0.001, η 2 G = 0.115), a trend toward a significant main effect of 365 phase (F(1,24) = 4.04, p = 0.056, η 2 G = 0.010), but no significant CS Type by phase interaction (pgg 366 = 0.072). Post-hoc paired t-tests revealed that expectancy for CS+EXT significantly increased 367 (t(24) = 3.894, p < 0.001, 95% CI [0.075, 0.245]) from late extinction to early renewal, but was not 368 different between phases for neither CS+CC (p = 0.720) nor CS-stimuli (p = 0.818). Thus, at 24 369 hours, participants exhibited renewal of shock expectancy towards items from the category that 370 underwent standard extinction, but not towards items from the control category, nor the CC 371

category. 372
Repeated-measures ANOVA of SCRs during 24-hour renewal revealed a main effect of CS type 373 (F(1.81,43.39) = 3.732, pgg = 0.036, η 2 G = 0.010) (Fig. 1C) (Fig. 1B). While a repeated measures 387 ANOVA of mean shock expectancy revealed no significant main effect of CS Type (pgg = 0.080), and no differences between CS+s (p = 1). Interestingly, autonomic arousal to each CS was 392 exceptionally low (Fig. 1C). A repeated measures ANOVA of mean SCR revealed no main effect 393 of CS type (pgg = 0.395). Thus, 1 month later, participants expressed some retrieval of Day 1 CS+ 394 shock contingencies, but did not display heightened physiological arousal towards CS+ items. 395  thalamus and PAG). We focused these ROI analyses on the second half of extinction. This 422 revealed diminished activity to the CS+CC in comparison to the CS+EXT (Fig. 2C), indicating that 423 the outcome during counterconditioning attenuated activity in regions involved in maintaining and 424 expressing threat expectations relative to merely omitting the shock. 425

24-hour threat renewal test. Univariate fMRI analysis of the CS+EXT > CS+CC and CS+CC > 426
CS+EXT contrasts did not reveal any significant activity that survived whole-brain correction for 427 multiple comparisons. A more liberal exploratory threshold of p < 0.001 (uncorrected) for the 428 CS+CC > CS+EXT contrast revealed a cluster in the left amygdala (MNI -16, -7, -21; 27 voxels, 429 z = 3.49, p uncorrected < 0.001; cluster corrected at p < 0.05 with SVC) (Table 3, Fig. 2B). No regions 430 emerged at this liberal threshold for the inverse contrast (CS+EXT > CS+CC). 431

month threat renewal test.
No regions emerged at the whole-brain level for the univariate 432 contrasts CS+CC > CS+EXT or CS+EXT > CS+CC at 1 month, even using a liberal threshold (p 433 < .001, uncorrected). 434

A BLANAc circuit for retrieval of rewarded extinction. To examine the involvement of fMRI 445
derived amygdala projections, we conducted a generalized psychophysiological interaction 446 analysis (gPPI) during recent and remote threat renewal tests (Fig. 3A). This analysis was 447 inspired by neurobiological evidence that a BLA to NAc circuit preferentially supports reduced 448 threat relapse of rewarded extinction (Correia et al., 2016). The seed region was an anatomically 449 defined BLA, and the target region was an anatomically defined NAc.  is considered a critical region that inhibits conditioned defensive responses via projections that 461 inhibit the central nucleus of the amygdala (CeM) (Ghashghaei & Barbas, 2002;McDonald et al., 462 1996).This circuit is considered critical for successful extinction retrieval. We therefore conducted 463 a gPPI during recent and remote threat renewal tests using the vmPFC as the seed region and 464 an anatomically defined region of the CeM as the target region (Fig. 3B). The vmPFC was 465 functionally defined based on a medial frontal gyrus cluster from the CS-> CS+ contrast during 466 acquisition (Table 1)

Similarity patterns in the vmPFC across recent and remote renewal are enhanced for CC 514
stimuli. A repeated measures ANOVA of pattern similarity from recent to remote renewal (24-515 hour renewal session 1 month renewal session) revealed no main effect of CS Type (pgg = 0.083). Post-hoc paired t-tests revealed that across renewal phases, similarity was marginally 517 enhanced for CS+CC in comparison to CS-stimuli (t(22) = 2.047, p = 0.052, 95% CI [-0.001, 518 0.155]), but not in comparison to CS+EXT stimuli (p = 0.887). (Fig. 4C). 519 520

Discussion 555
As extinction is a transient form of inhibitory learning, there is interest in optimized strategies that 556 more effectively inhibit relapse of extinguished threat. Counterconditioning (CC) may be more 557 effective than standard extinction (Keller et al., 2020), but the neurobehavioral mechanisms of CC 558 in humans have remained unclear. Further, to our knowledge, the long-term neurobehavioral 559 effects of threat attenuation strategies (> 1 week) have remained unexamined in humans. Here 560 we found that, in comparison to standard extinction, rewarded extinction using CC attenuated 561 activity in regions associated with threat appraisal and expression and reduced 24-hour 562 conditioned responses. Twenty-four-hour renewal was accompanied by enhanced functional 563 connectivity between the BLA and NAc for stimuli from the CC category, and connectivity between 564 the vmPFC and CeM for stimuli from the standard extinction category. One-month renewal was 565 associated with reduced conditioned responses and accompanied by connectivity between 566 vmPFC and CeM for both extinction strategies. Representational similarity analysis showed that 567 memory traces of CC are stable in the vmPFC across recent and remote time points. 568 An overarching question about CC is whether it should simply be considered another form of 569 extinction or whether it operates through different neural mechanisms (Keller et al., 2020). Here, 570 we found that CC attenuated activity in regions associated with threat appraisal and expression 571 (insula, thalamus, dACC, PAG), suggesting that providing a positive experience during extinction 572 may facilitate safety learning. Notably, this finding is consistent with a recent fMRI study in which 573 a shock was replaced with a neutral outcome (a tone) (Dunsmoor et al., 2019). As previously 574 suggested, replacing shock with a non-aversive stimulus might reduce ambiguity and uncertainty 575 otherwise generated when a shock is merely omitted . 576 At 24-hour and 1-month renewal tests, there was a surprising lack of differentiation in whole-brain 577 fMRI activity between the retrieval of CC and standard extinction memories. A more liberal 578 statistical threshold did reveal greater activity for CC in the left amygdala at 24-hour renewal. On 579 one hand this finding may seem counterintuitive, given that the amygdala is critical for threat 580 learning and expression (Phelps and LeDoux, 2005) and conditioned responses were slightly 581 more attenuated by CC. However, the amygdala also responds to rewarding stimuli (Beyeler et  We used functional connectivity analysis to further assess the neural differences between CC and 588 standard extinction. At 24-hours, functional connectivity between the vmPFC and CeM was 589 enhanced for standard extinction in comparison to CC; in contrast, functional connectivity 590 between the BLA and the NAc was enhanced for CC in comparison to standard extinction. These 591 findings can be interpreted in the well-explored neurocircuitry of threat extinction in rodents. For the present results help extend rodent neurobiological findings to humans and indicate that 599 separate patterns of connectivity dissociate CC from standard extinction. Interestingly, 600 connectivity between vmPFC and CeM was observed at 1-month for both CS types, suggesting 601 that over longer periods of time, extinction recruits medial prefrontal inhibition of the amygdala 602 regardless of the particular threat inhibition strategy. It is worth noting that the 24-hour renewal 603 test served as another standard extinction session, as positive outcomes were not included at 604 test. Thus, the memory of CC at the 1-month test comprised a mix of CC (from Day 1) and 605 standard extinction (from Day 2) that may be reflected in the switch in connectivity from BLANAc 606 to vmPFCCeM over time. 607 A multivariate RSA was used to further interrogate the fidelity of CC and standard extinction 608 memories. The reactivation of neural activity patterns from extinction were enhanced by CC in 609 the vmPFC both 24-hours and 1-month later. It is notable that the vmPFC showed neural 610 reactivation patterns for CC, as functional connectivity analyses indicated a vmPFCamygdala 611 connection was selectively enhanced 24-hours following standard extinction but not CC. 612 However, neurobiological evidence shows that activation of the BLANAc circuit by rewarded 613 extinction increases activity in the IL to prevent threat relapse (Correia et al., 2016). Thus, CC 614 may likewise enhance involvement of the vmPFC for storing long-term memory traces of safety. 615 The results from the 1-month retrieval test were intriguing for several reasons. First, although 616 shock expectancy returned slightly, autonomic arousal was remarkably low. This might indicate 617 that both threat attenuation strategies were successful over the long term. It is notable that 618 functional connectivity between the vmPFC and the CeM was evident for both CS+ categories at 619 1-month (albeit only at a marginal level for CS+EXT), suggesting this is a mechanism for successfully reducing conditioned responses over long durations in humans. It is also important 621 to note that participants were all reportedly free of psychopathology, and thus memory of 622 laboratory conditioned threat might simply weaken over long durations in the healthy brain. This 623 calls for future studies comparing the return of threat over longer intervals in patients with anxiety 624 disorders, particularly posttraumatic stress disorder (PTSD). Threat conditioning is a popular 625 model for PTSD (Mahan & Ressler, 2012) but immediate dysregulated responses to a CS may 626 better reflect Acute Stress Disorder, which refers to the stress symptoms that arise in the first 627 month after a traumatic event (Bryant, 2019). A key criteria in a PTSD diagnosis is the persistence 628 of symptoms at least 1-month following the trauma (American Psychiatric Association, 2013). 629 Importantly, acute stress disorder can develop when PTSD does not, and vice-versa (Bryant, 630 2010). More research is warranted on the long-term endurance of different extinction strategies 631 in clinical populations who display extinction retrieval deficits. 632 A limitation of the present study concerns the broad definition of "reward" for the outcomes used 633 to replace shocks in counterconditioning. Simply put, were the pictures actually rewarding? More 634 generally, by what operational definition should "reward" be applied? It is worth noting that the 635 pictures used in this study were rated highly in positive valence by a separate group of 636 participants. CC paradigms have employed a wide variety of appetitive outcomes (see Table 1: an expected shock could be construed as a psychological reward (or at least a relief). It is 642 therefore possible that facilitating extinction through any number of strategies simply promotes 643 engagement of a threat-inhibition process that overlaps with reward-responsive neurocircuitry. 644 One way future research could evaluate whether there is a unique effect of "reward", would be to 645 compare outcomes that vary in reward intensity, such as comparing positive pictures to primary 646 reinforcers, like food or juice, or to compare passive delivery versus instrumental responses 647 (Thomas et al., 2012). 648 Insofar as Pavlovian extinction serves as a theoretical foundation for exposure therapy, and 649 symptoms frequently return following treatment (Vervliet et al., 2013), examining the 650 neurobehavioral endurance of different threat attenuation strategies is important. These results 651 provide new evidence that the presence of a rewarding stimulus during extinction may boost 652 threat attenuation through an amygdala-striatal pathway, and stabilize memory representations in the vmPFC over long time intervals. These results extend neurobiological findings on the 654 overlap between reward and threat extinction from rodents to healthy humans. While 655 neuroimaging research comparing these strategies in clinical populations is warranted, this type 656 of research could serve as a foundation for translational efforts that result in a paradigm shift for 657 exposure therapy. 658