5-HT2c receptors regulate the balance between instrumental vigour and restraint

1 Department of Experimental Psychology, University of Oxford, OX1 3UD, U.K. 2 Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford OX1 9DU, U.K. 3 Wellcome Centre for Integrative Neuroimaging, University of Oxford, Oxford OX1 3UD, U.K. 4 Current address: Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia, 20147, U.S.A. 5 These authors contributed equally to this work


Introduction
The central neurotransmitter serotonin (5-HT) has been implicated in the motivational control of behaviour [1][2][3][4]. Several lines of research have shown that 5-HT, and its 5-HT 2C receptor in particular, plays an important role in regulating instrumental vigour. For instance, depletion of central 5-HT or antagonism of the 5-HT 2C receptor both speed responding and increase willingness to work for reward [5][6][7][8][9][10][11]. This has resulted in 5-HT 2C receptors being considered a possible target for the treatment of disorders of motivation such as apathy [7][8][9][10]12,13]. However, a largely separate literature has highlighted a key function for intact 5-HT signalling -and again, the 5-HT 2C receptor in particular -in enabling appropriate response restraint. Tonic firing of 5-HT neurons increases whilst waiting for reward and decays in the period before an animal ceases to wait [14], and depletion of central 5-HT or administration of a 5-HT 2C antagonist can also increase inappropriate motor responses in rodents, particularly in anticipation of reward [6,11,[15][16][17][18][19][20]. Therefore, a fundamental yet unaddressed question is under what circumstances transmission at 5-HT 2C receptors promotes action over inaction and how this might interact with potential future reward.
One possible reason for this lack of clarity is that most studies to date have required animals to work for a constant reward, yet the activity of 5-HT neurons is modulated both by reward magnitude and reward context [21][22][23]. A second issue is that the study of action vigour often uses internally guided instrumental paradigms, whilst those investigating the role of serotonin in action restraint have predominantly used stimulus-driven tasks. Such differences are likely to be important as 5-HT has been implicated in gating sensory processing [24], and it has recently been shown that administration of a 5-HT 2C receptor antagonist can specifically reduce the influence of cues over decision-making policies [19]. Therefore, to better understand the role of 5HT 2C receptors in shaping the influence of environmental stimuli on action initiation and restraint, we investigated the effect of systemic and intra-NAcC administration of SB242084 -a functionally selective ligand that acts broadly antagonistically at 5-HT 2C receptors [25,26] -on rats' performance of a Go/No-Go task designed to separate action requirements from current motivation (Fig 1) [27]. We predicted that while disruption of 5-HT 2C receptors would invigorate instrumental responding (Go), it would in tandem impair action restraint for reward (No-Go); and, crucially, that these 4 alterations in performance would be mediated by the size of potential reward and whether behaviour was internally-motivated or cue-driven. In addition, we examined whether such effects might be mediated specifically via 5-HT 2C transmission in the NAcC, which has previously been implicated in inhibitory response control [16,28]. To do this, we infused SB242084 into the NAcC and compared the manipulation to intra-NAcC administration of damphetamine, a sympathomimetic drug known to potentiate dopamine and 5-HT, under the hypothesis that both drugs might weaken action restraint for reward in the NAcC.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint

Subjects
All procedures were carried out in accordance with the UK Animals (Scientific Procedures) Act (1986) and its associated guidelines. 26 group-housed male Sprague-Dawley rats (Envigo, U.K.), aged ~2 months at the beginning of training, were used. Animals were maintained on a twelve-hour light/dark cycle (lights on 07.00). During testing, rats were food restricted to maintain ~85-90% of their free-feeding weight. Water was available ad libitum in their home cages. For the NAcC infusion studies, animals were implanted with bilateral guide cannulae (Plastics One) 1.5mm above the target site of the NAcC (AP:+1.4mm, ML:±1.7mm, DV:-6.0mm from skull surface) under isoflurane anaesthesia and secured with dental acrylic.

Behavioural task
Paradigm. The task design is shown in Fig 1a-b. Animals were trained as described by [27].
In the full task, a trial was initiated when the rat voluntarily entered and stayed in a central nose-poke for 0.3-0.7s (Fig.1a). This triggered the presentation of one of four auditory cues, which signalled the action requirement (Go Left/Right or No-Go) and available reward for a correct response (Small or Large) (Fig.1b). Go trials required animals to make two presses on the correct lever within 5s of cue onset (Fig.S1a). On No-Go trials, animals were required to remain in the nose-poke for the No-Go hold period (Fig.S1b). Successful trials caused reward to be delivered to a magazine on the rear wall.

Pharmacological challenges
SB242084 (Tocris) was dissolved in 25mM citric acid in 8% w/v cyclodextrine in distilled water, and the pH adjusted to 6-7 using 5M NaOH. Systemic injections of drug (0.1mg/ml, "Low" dose; or 0.5mg/ml, "High" dose) or vehicle, containing 25mM citric acid and 8% w/v cyclodextrine in distilled water, were given intraperitoneally in a volume of 1ml/kg 20 minutes prior to testing. For local SB242084 infusions, stock solutions (Vehicle, SB242084 0.2μg/μl, . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint 1.0μg/μl) were prepared. For the local infusion positive control study, d-amphetamine was dissolved in 0.9% NaCl solution to a concentration of 10μg/μl.

Local infusion procedure
33-gauge bilateral infusion cannulae were inserted into the NAcC. 0.5µl of vehicle or drug solution was injected per hemisphere at a rate of 0.25µl/min. The infusion cannulae were left in place for 2 minutes after the cessation of the infusion to allow diffusion of solution from the cannulae. Testing commenced 10 minutes after.

Data analysis
Performance and time measures were mainly analysed using repeated-measures ANOVAs with drug dose and reward size (small, large) as the within-subject factors, unless specified otherwise. Latency and performance measures of interest are summarized in

Re-engagement [s]
First nose-poke after success or 1s after failure Table 1. Overview of the behavioural variables of interest within each trial. Further details can be found in supplementary information and Fig. S1. 'N/A', not applicable.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint

SB242084 increases accuracy and instrumental drive on Go trials
We first examined how SB242084 influenced action selection on Go trials, where rats were required to initiate an action and press the appropriate lever twice for small or large reward.
In vehicle sessions, rats' success rate was on average >80% on both trial types.
A first analysis did not find any significant effects of the drug on Go performance (all F<1.49, p>.248). However, as prior research has reported SB242084 to have non-linear effects on performance with increasing drug doses [6], we sought to test for non-linearities by running within-subject contrasts. This revealed that the drug caused an overall improvement in performance on the 5-HT 2C antagonist for both low and high reward trials selectively at the low dose, where 11 of the 12 rats showed greater than average success rates compared to vehicle (quadratic effect of drug: F 1,11 =11.26, p=.006; drug X reward interaction: F 2,22 =0.29, p=.789; Fig.1c). Further analyses showed the improvement in performance on the low dose was caused by a decrease in lever press omissions (quadratic effect of drug: F 1,11 =7.73, p=.018; Fig. 1d), with no corresponding change in the frequency of incorrect lever response trials (all F<1.43, p>.261; table S1).
We next examined whether administration of the drug altered motor responses within Go trials. While there was no reliable change in latency to initiate action, (all F<1.65, p>.216; . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint vehicle v low dose, p=.004; all other comparisons, p > .166. D. Change in percentage of Go trial omissions on small and large reward trials, calculated as the difference between drug and vehicle treatments. Pairwise comparisons: vehicle v low dose, p=.002; all other comparisons, p > .306. Data in c-d are depicted as treatment means (thick bars), superimposed by individual subject data (dots and grey lines). E. Action initiation latency on successful small and large reward Go trials. F. Travel time on successful small and large reward Go trials. Pairwise comparisons: low dose v vehicle, p=.001; high dose v vehicle, p=.017. G. Magazine latency on successful small and large reward Go trials. Pairwise comparisons: vehicle v low dose: p=.187; vehicle v high dose, p=.008. Data in E-G is shown as means (large coloured circles), superimposed by individual rat data (lines). *p < .05

Systemic SB242084 increases impulsive action on No-Go trials
We next considered the effect of SB242084 on rats' ability to withhold action for reward by comparing the proportion of correct small and large reward No-Go trials in each of the drug administration conditions. On vehicle, animals successfully withheld responding for small and large rewards on average on >84% of No-Go trials. However, in contrast to Go trials, administration of either dose of the drug impaired performance when the prospective future reward was small (drug x reward interaction: F 2,22 =5.18, p=.014; Fig. 2a). This was not caused by a general inability to withhold action as not only did the ligand have no effect on performance when the large reward was on offer, it also did not change the number of aborted trials, when the rats failed to sustain the pre-cue nose poke required to initiate a new trial (no main effect or interactions with drug: F<0.93, p>.409; table S1).
We reasoned that these No-Go failures might be made up of a mixture of fast cue-driven responses and slower premature responses as the promise of reward becomes more imminent. To examine which of these processes the ligand influenced, we quantified premature head exits in either the 'early' or 'late' epoch of the No-Go hold period of error trials (Fig.2b). This revealed that both doses of the ligand promoted erroneous 'late' over 'early' head exits, again only when the rats were anticipating a small reward (small reward: vehicle v low dose : χ 2 (2)=9.16, p=.010; vehicle v high dose: χ 2 (2)=6.86, p=.032; large reward: all χ 2 <5.26, p>.072; Fig.2c).
We also investigated whether SB242084 affected action initiation latencies following the successful completion of the No-Go waiting requirement. Although there were no significant differences in correct No-Go trial hold durations on and off the ligand (all F<0.83, p>.451), . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint based on the patterns of No-Go errors we predicted the ligand might have also altered the balance of fast and slow head exit timings with respect to salient task events (cue offset, signalling the end of the No-Go period, and reward delivery 1s later). We therefore examined the distribution of head exit latencies on correct No-Go trials within the pre-reward interval (i.e., in the 1s between No-Go cue offset and reward delivery) as well an equivalent postreward interval, both split into 'early' and 'late' epochs. As can be observed in Figure 2d, rats given vehicle injections on average had the highest likelihood of leaving in the early part of each interval. However, this pattern switched such that the rats given the ligand -particularly at the lower dose -became more likely to leave in the late part of each interval (early-late X drug interaction: F 2,22 =5.81, p=.009; Fig 2e). Similar to Go trials, once an action had been initiated, the drug dose-dependently reduced magazine latencies (main effect of drug, F 2,18 =7.51, p=.004; vehicle v low dose, p=.006; vehicle v high dose, p=.023; table S1).
In summary, SB242084 selectively impaired animals' ability to withhold responding when the reward on offer was small. This reflected a significant increase in inappropriately leaving the nose-poke port late in the hold period without a concomitant increase in early cue-driven responses. This shift from early to late responses persisted on successful No-Go trials, both in the pre-reward and post-reward interval.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint

Effects of 5HT 2C receptor manipulation on action restraint / instrumental drive not localisable to NAcC
It has previously been suggested that 5-HT 2C receptors in the NAcC are important for incentive motivation and inhibitory control [16]. Therefore, to determine whether the attenuation of goal-directed inhibition and invigoration of reward-related lever pressing we observed after systemic injections was dependent on 5-HT 2C receptors in the NAcC, we tested a second cohort of animals on the Go/No-Go task following local infusion of either vehicle, 0.1µg or 0.5µg per hemisphere of SB242084. Of the 14 implanted animals, one was excluded for having misplaced cannulae and two others did not complete all testing sessions, resulting in an n of 11 rats (Fig.3a).
In contrast to the effects of systemic administration, and against our expectations, there were no reliable effects of intra-NAcC SB242084 compared to vehicle on either Go or No-Go trial success rate or latencies (all F<0.97, p>.395; Fig.3b-h; table S2). This remained the case even when limiting analyses to animals that showed a change following d-amphetamine infusions (see next section).
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint Data is shown as means (horizontal bars), superimposed by individual subject data (dots and light grey lines). C-E. Effect of intra-NAcC vehicle or SB242084 on action initiation, travel time and magazine latency on Go trials. Data is depicted as mean (large dots), superimposed by individual subject data (thin grey lines and small dots). G. Effect of intra-NAcC vehicle or SB242084 on the timing of premature nose poke exits, taking place in either the early or late epochs of unsuccessful No-Go trials (small or large reward, upper and lower panel, respectively). H. Effect of intra-NAcC vehicle or SB242084 on timing of nose-poke head exits in the early or 'late' epochs of the pre-reward interval and post-reward interval of successful No-Go trials. Data in G and H are depicted as mean ± SEM, normalized to the within-subject variance across small and large reward trials either just within the No-Go hold period (G) or within the pre-or post-reward interval (H).

Amphetamine in NAcC impairs action restraint but does not improve instrumental performance
The lack of effect of intra-NAcC infusions of the ligand appears to suggest no direct role for 5HT 2C receptors for performance on this task. However, it is also possible that it reflects a failure of patency of the cannulae or a difference in the behavioural strategy of this cohort of animals from the group that were used for the systemic manipulation.
To rule out the former explanation, and also to provide a direct comparison with a dopaminergic manipulation, we examined the effect of intra-NAcC infusions of 0.5µg damphetamine in the same cohort of animals (n=13). Infusions of d-amphetamine did not influence Go trial success rate (F < 0.24, p > .630) (Fig.4a). Nonetheless, the drug substantially and selectively speeded action initiation on successful Go trials (main effect of drug: F 1,12 =12.74 p=.004; Fig.4b).
On No-Go trials, intra-NAcC d-amphetamine markedly impaired rats' ability to withhold action. It caused a substantial increase in No-Go errors, and these occurred both in the early and the late epochs of the No-Go holding interval (main effect of drug: F 1,12 =14.05 p=.003; all other F<3.66, p>.080) (Fig.4e-g). Notably, these No-Go errors were often followed by lever pressing within the No-Go holding interval (F 1,12 =12.40, p=.004). This deficit in withholding actions also generalised to the pre-cue period, where intra-NAcC d-amphetamine substantially increased the numbers of aborted trials (main effect of drug: F 1,12 =5.76, p=.035; table S3).
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint Taken together, these findings demonstrate that intra-NAcC d-amphetamine significantly biased behavioural strategies towards action over inaction, speeding action initiation and impairing action restraint. In turn, this demonstrates that the null effects observed following intra-NAcC infusions of SB242084 cannot be simply attributed to a lack of cannula patency.

Figure 4. Effects of intra-NAcC d-amphetamine on Go/No-Go performance and latencies. A.
Percentage of correct responses for small and large reward trials performed per drug session in Go trials. Data is shown as means (horizontal bars), superimposed by individual subject data (light grey lines and dots). B-D. The effect of intra-NAcC d-amphetamine on action initiation, travel time and inter-press latency on Go trials. Data is depicted as mean (large dots), superimposed by individual subject data (thin grey lines and small dots). E. The effect . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint of NAcC d-amphetamine on No-Go trial success rate. F,G. Timing of premature nose poke exits on unsuccessful small (upper panels) and large (lower panels) reward No-Go trials. Distributions are normalised to overall numbers of small or large reward No-Go trials. Data is shown in distributions of 100ms bins (F) or divided into 'early' and 'late' epochs of the No-Go hold requirement (G). Data is shown as mean ± SEM, normalized to the within-subject variance across the No-Go hold period of small (top panel) or large (bottom panel) reward trials. *p < .05.

Effect of SB242084 does not depend on training history or changes in task parameters
The systemic and local NAcC 5HT 2c perturbations were carried out in separate cohorts, each having trained with slightly different No-Go hold intervals (1.7-1.9s and 1.5-1.7s for the systemic and the local 5HT 2C manipulation, respectively). To ensure that the effects found were not attributable to any such differences in task parameters, or due to training experience, we analysed the effects of two replications of systemic administration of the low dose of SB242084 in the second cohort of animals, one administered before cannulae surgery and one performed after the infusion experiments had been completed.
Arguing against this possibility, systemic administration of SB242084 caused a very similar pattern of changes in task performance. The ligand again improved Go trial success rate especially on small reward trials (drug X reward interaction: F 1,13 =13.56, p=.003, vehicle vs drug on small and large reward trials: p=.001 and p=.869, respectively). Likewise, the ligand again speeded up travel times and magazine latencies (main effect of drug: both F 1,13 >13.22, p<.004). Similarly, on No-Go trials, there was a robust interaction between the effects of the drug and reward, caused by a reduction in success rate in small but not large reward trials following SB242084 (drug X reward interaction: F 1,13 =7.14, p=.019). Importantly, there was no effect of whether the drug was given before or after the local infusion experiments in any of the above analyses (all F 1,13 <1.40, p>.258). Therefore, the lack of effects seen after the NAcC infusions of the 5-HT 2C ligand cannot simply be attributable to changes in specific task parameters or experience.

Discussion
Here we studied the role of 5-HT 2C receptors -an important modulator of instrumental vigour [7][8][9][10] -in controlling action initiation and restraint. Systemic, but not intra-NAcC, . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint administration of a low dose of a 5-HT 2C receptor ligand SB242084, which broadly acts as a selective antagonist, improved instrumental performance on Go trials. This was apparent even in the face of high baseline rates of accuracy, and was caused by a reduction in the rates of response omissions. Furthermore, although systemic 5-HT 2C antagonism had no effect on cued action initiation latencies, the drug dose-dependently speeded progress through the trial regardless of the reward size on offer. By contrast, 5-HT 2C blockade had a detrimental effect on goal-directed restraint of actions, but only on No-Go trials which promised small rewards. This was characterised by a potentiation of impulsive responses in the later part of the No-Go holding interval. This contrasted with the effects of intra-NAcC infusions of damphetamine which amplified both early and late premature action. Taken together, this suggests that 5-HT 2C receptors, outside the NAcC, play an important role in orchestrating the balance between internally-driven response likelihood and instrumental vigour, shaped by the anticipated benefits of acting or restraint.
A number of previous studies, using a variety of tasks, have reported reduced operant responding latencies and increased motivation to work, particularly in high effort situations, following systemic administration of SB242084 [7][8][9][10]. Here, we also observed increased success rate on Go trials and the dose-dependent invigoration of instrumental actions.
Notably, however, this occurred in the context of a task with no equivalent effort requirements (note that while this task was arguably cognitively demanding, a recent study found no effect of 5-HT 2C receptor agents on cognitive effort allocation [11]). Moreover, this enhanced response speed did not come at any cost to instrumental precision; after having received the 5HT 2C receptor ligand, rats were no less likely to choose the correct lever or make more lever presses than required. Therefore, while our results are generally consistent with the idea that perturbing transmission at 5-HT 2c receptors boosts goal-directed motivation and willingness to work [7,9,10], the current data refine these definitions and also suggest that there are potentially several distinct processes at play. First, it was not the case that all response latencies were faster after systemic SB242084 as the average speed of cued action initiation on both Go and No-Go trials remained unchanged. Second, while there was a monotonic effect with increasing drug dose on those latencies that were affected, the change in Go trial response omissions was limited to the low dose, consistent with previous reports [6]. Moreover, there . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint was evidence that performance, though not response latencies, was most affected when the small reward was on offer. This is in line with previous work showing that the most prominent effects of SB242084 were apparent when animals were working for lower net value options [8,13], suggesting that 5-HT 2c receptors may be important for incentivising engagement with a current goal even in the face of a low net payoff.
In parallel to the improvements in Go trial performance, we also observed an increase in inappropriate, premature responses on No-Go trials following systemic SB242084, which once again was specific to small reward trials. There is much evidence from both manipulation [29,30] and physiological [31,32] approaches that central serotonin is a key modulator of the ability to wait for reward, with 5-HT 2C receptors playing a central role in mediating this [6,11,12,15,17,20]. However, the effect we observed here did not manifest as an overall increase in impulsivity nor a gross timing deficit. Instead, close inspection of the pattern of errors on these trials showed that the ligand specifically increased the proportion of errors that occurred in the late period of the No-Go holding interval on small reward trials, but had no influence on the rate of fast, cue-elicited No-Go errors. Moreover, when the animals were able correctly to withhold responding during the No-Go period, the 5-HT 2C receptor ligand shifted the pattern of subsequent action initiation latencies away from cue-elicited responses (i.e., ones clustered around cue offset at the end of the No-Go holding period or reward delivery 1s later). This resilience to inappropriate cue-evoked responding was also reflected by a previous report which demonstrated selective amelioration of the influence of cues on risk-based decision-making in a rat gambling task after SB242084 administration [19].
Thus, 5-HT 2C receptors appear to influence the balance of how internal versus cue-driven processes shape instrumental drives to act, with systemic SB242084 potentiating the former over the latter. Such a perspective is compatible with recent evidence from optogenetic activation of dorsal raphe serotonin neurons that suggested serotonin may modulate the speed of evidence accumulation when deciding whether or not to switch away from a current behavioural policy, which in turn can also influence the vigour of the ongoing behaviour [33].
However, while the authors of this study suggested this was caused by serotonin modulating levels of uncertainty, it is unlikely that this factor is playing any major role here as the rats in the current study were highly trained and the cue-action-reward contingencies were . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint deterministic and fixed. Instead, our data implicates 5-HT 2c receptors more closely with the regulation of instrumental drive based on the future benefits and opportunity costs of acting.
The exact locus of the effects on instrumental drive and restraint observed following systemic administration of SB242084 is unclear. What is evident from the results presented here is that none of the described effects appear to rely on 5-HT 2C receptor signalling within the NAcC.
This was unexpected as not only is there an abundance of 5-HT 2C receptors in the NAcC [34,35] but also a previous study, using the same doses of SB242084, reported an increase in impulsivity on the 5-choice serial reaction time task (5-CSRTT) specifically after infusions into NAcC and not into medial frontal regions [16]. This discrepancy was not caused by the cannulae targeting part of the NAcC that is not important for the task or from a loss of cannulae patency, as subsequent microinjections of d-amphetamine into the NAcC had a substantial influence on performance. Nor can the lack of effect result from subtle task or training differences in the different cohorts of animals that underwent the main systemic experiment and the NAcC cannulation respectively. This is because systemic administration of SB242084 in the cannulated cohort replicated the original patterns of results in the first cohort. Instead, what the findings presented here favour is a possible fractionation of 5-HT 2Cmodulated "waiting" impulsivity [28,36]: depending on whether the animals have to withhold a specific response until a cue is presented (as happens in the 5-CSRTT) or, as here, to withhold competing motor responses in the presence of a cue predicting future reward.
Previous studies have demonstrated a direct influence of systemic SB242084 on midbrain dopamine firing rates and dopamine levels in the NAcC [9,37,38], and mesolimbic dopamine is known to shape goal-directed motivation and the balance of action initiation and restraint [27,28,[39][40][41]. Here, microinjections of d-amphetamine into NAcC, known to potentiate and prolong dopamine signalling, also caused a marked increase in premature responses on No-Go trials and speeded action latencies on Go trials. However, the pattern of these changes was strikingly different to those observed after systemic administration of SB242084.
Specifically, d-amphetamine exclusively speeded action initiation latencies on Go trials, which had been unaffected by administration of SB242084, but had no effect on the speed of other responses in a trial or overall success rates, all of which had been altered by systemic 5-HT 2C receptor manipulations. Similarly, premature responses were elevated both in the pre-cue period and throughout the No-Go holding interval after NAcC d-amphetamine, as compared . CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint to a selective elevation in the later No-Go holding period after systemic SB242082. This implies that mesolimbic dopamine and serotonin, acting through 5HT 2c Rs, provide synergistic but distinct modulation of instrumental drive and response restraint, with the former regulating the rapid initiation of reward-seeking actions and the latter affecting instrumental drive for reward. Note though that this does not rule out additional, more direct interactions between 5-HT 2c transmission and dopamine elsewhere in the basal ganglia implicated in regulating action restraint and response vigour, such as dorsal striatum or subthalamic nucleus [10,18,35].
The effect of SB242084 on goal-directed instrumental vigour has led to the possibility that ligands targeting the 5HT 2c R -and in particular SB242084 given its functional selectivity over signalling pathways coupled to 5-HT 2c receptors -could potentially be used to treat patients with motivational deficits such as apathy [7,10,37]. Our and others' data add a note of caution by showing that the improvement in instrumental drive for reward may, in certain contexts, also have detrimental effects on response restraint. Similarly, 5HT 2c R blockade has been observed both to reduce and to amplify particular behaviours associated with obsessivecompulsive disorders [42,43]. Therefore, further research will be required to understand the neural mechanisms underlying the complex changes in response vigour and in response restraint reported here, to allow for more precise targeting of 5HT 2c R and its associated signalling pathways in the future.
. CC-BY 4.0 International license perpetuity. It is made available under a preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in The copyright holder for this this version posted December 13, 2020. ; https://doi.org/10.1101/2020.12.13.422360 doi: bioRxiv preprint

Funding and Disclosure
This work was supported by Wellcome (fellowships WT090051MA and 202831/Z/16/Z to MEW, 206330/Z/17/Z to MH), the Clarendon and the Archimedes Foundation (awards SFF1819_CB2_MSD_1196514 and Kristjan Jaagu scholarship to OH), and the ESRC (award ES/J500112/1 to LLG). The authors declare no competing financial interests.