Lateralized role of prefrontal cortex in guiding orienting behavior

Orienting movements are essential to sensory-guided reward-seeking behaviors. Prefrontal cortex (PFC) is believed to exert top-down control over a range of goal-directed behaviors and is hypothesized to bias sensory-guided movements. However, the nature of PFC involvement in controlling sensory-guided orienting behaviors has remained largely unknown. Here, we trained rats on a delayed two-alternative forced-choice task requiring them to hold an orienting decision in working memory before execution is cued. Medial PFC (mPFC) Inactivation using either Muscimol or optogenetics impaired choice behavior. However, optogenetic impairment depended on the specific trial epoch during which inactivation took place. In particular, we found a lateralized role for mPFC during the presentation of instruction cues but this role became bilateral when inactivation occurred later in the delay period. Electrophysiological recording of multiple single-unit activity further provided evidence that this lateralized selectivity is cell-type specific. Our results suggest a previously unknown role of mPFC in mediating sensory-guided representation of orienting behavior and a potentially distinct cell-type specific role in shaping such representation across time.


INTRODUCTION
Movement is pivotal to reward-seeking behaviors. To obtain rewards, animals have to continuously move to sense their environment. Since movement is costly, both in terms of energy expenditure and time, significant brain resources have been dedicated to optimizing goal-directed movements. The prefrontal cortex (PFC), in particular, has been implicated in guiding goal-directed actions (Miller and Cohen, 2001;Quintana and Fuster, 1999;Szczepanski and Knight, 2014) among other cognitive functions that serve that purpose such as attention modulation, working memory, rule learning and task switching (Donahue and Lee, 2015;Duan et al., 2015;Fujisawa et al., 2008;Greene et al., 2001;Romo and de Lafuente, 2013). Receiving sensory input from the mediodorsal nucleus of the thalamus and projecting to the premotor cortex, PFC is believed to gate in relevant information for sensorimotor integration (Euston et al., 2012;Gabbott et al., 2005;Narayanan and Laubach, 2006). It has been proposed that PFC influences motor goal representation by encoding rules and strategies that govern reward seeking actions (Gisquet-Verrier and Delatour, 2006;Karlsson et al., 2012;Powell and Redish, 2016;Rich and Shapiro, 2009). However, the extent to which this top-down influence affect more basic actions such as orienting the head, eyes and body has remained largely unknown.
Orienting behavior is a primal stereotyped movement essential to directing attention towards salient sensory input, which in turn facilitates foraging, navigation among other behaviors. Numerous cortical and subcortical structures involved in orienting behavior have been identified in rodents. In particular, lesion and electrophysiological recording studies have established the superior colliculus (SC) and frontal orienting field (FOF) as critical for orienting execution and planning respectively (Duan et al., 2015;Erlich et al., 2011;Kopec et al., 2015). However, the role of medial PFC in guiding orienting behaviors in rodents has remained poorly understood.
Here, we used a delayed two-alternative forcedchoice task (2-AFC) to study the role of rat mPFC in sensory-guided orienting behaviors. In particular, we first show that prolonged inactivation of mPFC leads to impaired performance. This inactivation affected both choice accuracy and reaction time. We then demonstrate that this impairment is inactivation timedependent: inactivating mPFC during cue presentation only affects contralateral choices whereas inactivation during motor planning affects

Two-alternative forced-choice task
We trained a total of n=19 rats to perform a twoalternative, forced-choice task. Details of behavioral training are described elsewhere (Mohebi and Oweiss, 2014). Briefly, adult Sprague-Dawley rats (Charles-River) were food deprived to their 85% ad libitum weight and placed in acoustically isolated training boxes (Coulbourn instruments, PA). The box comprised an operant conditioning chamber with three nose poke holes (a fixation hole in the center and two target holes on the sides) and a reward delivery trough on the opposite side of the nose poke holes as shown in Figure 1a. Subjects self-initiate a trial by placing their snout inside the fixation hole. Following a 1s fixation period, a brief (500 ms) singlefrequency tone (60 dB) was played. Pitch of the tone instructed the rat to make a choice to their rat or left. The instruction cue was followed by a delay period of variable length (1-1.5s), which was then immediately followed by a Go cue (white noise tone). Premature retractions terminated the trial. Correct choices were rewarded by delivering a 45mg food pellet in the trough. Rats adapt their choice depending on the instruction cue. Choice behavior is captured by a sigmoid function on both individual and group level behavior (Fig. 1b,    We only used the two extremes of the stimulus spectrum (5KHz and 14.2 KHz associated with right and left choice, respectively) throughout the rest of the study. Rats reached > 85% performance after an average of 30±8 sessions (Fig. 1c, n=1039 session, p<10 -6 , binomial test).

Unilateral inactivation of mPFC impairs choice behavior
We locally infused muscimol -a GABA A agonist -to reversibly inactivate the dorsal prelimbic cortex (Baker and Ragozzino, 2014;Willcocks and McNally, 2013). Microinjection was delivered unilaterally through a chronically implanted cannula (Fig. 2a) and drug infusion sessions were interleaved with sham injections (ACSF). We found that subjects showed selective impaired choice following drug injection. In particular, choices ipsilateral to injection were as  accurate as the control group; however contralateral choices were selected at significantly lower accuracy (Fig. 2b, Mann-Whitney U test, p<10 -3 ). In addition to this ipsilateral choice bias, reaction times were significantly slower under the muscimol administration (Fig. 2c, paired KS-test,p<10 -5 ). Unlike choice behavior, however, slowing down of reaction times was not selective and manifested in both contralateral and ipsilateral movements.

Optogenetic dissection of mPFC activity reveals separable states
Choice impairment and slowing down of movements following muscimol injection suggests a role for mPFC neurons in guiding orienting behavior during the task, similar to previous reports (Erlich et al., 2011). However, pharmacological inactivation lacks the temporal precision to dissect the exact dynamics of mPFC during different trial epochs. We then resorted to an optogenetic approach to perturb neural activity during specific task epochs. We expressed the inhibitory opsin ArchT3.0 in excitatory cells of PFC under the CaMKII promoter and used green laser (532nm, 10mW) to inactivate pyramidal cells in mPFC during instruction, delay and action execution epochs.
First, we sought to recapitulate the inactivation effects observed during the muscimol inactivation experiments. We suppressed the activity of mPFC pyramidal in a fraction of trials (25% selected in  Choice accuracy comparing laser trials to non-laser trials for contralateral and ipsilateral choices. c. Cumulative distribution of reaction times for laser and nonlaser trials for ipsilateral and contralateral choices d-f. Similar to a-c but for optogenetic suppression during the late delay period (.5s before Go cue). g-i. Similar to a-c but for optogenetic suppression delivered during the choice period (at the Go cue onset until the side choice) random order) for the entire trial period (Fig. 3b). This led to contralateral choice impairment and nonselective slowing down of reaction times (Fig3c-e), similar to our pharmacological inactivation results. Using blue light as a control (473nm, 10mW) did not affect the choice or reaction time (Fig. 3c).
Next we limited the inactivating laser pulse to specific epochs of the task. Unilateral, optogeneticallyinduced suppression (500ms pulses) limited to the instruction period impaired choice performance and slowed down reaction time, but only for contralateral choice trials (Fig. 4a-c). Inactivation during delay and reaction period, on the other hand, equally impaired choice accuracy and slowed down reaction times for both ipsilateral and contralateral choices (Fig. 4d-i).

Lateralized representation of choice among mPFC cells
We observed that suppression of mPFC during the instruction epoch caused selective impairment in contralateral choices whereas similar inactivation during delay and reaction epochs caused a nonselective impairment. Given this perplexing observation, we hypothesized that during the instruction period, only contralateral orienting information is represented by mPFC activity and ipsilateral representations evolve at later timepoints. To test this hypothesis, we used chronically implanted microwire arrays (Tucker Davis Technologies, Alachua, FL) to record from many single neurons in prefrontal cortex simultaneously during the task, sampling a broad range of layer V dorsal prelimbic cortex along the rostro-caudal axis. We recorded wideband signals and then isolated single units using offline analysis (Fig. 5a).
We classified recorded units based on their waveform shape to belong to either putative regular spiking (RS) or fast-spiking (FS) neurons (Fig. 5c). Examples of spiking activity of each cell type is shown in Fig 5d. As expected in prefrontal cortex, activity profiles were quite heterogeneous among individual units. A significant proportion of cells (45%) were modulated beyond their baseline firing rate (random shuffle test comparing 30ms bins to prefixation period) but briefly and at different timepoints, spanning the entire duration of the trial (Fig. 6a). Although we did not find evidence of persistently active single units during the delay period, RS cells as a population were persistently active during this epoch (Fig. 6b). FS cells, in contrast, did not show persistent activity during the delay period and became disproportionately active during the choice execution.
Both RS and FS population encoded information regarding upcoming choice after the Go cue (i.e., during choice execution). However, only RS cells significantly encoded the upcoming choice during the instruction period (Fig. 6c). Also, both populations encoded the outcome (rewarded/unrewarded) after the trial outcome was revealed to the animal. At no point during the trial (or inter-trial-interval) a significant population of RS or FS cells carried information about the previous trial choice or outcome (p>0.05, binomial test). Finally, RS (but not FS) cells encoded information about contralateral choices during the delay period (Fig. 6d).
Given the heterogeneous nature of response encoding by individual cells at different timepoints during each trial, we then turned to population-level analysis to investigate PFC dynamics. Reduced dimension trajectories were constructed for contralateral and ipsilateral trials using Principal Component Analysis (PCA) projections. Dimensionality reduction was performed on smoothed firing rate using a gaussian kernel with a 25ms width. During fixation (and before the instruction cue), there was small variability in the population response and the trajectories for contra and ipsi trials were almost indistinguishable from one another (Fig. 7a). At the instruction cue onset, however, population trajectory deviated significantly from baseline (p<0.01, shuffle test, n=1000), but only for contralateral choice trials, suggesting a decision is made about an upcoming choice. By the time go cue arrives, distinct trajectories have evolved both for ipsilateral and contralateral trials. These trajectories remained separate until the outcome of the trial was revealed at which point the population activity returned to their initial state.

DISCUSSION
Prefrontal cortex plays an important role in a wide range of cognitive behaviors. Here, we sought to investigate its role in sensory-guided orienting behavior in rats. We trained rats to perform a delayed two-alternative forced-choice task. Using two independent manipulation methods and multielectrode recordings in the prelimbic cortex, our data reveals a previously uncharacterized role of mPFC in top-down control of decision making during orienting behavior. We first demonstrated that unilateral suppression of mPFC activity during whole trial intervals impairs contralateral choices, consistent with previous reports (Erlich et al., 2011). We further found that precise photo-suppression during different epochs of the task have dichotomous effects on choice. Specifically, activity suppression during the instruction period only affected contralateral choices while delay period suppression equally affected both contralateral and ipsilateral choices. We explained this apparent paradox by recording from mPFC cells and demonstrating that both single unit and population dynamics deviate from baseline only for contralateral trials during the instruction cue. This role is primarily carried out by putative RS pyramidal cells. RS cells constitute a primarily excitatory neuron cell type that were targeted by our ArchTmediated suppression protocol design using the CaMKIIa promoter. Based on this demonstration, we propose that mPFC plays an asymmetrical role in orienting behavior during the stage in which extraction of salient features from sensory input takes place to guide decisions about future orienting movement. These results are in contrast with previous reports investigating the role of other subregions of PFC in orienting behavior. In particular, whereas prior work targeted the so called "frontal orienting field (FOF)" (AP:2.0, ML:1.3, DV:-0.8mm), we have targeted the prelimbic area (AP:2.6-4.0, ML:0.8:DV:-3.00mm), which is considered an upstream target of FOF neurons. In addition, while FOF has been labeled a premotor area dedicated to preparation of orientating movements, the prelimbic area is thought to be more involved in sensorimotor integration, reasoning and rule retrieval (Brasted et al., 2000;Duan et al., 2015;Euston et al., 2012;Hardung et al., 2017;Kamigaki and Dan, 2017;Kim et al., 2016;Sul et al., 2010). As such, the distinct patterns of activity in the prelimbic cortex during the delay period compared to the instruction period are likely serving different roles in mediating choice behavior.
We observed persistent activity during the delay period only at the population level but not at individual cells (Stokes, 2015;Stokes et al., 2017). Prior modeling studies have emphasized the role of persistent activity of prefrontal neurons in the representation of decisions in working memory (Bernacchia et al., 2011;Daie et al., 2015;Soltani et al., 2006). These studies were based on earlier neurophysiological observations of memory cells in primates (Funahashi, 2006;Funahashi et al., 1989;Fuster et al., 1971) in which a balance between excitation and inhibition was proposed to maintain such representations (Lim and Goldman, 2013;Murray et al., 2017;Wang et al., 2004). A recent modeling study have suggested that depending on the task requirements, mPFC cells may or may not demonstrate persistent activity (Masse et al., 2018). As such, it remains a topic of debate whether persistent activity is a characteristic of mPFC cells or a consequence of biased observations (Arnsten et al., 2010;Constantinidis et al., 2018;Lundqvist et al., 2018;Stokes, 2015;Stokes et al., 2017).
In our study, however, only putative RS pyramidal cells demonstrated persistent activity. While individual RS cells were choice selective as early as the instruction cue onset, they demonstrated sustained activity throughout the delay period, albeit in the form of short, transiently active clusters of cells. FS cells, on the other hand, were only recruited after the go cue during choice execution. Anatomical studies suggest that the majority of FS cells are Parvalbumin (PV)-expressing interneurons targeting the somatic compartments of excitatory projection cell bodies Pi et al., 2013;Pinto and Dan, 2015). This connectivity would enable them to rapidly suppress projection cell activities and thus exert significant control of the output of mPFC circuits. This might explain the activity pattern of FS cells observed around the choice execution: suppressing an unwanted element of a potential motor plan in order to bias the motor system towards Each dot represents the two-dimensional projection of the population dynamics, obtained from smoothed firing rate using a Gaussian kernel of 25ms width sampled at 1ms intervals. The trajectories are plotted for .5s before and after each trial event (denoted by a large circle for contralateral and ipsilateral trials). During fixation, population dynamics show small deviation from baseline. The introduction of the instruction cue, however, triggers a deviation from an initial state but only for contralateral trials. By the time the subject plans the movement and awaits the go cue, population trajectories become completely separated and distinct from baseline. Trajectories continue to be distinct until the animal fully commits to a choice.
executing the appropriate orienting movement. This role would be consistent with projection pathways from mPFC (Chatham et al., 2014) as well as other parts of cortex (e.g. posterior parietal cortex PPC (Hwang et al., 2019;Lyamzin and Benucci, 2019) to the basal ganglia that facilitate desired movements and suppress undesired movements. Given that suppression of excitatory neural activity in mPFC during the motor planning stage affected both contralateral and ipsilateral movements, our data suggest that the response selectivity of RS and FS neurons contribute to this parallel but coordinated elements of the motor plan.
We limited our recordings to layer 5 of the dorsal prelimbic cortex since it comprises most projection cells that send efferents downstream to other cortical (particularly PPC (Denardo et al., 2015) and subcortical regions (particularly striatal regions) and thus may be directly responsible for top-down control signals that affect orienting decisions. As such, it is likely that pharmacological inactivation effects might have extended outside of this layer. It remains to be examined in future studies whether projection targets of these pyramidal cells can explain their selectivity for ipsilateral and contralateral choices. For example, it is quite possible that intratelencephalic (IT) and pyramidal tract (PT) neurons disproportionately represent contralateral and ipsilateral choices, respectively (Shepherd, 2013). Using optogenetic identification combined with retrograde transport may help resolve such dichotomies (Anikeeva et al., 2011;Kvitsiani et al., 2013;Tervo et al., 2016).
It would also be of interest to compare the population dynamics in layer 2/3 of dorsal prelimbic cortex. While we did not find representations of ipsilateral choice during the instruction cue in layer 5 responses, layer 2/3 that directly receives more sensory afferents compared to layer 5 may carry such representations. Layer 5 and layer 2/3 also have distinct interneuronal cytoarchitecture, with layer 2/3 containing disproportionately more VIP and SOM interneurons compared to dense presence of PV interneurons in layer 5 (Gabbott et al., 1997). In contrast to PV cells, VIP and SOM cells target distal dendrites (Kepecs and Fishell, 2014;Pi et al., 2013). Given these differences, we predict that layer 2/3 interneurons help sculpt sensorimotor transformation whereas layer 5 FS cells (mostly PV) inhibit the outputs of these circuits and thus are only active during choice execution, as we demonstrated in the present study.
Our unit recording data confirmed the heterogeneous activity patterns reported by many previous studies, suggestive of network dynamics encoding (Rigotti et al., 2013). However, it is possible that this heterogeneous response could be explained by the phenotype and projection patterns of these mPFC cells (Hirokawa et al., 2019) as discussed above. Further experiments with cell-type and projection specific targeting of individual cells are required to test the origin of this heterogeneity in network patterns in the context of similar task design.

Two-alternative forced-choice task
We used a two-alternative forced-choice task design similar to previously published studies (Brunton et al., 2013;Gage et al., 2010). Adult Sprague-Dawley rats (Charles-River) were food deprived to their 85% ad libitum weight and placed in acoustically isolated training boxes (Coulbourn instruments, PA). The box comprised an operant conditioning chamber with three nosepoke holes (a fixation hole in the center and two target holes on the sides) and a reward delivery trough on the opposite side of the nose poke holes. A speaker was centrally placed on the reward trough side and used to deliver auditory cues to the subject. Subjects were required to place their nose inside the fixation hole for 1s waiting for an instruction cue. Instruction cue was a brief (.5s) single tone delivered at 60 dB. A low pitch tone (5 KHz) indicated an instruction to choose the right nose poke hole, whereas a 1.5 octave higher-tone (14.2 KHz) indicated an instruction to choose the left nose poke hole. The instruction cue was followed by a delay period of variable length (chosen pseudorandomly from a uniform distribution between 1-1.5s), which was followed by a Go cue-a white noise stimulus at 60dB. If the subject retracted from the fixation hole before the Go cue, the trial was terminated and considered a premature retraction. Visits to the instructed targets were rewarded immediately by delivering a 45mg food pellet, while incorrect visits were punished by extending the intertrial-interval from 7s for successful trials to 12s for failed trials with no reward pellet delivery. Stimulus generation, behavioral state control and reward delivery were all conducted using the Coulbourn habitest system (Coulbourn Instruments, PA) and the data was logged in real-time on a desktop computer. All behavioral events were sampled at 1KHz and subsequent analysis was performed using custom-developed MATLAB routines.

Microelectrode array implant surgery
Once subjects became proficient in the task and maintained a steady performance score above 85% correct, they were removed from the food deprivation protocol for a week prior to the surgery day. Subjects were anesthetized using inhalation of 2% Isofluorane in an induction chamber. Body temperature and vital signs were monitored throughout the surgery. A craniotomy was performed on top of the medial prefrontal cortex (+2-5mm AP, 0.5-1.5mm ML) to expose the brain surface. A 32 channel microwire array (Tucker-Davis Technologies) was slowly advanced into the cortex. Extracellular potentials were recorded during the penetration procedure with respect to cerebellar ground and reference screws touching the dura matter posterior to the lambda point. Local Field Potentials (LFPs) and multiple single-unit activity were monitored until the probe reached the target depth. The craniotomy was covered with GelFoam (Pfizer, NY) and then the probe was anchored to the skull by applying dental cement (C&B metabond, Parkell, NY) using 6 skull screws surrounding the probe. Subjects were then given one full week to recover from surgery before being subjected again to the food deprivation protocol.

Reversible inactivation
For the muscimol inactivation studies, cannulas were placed on top of the prelimbic cortex (AP:3.00, ML:0.8, DV:-3.2) anchored to the skull using stainless steel screws and dental cement. Rats were allowed two weeks to recover and rehabilitate in the task. Prior to the infusions, rats were lightly anesthetized using isoflurane. The dummy cannula was removed and replaced on one side with an injector (Hamilton Company, Nevada, USA) the end of which extended 0.5 mm from the tip of the guide cannula. GABAa receptor agonist muscimol (Sigma, Missouri, USA) was infused into the mPFC to reversibly inactivate this region. Muscimol solution (1 ug/uL in artificial cerebrospinal fluid, 1.0 μL) was infused unilaterally into one side of the cannula. The day prior to and the day following the muscimol infusion, an ACSF (1.0 uL) solution was unilaterally infused into the same side as a control. Infusions were performed at the slow rate of .1625 uL/min, and the infusion needle remained in place for 10 minutes following the injection. Rats performed the task 90 minutes after infusion for 90 minutes. One week after the first ACSF infusion, the infusion procedure was repeated on the other side of the brain.

Optogenetic suppression
Trained rats with performance scores above 85% were selected for the optogenetic experiment. Standard surgical procedures were followed for intracranial injection. Glass pipettes were pulled to OD of~40um with a long taper and penetrated in the brain via bare holes made on top of the prelimbic cortex, bilaterally. Each injection delivered 1uL of AAV5-CaMKIIa-ArchT-eYFP (Han et al., 2011) at the slow rate of 1uL/10 min at the same coordinates as muscimol injections. Ferrule attached optical fibers (NA=.48,200um in diameter, Doric Lenses, ON) were then implanted stereotaxically, sitting on top of the infected area. Ferrules were then connected to diode pump solid state lasers (either 446nm or 520nm) delivering laser pulses at 10mW.

Electrophysiological recordings
Data acquisition was performed using a Tucker-Davis RZ2 through a high impedance headstage and an automated commutator to permit unrestricted movement in the behavioral box. Wide-band signals were recorded at the rate of 25KHz/channel, amplified and digitized to 16-bits. LFPs were separated from multi-unit activity using a Symlet wavelet filtering algorithm. Once filtered, the noise root mean square value was calculated from the raw data and samples surpassing a threshold 4 times this noise floor was used as an indication of action potential occurrence. Extracted waveforms were then sorted using a custom-developed software -referred to as 'EZsort'-in MATLAB.
Brains were explanted and post-fixed in 4% paraformaldehyde overnight at 4C. Tissue was sectioned on a vibrating microtome at a 50-micron thickness, where coronal sections were used to identify ArchT/GFPexpressing neurons, and sections transverse to the probe shank were used to validate the implant location. For optogenetic experiments, sections were counterstained with Hoechst (1 ug/mL, 10 minutes), rinsed and mounted with ProLong Gold Anti-fade mounting medium (Invitrogen, Carlsbad, CA). For probe location, sections were rinsed, blocked with 10% normal goat serum, and incubated overnight at 4C in a solution of mouse-anti-CD11b (abcam) to identify the microglial sheath (1:100 dilution in 0.3% triton, 5% normal goat serum in PBS) (Ward et al., 2009). Sections were subsequently labeled with goat-anti-mouse 488 (Invitrogen, Carlsbad, CA)(1:200 dilution in 0.3% triton, 5% normal goat serum in PBS), counterstained with Hoechst (1 ug/mL, 10 minutes), rinsed and mounted with ProLong Gold Anti-fade mounting medium. Results were imaged on an Olympus fluoview confocal microscope. For optogenetic studies, an observer blinded to the experimental condition evaluated the expression level. Only those subjects with observable expression were included in the study.