Spatiotemporal organization of movement-invariant and movement- specific signaling in the output layer of motor cortex

1. Centre for Discovery Brain Sciences / Patrick Wild Centre, Edinburgh Medical School: Biomedical Sciences, University of Edinburgh, Edinburgh, Scotland, EH8 9XD, UK. 2. Simons Initiative for the Developing Brain, Edinburgh Medical School: Biomedical Sciences, University of Edinburgh, Edinburgh, Scotland, EH8 9XD, UK. 3. Department of Bioengineering, Imperial College London, London, SW7 2AZ, UK. 4. Department of Computing, Imperial College London, London, SW7 2AZ, UK. 5. MRC London Institute of Medical Sciences, London, W12 0NN, UK. 6. Institute for Adaptive and Neural Computation, School of Informatics, The University of Edinburgh, Edinburgh EH8 9AB, UK. 7. Present address: Optophysiology, University of Freiburg, Faculty of Biology, 79100 Freiburg, Germany.


Introduction
In mammals, descending cortical output is necessary for the learning and execution of voluntary movements (Guo et al., 2015;Hwang et al., 2019;Hwang et al., 2021;Kawai et al., 2015;Lawrence and Kuypers, 1968;Martin and Ghez, 1991). Deep layer projections from primary motor cortex form multiple descending pathways innervating cortical, subcortical, brainstem and spinal cord circuits necessary for triggering and controlling movement (for review, see Lemon, 2008;Ruder and Arber, 2019;Shepherd, 2013). Individual layer 5 projection neurons display complex firing patterns that correlate with various aspects of limb trajectories, such as joint angle, direction, and speed (Georgopoulos et al., 1982;Moran and Schwartz, 1999;Paninski et al., 2004;Thach, 1978) and during single action tasks in rodents, most layer 5 projection neurons (>70%) display movement-related activity in the form of bidirectional changes in firing rate (Dacre et al., 2021;Estebanez et al., 2017;Levy et al., 2020;Park et al., 2021;Sauerbrei et al., 2019;Wang et al., 2017), suggesting widespread encoding of movement. However, in non-human primates the largest components of motor cortex population responses during a delayed-multi-direction reach task have been shown to be "condition-invariant", meaning the population response magnitude and time course were similar irrespective of reach direction (Kaufman et al., 2016). Condition-invariant responses are tightly linked to the onset of movement likely reflecting movement timing rather than movement type, similar to condition-invariant population transitions observed in recurrent neural networks trained to recapitulate complex muscle patterns in reaching primates (Sussillo et al., 2015). Deciphering how condition-invariant (which we term "movement-invariant") and movement-specific signaling is spatiotemporally organized in the output layers of motor cortex and how they map onto specific projection classes would provide an important step towards understanding descending cortical control of movement.
In rodents, descending information from the main output layer of motor cortex, layer 5B, is routed via two molecularly and anatomically defined projection pathways: pyramidal tract (PT) neurons innervate multiple targets including thalamus, subthalamic nucleus, superior colliculus, ipsilateral striatum, brainstem and spinal cord, but not cortex or contralateral striatum (Economo et al., 2018;Kita and Kita, 2012;Muñoz-Castañeda et al., 2021;Ueta et al., 2014;Winnubst et al., 2019); while intratelencephalic (IT) neurons target bilateral striatum and cortex, but not other subcortical targets (Levesque et al., 1996;Muñoz-Castañeda et al., 2021;Wilson, 1987;Winnubst et al., 2019). Although layer 5B neurons are reciprocally connected (Kiritani et al., 2012;Morishima and Kawaguchi, 2006;Morishima et al., 2011), connectivity is essentially unidirectional from IT to PT neurons (Kiritani et al., 2012). This form of asymmetric across-projection class connectivity appears to be a common cortical motif necessary for sensorimotor processing (Brown and Hestrin, 2009;Kiritani et al., 2012;Molyneaux et al., 2007;Reiner et al., 2010). From a descending control perspective, asymmetric connectivity coupled with differential PT and IT intrinsic excitability, sensitivity to neuromodulation and local-and long-range inputs (for review, see Baker et al., 2018;Shepherd, 2013) provides a mechanism for flexible routing of information via distinct output channels depending on behavioral state and task requirements. Accumulating evidence suggests that PT neurons provide an essential source of descending control signals for the execution of voluntary limb movements (Economo et al., 2018;Nelson et al., 2021;Peters et al., 2017;Soma et al., 2017;Wang et al., 2017), while IT neurons provide input to cortical and striatal circuits contributing to movement preparation and specification (Panigrahi et al., 2015;Park et al., 2021;Yttri and Dudman, 2016), but how movement-specific information is spatiotemporally organized across the two output channels remains unclear.
Here, we performed 2-photon calcium imaging in deep layers of the caudal forelimb area (CFA) of mice trained to perform two diametrically opposing forelimb movements (i.e., an alternating push / pull lever task). By combining population imaging with neural classifiers and dimensionality reduction, we show that the majority of layer 5B neurons display movementinvariant signaling, correlated with movement timing rather than movement type, while small subpopulations of PT and IT neurons convey movement-specific information. Decoding of movement type was most prevalent prior to movement initiation in IT neurons, and during movement execution in PT neurons, with high decoding accuracy neurons from both projection classes being temporally uncorrelated and distributed across layer 5B. These findings provide evidence that movement-invariant signaling dominates layer 5B activity, while movementspecific information is spatially and temporally distributed across molecularly and anatomically defined projection neuron classes.

CFA is required for execution of a cued push / pull lever task for mice
To explore how layer 5B signaling relates to the execution of different movements, we first developed a cued linear push / pull lever task for mice. The task design required mice to push or pull a horizontal lever during presentation of a 2 second 6 kHz auditory cue in order to receive a water reward. After a 4-6 s inter-trial interval (ITI), mice had to push the lever 4 mm forward from an initial starting position, the lever would then be locked, and a servo-controlled lick spout would deliver a 5 µl reward following a 1s delay. The lever was then unlocked and a second ITI commenced where mice would then be expected to pull the lever backwards 4 mm to the original starting position after presentation of the same 6 kHz auditory cue. Miss trials or spontaneous movements during the ITI resulted in a lever reset and restarting of the ITI ( Figure 1A). Individual mice displayed idiosyncratic strategies to engage with the lever but displayed reproducible trial-to-trial forelimb trajectories ( Figure 1B). In general, mice reoriented their forelimb and paw upon cue presentation (lift and rotate backwards for pushes, lift and rotate forwards for pulls) during movement initiation (Video S1). Mice rapidly learned the task (mean 10.5 days, [4] inter-quartile range (IQR), N = 24 mice), displaying fast reaction times and movement durations that reflect the combination of paw reorientation and lever manipulation ( Figure 1C). 'Expert' mice completed 44. 5 [9.5]  IQR; N = 24 mice) ( Figure 1D-1E).
To confirm that the caudal forelimb area (CFA) of motor cortex is required for the execution of both push and pull movements, we focally injected the GABAA receptor agonist muscimol (1.6 mm lateral, 0.6 mm rostral of bregma, see Methods) (Dacre et al., 2021;Schiemann et al., 2015). By applying muscimol during behavioral engagement, we could assess the immediate effects of CFA inactivation within the first 10 mins following drug injection ( Figure 1F), where drug diffusion remained within the targeted region (Figure S1A-S1C  Figure 1G). Sham injections of saline into CFA or muscimol injections into hindlimb motor cortex had no effect (Figure S1D-1G). Blocking activity in CFA resulted in both an inability to initiate push or pull movements, and monoparesis of the contralateral forelimb (i.e., localized weakness without complete loss of function), as evidenced by the significant reduction in paw position accuracy (i.e., the forepaw was not positioned on the lever at cue presentation) ( Figure 1H, Figure S1H-1J and Video S2). The effect of muscimol inactivation was most pronounced in mice that displayed the highest number of successful trials, confirming that task execution is CFA-dependent even in expert mice (Hwang et al., 2019;Hwang et al., 2021;Kawai et al., 2015) ( Figure S1K).

Movement-invariant signaling dominates layer 5B activity patterns
To examine how output from CFA relates to the execution of push and pull movements, we restricted imaging of behavior-related population activity to cortical depths corresponding to  (Schiemann et al., 2015) (Figure 2A-2B and Figure S2A). Cell density estimates suggested that we imaged the majority of layer 5B neurons at depths up to 650 µm from the pial surface ( Figure S2A-C). A large proportion of layer 5B neurons displayed movement-related activity (468/653 neurons, mean = 73.5% [54.7 81.8] 95% CI per field-ofview (FOV) (n = 12 FOVs, N = 6 mice), where ∆F/F0 changes occurred within a peri-movement window spanning 150 ms prior to movement initiation to 40 ms after median movement completion. The remaining neurons were classified as either non-responsive, or 'rewardrelated' if changes in ∆F/F0 occurred after the peri-movement window ( Figure 2C-2D). The trial-to-trial similarity in population responses of movement-related neurons strongly correlated with the similarity in forelimb movement magnitude (i.e., motion index) during both push and pull movements, suggesting ∆F/F0 changes reflected movement of the forelimb ( Figure 2E).
By comparing push and pull trials, we found that most layer 5B neurons displayed movement-  Figure 2H and 2K). Although there was a high degree of variability in ∆F/F0 changes trial-to-trial, no consistent differences in mean pairwise trial-to-trial ∆F/F0 correlations were found between movement-invariant and movement-bias neurons across trial types ( Figure   2L). A small proportion of bias neurons displayed differences in baseline ∆F/F0 between push and pull trials, which could reflect postural differences (i.e., different start positions for push and pull trials) or differential preparatory activity (Li et al., 2015) ( Figure S2D-S2F). However, baseline differences were on average smaller than those observed during the peri-movement epoch (data not shown), thus unlikely to be the main driver of movement-specific signaling.
Given that ∆F/F0 changes provide an indirect readout of neural activity, we sought to confirm the proportions of movement-invariant and movement-bias neurons in layer 5B using highdensity silicone probe recordings. Putative layer 5B projection neurons were identified using spike-width thresholding and electrode depth profiling based on retrograde labeling from the pons ( Figure S2G-S2I). We found similar proportions of movement-invariant (no bias) and   push-/ pull-bias neurons when comparing both recording methods ( Figure 2M and Figure   S2J-S2L), confirming that movement-invariant signaling dominated layer 5B responses, while a small proportion of neurons conveyed movement-specific information.

A small proportion of layer 5B neurons decode movement type
Next, we investigated how reliably movement-type could be decoded from single neuron and population changes in ∆F/F0 using a Gaussian Naïve Bayes classifier and logistic regression, respectively (see Methods). Approximately 37 % of neurons (172 / 468 neurons) displayed decoding accuracy scores above chance ( Figure 3A), similar to, but slightly higher than the combined proportion of identified push and pull bias neurons (see Figure 2M), likely reflecting subtle differences in the sensitivity of both approaches. Given the trial-to-trial variability in ∆F/F0 and resultant moderate decoding scores (see Figure 2I and Figure 3A), we reasoned that population responses could provide a more robust movement-related signal that would enhance decoding of movement type. By applying logistic regression, population decoding was found to be consistently more accurate when compared to single neuron decoding (  (A) Maximum decoding accuracy during peri-movement epochs generated using a Gaussian naïve Bayes classifier. Circles represent individual neurons; black horizontal lines indicate significance threshold. HDA, high decoding accuracy (orange); LDA, low decoding accuracy (gray). (B) Box-and-whisker plots showing median, interquartile range and range of single cell (naïve Bayes classifier, orange) and population (logistic regression, brown) decoding accuracy (n = 12 fields-of-view (FOVs), N = 6 mice, P = 2.8x10 -2 , Wilcoxon Signed-rank test). Black dots represent individual mice. (C) Mean population decoding accuracy for all neurons from a representative FOV or after removal of 10-50% neurons in order from high to low single cell decoding accuracy ( Figure 3A). Red shaded line depicts the 95% CI based on shuffled data. Dashed line, movement initiation (MI). (D) Change in population decoding accuracy for an example FOV after the sequential removal of neurons in order from high (orange) to low (gray) single cell decoding accuracy ( Figure 3A). Line, mean ± 95% CI. Red line, 95% CI based on shuffled data. (E) Change in population decoding accuracy for an example FOV after the random removal of individual neurons. Line, mean ± 95% CI. Red line, 95% CI based on shuffled data. (F) Box-and-whisker plots showing the median, interquartile range and range for ordered (high-to-low decoding accuracy) versus random removal of neurons (n = 12 FOVs, N = 6 mice, P = 2.8x10 -2 , Wilcoxon Signed-rank test). Black dots represent individual mice. larger proportion of neurons having to be removed before decoding accuracy reduced to chance (median prop. removed = 0.64, [0.57] IQR, N = 6 mice, P = 2.8x10 -2 , Wilcoxon signed rank test) ( Figure 3F). This dependency on high decoding accuracy neurons suggests that movement-specific information is routed through a selected subset of layer 5B neurons.
To further explore the underlying structure of layer 5B population activity, we employed principal component analysis (Churchland et al., 2012;Churchland et al., 2010;Cunningham and Yu, 2014;Kaufman et al., 2014;Stopfer et al., 2003). For the leading 16 principal components, we compared the difference between push and pull trials to compute a discrimination index (d') ( Figure S3A). Leading principal components tended to be more similar across actions, while action type was often better represented by higher components ( Figure   S3B). Despite correlating with population decoding scores, high d' values were not preferentially associated with the leading principal components of the population activity ( Figure  the proportion of IT neurons with decoding accuracy above chance was highest prior to movement initiation, while for PT neurons this occurred during movement execution (IT peak proportion of neurons, 0.19 at -192 ms; PT peak proportion of neurons, 0.21 at +544 ms, N = 6 and 5 FOVs from 5 and 4 mice, respectively), suggesting differential roles for IT and PT neurons in movement initiation and execution, respectively. Importantly, at no time during the peri-movement window was the proportion of high decoding accuracy neurons above 21 %  for either cell type ( Figure 4J), consistent with a small proportion of projection neurons conveying time-dependent movement-specific information.

Movement-specific signaling is distributed across layer 5B
To explore whether high decoding accuracy PT and IT neurons form functional clusters within CFA, we first detected the onset of movement-related ∆F/F0 changes. Within each FOV, activity changes occurred ~300 ms prior to movement, consistent with a role in preparation / initiation (Dacre et al., 2021;Estebanez et al., 2017;Isomura et al., 2009;Li et al., 2015), and tiled the peri-movement window up to reward delivery. Neurons displaying a range of ∆F/F0 onsets were spatially distributed across each FOV ( Figure 5A-5B). To explore correlations in peri-movement activity patterns, we split PT and IT neurons based on their decoding accuracy scores (high, low and all) and compared pairwise activity during push and pull trials. We found  Figure S5). Moreover, comparing the activities of PT and IT neurons as a function of their pairwise distance, suggested that neighboring neurons did not show correlated activity or spatiotemporal clustering ( Figure 5E-5G). Thus, our data suggest a model whereby movement-specific information is routed through small, distributed subpopulations of layer 5B projection neurons, while most neurons convey movement-invariant information related to the timing of movement execution ( Figure 5H).

Discussion
Here, we show that the majority of layer 5B neuron signaling in mice was movement-invariant, where similar activity patterns were generated during the execution of two diametrically opposing movements. Changes in activity were tightly locked to the peri-movement period, indicative of a generic motor signal relating to movement timing, but not to movement type.
Movement-or condition-invariant signaling also dominates in primate motor cortex, thought to trigger state-dependent switching from stable neural dynamics during rest towards oscillatory dynamics underpinning movement execution (Churchland et al., 2012;Churchland et al., 2010;Kaufman et al., 2014;Kaufman et al., 2016;Kurtzer et al., 2005) and is an emergent property of recurrent neural networks trained to recapitulate complex muscle patterns during reaching (Sussillo et al., 2015). In contrast to primate motor cortex, we found widespread movement-invariant responses at the single neuron level (Kaufman et al., 2016). This is unlikely to reflect differences in recording sensitivity, given that our imaging and electrophysiology approaches identified similar proportions of movement-invariant neurons across layer 5B (Wei et al., 2020;Zhou and Tin, 2021), or the limited number of movements in our task as movement-invariant responses have been shown in relatively simple tasks requiring few actions (Evarts, 1968;Hocherman and Wise, 1991;Messier and Kalaska, 2000;Riehle et al., 1994;Weinrich et al., 1984). Instead, this might reflect evolutionary differences in how motor cortex recruits and controls muscle activation during the transition from rest to movement execution. Cell-type-specific imaging identified that movement-invariant signaling  Nelson et al., 2021), providing an external trigger to transform motor cortical dynamics necessary for postural maintenance at rest to a neural state required for movement execution. This switch in neural dynamics would signify the intention to move, but not which movement will be executed (Elsayed et al., 2016;Kaufman et al., 2016). An important next step will be to develop methods to identify and selectively manipulate neurons displaying movement-invariant signaling to demonstrate their causal contribution to postural control and timing of movement initiation.
We reasoned that the execution of two diametrically opposing movements should, in principle, generate distinct patterns of cortical output dynamics, given differences in starting posture, direction of movement and temporal sequence of muscle activation (Isomura et al., 2009;Miri et al., 2017). Although we found that the majority of layer 5B neuron signaling was movementinvariant, a relatively small proportion of neurons displayed response bias towards either push or pull movements. The low prevalence of movement-specific signaling is unlikely to be due to masking of subtle changes in spike rate when using calcium reporters (Wei et al., 2020;Zhou and Tin, 2021), since we observed similar proportions of movement-specific signaling when performing high-density extracellular recordings of putative layer 5B projection neurons.
The firing rates of individual neurons in motor cortex reflect a complex combination of signals that have been shown to correlate with joint angle, direction, and speed (Georgopoulos et al., 1982;Moran and Schwartz, 1999;Paninski et al., 2004;Thach, 1978), while population dynamics reflect time-varying changes in neural state during the transition from rest to movement execution (Churchland et al., 2012;Churchland et al., 2010;Kaufman et al., 2014;Kaufman et al., 2016;Kurtzer et al., 2005;Sauerbrei et al., 2019). In mice, individual layer 5B neurons displayed moderate decoding accuracy scores, likely due to relatively high trial-totrial variability, while the population average was consistently higher across mice. We found that only a small proportion (~20%) of neurons contributed to high population decoding accuracy scores, with their combined effects accurately decoding three quarters of all trials.
Removing only a handful of neurons per FOV was sufficient to abolish decoding, confirming that a minority of neurons convey the majority of information regarding movement type. This dependency on only a few neurons has important implications for understanding how movement-specific information is encoded in primary motor cortex, given that mapping neural dynamics during the execution of a single movement task will likely uncover widespread movement-invariant signaling, that relates to limb movement, but not the specific movement being executed.
In mouse cortex, projection neurons display connectivity patterns both within and across classes that suggest general organizing principles (Brown and Hestrin, 2009;Kiritani et al., 2012;Maruoka et al., 2017;Morishima et al., 2011). IT neurons in motor cortex are strongly recurrently connected, whereas inter-class connectivity is largely directional from IT to PT but not vice versa, generating a hierarchical organization with unidirectional signaling from higherorder to lower-order output neurons (Kiritani et al., 2012). Asymmetric projection-class connectivity, as well as differences in input structure and intrinsic excitability (Anderson et al., 2010;Hooks et al., 2013;Kiritani et al., 2012;Oswald et al., 2013) provide a mechanism to flexibly route movement-specific information via two independent output channels depending on behavioral requirements. Our cell-type-specific imaging identified that only a small proportion of PT neurons conveyed movement-specific information. In contrast, almost half of IT neurons displayed movement-specificity, with similar proportions of both push and pull bias.
Although PT and IT activity onsets occurred prior to and throughout movement, consistent with both pathways contributing to movement initiation and execution (Chen et al., 2017;Economo et al., 2018;Li et al., 2015;Park et al., 2021;Wang et al., 2017), the proportion of IT neurons with high decoding accuracy was highest prior to movement initiation, while for PT neurons this occurred during movement execution. This suggest that information relating to movement type is first conveyed by IT neurons, which project to bilateral striatum and cortex,  Winnubst et al., 2019). Importantly, the proportions of PT or IT neurons decoding movement type never exceeded 25%, consistent with movement-specific signaling being confined to a relatively small subpopulation of layer 5B projection neurons. What is unclear is the extent to which movement-specific signaling in PT and IT neurons is organized by projection target as seen in anterolateral motor cortex during directional licking (Chen et al., 2017;Economo et al., 2018;Li et al., 2015). Targeting neurons based on molecular expression profiles and projection specificity (Muñoz-Castañeda et al., 2021;Winnubst et al., 2019) will provide a finer grained understanding of how movement-specific information is routed via molecularly distinct projection pathways.
We found that PT and IT neurons displaying high decoding accuracy were distributed across FOVs. This lack of functional clustering differs from the proposed modular organization of directionally tuned cells in primate motor cortex, where neurons with similar preferred direction tend to cluster into vertically oriented minicolumns approximately 50 -100 µm wide, repeated every 250 µm (Amirikian and Georgopoulos, 2003;Cheney et al., 1985;Georgopoulos et al., 2007;Jones and Wise, 1977), but consistent with the distributed spatial organization of direction-specific layer 5B projection neurons in mouse anterolateral motor cortex during the execution of a whisker-based object location discrimination task (Li et al., 2015). The apparent lack of spatial clustering in CFA is unlikely to be due to reduced sensitivity of our analysis methods as confidence intervals provide a lower bound indication of cluster size such that, if present, spatial clusters based on decoding accuracy would have to be less than ~50 µm.
Similarly, we found no evidence of temporal clustering in either high (movement-specific) or low (movement-invariant) decoding accuracy neurons, as expected given that the onsets of both PT and IT neuron activity changes occurred ~300 ms prior to movement and tiled the peri-movement period up to reward delivery. Our work extends previous findings in superficial layers of motor cortex showing neurons with task-related response properties are spatially intermingled (Galiñanes et al., 2018;Huber et al., 2012), to support a model in which movement-specific signaling in layer 5B is routed via small but distinct subpopulations of

Animal husbandry and general surgery
Male adult C57BL/6J wild-type mice (5-12 weeks old, 20-30 g, 1-4 animals per cage) were maintained on a reversed 12:12 hour light:dark cycle and provided ad libitum access to food and water as well as environmental enrichment. All experiments and procedures were approved by the University of Edinburgh local ethical review committee and performed under license from the UK Home Office in accordance with the Animal (Scientific Procedures) Act Instruments, CA, USA) using a hand-held dentist drill with 0.5 mm burr. A small lightweight headplate (0.5 g) was implanted on the surface of the skull using cyanoacrylate glue and dental cement (Lang Dental, IL, USA) and mice were left for at least 48 hours to recover.

Behavioral training
After recovery from head plate surgery, mice were handled extensively before being head restrained and habituated to a custom forelimb lever push / pull behavioral setup. Mice were trained to perform two diametrically opposing movements -a 4 mm push or pull of a moveable lever -in response to a 6 kHz auditory cue to obtain a 5 µl water reward. Mice rested their right forepaw on a stationary lever while making push or pull movements with their left forepaw.
Upon completion of a successful push or pull (determined by the status of IR beams at either end of the lever's travel), the moveable lever was locked in place for the duration of the reward period (3 s) and the water reward was delivered by an automated spout -both locking mechanism and spout were controlled by micro servo motors (HXT900, HexTronik). To increase task engagement, mice were placed on a water control paradigm (1 ml/day) and weighed daily to ensure body weight remained above 85% of baseline. Mice were trained once per day for 30 mins and advanced through different phases of the task once they achieved > 50 rewards in each of two consecutive sessions or > 70 rewards in a single session. Initially, mice were required to perform uncued push and pull movements to obtain rewards (phase 1).
Next, an auditory cue was introduced with pseudo-random inter-trial-interval (ITI) of 4-6 s and a response window of 10 s (phase 2). During the ITI, mice had to hold the moveable lever still -spontaneous movements of the lever within the ITI triggered a reset, where the lever was locked in the original position for 1 s. The response window was gradually reduced to 5 (phase 3) and then 2 s (phase 4) across training sessions. Mice were deemed "expert" after achieving > 70 rewards on two consecutive days of training with a response window of 2 s. During 2photon imaging experiments (see below), a 1 s delay between completion of a successful movement and reward delivery was introduced to temporally separate activity related to reward consumption from the forelimb movements of primary interest.

Forelimb kinematic tracking
Behavior was recorded using a high speed camera (60 fps during population calcium imaging of layer 2/3 or 5B -Prosilica GC780C, Allied Vision, Germany; or either 100 or 300 fps for cell-type specific calcium imaging or in vivo pharmacology, respectively -Mako U U-029, Allied Vision) and acquired using Streampix 7 (Norpix, Canada) or Mantis64 (https://www.mantis64.com/). To measure gross forelimb movement, a region of interest (ROI) was manually drawn around the left forelimb and the frame-to-frame difference in pixel intensity was calculated using the formula: where cf,i is the grayscale level of pixel i in frame f. The resulting motion index was smoothed with a 1 s LOESS filter then aligned to behaviorally relevant time points (lever displacement, cue presentation etc.), with videos and behavior resampled to a common sampling rate. Motion index onsets were calculated by aligning the motion index to the lever movement and identifying the first point prior to movement where mean motion index was > threshold (mean upper bound of 95% confidence interval during baseline). Directional tracking of forelimb and lever movement was performed using Deep Lab Cut, a markerless video tracking toolbox (Mathis et al., 2018).
Tracking data was aligned to cue presentation, baselined to mean xy position during the 100 ms prior to cue and then cropped between movement initiation and movement completion.
For presentation, trials of different durations were resampled to a fixed length to enable a mean trajectory to be plotted across multiple trials.

In vivo pharmacology
To assess the behavioral effects of caudal forelimb area (CFA; N = 10) or hind limb motor cortex (M1hl; N = 5) inactivation, mice trained to expert level in the task had a small burr hole opened directly above the target area (CFA: 1.6 mm lateral, 0.6 mm rostral to bregma; M1hl: 1.25 mm lateral, 1.25 caudal to bregma) before recovering for > 90 mins. After 10 mins of baseline behavioral task execution, the lever was locked and a small volume of the GABAA receptor agonist muscimol (dissolved in external saline solution containing 150 mM NaCl, 2.5 mM KCl, 10 mM HEPES, 1.5 mM CaCl2 and 1 mM MgCl2, adjusted to pH 7.3) was injected into the target area -200 nl of 2mM muscimol was injected at each of 4 or 5 sites forming a 0.5 mm X centered on 1.6 mm lateral, 0.6 mm rostral to bregma (CFA) or 1.25 mm lateral, 1.25 caudal to bregma (M1hl). Each injection site was at a depth of 0.7 mm below the cortical surface. To confirm the anatomical location of drug injection, 1% w/v of retrobeads (red, Lumaflor Inc.) was included in the injected solution. In a subset of mice (N = 5 of 10), animals were also injected in CFA with saline (vehicle only; injection performed on a different day). In these experiments, mice were randomly assigned to drug or control groups (each mouse received one injection of muscimol and one injection of saline) and experiments performed blinded. After each experiment, mice were transcardially perfused and coronal (60 μm) or sagittal (100 μm) sections were cut with a vibratome (Leica VT1000S), mounted with Vectashield mounting medium (H-1000, Vector Laboratories), imaged using a fluorescence microscope (Leica DMR, 5x objective) and manually referenced to the Paxinos and Franklin Mouse Brain Atlas (Paxinos and Franklin, 2008). Behavioral metrics were analyzed comparing videos of 10 mins pre and post injection -the experimenter was blinded to the experimental groupings. Behavioral video data for all pharmacology experiments was captured using a highspeed camera (Mako U U-029, Allied Vision), and paw position accuracy was calculated as the proportion of trials in which mice were holding the moveable lever with their left forepaw at the onset of the auditory cue.

Quantifying muscimol diffusion
To measure muscimol diffusion, a small volume of muscimol-BODIPY TMR-X Conjugate (ThermoFisher Scientific; dissolved in 0.1PBS w/1% dimethyl sulfoxide) was injected into CFA (200 nl of 2 mM at 4 sites centered on 1.6 mm lateral, 0.6 mm rostral to bregma at a depth of 0.7 mm below the cortical surface). To mark the center of injection, pipettes were backfilled with a small volume (~20 nl) of green retrobeads (Lumafluor Inc.) prior to filling with muscimol-BODIPY. Following injection, animals were transcardially perfused and brains snap-frozen on dry ice 10 minutes from completion of the muscimol injection. Brains were stored on dry ice, coronal sections (60 μm) collected with a cryostat (Leica) at -20C and imaged with a light microscope (Leica DMR, 5x objective). We assumed maximum fluorescence ≈ maximum injected concentration and that grayscale pixel intensity was proportional to muscimol-BODIPY concentration. Therefore, pixel values were thresho lded at the equivalent pixel value of an EC20 concentration of muscimol and fluorescence outlines were drawn to generate a 'spread profile'. Green retrobeads were used to mark the center of each injection, and images were aligned to the injection center of gravity. From the aligned profiles, a modal spread profile (i.e., pixels with positive grayscale values across all mice) was generated and aligned to the Paxinos and Franklin Mouse Brain Atlas (Paxinos and Franklin, 2008).

Retrograde tracing
To selectively label pyramidal tract (PT) neurons in layer 5B of CFA, red retrobeads

2-photon imaging
To perform population calcium imaging in layer 5B ( to 920 nm wavelength. Images were acquired at ~30 Hz with a 16x objective lens (0.8 NA; Nikon) with SciScan image software (Scientifica) and synchronized with external high speed videos and behavioral data using Mantis64. To facilitate reliable imaging at depths > 500 µm, all imaging was performed acutely following cranial window implantation -usually within 24 hours of the surgery. Laser power was 91 -173 mW (mean = 143 mW) across all imaging sessions, well below the damage thresholds of 250 -300 mW outlined in Podgorski and Ranganathan (2016). The combination of low pixel dwell time and systematic blanking of fieldof-view edges, where the dwell time is higher, and the addition of room temperature artificial we used linear ramps with defined onset times and a rise time of 0.5 s mimicking GCaMP6s kinetics. We then calibrated the onset detection algorithm on the simulated data (100 simulated cells with 30 simulated trials per cell and artificially added noise in each trial matching the noise level in the imaging data) and updated it by a small correction factor.
Neurons with movement bias were detected using the same classification criteria described above but across movements (i.e. significant differences in bootstrapped ∆F/F0 baseline-topeak or 250ms peri-movement bins). Where significant differences between push and pull trials were identified, the neuron was classified as biased for the movement with the larger response from baseline.

Trial to trial correlations
To assess the similarity of trial-to-trial activity, the average pairwise trial-to-trial correlation coefficients (Pearson's r) of the peri-movement ∆F/F0, smoothed with a 1 s LOESS filter, were computed for each neuron. Data are presented as bootstrapped medians per animal for each movement preference class (10,000 repetitions, 50 samples). To investigate the relationship between trial-to-trial similarity of movement and population ∆F/F0, pairwise trialto-trial correlation coefficients (Pearson's r) of peri-movement motion index and pairwise trialto-trial correlation coefficients (Pearson's r) of the peri-movement population ∆F/F0 of the same trials were compared. Population ∆F/F0 was the sum of all movement-related neurons in each FOV. Data were binned according to the pairwise trial-to-trial correlation coefficients of their motion index and are presented as bootstrapped median (10,000 repetitions, 50 samples) within each bin.

Extracellular recording and spike sorting
To compare neural activity during the task, extracellular unit recordings in CFA were performed using acutely implanted silicone probes (Neuropixels Phase 3B probes, IMEC).
Data were acquired from the 384 channels closest to the probe tip (bank 0) with SpikeGLX software at 30 KHz, an amplifier gain of 500 for each channel and high-pass filtered with a cutoff frequency of 300 Hz. Spike data were synchronized with external high-speed videos and behavioral data (cue presentation, lever movement and reward delivery) using Mantis64. Spike sorting was performed using Kilosort3 to automatically cluster units from raw data (Pachitariu et al., 2016). The resulting spike clusters were manually curated using Phy (https://github.com/cortex-lab/phy), and any unit with sufficient refractory period violations, inconsistent waveform amplitude across the duration of the recording, or clipped amplitude distribution was excluded from analyses. Probe location was confirmed via DiI (Thermofisher) reconstruction of the recording track and compared to retrogradely labeled PT neurons -FastBlue (Polysciences) injected into the pons -in each animal to limit analysis to units within layer 5B (upper boundary, 500-680 μm; lower boundary, 900-1080 μm; N = 5 mice). Spike widths were calculated as the duration from trough to following maxima of the spike waveform.
Putative pyramidal neurons were identified as units with median spike widths greater than 0.4 ms.
To classify units as responsive to the push or pull movements, firing rates were calculated by convolving motion index-aligned spike times with a 50 ms Gaussian kernel and mean changes in firing rate were calculated by subtracting the mean firing rate during a baseline period (1 s period before cue presentation) from the mean firing rates in 250 ms bins tiling a response period extending back from ( ℎ ( , ( )) to include motion index onset. Briefly, motion index onsets were calculated as the first point after cue where the motion index was > threshold (threshold = mean motion index in a baseline window from the 1.5 s before cue plus 2 SD). Trials where the motion index onset came before cue presentation were excluded from analysis. Significant responses were identified by comparing bootstrapped 95% confidence intervals of these mean changes in firing rates to 0; if at least one bin differed from 0, that unit is considered movement-responsive. Movement-responsive units were classified as having a push or pull bias if the confidence intervals in the earliest significantly responsive bin did not overlap; in these cases the units were classified as biased for the movement with the larger change from baseline.

Neural decoding
To decode movement type in single neurons we employed a naïve Bayes classifier, where distributions of features are assumed to be Gaussian. Movement-aligned ∆F/F0 data were assessed within a 5 s peri-movement window to produce a time series for the decoding accuracy. At each time point, leaving one trial out (test trial), the likelihood of determining push or pull was estimated based on the remaining trials (training set). The leave-one-out procedure was then repeated for all trials to calculate a mean decoding accuracy time series for each neuron. The resulting time series were analyzed within a peri-movement epoch -the perimovement epoch began at -0.15 s relative to motion index onset and ended based on the peak ∆F/F0 response of each neuron; the position of the median peak was calculated for each movement type and the later of these time points used as the cut off. To identify neurons with decoding performance above chance, the bootstrapped distributions of decoding accuracy scores were compared against a threshold value for each session. Only neurons with at least 1 bin significantly higher than threshold were defined as high decoding accuracy. The threshold for each session was calculated based on modeled data composed of random samples from a Gaussian distribution with the same number of trials as the experimental data.
For each session, modeled data accuracy was calculated 1000 times, assuming a prior probability of 50:50, and the mean + 2 SD was used as the threshold for significance. For population level classification of movement type, we employed logistic regression. As above, the decoding accuracy of time series for each population was generated via leave-one-out design looped over all the trials in a given session. Population decoding accuracy was defined as the maximum decoding accuracy in any 250 ms bin within the peri-movement epoch.
Population decoding was also performed on subsets of the entire population. Neurons were removed from the population one at a time, either in order from highest to lowest decoding accuracy score or randomly, with the network retrained for each iteration. The process was repeated 25 times in the random condition and the median of all responses used as the representative example for comparison with the ordered removal condition. Subpopulations of neurons decoding significantly above chance were determined by comparing decoding scores with a shuffled dataset (sampled randomly from 1000 time points with the trial labels (push or pull) randomized for each sample). If confidence intervals from the population data did not overlap with those of the shuffled data, population scores were deemed to be above chance.
In 3/12 FOVs of the layer 5 dataset with low proportions of high decoding neurons and / or low trial numbers, the population decoding accuracy was never significantly above chance. These FOVs were excluded from the comparison between ordered and random removal of neurons.

Dimensionality reduction
Raw fluorescence traces for all trials with successful movements in a 7.5 s peri-movement window were concatenated, filtered with a three frame (75 ms) wide boxcar kernel, whitened and transformed with principal component analysis. The principal components (PCs) corresponding to the 16 highest eigenvalues, which corresponded to an average 83% (range [77 94]) cumulative explained variance, were analyzed. To compute trajectories in PC space, PC projections for all trials were averaged (separately for push or pull) and the variance and 95% confidence intervals for each time point estimated via 100-fold bootstrapping. The separability of the trajectories for push or pull was computed in each PC separately as d'(t) = |mpush(t) -mpull(t)| / √0.5(vpush(t) + vpull(t)), where mpush(t) and mpull(t) are the mean trajectories for push and pull, and vpush(t) and vpull(t) the corresponding variances, estimated from trials.
The separability d'(t) was bootstrapped from 400 samples, and variance and 95% confidence intervals estimated from this sample. d'(t) was computed for all frames from movement onset to completion, where the latter was the longest movement duration recorded in each session.
PCs were considered separable if the difference between d'(t) and dshuffle'(t) (obtained in the same way from trial-shuffled data) was within the 95% confidence interval, which was estimated from the sum of the relevant bootstrapped variances. For each FOV, the largest significant d'(t) was used; in 1/12 FOVs no PCs showed significant separability (this FOV also had a chance level population decoding score), and were excluded in the summary data.

Spatiotemporal organization
To assess the functional (temporal) organization of simultaneously recorded populations of neurons, pairwise correlation coefficients (Pearson's r) from the smoothed (1 s LOESS filter) and motion index-aligned ∆F/F0 within the peri-movement epoch were compared. Data were split based on their decoding accuracy scores and the bootstrapped median difference between high decoding accuracy neurons and those of the population were subtracted and a median difference calculated per sample. This process was repeated 10,000 times to generate a distribution for high decoding neurons versus the entire population and the same sampling procedure was used to investigate the correlations within low decoding accuracy neurons. To investigate spatial clustering, bootstrapped median differences between high decoding accuracy neurons and the population using pairwise distances (defined as the Euclidean distance between the centroids of manually drawn ROIs from 2-photon imaging processing) were compared. A Generalized Linear Mixed-Effects Model: *+",-"./ × 0/120/, + 34*/ + was used to model the pairwise correlation coefficient as a function of pairwise distance (continuous), decoding accuracy and movement type. Pairwise distance and decoding accuracy were modeled as interacting fixed terms, while movement type and animal were modeled as random intercepts to account for the dependency of the measurements on observations from the same animal and across the different movement types. The model was estimated using the restricted maximum likelihood, or REML, method (Bartlett and Fowler, 1937). Model assumptions were verified by comparing residual versus fitted values for each covariate in the model against each covariate removed from the model.

Statistics
Data analysis was performed using custom-written scripts in MATLAB 2019a or Python 3.
Where multiple measurements were made from a single animal, suitable weights were used to evaluate summary population statistics. Statistical significance was considered when p < 0.05 unless otherwise stated. Data were tested for normality and parametric/non-parametric tests were used as appropriate and as detailed in the text. The GLMM was designed in Python using the statsmodels library (Seabold and Perktold, 2010). Bartlett, M.S., and Fowler, R.H. (1937). Properties of sufficiency and statistical tests. Proceedings of the Royal Society of London Series A -Mathematical and Physical Sciences 160, 268-282.