Abstract
The ability to sequence movements in response to new task demands enables rich and adaptive behavior. Such flexibility, however, is computationally costly and can result in halting performances. Practicing the same motor sequence repeatedly can render its execution precise, fast, and effortless, i.e., ‘automatic’. The basal ganglia are thought to underlie both modes of sequence execution, yet whether and how their contributions differ is unclear. We parse this in rats trained to perform the same motor sequence in response to cues and in an overtrained, or ‘automatic’, condition. Neural recordings in the sensorimotor striatum revealed a kinematic code independent of execution mode. While lesions affected the detailed kinematics similarly across modes, they disrupted high-level sequence structure for automatic, but not visually-guided, behaviors. These results suggest that the basal ganglia contribute to learned movement kinematics and are essential for ‘automatic’ motor skills but can be dispensable for sensory-guided motor sequences.
Introduction
Our brain’s capacity to organize movements and actions in response to new challenges allows us to imitate trendy dance moves or play Chopin etudes from sheet music. Assembling motor sequences in such controlled and deliberate ways can be mentally taxing and computationally costly, resulting in slow1 and error-prone performances subject to cognitive interference2,3. However, executing the same motor sequence, such as typing a password or playing a favorite piano sonata, repeatedly and consistently, can turn it into a continuous task-specific movement pattern that is fluid, fast4–7, precise5, efficient8, and less cognitively demanding2,9,10; in a word: ‘automatic’11–13. Thus, the very same motor sequence can be executed in modes that are both qualitatively and subjectively distinct1,12–14.
Given that the specification of the same motor sequence can differ so markedly (Fig. 1), the underlying neural circuits, or the functions they implement, are thought to differ as well10,15–21. For example, a discrete motor sequence whose progression is informed by external sensory cues will engage a serial action selection process that likely engages higher-level circuits22,23 (Fig. 1a). Highly overtrained, or ‘automatic’, motor sequences, on the other hand, can be specified in terms of sequential low-level motor commands (Fig. 1c)24–26. Indeed, our colloquial reference to automatic behaviors being stored in ‘muscle memory’ reflects a subjective sense that they are, in contrast to sensory-guided motor sequences, less reliant on cognitive processes and produced by circuits closer to the motor periphery11,14,15,19,27.
Besides being sensory-guided and automatic, motor sequences can also be informed by working memory (Fig. 1b), as is the case when we try to imitate our piano teacher or reproduce our own improvisations from a few moments ago. The generation of such motor sequences are akin to sensory-guided ones in that they too demand considerable mental effort28 and are informed by (remembered) sensory experiences. Given these shared qualities, sensory-guided and working memory guided sequences are often collectively referred to as ‘controlled’1,13,16,17,29 to distinguish them from automatic behaviors. However, working memory-guided motor sequences are also similar to automatic ones in that their progression is informed by internal neural processes rather than immediately available sensory cues. Hence these two modes of sequence execution, automatic and working memory-guided, are sometimes referred to as ‘internally’ generated, to distinguish them from ‘externally’ cued ones20,22,30.
Yet the degree to which the distinctions between ‘controlled’ vs ‘automatic’ and ‘externally vs ‘internally’ generated motor sequences map onto specific neural circuits and mechanisms is unclear16,22,31,32. Here we set out to probe how the neural implementations of automatic, visually- and working memory-guided sequences differ, focusing on the sensorimotor and associative arms of the basal ganglia (BG). While these pathways have been implicated in various aspects of motor sequence learning and execution18,27,33–36, their specific contribution to sensory-guided, working memory-guided, and automatic behaviors, have yet to be fully understood15,19,37–39.
Because sensory-guided motor sequences can be equated to serial action-selection or decision making23,32,42, it follows from BG’s acknowledged role in these processes18,43–45 that they should be central to generating such behaviors. While recordings showing sequence-specific neural activity in the primate and human BG are consistent with this view46,47, some recent studies have called this into question. For example, inactivating a main output nucleus of the BG (Globus Pallidus internal segment, GPi) in monkeys did not affect their ability to generate visually-guided reaching sequences beyond slightly reducing the vigor of the constituent movements48.
BG has also been implicated in working memory-guided sequences. Indeed, recordings from GPi neurons in monkeys performing the same sequences guided by visual cues and working memory revealed more task-modulated neurons in working memory trials49,50, an indication that the BG contribute differently to the two execution modes, a notion supported also by studies in humans51.
In contrast to sensory- and working memory-guided motor sequences, ‘automatic’ motor sequences lack any flexibility – once initiated, the progression of the behavior is fixed, obviating the need for discrete action selection. Studies showing neural activity in sensorimotor regions of the BG preferentially encoding the start and stop of overtrained behaviors52–56 imply that repeated practice transforms an initially controlled motor sequence – with distinct elements and choice points – into an immutable continuous sequential movement pattern1,12. While sensorimotor regions of the BG are thought to help select and initiate such ‘chunked’ behaviors57–59, the implication is that their detailed progression and specification are elaborated in downstream circuits.
The generality of this view, however, was challenged by recent studies showing that the BG is necessary for generating the detailed kinematics of stereotyped learned behaviors24,60–63. In these studies, activity patterns in the sensorimotor striatum did not reflect the starts and stops of the overtrained behaviors, nor specific choice points in the sequences; rather, the neurons encoded the kinematic details of the sequential behaviors, suggesting an essential role for the BG in the continuous low-level kinematic specification of automatic behaviors (Fig 1c).
To get at the distinction in how the BG contribute to the different types of motor sequences, we designed a task for rats that trains them to perform the very same motor sequence in the three execution modes discussed above (Fig 1). In the first mode, each element in the sequence is cued by a visual stimulus (‘sensory-guided’); in the second, the sequence is informed by working memory (‘working memory-guided); and in the third, the sequence is ‘automatic’ thanks to lengthy overtraining on the same sequence. Using the pianist as an analogy, the first is akin to playing a piece from sheet music, the second is repeating that very piece from memory, and the third is the condition that emerges after practicing it diligently for many weeks.
To distinguish the contributions of the BG to these different execution modes, we focused on the dorsal striatum, the input zone to the sensorimotor and associative arms of the BG. Surprisingly, neural recordings in sensorimotor striatum (dorsolateral striatum, DLS, in rodents) – a region implicated in behavioral automaticity11,64 – revealed no meaningful difference across the execution modes, with neurons representing low-level kinematic features in all cases. The neural population showed no selectivity for higher-level attributes of the behavior, such as the order or sequential context of a given movement.
Lesions to the DLS, however, revealed a stark contrast across execution modes, with sequence organization being essentially lost for automatic and working memory-guided sequences, but largely preserved for sequences informed by visual cues. Additionally, the movements across all three conditions were slower and more variable post-lesion, resembling the animal’s engagement with the task early in learning, a finding consistent with a general role for the BG in specifying the detailed kinematics, including the vigor24,48, of learned movements. Lesions of the associative region of the striatum (dorsomedial striatum in rodent; DMS), widely implicated in behavioral flexibility65–69, had no lasting effect on either the performance or the kinematics in either execution mode.
A network model simulating DLS and its interactions with other motor circuits recapitulated these results, including the similarity of DLS activity across execution modes and the differing impacts of DLS lesions on visually-guided and automatic sequences. The model also explained the shared role for DLS in the kinematic specification across execution modes by showing that differing inputs to DLS are transformed into common activity patterns that sculpt movement features through interactions with downstream control circuits.
Results
A discrete sequence production task for rats
To directly compare the neural substrates of visually-guided, working memory-guided, and automatic motor sequences, we designed, based on similar paradigms in humans and non-human primates (NHPs)20,23,70, a discrete sequence production task in which rats execute the same motor sequence in the three different modes.
To facilitate comparisons to other motor-related studies in rodents, including our own24,25, which probe forelimb33,36,54,56,62,72 and whole-body orienting73–75 movements, we opted for a ‘piano-playing’ task, in which rats are rewarded for performing sequences of three keypresses on a three-key ‘piano’ in a prescribed order, alternating between forelimb lever-presses and orienting movements (12 possible sequences; see Fig. 2a, Supplementary Movie 1).
Rats were initially trained to press a single lever for a water reward, then to associate a visual cue above each lever with pressing that lever (see Methods). After acquiring the cue-action association for each of the levers (5164 ± 948 trials; mean ± s.e.m), rats transitioned from single presses to two- and, ultimately, three-element sequences. In ‘controlled’ training sessions, the rewarded keypress sequence was either signaled directly and sequentially by visual cues (CUE) or had to be remembered from the instructed sequence of previous trials (working memory, WM; see Methods). This trial design, adopted from studies in NHPs30, allowed us to compare performance and neural dynamics for sequences externally guided by visual cues and internally generated from working memory20,21,76–78. Blocks of CUE and WM trials were interleaved, and, in each, one of the 12 possible sequences was randomly selected and rewarded (Fig. 2b).
In separate ‘automatic’ sessions (see Methods), animals were trained to produce the very same pre-determined keypress sequence - randomly chosen for each rat from the 12 possible sequences - for the duration of the months-long experiment (overtrained mode, OT). Because this automatic sequence is one of the 12 sequences rewarded in the controlled sessions, we could compare the same motor sequence across the three distinct execution modes.
Rats master the ‘piano playing’ task
Rats learned to produce the prescribed sequences under all three conditions (CUE, WM, OT) (Fig. 2c-d). We deemed rats to be ‘experts’ when both their success rates and trial times were reliably within 0.5 σ of asymptotic performance values (see Methods). Expert performance was reached after 17,623 ± 7616 trials (75 ± 23 days) in the CUE mode, 11330 ± 5278 trials (88±30) days in the WM mode, and 14061 ± 7082 (72 ± 21 days) trials in the OT mode.
The success rate of expert rats was, on an average, 60.19% ± 10.73% (CUE), 44.29 ± 15.45% (WM) and 79.57 ± 11.84% (OT). Note that in all cases, this is many times chance performance, which would be 8.33% considering only the 12 prescribed sequences or 3.7% considering all possible three element keypress sequences. As is expected from similar learning paradigms in humans and NHPs6,7,20, the mean and variability of the trial times decreased with learning (Fig. 2e-f), while the stereotypy of the associated movement patterns increased across all three execution modes (Fig. 2g). Finally, to estimate the efficiency, or smoothness8,79, of a movement we used the inverse of the spectral arc length71, a noise-robust and dimensionless metric that reflects the smoothness fluidity of a trajectory (see Methods). Consistent with expectations from humans and NHPs8,80,81, this metric increased over learning.
Kinematic similarities across execution modes and movement elements
Probing a distinction in how neural circuits implement the different types of motor sequences requires dissociating differences in execution mode from differences in movement kinematics. To establish the degree to which kinematics for the same motor sequence across execution modes (i.e. CUE, WM or OT) were similar, we tracked the rat’s dominant forelimb (i.e. the one pressing the lever) and nose from videos recorded from the side and the top (Fig. 2g, Supp vid 1, Extended Data Fig. 1a)82,83. Comparing trials of similar duration across execution modes revealed very similar forelimb trajectories. Pairwise correlations of trajectories, linearly warped to lever press times, had similar distributions whether the trials were from within or across modes (Fig. 2h). Nose trajectories were also similar across trials and execution modes and had high pairwise correlations (Extended Data Fig. 1b-c). The similarity in kinematics is important because it allows us to interpret any potential differences in neural activity and sensitivity to neural circuit manipulations as being due to differences in execution mode (OT, WM, CUE) rather than in low-level aspects of motor implementation.
Overtrained motor sequences have signatures of automaticity
In humans, motor automaticity is distinguished by improved performance, increased movement speed, and less variable execution times1,11,12. While we found that the kinematics for the same motor sequence across session types was similar (Fig. 2e-f, Supplementary Movie 1), we parsed these metrics in the behavioral data to assess whether automaticity had been established in OT sessions (Fig 3a-d).
Consistent with signatures of automaticity, we found that trials in OT sessions were more successful, faster, and less variable than when performing the same sequence in controlled sessions (Fig. 3a-c). We also examined the failure modes, hypothesizing that rats performing automatic sequences should be more systematic, or less variable, in the errors they make84. In agreement with this, we found that the entropy, or randomness, of erroneous sequences was much lower in the OT sessions than for either of the controlled sessions (CUE, WM; Fig. 3d).
Furthermore, if automatic motor sequences are consolidated and defined in terms of continuous low-level motor commands24,62,63,85, and not as serial discrete action selection, unsuccessful trials in OT sessions should be dominated by errors relating to variability in movement kinematics (e.g. a forelimb swipe at the ‘correct’ lever that misses the target), as opposed to errors in higher-level sequencing aspects (Supplementary Movie 2).
In support of this, failures in OT trials were mostly due to rats swiping at but missing the ‘correct’ lever or failing to depress it beyond the threshold for detection (approximately 63.81 ± 19.56% of all OT errors were of this type; Fig. 3e). This was in contrast to controlled trials, in which unsuccessful attempts were dominated by true sequence errors, where rats orient towards and press the ‘wrong’ lever (only 17.6 ± 18.25 % of CUE and 29.62 ± 19.02% of WM errors were motor errors). Interestingly, potential sequence errors in the OT sessions were twice as likely to come after trials with motor execution errors compared to correct trials (Fig. 3f, see Methods), consistent with a drop in reward triggering increased motor exploration84.
Similar to studies in humans and NHPs performing both sensory cued and automatic motor sequences4,48,86, rats were more likely to execute the overtrained sequence in controlled sessions compared to chance (12.35±4.43% in CUE, 14.99±4.85% in WM, where chance is 8.3%; Fig. 3g).
These differences in the quality of motor sequence execution and the error modes across the distinct session types are consistent with studies comparing sensory cued and automatic motor sequences in humans6,7,13 and indicate that automaticity of the overtrained motor sequence, as commonly defined in the literature, had been achieved.
DLS represents motor sequences similarly irrespective of execution mode
We had designed our behavioral paradigm to directly probe whether and how the neural implementations of different types of motor sequences differ at the level of the striatum. If its sensorimotor region, DLS in rodents, distinguishes modes of execution at the action selection-level, as widely believed43,44,46,47,58, we argued that it should manifest as a difference in the associated neural activity patterns between visually-guided and automatic execution modes (Fig 1). Neural recordings should also speak to whether and how the BG’s role in high-level action selection differ across visually- and working memory-guided sequences.
Alternatively, if BG encodes sequences at a ‘lower’ execution level (Fig. 1), we may see no difference in the activity patterns, as the motor outputs are kinematically similar across modes (Figs. 2g,h). To directly and effectively probe this, we implanted 64-channel tetrode drives targeting the DLS in expert rats (n=4, Extended Data Fig. 2a) and compared the activity of the same neurons for the same motor sequence in OT, CUE and WM trials. We recorded neural activity continuously over several weeks, comparing units recorded for at least 5 trials for each execution mode (CUE, WM, OT) for the same motor sequence (in total: n=579 neurons selected from 2468 total, see Methods).
The comparisons across each execution mode (Fig. 4a) were striking for the lack of any qualitative difference: task-related activity patterns of DLS neurons for the same motor sequence were highly correlated across all trial types (CUE, WM and OT; Fig. 3a-b,d) with no significant difference in either the average firing rates or peak z-scored activity (Fig. 4c). We found a similar result when splitting the population into putative medium spiny neurons (MSNs) and fast-spiking interneurons (FSIs) (Extended Data Fig. 3). Thus, the neural recordings, on their own, did not suggest a meaningful difference in how the BG are engaged in sensory-guided, working memory-guided, and automatic motor execution despite prior suggestions to the contrary15,19,37–39,49,51.
The DLS does not encode high-level features of the sequence
We next analyzed neural activity patterns in DLS for clues about the general contributions it makes to motor sequence execution, focusing first on its putative role in discrete action selection. Prior studies supporting such a function have interpreted elevated average DLS activity at the boundaries of ‘chunked’ actions52,55,56,87 as an indication that the BG help bias their initiation and/or termination by facilitating and/or inhibiting downstream control circuits58. To test whether activity is concentrated at the boundaries of putative action ‘chunks’, we examined the firing rate modulation over the length of the whole sequence (Fig. 4e; see methods). We found no evidence for population activity in the DLS preferentially marking the start and/or stop of overtrained sequences; neither did it consistently mark the boundaries between behavioral elements (lever-presses, orienting movements) or pairs of such (combined orient and lever-press movements) (Fig. 4e).
While average population activity in DLS did not demarcate the boundaries of motor elements or motor chunks, its activity could reflect their place in the sequence in other ways46,47,88. For example, it has been proposed that DLS neurons represent the sequential context of movements and actions, including their ordinal position or, in lever-pressing tasks, the identity of the lever being pressed54,89. In such a coding scheme, DLS activity associated with a specific movement should not be a mere function of its kinematics, but also reflect higher-order features of the sequence32.
To explicitly probe this, we expanded our analysis to all 12 motor sequences generated in controlled sessions. If DLS represents higher-order features of sequence organization, its neural activity should differ when the same lever-press or orienting movement is performed in different sequential contexts (e.g., the press and orienting movement L→C in the sequence L→C→L vs. the sequence C→L→C; Fig. 5a). We did not find this to be the case: neural activity across two sequences composed of different elements (L→C→L vs. C→R→C) but similar lever-press and orienting movements (press→move right→press→move left→press), was similar (Fig. 5a). More generally, DLS activity associated with a given motor element was highly correlated regardless of the sequence in which it was embedded, its ordinal position in the sequence (1st, 2nd, 3rd press), or the specific lever being pressed (i.e. L, C, or R) (Fig. 5a-e). There was however a very clear distinction in how striatal neurons represented orienting movements to the left and right, both short (e.g., L→C) or long (e.g., L→R), consistent with an egocentric kinematic code (Fig. 5d-e, Extended Data Fig. 4a-b).
DLS encodes detailed movement kinematics
Based on this initial analysis and related studies24,60,62,63, a plausible alternative to DLS representing higher-order aspects of discrete motor sequences is that is encodes - and contributes to shaping - the detailed kinematics of learned sequential movement patterns, including their vigor. While DLS is not required for species-typical lever-press or orienting movements24,36,75,90, it can, by acting on downstream control circuits, help make them more adapted to a specific task24,62,91 (Fig. 2e). In this scenario, we would expect DLS neurons to encode kinematic features continuously throughout the behavior24,63,85.
To probe this idea24, we trained a multilayer neural network to predict, using the spiking activity of simultaneously recorded DLS units, the instantaneous velocity of the rats’ active forelimb and nose during the controlled task, as viewed from a side and top camera respectively (Fig. 5f, Extended Data Fig 5a). Consistent with observations from trial-averaged ensemble activity, we could decode movement kinematics on individual trials across the different sequences from populations of DLS neurons (Fig. 5g). Decoders trained on a subset of sequences could predict kinematics from held-out sequences just as well (Fig. 5g), implying that the kinematic code in DLS is invariant to the sequential context of the movements as suggested by our earlier analysis (Fig. 5d-e). Finally, to determine if we could decode the 3D kinematics beyond what is captured in the 2D camera projections, we triangulated the forelimb and nose position into 3D using multiple views across our three cameras (Extended Data Fig 5a-c, Methods). Decoders of DLS activity could predict variations in movement kinematics on individual trials similarly in 3 dimensions (Extended Data Fig 5d-f).
Recent studies85,92,93, have suggested that DLS encodes the progression, or the ‘phase’, of a behavioral sequence. In the context of stereotyped movement sequences, however, kinematics and phase are tightly coupled, making it difficult to parse which of these attributes DLS activity reflects94–96. Because our controlled task condition breaks this coupling, kinematics and phase become dissociable. Training a decoder to predict the phase of the behavior, however, failed (Fig. 5h), meaning that phase in the sequence cannot be recovered from DLS activity alone. Only when phase and kinematics was coupled, i.e. when we considered only single sequences with no repeated motor elements (e.g., lever taps, see Methods for how these were selected), could we decode phase (Fig. 5h). Taken together, our results suggest that DLS encodes low-level continuous kinematics of movements in a way that is - in contrast to prior reports54,56 – invariant to their sequential context.
Probing DLS function by lesions
Although our neural recordings showed that DLS activity reflects ongoing kinematics in a similar way for sensory-guided, working memory-guided, and automatic motor sequences, this does not establish a causal role for DLS in their execution. Alternatively, DLS activity could – in one or all modes – simply reflect input from essential sensorimotor control circuits97. To arbitrate between these possibilities, we lesioned DLS bilaterally in expert animals (n=7 rats, see Methods, Fig. 6a, Extended Data Fig. 2b).
In interpreting the effects of striatal lesions, we distinguish, as we did for the neural analysis, two aspects of performance, each associated with a putative function of the BG (Fig. 6b). The first is the ability to perform the prescribed sequence of lever presses (i.e. ‘sequencing’). The second is the ability to use fast and efficient movements refined and adapted to the task (i.e. ‘kinematics’). Note that only the sequencing aspect is required for reward. Parsing performance in this way allows us to probe whether striatum contributes to high-level sequence structure and low-level movement kinematics differently across execution modes.
DLS lesions affect high-level sequence structure on automatic and working-memory, but not visually cued, trials
Following a seven-day post-lesion recovery (see Methods), the rats’ ability to perform the OT sequence was severely impaired (Fig. 6c-d), dropping to near chance levels (8.33%). In stark contrast, success rates on CUE sequences were comparable to pre-lesion (Fig. 6c-d), save for a brief drop on the first few sessions after the lesions consistent with non-specific transient of the surgery24,25. Success rates in WM trials, on the other hand, were chronically affected, and while the relative drop was less than for the OT mode, performance still dropped to near chance levels (Fig. 6c-d). A seven-day control break before the lesion (see Methods), did not significantly affect the behavior in either execution mode (Extended Data Fig. 6).
One possible explanation for the lesion resilience in CUE trials, and the less severe performance drop in WM trials (compared to OT), is that visual cues aid movement initiation in a DLS-independent manner90. However, the post-lesion drop in success rate on OT and WM trials could not be explained by a deficit in movement initiation62. Although DLS lesioned rats generated fewer lever-presses overall, they were actively engaged in the task and performed similar numbers of trials across controlled and automatic sessions (Extended Data Fig. 8a).
Taken together, these results are consistent with DLS having an essential role in controlling high-level sequence structure for automatic (OT) and working memory-guided motor sequences (WM)24,49,98. However, we found DLS to be dispensable for sequencing visually cued behaviors (CUE)99,100.
DLS lesions affect learned movement kinematics equally across execution modes
We next probed the effects of DLS lesions on movement kinematics, i.e. the aspect of the animals’ performance we could reliably decode from DLS activity. Consistent with a general role for the BG in the specifying learned task-specific movement kinematics24, orienting and lever-pressing movements across all three execution modes were similarly affected. The ‘vigor’ of the movements, defined as the scalar gain factor applied to the kinematic features of a movement such as movement latency or speed101,102, was reduced (i.e. longer trial times and slower speeds, Fig. 6e-f). The change in vigor following lesion was similar for short (e.g. L->C) and long (e.g. L->R) movements (Extended Data Fig. 7a-d).
One plausible coupling between deficits in kinematics/vigor and the ability to generate the proper sequence is if the neural dynamics that inform the sequence in OT and WM trials can only be expressed at higher (i.e. closer to pre-lesion) speeds (CUE trials would be paced and informed by visual cues and hence would not be affected). However, that we did not find a consistent relationship between vigor (movement speed) and trial outcome after DLS lesions (Extended Data Fig. 7e-h), speaks against this idea.
While the lesion-induced effects on overall movement speed and latency are consistent with prior reports on BG’s role in controlling movement vigor48,60, other aspects of kinematics were also affected. Trial-to-trial movement variability, even for similar duration movements, was dramatically increased (Fig. 6g-h). Furthermore, the smoothness of task-related movements, which increases over learning71, also decreased following DLS lesions, a finding that mirrors what is seen in Parkinson’s patients103 (Fig. 6i). Interestingly, we found that the quality of the movements post-lesion reverted to what is seen early in learning (Extended Data Fig. 8b-g), a result consistent with DLS learning to control task-specific movement kinematics24,26 by interacting with downstream control circuits that generate species-typical movements23.
The DMS is not required for motor sequence execution in either mode
Our focus on sensorimotor striatum (DLS) was motivated by its known role in movement execution43,61,91. In contrast to DLS, which receives much of its input from sensorimotor cortex, dorsomedial striatum (DMS) receives its cortical input predominantly from PFC and PPC104,105. Though this associative region of the striatum has been implicated in flexible control of behavior, such as modifying, switching, and updating behavioral choices in response to previously learned associations65–68,106, whether it has an essential role in generating sensory- or working memory-guided motor sequences is less clear36,75. To probe this, we lesioned DMS bilaterally in a separate cohort of expert animals (n=6 rats, see Methods, Fig. 7a, Extended Data Fig. 2c).
Consistent with our previous work24, we found that DMS was not required for automatic motor sequence execution (Fig. 7b-c). More surprisingly, DMS lesions also did not have any lasting effects on controlled motor sequence execution in either WM or CUE trials (Fig. 7d-h). While there was a transient decrease in success rate in the first few days following the lesion, this is consistent with nonspecific effects of the surgery procedure and recovery (see similar effects in24–26 and Fig. 6c - CUE condition). While we cannot rule out that these transient effects reflect a real contribution of DMS to motor sequence execution, we note that any such putative function can be compensated for in a matter of days. Furthermore, unlike for DLS, lesions of DMS did not significantly affect kinematics in either execution mode, with lesioned rats showing no consistent increase in trial time or mean trial speed or a drop in the stereotypy of their task-related movements (Fig. 7d-h). This reinforces the dissociation between the DLS and DMS in terms of low-level kinematic control24, and further suggests that DMS is not necessary for either sequencing or kinematic aspects of well-trained motor sequences. Whether DMS plays a role in early motor sequence learning remains an intriguing open question34–36,87,107.
A simple neural network model can account for the results in both CUE and OT execution modes
At first glance, our results suggesting both similar (e.g., in terms of coding properties and contribution to low-level kinematics) and different (e.g., in terms of effects of lesions on high-level sequence structure) functions for DLS across modes, may seem discrepant. To reconcile these findings and better inform the circuit-level logic underlying motor sequence execution, we built a simple neural network model of the motor system (Fig. 8a) in which a DLS-like circuit learns to interact with ‘downstream’ control circuits in different execution modes. Our modeling focused on sensory-guided and automatic execution modes, as these show the clearest distinctions in our experimental data, setting aside for this analysis the working memory-driven condition.
The DLS network in our model contained no recurrence, reflecting the fact that spiny projection neurons are coupled via relatively weak and sparse lateral inhibition108,109. The ‘downstream’ component of the model – intended to capture the control circuits modulated by the BG – was made recurrent. For simplicity, and to focus on the DLS, we abstracted away the details of how it connects to downstream control circuits (i.e., through other BG nuclei, thalamus, etc.), modeling them with a set of linear synaptic weights. To capture the ability of animals to execute cued lever presses prior to sequence training, we pretrained our model to perform a simple in-silico version of ‘lever-pressing’: moving a virtual manipulandum to a target in a 2D environment (Methods).
We then trained the DLS input synapses such that the network’s output ‘moved’ through sequences of three targets (Fig. 8b, Methods). The same network was tasked with producing sensory-guided sequences instructed by external inputs (CUE mode), and also a single internally generated sequence (OT mode; see Methods).
Since animals value time110 and reduce trial duration as a function of learning (Fig. 2d), our training procedure incentivized the model to reach the three target positions in the correct order as quickly as possible. The decision to train the DLS input synapses was inspired by the presumed role of cortico-striatal and thalamo-striatal plasticity in reinforcement learning26,67.
To compare the performance of our network model to that of real rats, we conducted analyses of the model analogous to those performed on our experimental data. We found that the activity of the neural units in the trained DLS network was largely independent of execution mode as seen in our data (Figs. 8c-d; compare with Figs. 4b,d). This cross-mode independence reflects an emergent alignment of DLS inputs (which are trained on both task modes) for OT and CUE trials with the same target sequence (Fig. 8e). We also recapitulated our experimental observation that DLS network representations reflected egocentric direction-of-motion information (Fig. 8f; compare with Figs. 5d-e).
Next, we simulated lesions to the DLS by removing this part of the model following training, comparing the effects on sequencing and kinematics separately. We found that DLS removal left high-level sequencing impaired for OT but not CUE trials (Fig. 8g; compare with Fig. 6a), recapitulating our experimental results. The model also captured the effect of DLS lesions on movement trajectories, decreasing movement velocity and increasing trial time across execution modes. This is consistent with a role for DLS in adapting movement kinematics for a specific task independent of execution mode (Figs. 8h-i; compare with Figs. 6a-b).
Thus, key features of the experimental data – invariance of DLS activity to execution mode, sensitivity of automatic task performance to DLS lesion, and a role for DLS activity in shaping learned movement kinematics – all emerge naturally during task learning when using a network model with basal ganglia-like circuitry.
In the model, we used a biologically-inspired circuit architecture (Fig. 8a) and matched the training procedure of the network to that of our rats (i.e., with sequence training overlaid on pretrained circuits). To probe how different features of our model contributed to the results, we also considered three alternative circuit models/paradigms. First, to test whether our results depended on DLS output being time-varying and high-dimensional, so as to specify detailed kinematics, we made the DLS output scalar, thus constraining it to represent coarse sequence-level information, e.g. movement speed modulation or ‘vigor’, as has been suggested48,60. This model led to markedly different DLS activity patterns across execution modes, in violation of our experimental findings (Extended Data Fig. 9a-g).
We next explored an ‘action selection’ model in which DLS is constrained to generate signals only at the boundaries of elementary movements, which inform the rest of the circuit of the next ‘lever-press’ to be executed. This model resulted in DLS activity patterns that lacked prominent representations of egocentric movement information, again in violation of our experimental data (Extended Data Fig. 9h-m). Finally, to test whether the mode-specific deficits of our lesions were due to the existence of pre-trained motor circuits capable of executing cued movements, we eliminated the pre-training of downstream circuits and trained the full network de novo on all aspects of both task modes (CUE and OT). This simulation failed to capture the DLS lesion resilience seen in our experiments (Extended Data Fig. 9n-s).
Comparing the results from our various models with our experimental data further supports the idea that, for behaviors with learned task-specific kinematics, the BG provide fine-scale kinematic control signals to downstream circuits, which in the case of automatic motor sequences defines the high-level sequential structure of the learned behavior.
Discussion
We set out to probe the BG’s contribution to motor sequence execution and how it differs depending on whether the sequence is informed by sensory cues, working memory, or is generated after lengthy overtraining (Fig. 1). Surprisingly, neural activity patterns in the sensorimotor striatum (DLS) of rats producing the same three-element sequences were similar across the execution modes (Figs. 2-4). In all cases, DLS activity represented low-level kinematic features – not high-level aspects (e.g. lever identity, ordinal position) – of the behaviors (Fig. 5). Consistent with this coding scheme, lesions to the DLS affected task-specific movement kinematics similarly in all modes. Interestingly, higher-level sequential organization, while not represented in DLS, was affected by DLS lesions on overtrained and working memory-guided trials but remained intact for visually cued sequences (Fig. 6). Lesions to the associative regions of the striatum (DMS) had only transient effects on performance and no effect on movement kinematics (Fig. 7). A simple network model could recapitulate our results and provide an explanation for the different findings: DLS learns to transform its inputs into mode-invariant activity patterns that provide similar kinematic control signals to downstream motor circuits in all execution modes (Fig. 8).
DLS’s role in specifying low-level kinematics of task-specific learned movements
The BG have been implicated in a diverse array of functions related to motor sequence execution, including action selection33,44,57,88,111, the storage of sensorimotor associations24,97, chunking57, and low-level kinematic specification of the requisite movements24,48,62,63. Evidence for the different functions comes from studies that challenge animals to produce motor sequences in different ways14,112, some relying on sensory cues others on working memory to inform sequencing47,49,51,72, while many probe overtrained, or automatic, sequence execution24,35,53,56,85. Thus, the often-discrepant views of BG’s role in motor sequence execution could simply reflect the different computational demands of the various tasks and BG’s ability to contribute to each of them in different ways and to different degrees.
Our experimental paradigm allowed us to test this by probing striatal function across three distinct modes of motor sequence execution in the same animal. Surprisingly, we did not find any meaningful difference in DLS activity when animals performed the same motor sequence in the different execution modes. In all cases, DLS represented ego-centric movement kinematics, suggesting a general role for the BG in the specification and control of learned movements24,62,63,85,91.
Most prior studies advancing this view24,63,85 probed highly stereotyped movement patterns, in which kinematics and behavioral phase (or time) are intrinsically linked95, leaving open the possibility that BG controls not kinematics but the temporal progression92,96,113. Our ‘controlled’ motor sequence task breaks this link, allowing us to distinguish the two coding schemes by comparing across many different sequences. This analysis revealed that DLS indeed represents kinematics, not temporal progression, or phase, of the behavior (Fig. 5h).
Consistent with a role for the DLS in specifying time-varying kinematics, lesions affected task-specific learned movements across execution modes (Fig. 6e-f). While lesioned animals could still orient towards the levers and press them, the speed and variability of their movements reverted to what they expressed early in learning (Extended Data Fig. 8b-f). We observed a similar reversion also in a previous study, in which rats learned idiosyncratic and highly precise and complex movement patterns required to press a lever with an inter-press interval of 700ms. After DLS lesions, rats reverted back to species-typical repetitive lever-pressing behavior akin to what they had expressed early in learning24.
Together, these results are consistent with a role for DLS in the task-specific refinement of species-typical movements generated in downstream circuits, in our case, likely in the brainstem114–116. We saw a similar result in our circuit model: when the ‘DLS’ circuit was allowed to interact with pre-trained ‘downstream circuits’ that could independently respond to cues, variability in trial duration decreased following sequence learning and increased following DLS lesion for both automatic and cued execution modes (Fig. 8h-i). Dissecting the model revealed that, despite receiving different inputs across modes, the DLS learned mode-invariant activity that modulated movement kinematics through its downstream targets.
Our results also corroborated many studies that found DLS lesions to affect movement vigor (the overall speed/amplitude of a movement/action) (Fig 6b-c), a scalar aspect of kinematics48,60,85,110. However, vigor alone was not sufficient to explain the change in movement variability and smoothness following DLS lesions (Fig 6e-f), consistent with a role for the DLS in the detailed time-varying specification of movement kinematics24. Finally, a model constrained such that DLS provides a vigor signal to downstream circuits showed markedly different activity across execution modes in contrast to our data (Extended Data Fig. 9).
DLS role in generating high-level sequential structure?
Despite our experiments implicating the BG in the specification of task-specific movement kinematics24,62,85, successful performance in our task does not explicitly require such adapted kinematics. Reward is delivered contingent only on the three-element sequence being generated as prescribed, which can be done also with slow and inefficient movements akin to what DLS lesioned animals express. That we nevertheless see deficits in task performance after DLS lesions would seem to implicate the DLS also in the control also of high-level sequence structure.
However, we found no evidence for DLS activity representing such sequence structure. Furthermore, while DLS lesions severely affected performance in automatic and working memory-guided sessions, there was no lasting effect on the success rate on visually cued trials despite the kinematics being similarly affected. Interestingly, this mirrors what is seen in Parkinson’s patients, whose inability to execute internally generated motor sequences can be rescued by providing instructive visual cues27,117. Taken together with our results, this implies that cue-action associations, and the serial action-selection process they inform, is not reliant on the BG.
That DLS lesions affected sequence structure on trials in which the prescribed sequence was not informed by external cues (OT, WM trials), raises the question of how the DLS, and the BG more generally, contributes to sequencing such internally generated behaviors? For highly stereotyped overtrained motor sequences (expressed on OT trials), we posit that the behavior becomes consolidated in terms of DLS-dependent continuous low-level kinematics12,32. In this case, behavioral progression would no longer rely on the selection of discrete actions but instead results from an invariant and continuous mapping of past to future behavior, a process we argue is DLS dependent14,24.
DLS’s contributions to working memory-dependent motor sequences is more of a mystery, though our results suggest that DLS-dependent dynamics is required (Fig 6). One possibility is that striatum is essential for working memory processes more generally. Studies showing that de novo Parkinson’s patients have performance deficits in a working memory task independent of the severity of their motor symptoms supports this idea98,118. Though we did not see a distinction in task-related DLS activity between working memory trials and other types of trials (Fig. 4), our study was not designed to assess the time leading up to trial initiation, leaving open the possibility that there may be signatures of working memory processes in this preparatory phase, as has been observed in DMS in a different task paradigm119. Interestingly, lesions of DMS, a region implicated in working memory in other studies120, did not cause performance deficits in our paradigm (Fig. 7).
DLS role in ‘chunking’ and action selection?
The BG have also been widely implicated in the process of ‘chunking’57, through which a motor sequence initially generated by a serial decision-making process becomes linked, over training, in a way that allows the sequence to be selected and executed as a single action40,57. Support for BG’s role in chunking comes from studies showing that the neural representation of task-specific motor sequences in striatum becomes sparser with extended practice52,56,121, preferentially marking the boundaries (i.e. start and stop) of overtrained, or ’chunked’, behaviors52,54,56. Such sparsification is consistent with a role for the BG in selecting actions elaborated in downstream circuits52,121,122. For sensory- and working memory-guided behaviors, action selection would occur at the level of individual elements and hence be more granular, while selection for overtrained behaviors would happen at the level of the whole sequence or ‘chunk’.
Our paradigm allowed us to directly probe whether DLS activity reflects such a function by comparing it for the same motor sequence with similar kinematics in an automatic (selection of the sequence as a chunk) and a controlled (serial selection of discrete motor elements) context (Fig. 1). Not only did we fail to see prominent start/stop activity on automatic trials, but we found that controlled and automatic motor sequences share highly correlated representations in the DLS (Fig. 4).
How should the differences between our results and those showing signatures of chunking and – implicitly – the selection of overtrained sequences by the BG be interpreted? First, most studies implicating BG in chunking tend to compare overall activity early in learning to what is seen in the expert (see, e.g.19,34,55,87,121). Any observed activity difference could thus reflect either the process of chunking or changes in movement kinematics, including vigor, that occur as a function of training. For example, one recent study found behaviorally-locked sequential activity patterns in DLS already early in learning, while position- and speed-related activity became more prominent after extensive practice85. That we see no difference across automatic and controlled motor sequences suggests that start/stop and sequence-specific activity seen in previous studies is not the consequence of motor chunking per se but may instead reflect learning-related changes in motor output or a shift in how the DLS contributes to it.
Alternatively, the lack of sequence-specific activity in our study could mean that the motor sequence we overtrained failed to coalesce into a single ‘chunk’ due to some peculiarity of our experimental approach. We do not find this plausible, given the clear signatures of automaticity we see (Fig. 3) and the lengthy training times. Furthermore, in an earlier study, in which we trained rats to generate highly stereotyped and stable movement patterns without any need for serial decision making or action selection, we similarly did not see start/stop activity24. However, we recognize that absence of evidence is not evidence of absence, and we cannot exclude the existence of sequence-specific activity in cells that we were not recording from.
While our results do not support a role for the BG in initiating consolidated motor chunks elaborated downstream, they are consistent with overtrained motor sequences becoming defined as BG-dependent motor chunks. Thus, rather than selecting them, BG’s role in motor chunking could be in transforming discrete motor sequences into single continuous actions in which the high-level sequential structure of the overtrained behavior is specified by BG-dependent low-level kinematics.
Circuits controlling sensory-guided motor sequences need to be elucidated
Our finding that visually guided motor sequences can be performed as prescribed after DLS and DMS lesions, begs the question of which circuits control the progression of sensory guided behaviors. Work in humans and NHPs have suggested a role for cortex17,77,123,124. Though we have shown that motor cortex is not required for executing highly overtrained automatic behaviors in rodents25, it remains an open question whether and how motor cortex contributes to sensory-guided motor sequences and the degree to which its function differs across execution modes. Work on skilled reaching movements, i.e. behaviors crucially instructed by visual input, has implicated motor cortex in their execution125. This suggests that the need to respond to environmental cues may render motor cortex a necessary controller. Future experiments will be needed to address the degree to which motor cortex’s contributions to movement control depend on the specific challenges posed by a task, and the condition under which its function require the BG.
Methods
Animals
The care and experimental manipulation of all animals were reviewed and approved by the Harvard Institutional Animal Care and Use Committee. Experimental subjects were female Long Evans rats 3- to 8-months at the start of training (Charles River).
Behavioral apparatus
Animals (n=23 total) were trained on our discrete sequence production task (the ‘piano’ task) in a fully automated home-cage training system126. Hardware was controlled by Teensy 3.6 and experiments and videos were recorded by Raspberry Pi 3. Home cage training was done in custom-made behavioral boxes. Boxes were outfitted with three levers spaced approximately 2.5 cm apart, and 14 cm above the floor. Plastic barriers 0.25” thick, 2.3” tall, and of 1” extent were placed between each lever to restrict the postures with which a rat can use their forelimb to press levers. A reward spout for water delivery was placed beneath the center lever. Lever presses were registered when lever displacement reached a threshold, corresponding to an angular deviation from horizontal of ∼14 degrees. Lever displacements were measured by optical sensors (Digi-key QRE1113-ND). Three cameras (Raspberry Pi Camera Module V2) recorded videos from each side of the box, and from the top (Extended Data Fig. 1a, Extended Data Fig. 5a).
Behavioral training
Water deprived rats received four 40-minute training sessions during their subjective night, spaced 2 hours apart. Starts of sessions were indicated by blinking house lights, a continuous 1kHz pure tone, and a few drops of water. At the end of each night, water was dispensed freely up to the daily minimum (5ml per 100g body weight).
Importantly, we wanted our paradigm to distinguish motor automaticity from habit formation11,12, two independent processes that can occur alongside each other and that may involve some of the same neural substrates11,127,128. Thus, we designed our task to ensure that rats achieve automaticity on a motor sequence without developing it into a habit. Since habits tend to form when the correlation between actions and outcomes is weak or variable129–131, if the reward is delayed132, or, further, if the reward is appetitive or addictive133–135, our paradigm directly linked behavioral variants to a water reward in a training process131 that resulted in highly overtrained behaviors expressed in goal-directed ways.
Stages of training
Rats were initially pre-trained to associate a visual cue with a lever press. On each trial, one of three LEDs above each of the three levers, chosen randomly, would light up to signify the ‘correct’ lever. Correct presses were rewarded while incorrect presses triggered a 1.2 second timeout, which was retriggered for every press in the time out period. To prevent rats from only selecting a single lever, we gradually decreased the probability of cueing a repeatedly pressed lever. All rats (n=18) learned to associate levers with cues in a median of 5284 trials. The criterion for learning was performing at >90% success rate for >100 trials.
After learning to associate visual cues with levers, rats were rewarded only when performing consecutive lever presses. Initially rewards were provided for every two successful consecutive presses, but after 500 rewards, reward was dispensed only after every three consecutive lever presses. Cued levers were constrained to not repeat, giving 12 different possible three-lever sequences. Rats quickly learned to press levers in a sequence, and no longer visited the reward port in between consecutive presses.
Following 1000 successful three-lever trials, rats were introduced to the block structure (Fig. 2a), which was modeled on a sequence task in primates78. In the block structure, the same three-lever sequence is cued on each trial until there are 6 successful performances, then a new sequence is randomly chosen. Cues are presented sequentially with no delay.
After ∼1-2 weeks of training on the block structure, working memory (WM) trials were introduced by withholding the 2nd and 3rd cued lever in the 4th-6th trial of the block. Missed uncued levers were not required to be repeated until successful for a new sequence to be chosen.
One of the four nightly controlled sessions was changed to an automatic session. In this session, rats were required to perform only a single three-lever sequence, chosen randomly for each rat. This sequence was initially fully cued. Cues were then removed in reverse sequence, from last lever to first lever, every time the rat performed at >50% success rate over 30 trials. If success rate fell below 20% for 30 trials, or if rats failed to press the lever once in the entire session, a cue would be added back in. All rats were able to perform > 100 trials with at least two of the three cues withheld after 1024 ± 438 (mean ± SEM) trials or 17 ± 10 (mean ± SEM) days. After learning the sequence without cues, they would occasionally (5.18% ± 1.39% (SEM) of trials) hit the lower threshold prompting the addition of cues.
Behavior analysis
In total, 23 rats were trained on the three-lever task. 12 of 23 were used to characterize the behavior; nine were excluded because kinematics was not captured or analyzed early in training.
Definition of expert performance
Expert performance was determined when success rates and trial times stabilized to within 0.5σ of final performance values (based on the last 2000 trials). Metrics (success rates and trial time) were smoothed with a moving average of 400 trials. Furthermore, we required automatic performance to reach >72% success rates to be considered ‘expert’, following previous studies55,85,87,125.
Calculation of performance metrics
Success rate
Success rate was defined as the number of rewarded trials divided by the total number of attempted trials.
Trial time
Defined as the interval between the first and third lever press. This only includes successful sequences, as incorrect sequences may not include three full lever presses.
Error variability
Defined as the Shannon entropy (in bits) of the probability of each sequence occurring for a given target sequence. Low probability sequences (p<0.001) are discarded. If mistakes are systematic, the probability distributions will be skewed towards particular erroneous sequences, and the entropy will be low. If mistakes are made randomly, the distribution will look more uniform, and the entropy will be high. For controlled sequences, the error calculation was done on the sequence chosen for the OT mode.
Error modes
Motor errors were defined as failures to touch or fully depress the ‘correct’ lever in what would otherwise have been a correct sequence (Supplementary Movie 2). I.e. the rat oriented to the ‘correct’ lever and swiped at it but failed to depress it beyond the threshold for detection. Sequence errors, on the other hand, involved orienting to and pressing the wrong lever. For each rat and session type (controlled and automatic), ∼100 videos of error trials were manually inspected and labeled as either a sequence or motor error. To analyze behavior following different error mode types (Fig. 3f), we used a heuristic to automatically estimate and classify failures as motor or sequence errors. This allowed us to analyze >100 trials. In this analysis, motor errors were classified as any error sequence that resembled an omitted lever (e.g., for the target sequence LRC, RC and LC are considered motor errors), while sequence errors are any other type of mistake. This heuristic generally overestimates motor errors (29.65% ± 5.9% of errors for CUE trials, 16.95% ± 5.48% of errors for WM trials, 13.16% ± 5.32% of errors for OT trials, data is mean ± s.e.m.) and underestimates sequence error. Trial-dependent accuracies and trial times, conditioned on the type of trial that came before (hit, motor error, sequence error), were then calculated using this heuristic (Fig. 3f). For trial times, we only considered sequences that had the same overall movement length as the overtrained sequence (since L→R is further to travel than L→C) as a control.
Average forelimb speed
Raw trajectories (position traces) of the active forelimb were smoothed and up-sampled (from 40hz to 120hz) using cubic smoothing spline (csaps in Matlab, smoothing parameter of 0.1). Instantaneous velocities for the horizontal (x) and vertical (y) positions were calculated, then converted to instantaneous speed. This value was averaged from 0.1 second before the first lever press to 0.1 second after the last one. Since velocity was calculated from a side-view camera, and animals moved towards and away from the camera to press different levers, we left velocity measures in pixels/s.
Movement smoothness
To quantify the smoothness of the movement, or its continuity and non-intermittency71, we measured each trajectories spectral arc length, a dimensionless metric that measures the arc length of the Fourier magnitude spectrum within an adaptive frequency range103. This metric quantifies smoothness independent of amplitude and duration, and is less sensitive to noise than another popular smoothness metric, the log-dimensionless jerk71,136. Values are scaled between the maximum and minimum average spectral arc length recorded across rats in Figures 2i, 6i, and 7i. Values closer to 1 are more smooth.
Trial selection for behavioral analysis
In Fig. 2, early performance is taken from the first 1000 trials, and late performance is measured from the last 1000 trials. For Fig. 3, metrics are calculated from all trials after expert performance was reached. For Fig. 6 and 7, pre- and postlesion accuracies are calculated from the week before and after lesion. Pre- and postlesion trial times, trial speeds, and kinematics are taken from the last 1000 trials before and after lesion. Finally, it is important to note that since automatic sessions are introduced after the controlled sessions, early performance on automatic (OT) trials benefits from prior controlled practice (Fig. 2).
Kinematic tracking
To track the movements of the rat’s active forelimb and head during our task, we utilized recent machine learning approaches to detect keypoints from individual video frames. Videos of animals performing the task were acquired at 40 Hz (90 Hz for DLS recording cohort) from cameras pointing at the lever from the two sides to obtain both forelimb trajectories, and one camera pointing down from the top to obtain the rat’s horizontal position. For video-based tracking, we trained ResNet-50 networks pretrained on ImageNet, using DeeperCut (https://github.com/eldar/pose-tensorflow)82. To refine the tracking for our rats, we randomly selected about ∼200 frames per view and trained the network using manually labeled position of the hand and nose. The network was then used to predict the position of body parts across all trials on a frame-by-frame basis, using GPUs in Harvard Research Computing cluster. Tracking accuracy was qualitatively validated post-hoc by visual inspection of 5 trials across 3 different sessions. Frames with poor tracking (< 0.95 score from the model), due to occlusion of the forelimbs, were removed and trajectories in those frames were linearly interpolated. If > 5 consecutive frames were removed, the trial was discarded for tracking purposes. Additionally, any trial with >5% of poorly tracked frames was removed from the analysis. The full trial trajectory was then smoothed using a Gaussian filter in matlab, with a σ of 0.6 frames.
To track the movements of the rat’s active forelimb and nose in 3-dimensions, we first calibrated our multiple camera views (left side, right side, top side, see Extended Data Fig. 5a) to a set of manually labeled features in our box observable from both views in order to calculate camera extrinsics and world coordinates, drawing from camera calibration functions in the Computer Vision Toolbox (e.g. estimateWorldCameraPose, estimateCameraParameters, cameraPoseToExtrinsics). We could then use the calibrated cameras to triangulate our 2D estimated points into 3D137. For the forelimb, we triangulated from 2D points tracked on the left and right cameras (Extended Data Fig. 5b). The nose used either the left and top, or right and top cameras. 3D world coordinates are in mm, relative to one of the manually labeled feature in the box.
Kinematic analyses
To quantitatively compare kinematic similarity, we computed pairwise trial-to-trial correlations. Since trial times varied, movement trajectories were time warped to a common template. Specifically, trajectories from each trial were interpolated so that the time between the 1st and 2nd lever, and the time between the 2nd and 3rd lever, matched the median inter-lever intervals. For sub-movement correlations, trajectories are warped to the median inter-lever interval time. Though we tracked both forelimbs, analyses were performed only on the active forelimb used to press the lever. Rats used a single forelimb to perform lever presses (n=12/23 right, n=11/23 left).
Electrophysiological recordings
Microdrive construction, surgical and recording procedures were performed as previously described138. After expert performance was reached on the sensory guided, memory guided, and automatic tasks, a microdrive containing arrays of 16 tetrodes was implanted into the DLS (n=4 rats) contralaterally to the active forelimbs as previously described24 (Extended Data Fig. 2a). Neural and behavioral data was recorded continuously for 95+-31 days. The drive was occasionally advanced by ∼200 µm, 0-4 times over the course of the experiment. At the end of the experiment, an electroylytic lesion was done to mark the electrode site. This was done by passing a 30 µA anodal current for 15s through the electrode tips. For implant coordinates, according to Paxinos139, see Table 1.
Lesion surgeries
After reaching expert performance, bilateral striatal lesions were performed (n=7 DLS, n=6 DMS) as previously described24. For injection coordinates, see Supplementary Table 1. In brief, quinolinic acid (0.09 M in PBS (pH 7.3), Sigma-Aldrich) was injected in 4.5nl increments, via a thin glass pipette connected to a microinjector (Nanoject III, Drummond). Lesions were performed in two stages, starting with the side contralateral to the primary forelimb (the forelimb that presses the first lever). Animals recovered for 7 days before being reintroduced to training.
Histology
At the end of the experiment, animals were euthanized (100 mg/kg ketamine and 10 mg/kg xylazine), transcardially perfused with either 4% paraformaldehyde (for nissl staining to confirm lesion size and location), or 2% paraformaldehyde (PFA) and 2.5% glutaraldehyde (GA) (for osmium staining, to confirm location of electrode implant) in 1x PBS. For electrode implants, brains were then stained with osmium (as described in140) and embedded in epoxy resin for micro-CT scanning. Micro-CT scans (X-Tek HMS ST 225, Nikon Metrology Ltd.) were taken at 130 kV, 135 uA with 0.1 mm copper filter and a molybdenum source. 3D volume stacks were reconstructed (VG studio max), and brains were aligned along the coronal, medial, and sagittal plane using Fiji. Location of the electrolytic lesion could be calculated relative to anatomical landmarks (i.e., corpus callosum split at AP = 1.65mm from bregma, anterior commissure split at 0 mm bregma). For lesioned animals, brains were sectioned into 80-µm slices using a Vibratome (Leica), then mounted and stained with Cresyl-Violet. Images of whole brain slices were acquired at x10 magnification with either a either a VS210 whole slide scanner (Olympus) or an Axioscan slide scanner (Zeiss). To quantify the extent and location of striatal lesions, we analyzed coronal sections spanning the anterior posterior extent of the striatum from 4 calibration animals and 2 experimental animals (DLS) (7 hemispheres injected in total) or from 4 experimental animals (DMS). Boundaries were manually marked based on differences in cell morphology and density141,142. The extent of the striatum was determined based on the Paxinos139, using anatomical landmarks (external capsule, ventricle) and cell morphology and density.
Neural analysis
Spike sorting
Raw neural data was collected continuously over the course of the experiment (mean and s.e.m. are 95 ± 31 days, n=4 fully trained rats). Spiking activity from populations of single units was sorted using our custom-designed spike-sorting algorithm, Fast Automated Spike Tracker (FAST)138. A custom Matlab GUI (https://github.com/Olveczky-Lab/FAST-ChainViewer) was used to manually isolate and track single-units over long timescales. On average, we isolated 20.5 ± 13.3 units simultaneously in the striatum within each session. We were able to track units for an average of 4.3 ± 1.2 days. Assessing the quality of sorted single unit was done as previously described138. In brief, we evaluated unit quality by computing the isolation distance143, the L-ratio144, the fraction of inter-spike interval violations145,146, discarding units that did not meet our criteria (Isolation distance >=25, L-ratio <=0.3, and ISI violations below 2m <= 1%).
Unit type identification
Units were identified as putative MSNs or FSIs based on their peak width (full width at half maximum) and time interval between spike peak and valley147,148. Units with peak width >150 µs, peak-valley interval >500 µs were classified as SPNs, while units with peak width ≤150 µs, peak-valley interval ≤500 µs were classified as FSIs.
Criteria for unit selection
We selected a subset of the total population of recorded units for our neural analyses. To be included, we required that a neuron fired at least 1 spike, on at least 25% of all trials and was recorded over >5 rewarded trials in each execution mode (CUE, WM, OT) for Fig. 4, or >5 rewarded trials in each of the 12 different sequences. From a total of 2468 recorded and well-isolated units, this criterion reduced the units available for analysis to 579 units for the comparison across execution modes, and to 340 units for comparisons across sequences in the controlled sessions.
Neural metrics
Trial averaged, z-scored activity
First, instantaneous firing rates were calculated for each trial by convolving binned (10ms bins) spike counts with a Gaussian kernel (σ=25ms). To account for differences in trial times, firing rates were then local linearly warped to the median press times. Warping was done only after calculating firing rates so as to not artificially increase or decrease the firing rate. Firing rates before the first and after the last lever press were not warped. After this alignment step, each trial was z-scored, and then averaged over for each unit and each execution mode.
Average firing rates
Firing rates on individual trials were calculated from -0.2 seconds before the first lever press, to 0.2 seconds after the third lever press. This value was averaged over every trial for a given execution mode.
Average activity
To determine if there was elevated population activity at privileged timepoints in the sequence in the task (e.g. at the boundaries of discrete motor elements), we averaged over the time varying z-scored activity for each unit recorded in a rat. This average trace was compared to a distribution of average z-scored activity, sampled from random times in the behavior (2 seconds before and after the first and last lever press, n=1 × 104 permutations).
Correlation across execution modes
For each unit, correlation coefficients were computed between the time-varying vector of trial-averaged activity across execution modes.
Correlating neural activity associated with behavioral elements across sequences
We computed the correlations between neural population vectors of combined orienting and lever pressing movements across the 12 different sequences. Population vectors were calculated by averaging the activity of each neuron over orienting and lever press movements. Orienting movements were defined as those that occurred 0.1 seconds after a prior lever press until 0.1 seconds before the next press. Lever press movements were defined as occurring +/-0.1 seconds around the lever deflection. We excluded the first lever press for this analysis, as it was not preceded by an orienting movement.
Principal components
Principal component analysis was performed on the matrices of population activity (neurons vs. time) concatenated across either the three execution modes (CUE, WM, OT) or the twelve sequences along the time dimension. Execution modes or sequences were then disjoined to generate the plots in Fig. 4e and Fig. 5d.
Neural decoding analysis
We used a feedforward neural network with two hidden layers to predict the time-varying, 2D velocity components of the active forelimb (side camera) and the nose (top camera) from the spiking activity of ensembles of DLS neurons. Spiking activity was binned in 25ms bins. We used 75ms of coincident spiking activity as the input to the model. Other model parameters were the same as in previous work24. We additionally challenged our network to predict the 3D velocity components of the active forelimb and nose, from 3D world coordinates triangulated from calibrated cameras (described above) (Extended Data Fig 5d-f).
We trained our models on blocks of >50 trials in which there were at least 12 simultaneously recorded units that had an average firing rate >0.25hz during the trials. In each block of trials, we fit decoding models using the activity of up to n=20 randomly sampled ensembles of 12 striatal units. We quantified model performance using two-fold cross validation by computing the pseudo-R2: Decoding performance (pseudo-R2) was measured in each ensemble, then averaged across all 20 ensembles, and then averaged across all blocks of trials for each rat. For subset model, the training dataset was generated from only 6 of the 12 sequences, chosen at random for each of the n=20 ensembles. The test dataset was then generated from the remaining 6 sequences.
Neural network model
We simulated an artificial neural network consisting of two populations, one corresponding to DLS and another to other downstream motor circuits. The DLS network contained no recurrent connections while the downstream network contained all-to-all recurrent weights. The two populations were bidirectionally connected with all-to-all feedforward and feedback weights. The downstream motor network directly controls movement via a set of feedforward weights, and also receives an additional source of input representing cue signals via a set of feedforward weights. Each network consists of 500 units with a rectified linear (ReLU) activation function. Network weights were initialized with the Kaiming Uniform initialization149. Gaussian noise of standard deviation 0.1 was added to the inputs to each neuron in both networks at every timestep in all simulations.
We modeled a simplified version of the experimental task in which the output of the network controls the velocity of a “forelimb” (represented simply as a point) and is tasked with moving it into a set of three circular target zones in a prescribed sequential order, as in the piano task. The target zones were positioned as shown in Fig. 8b. On each trial, the loss function measuring the performance of the network was defined as the sum of the squared distance between the forelimb position and the center of the current target. The identity of the current target changes to the next in the sequence once it is reached. On cued trials the target lever changed, and on the first step, cue input was provided in the to the downstream network in the form of a vector indicating the position of the cue relative to the forelimb. The cue input was transient, lasting only one timestep for each cue. If the sequence was not performed successfully within T=40 timesteps of the simulation, the trial was halted and considered a failure.
The network, excluding the DLS input and output weights, was pre-trained on the cued task for 100,000 iterations (well past the point where asymptotic performance was reached). The DLS input weights were then trained on randomly interleaved cued and automatic task trials (50% probability of each, with all 12 possible target trajectories equally likely on cued trials), again for 100,000 iterations. The target sequence on automatic trials was always the same (right, center, left). All network training used backpropagation and the Adam optimizer with learning rate set to 1e-4. Training was conducted using PyTorch.
In Extended Data Fig. 8a-g, we modified the network architecture by replacing the NxN DLS output weights with a chain of Nx1 and 1xN weights, corresponding to a rank-1 projection. In Extended Data Fig. 8h-m, we modulated the gain of DLS activity, suppressing it by a factor of 0.1 at all time steps except the first and those when the target lever changed. In Extended Data Fig. 8n-s, we omitted the pretraining stage and instead trained the entire model on all task modes for 200,000 iterations.
Extended Data
Supplementary Info
Supplementary Video 1
Three trials of the same motor sequence performed in the CUE, WM, and OT execution mode. Videos are shown first from a top camera, then from a side camera. In the side videos, the kinematics of the active forelimb is tracked and plotted.
Supplementary Video 2
Trials from the OT mode from an example rat are shown to demonstrate the types of errors we observe. Example trials include five consecutive successful trials, two motor errors that follow a successful trial, and finally a run of sequence errors following motor errors.
Footnotes
Introduction and discussion were updated to improve the framing of the study and its results; new figure panels and supplemental figures.
References
- 1.↵
- 2.↵
- 3.↵
- 4.↵
- 5.↵
- 6.↵
- 7.↵
- 8.↵
- 9.↵
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.↵
- 25.↵
- 26.↵
- 27.↵
- 28.↵
- 29.↵
- 30.↵
- 31.↵
- 32.↵
- 33.↵
- 34.↵
- 35.↵
- 36.↵
- 37.↵
- 38.
- 39.↵
- 40.↵
- 41.↵
- 42.↵
- 43.↵
- 44.↵
- 45.↵
- 46.↵
- 47.↵
- 48.↵
- 49.↵
- 50.↵
- 51.↵
- 52.↵
- 53.↵
- 54.↵
- 55.↵
- 56.↵
- 57.↵
- 58.↵
- 59.↵
- 60.↵
- 61.↵
- 62.↵
- 63.↵
- 64.↵
- 65.↵
- 66.
- 67.↵
- 68.↵
- 69.↵
- 70.↵
- 71.↵
- 72.↵
- 73.↵
- 74.
- 75.↵
- 76.↵
- 77.↵
- 78.↵
- 79.↵
- 80.↵
- 81.↵
- 82.↵
- 83.↵
- 84.↵
- 85.↵
- 86.↵
- 87.↵
- 88.↵
- 89.↵
- 90.↵
- 91.↵
- 92.↵
- 93.↵
- 94.↵
- 95.↵
- 96.↵
- 97.↵
- 98.↵
- 99.↵
- 100.↵
- 101.↵
- 102.↵
- 103.↵
- 104.↵
- 105.↵
- 106.↵
- 107.↵
- 108.↵
- 109.↵
- 110.↵
- 111.↵
- 112.↵
- 113.↵
- 114.↵
- 115.
- 116.↵
- 117.↵
- 118.↵
- 119.↵
- 120.↵
- 121.↵
- 122.↵
- 123.↵
- 124.↵
- 125.↵
- 126.↵
- 127.↵
- 128.↵
- 129.↵
- 130.
- 131.↵
- 132.↵
- 133.↵
- 134.
- 135.↵
- 136.↵
- 137.↵
- 138.↵
- 139.↵
- 140.↵
- 141.↵
- 142.↵
- 143.↵
- 144.↵
- 145.↵
- 146.↵
- 147.↵
- 148.↵
- 149.↵