Pupil diameter tracked during motor adaptation in humans

Pupil diameter, under constant illumination, is known to reflect individuals’ internal states, such as surprise about observation and environmental uncertainty. Despite the growing use of pupillometry in cognitive learning studies as an additional measure for examining internal states, few studies have used pupillometry in human motor learning studies. Here we provide the first detailed characterization of pupil diameter changes in a short-term reach adaptation paradigm. We measured pupil changes in 91 human participants while they adapted to abrupt, gradual, or switching force field conditions. Sudden increases in movement error caused by the introduction/reversal of the force field resulted in strong phasic pupil dilation during movement accompanied by a transient increase in tonic pre-movement baseline pupil diameter in subsequent trials. In contrast, clear changes in pupil responses were absent when the force field was gradually introduced, indicating that error drove the changes in pupil responses. Nevertheless, we found an association between baseline pupil diameter and awareness of the gradually-introduced perturbation assessed post-experimentally. In all experiments, we also found a strong co-occurrence of larger baseline pupil diameter and slower reaction and movement time after each set break. Interestingly, error-induced pupil responses gradually became insensitive after experiencing multiple reversals. Collectively, these results suggest that tonic baseline pupil diameter reflects one’s belief about environmental uncertainty, whereas phasic pupil dilation during movement reflects surprise about a sensory outcome (i.e., movement error), and both effects are modulated by novelty. Our results provide a new approach for non-verbally assessing participants’ internal states during motor learning.


Introduction
Motor learning, as a process of correcting movements to achieve a goal, is a complex mixture of multiple processes (1)(2)(3). For instance, riding a bicycle initially requires substantial effort to control the handlebars and pedals while balancing, but these conscious efforts are eventually taken over by more automatic and implicit control processes. Studies using arm reaching posit that motor adaptation to a novel dynamic/kinematic environment consists of multiple processes, which are often contrasted as conscious/explicit and automatic/implicit components (4)(5)(6), each separately depending on prefrontal and cerebellar function (7,2). Indeed, human functional imaging studies have revealed that during the acquisition of new motor skills, prefrontal and hippocampal areas are particularly active in the earliest stage of learning. In later stages, the motor and parietal areas, as well as subcortical regions, become more active (8)(9)(10)(11). However, the precise cognitive processes represented by early prefrontal/hippocampal activation and the ways in which these processes evolve at finer temporal scales remain unclear.
Pupil diameter under constant illumination is known to reflect a variety of internal cognitive states of individuals performing cognitive or simple motor tasks. Task-evoked changes reflected in phasic pupil dilation have been associated with (unsigned) prediction error or surprise about observations (12)(13)(14)(15)(16)(17)(18), and mental/physical effort (19)(20)(21). Relatively slow changes in tonic baseline diameter have often been associated with arousal/vigilance (22)(23)(24)(25)(26)(27), the tendency of the exploration/exploitation trade-off (28,29), and more recently, subjective uncertainty about the environment (12)(13)(14)(15)(16)(17)30). Although these constructs have different names, they are closely related to each other in terms of their dynamics. For instance, a surprising observation (i.e., a large deviation from an expectation) may imply a change in the environment leading to a (transient) increase in subjective environmental uncertainty. In an uncertain situation, an animal may need to make more mental/physical effort to find a better solution (i.e., exploration) which may recruit increased arousal/vigilance. As the animal adapts to the new environment, surprise about observations and other variables gradually returns to the average level. Thus, surprise and uncertainty appear to be essential for interpreting pupil responses in various tasks.
More importantly, surprise and uncertainty are believed to play a crucial role in learning behavior by dynamically adjusting learning rate, especially in reward-based learning, including conditioning (31) and choice decision-making (15,(32)(33)(34). Thus, these constructs could similarly affect motor learning, such as driving explicit processes to explore potential reach plans (5) or changing sensitivity to errors (35). However, there have been surprisingly few attempts to assess the trial-by-trial changes in pupil diameter during human motor learning. In the current study, to characterize pupil responses and the types of information they reflect in a commonly-used motor learning paradigm, we ran a series of experiments in which we simultaneously tracked pupil diameter during short-term force field adaptation in a reaching paradigm (36). In our experiments, we recruited 91 participants who performed reaching movements under the presence of a velocity-dependent force field that was introduced either abruptly (n = 28), abruptly and then reversed multiple times (n = 29), or gradually (n = 34). Our data suggest that the tonic baseline pupil diameter reflects participants' belief about environmental uncertainty, whereas the phasic pupil dilation during movement reflects surprise about a sensory outcome (i.e., movement error), and both are modulated by the novelty of perturbation. The current study thus provides an interesting new approach for non-verbally assessing participants' cognitive states during motor learning.
we measured participants' pupil diameter for a variable duration (3-11 s. The start position then changed to green, informing the participants to initiate reaching. Movement was defined as the period at which the hand movement velocity was above a threshold (3.5 cm/s). At the completion of a reaching movement, endpoint feedback was provided by a magenta cursor (0.5 cm diameter) for 1,000 ms.
We introduced a velocity-dependent curl force field (39) to establish the relationship between the pupil response and the motor adaptation. The force field was applied according to the following equation: where ! and " are the force applied to the handle (N), and ! and " are the velocity of the handle (m/s) for the x-and y-directions, respectively. For the clockwise (CW) force field, the viscosity coefficient B (N/[ms -1 ]) had positive values, and for the counter-clockwise (CCW) field, B had negative values. To quantify adaptation to the force field, we occasionally introduced "channel" trials, in which the handle motion was constrained to a straight path between the home position and the target by a simulated damper and spring (40), to measure the force applied to the channel.
To make inter-participant comparisons of pupil diameter interpretable (e.g., for Exp 3), we additionally measured reference physiological response amplitudes of pupil diameter within each participant while eliciting a pupillary light reflex by changing the background color of the display from light blue to white (higher luminance) or black (lower luminance) ( Table 1). These measured pupil limits were used for the within-individual normalization of pupil diameter data (Fig. 1B).

Experiment 1:
Experiment 1 was conducted at the Brain and Mind Institute, University of Western Ontario (Ontario, Canada). The self-reported right-handed participants (n = 28; 13 males, 15 females; age: 24.3 ± 4.5) sat on a height-adjustable chair and held the handle of a robotic manipulandum (1,000 Hz control rate) (41). The position of the handle was represented as a cursor (white dot) on an LCD monitor (60 Hz update rate) placed directly above the handle to prevent the participant from seeing their hand. The starting position (white circle), goal target (gray circle), and background (light blue circle, 15 cm diameter) to prevent sharp luminance changes around the target were also presented on the display throughout the task (Fig. 1A).
The eye tracker was mounted on the display to monitor participants' eye gaze and pupil diameter (Fig. 1A). The approximate distance between participants' eyes and the center of the monitor was 16 cm.
An experimental session consisted of five blocks of trials (59 trials per block, except for the fourth block, which consisted of 44 trials). There were short breaks (up to 1 min) inserted between the blocks. On each block, the first and the last two trials were used to measure the simple pupil light reflex of the participants. In each of these "light reflex" trials, the LCD screen was suddenly turned either black or white for 2,000 ms. The pupil response was measured during and up to 3,000 ms after the termination of the black/white color stimulus. The response strengths (i.e., trough for constriction, and peak for dilation) were averaged over the blocks and used to normalize the individual pupil diameter during the main task. Following the initial "light reflex" trials, participants performed 55 trials of the centerout reaching task. On each trial, after confirming stable eye fixation (1,000 ms of fixation without any blink) on the target and the cursor staying within the home position, the goal target turned green after a variable delay (1,000-1,500 ms) cueing the reaching movement.
To prevent possible predictive/reflexive eye movement and/or pupil dilation because of a moving cursor, visual feedback of the cursor was removed during movement and participants were instructed to maintain fixation on the center of the target. Terminal end-point feedback was provided with a near-isoluminant magenta cursor for 1,000 ms when the cursor speed was less than 1 cm/s. After the feedback period, the manipulandum handle automatically returned to the home position and the next trial started. presented on the display throughout the task (Fig. 3A). The colors (gray, green, light blue, and magenta) were adjusted to be approximately isoluminant ( score: 9.9 ± 0.9). In this experiment, the force field was gradually introduced over the course of seven blocks (50 trials in each block, except for the last block, which consisted of 70 trials). The force field was first introduced at the 16th trial in the second block and incrementally increased by 5% of the full-strength (B = 0.12) after every 11 trials, until it reached the full-strength (Fig. 5A). Channel trials were randomly interspersed in 20% of trials, except for block seven, in which channel trials were repeated from the 321st trial to the To remove high-frequency noise, the pupil data were smoothed with a Gaussian kernel with 235 ms FWHM. Importantly, we individually normalized the pupil diameter data relative to the minima and maxima of the pupil diameter data measured during the light reflex trials. We also calculated the pupil dilation velocity through the numerical derivative (diff.m function) of the normalized pupil diameter data. The trial-by-trial summaries of these pupilrelated variables were defined as follows. The baseline pupil diameter was defined as the average pupil diameter during the waiting period before the onset of the go cue. The mean pupil dilation velocity at each trial was defined as the average of pupil dilation velocity during the period from 300 to 700 ms from movement onset. These periods were defined in a post hoc manner to maximally reflect the effect of experimental manipulation (i.e., force perturbation) and roughly corresponded to the periods of p < 0.001 (uncorrected) for the comparison between the baseline vs. the first five perturbed trials (e.g., Fig. 2B). Pupil dilation velocity data for trials in which a saccade was detected during the movement period were excluded from the analysis. The percentages of excluded trials were 16.5% ± 12.9% (Exp 1), 15.5% ± 15.2% (Exp 2), and 10.5% ± 8.8% (Exp 3).

Statistical analysis:
Time-series comparison: We assessed changes in the pupil velocity time series between unperturbed (average of the first block data) vs. perturbed (average over the first five perturbed trials) trials using group-wise two-sided paired t-tests applied at each time frame.
Uncorrected p-values were reported. Correction for multiple comparisons (# of time frames since movement onset) was applied using the Holm-Bonferroni method (45) with a familywise error rate of p = 0.05 and the Benjamini-Hochberg method (46) with a false-discovery rate of q = 0.05. For a complementary analysis, we also employed the cluster-mass permutation test (47) implemented as 'permutest.m' for MATLAB (48) with a significance level of p = 0.05 using a cluster-thresholding p-value of 0.01. However, it should be noted that the cluster-based permutation test does not establish the significance of effect latency (49).

Effect of repetitive change points for Exp 2:
We assessed the effect of participants' repetitive experience of the change points by fitting a general linear mixed-effects model ('fitlme.m' function) to the individually z-scored response data around the change points (baseline pupil diameter, pupil dilation velocity). All models included a fixed intercept and random intercepts for subjects and groups (Exps 2A, 2B, and 2C) and fixed and random effects of the perturbation types (CW and CCW). As effects of interest, fixed and random effects for the number of change points (#cps) were also included. To account for the trivial effect of error magnitude, we additionally included the absolute error term: trajectory error and endpoint error at the immediately previous trial for the baseline pupil data, and trajectory error at the same trial for the pupil dilation velocity data. The subject, group, and perturbation types were treated as categorical variables. The importance of the random effect of #cp was assessed using the likelihood-ratio test (LRT) between the full model and an alternative model lacking the random slope term. The models were fitted to the data using the restricted maximum likelihood method (ReML) with random starting values. We evaluated the p-value of the estimated fixed-effect slope for the #cp term to test whether the effect of change in #cp was statistically significant. The significance level was set to 0.05.

Sub-group comparison for Exp 3:
The data for 32 participants who went through the poststudy questionnaire about perturbation were analyzed. We first split the data into two subgroups based on the median value of the total score of the post-study questionnaire (i.e., the total number of blocks in which the participant was aware of perturbation, ranging from 0 to 7). The difference in the score (i.e., number of "yes" responses) between each sub-group was assessed using a chi-squared test at each block. We then compared the pupil and behavior data between the sub-groups in the following way. Data were first averaged within each cycle (five-trial bins), and a linear mixed-effects model was fit to the data. The model contained a fixed intercept, a random intercept regarding subjects, a fixed effect for cycle number within each block, random effects for cycle number regarding subject and block, and a fixed interaction between block and sub-group as an effect of interest. A significant difference between the sub-groups was assessed by the significance of this interaction term for each block. The cycle number was treated as a continuous variable, and other variables were treated as categorical. The models were fit to the data with the ReML method with random starting values. We employed the Holm-Bonferroni method (45) for correction for multiple comparisons (i.e., blocks) to maintain the family-wise error rate at p = 0.05. For less-stringent correction, we also used the Benjamini-Hochberg method (46) to maintain the falsediscovery rate at q = 0.05.

Effect of set break (block novelty):
To test the effect of starting new blocks avoiding the effect of perturbation, we first selected the blocks in each experiment in which perturbation was either absent or very weak for the initial 10 trials (blocks 1, 2, and 5 for Exp 1, 1 and 2 for Exp 2, and 1-4 for Exp 3) and averaged the data within each trial defined relative to block initiation. We then compared the average of the initial two trials (early) vs. the 6th-10th trials (late) for baseline pupil diameter, pupil dilation, reaction time, and movement time, using a two-sided paired t-test for each experiment.

Data and code availability:
The data and the custom-written Matlab codes used for the analysis will be uploaded to the publicly available server upon publication. Until then, requests should be addressed to the corresponding author (ayokoi@nict.go.jp).

Results
Pupillometry during simple force field adaptation paradigm: Error-driven pupil responses. To obtain initial insights about pupil movement changes during a typical motor adaptation paradigm, we monitored 28 participants' eye movements while they performed a reaching task with a force field perturbation (Exp 1; Fig. 1A, C). To minimize measurement noise in pupillometry, we instructed the participants to keep their eyes fixated on the target and to refrain from blinking while they performed reaching movements. The visual cursor feedback was occluded during reaching. To minimize brightness-induced changes in pupil size, visual stimuli were isoluminant with respect to the background (Fig. 1A). Participants were instructed to aim directly at the target, and to reach straight to it. Participants showed a typical behavioral signature of force field adaptation. A sudden introduction of the perturbation force disturbed the smooth, relatively straight hand trajectory (cycle 11, Fig. 1D) resulting in a large lateral deviation (cycle 14, Fig. 1D). The lateral deviations rapidly decreased with repeated reaches made under the perturbation (cycle 32, Fig. 1D; Fig. 1E), and the sudden removal of the force perturbation resulted in a large hand deviation toward the opposite direction (cycle 33, Fig. 1C; Fig. 1D), a signature of motor adaptation known as an aftereffect. With further trials, the trajectory (and movement errors) returned to a nearbaseline level (cycle 52, Fig. 1D; Fig. 1E). The learning quantified in the force channel trials also showed a typical learning curve (Fig. 1F). How do participants' pupils respond in this typical motor adaptation situation? The pupil typically showed phasic dilation during movement in the Null-field trials ( Fig. 2A, p-1 and p-2). When the perturbation was unexpectedly applied, the pupil exhibited additional dilation ( Fig. 2A, trial p, p+1). Analysis of the time derivative of the pupil dilation (pupil dilation velocity) revealed that this perturbation-evoked pupil dilation started ~300 ms after movement onset and lasted until ~700 ms after movement onset (Fig. 2B). The trial-bytrial change in the pupil dilation velocity averaged over this window (we will refer to this as pupil dilation) showed a sharp rise upon the introduction of the force field and gradual decline as the participants adapted to the force field and the movement error decreased ( Fig.   2C; Fig. 1E). This indicates that, in the current paradigm, the phasic pupil dilation during reaching did not simply reflect physical effort, as it decreased despite the increase in lateral force exerted by the participants (Fig. 1F; Fig. 2C). The increased pupil dilation was also followed by increased tonic baseline pupil diameter in the following trial ( It is also noteworthy that the tonic baseline pupil diameter showed higher values at the beginning of a new block (Fig. 2D). We will later provide more detailed analysis regarding these points combined with the results of Exp 2 and 3.
The results described above suggest that the phasic pupil dilation is likely to be modulated by the size of movement error. The correlation between the trial-by-trial change of phasic pupil response and that of absolute movement error for the group-averaged data was highly significant (r = 0.64; p = 8.97 × 10 -31 ). This finding appears to be consistent with recent proposals that task-induced pupil dilation reflects surprise (e.g., unsigned prediction error, or risk prediction error) and the tonic baseline pupil diameter reflects subjective uncertainty about the environment (15,14). On the one hand, this fits well with the observed change in baseline pupil diameter; an abrupt introduction of the force field strongly implies environmental change leading to a transient increase in subjective uncertainty about the task, which is reflected in the baseline pupil diameter. On the other hand, the sudden removal of the force field did not elicit such changes in the baseline pupil diameter, presumably because the participants were aware that the task has returned to a known "baseline" state, implying an effect of novelty on environmental uncertainty, or simply an effect of fatigue or reduced arousal.
To further investigate the relationships between pupil diameter, error size and environmental change during motor learning, we also monitored the pupil diameter in two additional experiments. In Exp 2, we employed a switching force field schedule, in which the direction of force field was unexpectedly reversed multiple times, inducing overt environmental change and a re-increase in error. In Exp 3, we employed a gradual force field schedule, in which the magnitude of force field was gradually increased, inducing covert environmental change with much smaller errors compared with Exp 1 and 2.
Pupil responses to a switching force field schedule: Dissociation between error sizes and pupil responses after multiple reversals.
In Exp 2, we monitored the pupil responses of another 29 participants while they reached in the presence of force fields with slightly different settings compared with Exp 1, including a light-weight manipulandum and a vertically aligned monitor (Fig. 3A). Exp 2 involved three sub-groups (Exp 2A, B, and C) in which participants experienced different force field schedules (Fig. 3B). In all of the sub-groups, the force field was abruptly introduced on the 11th trial in block 2 (CW for Exp 2A and C, and CCW for Exp 2B), and the direction of the force field was reversed at different timings and frequencies for different sub-groups ( Fig.   3B; see Methods for detail). These change points, including the introduction and removal of the force field, induced sudden, unexpected increases in error size (Fig. 3D), indicating a change in task environment. We examined participants' pupil responses, both phasic and tonic, to these change points.
First, to replicate the basic results of pupil responses in Exp 1, we analyzed the pupil data around the first change point (i.e., the first introduction of force field) aggregating the data from all sub-groups. The time course of perturbation-evoked phasic pupil dilation showed a similar pattern (Fig. 3C) to that in the first experiment (Fig. 2B) with a slight difference in the timing of the effect. Thus, we employed a similar time window for averaging the pupil dilation velocity at each trial (300 to 700 ms). As shown in Figures 4A and C, both the phasic pupil dilation during movement and the tonic baseline pupil diameter responded to the first change point in a similar way to those in Exp 1. How does the pupil respond to the following change points?  Intriguingly, despite the sharp re-increase in movement error at the following change points (Fig. 3D), both the pupil dilation and baseline pupil diameter quickly became insensitive to these large errors (Supplementary Figure 1). It should be noted that such a decline in pupil response sensitivity to errors was not reported previously in the context of reinforcement learning (e.g., 14,15). To quantify the decline in pupil responses, we focused on response amplitudes around each change point. The response amplitude in the baseline pupil diameter and the pupil dilation showed a dramatic decline for the subsequent change points (Fig. 4B, D). Such a decline in pupil response amplitude was also observed in the data in Exp 1 (Fig. 4B, D).  Figure 2B). Note that, as already mentioned, the decrease in pupil responses was significant even after taking this reduction of kinematic errors into account.
Pupil responses to gradual force perturbation: Association between baseline pupil diameter and perturbation awareness.
We also ran another experiment with 34 new participants in which they adapted to a gradually introduced force field. The magnitude of the force field was gradually increased, as shown in Figure 5A (see Methods for details). As previously noted, the gradual introduction of perturbations is believed to evoke substantially less awareness about the presence of perturbation compared with abruptly-introduced perturbations because of the smaller errors it induces (50,51). As expected, the magnitude of error experienced by participants was much smaller compared with Exp 1 and 2 (Fig. 5B). As shown in Figure 5C, participants gradually adapted to the force field to a level that was comparable to that shown by participants in Exp 2C (Supplementary Figure 1B). There was no significant difference in learning indices between the average of the last five channel trials in the CW field for Exp 2C vs. the average of the last five channel trials before the start of constant channel phase for Exp 3 (t 41 = −1.04, p = 0.31, two-sided independent samples t-test). How does the pupil respond to gradual perturbation? As expected, except for the clear and consistent increase in baseline pupil diameter at the start of new block, participants' pupils showed no clear responses, in terms of both phasic and tonic activity (Fig. 5D, E). The results clearly indicate that the pupil responses were more sensitive to a sudden and large change in the environment (Exp 1 and 2) than to a covert, gradual change (Exp 3), suggesting some similarity between participants' awareness about the perturbation and subjective uncertainty about the task state change reflected in baseline pupil diameter.
To further investigate the relationship between pupil responses and awareness about perturbation, we assessed participants' awareness of perturbation immediately after the experiment by asking them to report the presence/absence of force perturbation in each block without providing any information about this interview prior to the main session (see Methods for more detail). Based on these data, we split participants into two groups according to the total number of blocks in which they reported that perturbation was present. Figure 6A shows the resultant median-split report rate and their average. First, the data showed a gradual increase in report rate, consistent with the gradual increase in the force field. Interestingly, although the perturbation was gradually introduced, on average, more than half of participants were aware of its presence in the later stage of learning, in which the perturbation size almost reached its maximal value (blocks 5 and 6). The chi-squared test revealed a significant difference in %presence between the sub-groups in blocks #4 ( $ = 6.47, p = 0.01), 5 ( $ = 11.52, p = 6.896 × 10 -4 ), and 6 ( $ = 9.49, p = 0.002). There was no significant bias in applied force direction between the sub-groups ( $ = 0.008; p = 0.9285), indicating that the difference was not simply caused by the different sensitivity to force direction.
Intriguingly, comparison of pupil data between sub-groups revealed that baseline pupil diameter was larger for the "more aware" participants in blocks 5 (

Increased baseline pupil diameter at the start of new blocks suggests increased subjective uncertainty.
The most prominent feature for the pupil responses in Exp 3 was a characteristic reincrease in baseline pupil diameter at the beginning of a new block (Fig. 5E). This  Table   2). Notably, if the increased tonic baseline pupil diameter at the start of the new block merely indicated higher arousal/vigilance/vigor because of short rests between the blocks, the RT and MT would be expected to decrease (26,27,52). Longer RTs typically accompany decisions under uncertain condition or choices with many potential options (Hick's law), and longer MTs are observed when the task is perceived as difficult (Fitts's law). Therefore, such increases in baseline pupil diameter should, at least in part, reflect an internal state related to increased subjective uncertainty.

Discussion
In the current study, we systematically studied pupillary responses during the process of force field reach adaptation in which the perturbation was applied either abruptly (Exp 1) As discussed below, our results support an interpretation that is generally consistent with proposals in the field of cognitive, reward-based learning (12-18, 30, 28, 29), suggesting that baseline pupil diameter is likely to reflect subjective uncertainty about the environment, whereas phasic pupil dilation during movement is likely to reflect surprise about sensory consequences (e.g., movement error), and both are modulated by the novelty of the environment. At the beginning of a new block, when the novelty of a task is presumably high because of participants' lack of complete knowledge about the experiment or forgetting during the break, both the tonic baseline pupil diameter and phasic pupil dilation were high.
In contrast, at change points in which the environment (i.e., force field direction) is switching between known conditions, at which time the novelty of an error is low, pupil responses are less sensitive to the size of the observed error. These processes may, at least in part, be mediated by central noradrenaline (NA) activity.

What do pupil responses reflect during motor adaptation?
Neurophysiologically, pupil diameter under constant luminance is known to reflect the activity of the locus coeruleus (LC) in the brainstem, a central source of NA (53-55). As suggested by numerous studies, pupil size and LC neurons show similar response patterns to surprising events and perceived environmental uncertainty (53,54). For instance, both LC neurons and the pupil show phasic responses to novel, infrequent, or surprising stimuli (60, 56, 18, 57-59, 14, 12, 13, 61, 15-17). Moreover, LC neurons and the pupil show a transient increase in tonic activity facing the reversal of target-reward contingency (i.e., sudden increases in prediction error) in both monkeys (61) and humans (12-17, 30, 28, 29). Thus, phasic pupil dilations and tonic increases in baseline pupil diameter are likely to reflect perceived surprise and environmental uncertainty, respectively, mediated by corresponding activation in the LC-NA system. Thus, the present results can be interpreted as follows. A sudden introduction/reversal of the force field induced substantial surprise, leading to a phasic response in the pupil-linked NA/arousal system. This is then followed by a transient increase in uncertainty regarding the task environment and a corresponding increase in tonic activity of the pupil-linked NA/arousal system (Exp 1, 2). The absence of such pupil responses in the gradually increasing force field condition (Exp 3) suggests that the size of (prediction) error is a key factor driving this process.
Interestingly, further observation of the results of Exp 2 indicated that pupil responses to errors might also be modulated by novelty. The decline in pupil responses to large errors after several change points (Fig. 4B, D) indicates that error size is not the only determinant of pupil responses during motor adaptation. Conceivably, acquisition of higher-level knowledge about task structure (e.g., the presence of reversal) might have made the sudden increase in movement errors no longer surprising, but somehow expected. A recent monkey study using a choice-reversal task also reported that the more reversals monkeys experienced, the faster they switched behavior, which was captured by a Bayesian choice model as gradually increasing prior belief on reversal (62). In accord with this notion, we observed a gradual reduction in kinematic error in the change point trials, which was not attributable to reduced learning in the preceding force direction or reduced peak hand velocity (Supplementary Figure 2). These results indicated that participants became gradually better at moving under the force field, including sudden reversals, potentially because of increased stiffness (63,64), improved feedback control (65,66), or both. It should be noted that the decline in pupil sensitivity to frequent increases in errors, as observed in the present study, has not been described for the similar reversal learning paradigm in the context of reward-based learning (e.g., 14,15). Although it is not yet clear whether this phenomenon is specific to motor learning, our results extend the findings of previous studies in revealing how the pupil-linked arousal/NA system responds to changes in the environment.
We speculate that the larger baseline pupil diameter observed at the beginning of new experimental blocks reflects heightened subjective uncertainty about the task (e.g., Fig. 5E; Exp 3). This characteristic pattern of larger pupil size early in experimental blocks has also been frequently reported but often interpreted simply as (re-)increase in arousal after short rests (e.g., 68,69). However, as indicated by the increased RT and MT in this phase (Fig. 7), the larger pupil diameter here was more likely to be associated with increased subjective uncertainty because RT typically increases when facing uncertain decisions (69) or choices with many potential options (70), and MT typically increases when the task is perceived as difficult (i.e., subjective expectation of movement accuracy is low) (71). A recent study also reported a larger pupil diameter and slower RT for uncertain decisions (72). If larger baseline pupil diameter only reflects higher arousal/wakefulness (22)(23)(24)(25), vigilance/alertness (26,27), and/or vigor (52,73), RT and MT would be expected to decrease (Fig. 7D). Thus, our data suggest that larger pupil diameter after set breaks at least partially reflects subjective environmental uncertainty.

How can pupil-linked arousal influence motor adaptation?
The transient increase in subjective uncertainty in the environment and pupil-linked arousal/NA activity induced by the sudden introduction of force fields bears a similarity to several phenomena reported in the early phase of human motor adaptation. For example, adaptation to both stable and unstable force fields transiently increases muscle co-activation (74,64,75), as increasing limb impedance is an optimal strategy for movement under highly uncertain dynamic environments (63) or to increase movement accuracy (76). Similarly, adaptation to a force field transiently increases the gains of visuomotor feedback responses (77), as well as long-latency muscle stretch reflex (78), which is mediated by the primary motor cortex (79)(80)(81). Furthermore, a recent study demonstrated that the Ia afferent firing from muscle spindles is enhanced in the early phase of visuomotor adaptation (82). These central and peripheral gain control changes might originate from the NA projections to the motor cortices and spinal cord (83). Notably, a recent report showed an arousal-like transient increase in neuronal population activity in the primary motor cortex in response to errors caused by environmental (brain-computer interface mapping) change, which correlated with pupil diameter change (68). Additional evidence suggests increased muscle spindle sensitivity following sympathetic up-regulation (84,85).
One intriguing question is whether/how pupil responses can be informative for understanding the "explicit" and "implicit" components of motor learning (86,5,6).
Currently, the dominant approach for trial-by-trial assessment of the conscious/explicit component of motor learning in the reach adaptation paradigm is to ask participants to verbally/manually report their aiming direction prior to each reach (87,5,88). However, this approach suffers from the inherent problem of interfering in the learning system itself and hence biasing learning processes to be more "explicit" (89)(90)(91). Moreover, although this aim reporting method allows researchers to assay the contribution of explicit processes to net learning, it does not provide direct clues regarding the underlying cognitive processes driving the explicit process. Thus, it is important to develop new complementary approaches to assess the explicit and/or implicit component (e.g., 92,93) or to examine cognitive states that can drive it without affecting the learning system. Given the notion that motor learning is learning about movement selection and execution (1) and the significant effect of surprise/uncertainty (and pupil-linked arousal/NA system) on reward-based learning of action selection (15), pupil responses may provide a window for assessing how learning about movement selection (i.e., explicit process) progresses.
Although we did not explicitly quantify the explicit components, our results may provide some insight into the question above. The post-experiment questionnaire in Exp 3 (gradual force field) revealed the association between the inter-individual differences in the overall correctness of perturbation awareness and baseline pupil diameter (Fig. 6B). Such inter-individual differences were also accompanied by the amount of learning and movement error, in a manner in which higher awareness was associated with less learning and more error (Fig. 6C, D). One possibility is that participants with weaker adaptation experienced more error, resulting in larger baseline pupil diameter and an increased probability of later recall of perturbation awareness. Recent evidence also suggests that prediction error (94,95), as well as phasic pupil responses (96,97), can signal a subjective belief about environmental change, which helps to create event boundaries in a memory structure (96,95). Another possibility is that participants with larger baseline pupil diameter had higher subjective uncertainty about the task (and were hence more likely to recall it as "perturbed" afterward), leading to more frequent exploratory behavior, which resulted in a smaller amount of adaptation. This scenario is consistent with previous reports that the awareness of perturbation reduced the amount of implicit adaptation (51,98,99) and a more recent study suggesting that the implicit adaptation process compensates for the noisy explicit strategy (88). One caveat is that, although we instructed our participants to aim straight to the target, we did not directly measure the explicit component of adaptation. Thus, at this point, we can only suggest that the tonic baseline pupil diameter reflects subjective uncertainty about the environment and may also be related to awareness about the perturbation. Further confirmatory studies will be needed to clarify this point.

Limitations and open questions
Despite the established link between pupil diameter and central LC-NA activity, the relationship is not necessarily one-to-one. For instance, a recent rodent study, in which cortical axons for both NA and acetylcholine (ACh) were recorded, reported that while rapid changes in pupil diameter and its time derivative (i.e., dilation velocity) were more strongly correlated with NA than ACh activity, slow pupil dilations on the timescale of a few seconds were correlated with activation of both NA and ACh (100). However, the detailed mechanisms by which central ACh affects pupil size remain unknown. Moreover, recent studies have reported that the activity of the dorsal raphe nucleus (DRN) serotonin  neurons in mice also tracks environmental uncertainty (101) and that photoactivation of DRN 5-HT neurons also elicits pupil dilation (102). However, the bidirectional connection between DRN and LC (83) further complicates this relationship. It is also important to note that 5-HT plays a key role in controlling the input-output gain of spinal motoneurons (103,104).
Overall, a more direct approach, such as invasive animal studies or pharmacological manipulation, is required to further establish the links among pupil diameter, these neuromodulators (NA, ACh, and 5-HT), and motor learning processes.
One topic that remains to be addressed is the relationship between motor learning rate and uncertainty/surprise and the pupil-linked arousal/NA system. Theoretically, statisticallyoptimal learning algorithms take multiple sources of uncertainty into account to dynamically modulate the learning rate, such as the Kalman filter (105) and more recent extensions (33,32,34,(106)(107)(108). Unfortunately, it was not easy to directly answer this question with the current data, because the current experiments were not designed for the accurate quantification of learning rate. One recent study (67), however, reported an association between baseline pupil diameter and learning rate in a modified saccade adaptation task. We will address this question in a separate report in which we directly measure the single-trial learning rate with/without experimental manipulation of the pupil-linked arousal system in the motor adaptation paradigm.
In the present study, we provided the first detailed characterization of pupillary responses in a widely used motor adaptation paradigm. Our data revealed how the internal states of human participants, most likely surprise and uncertainty about the environment, dynamically change during motor adaptation, thus providing important clues for understanding the process of force field learning. The results of the current study highlight the utility of pupil diameter as a valuable window into the motor system.  Two-sided paired t-tests were used for the comparison.