A Double Dissociation between Savings and Long-Term Memory in Motor Learning

Both declarative and procedural memories are easier to reacquire than learn from scratch. This advantage, known as savings, has been widely assumed to result from the reemergence of stable long-term memories. In fact, the presence of savings has often been used as a marker for whether a memory had been consolidated. However, recent findings have demonstrated that motor learning rates can be systematically controlled, providing a mechanistic alternative to the reemergence of a stable long-term memory, and recent work has reported conflicting results about whether implicit contributions to savings in motor learning are present, absent, or inverted, suggesting a limited understanding of the underlying mechanisms. In order to elucidate the mechanism responsible for savings in motor learning, we investigate the relationship between savings and long-term memory by determining how they depend on different components of motor learning. To accomplish this, we experimentally dissect motor adaptation based on short-term (1-minute) temporal persistence. Surprisingly, we find that a temporally-volatile component of implicit learning leads to savings whereas temporally-persistent learning does not, but that temporally-persistent learning leads to long-term memory at 24 hours whereas temporally-volatile learning does not. Moreover, we find that temporally-persistent implicit learning not only fails to contribute to savings, but that it produces an anti-savings which acts to reduce the net savings, and we show that the balance between temporally-volatile and temporally-persistent components can explain seemingly inconsistent reports about implicit savings. The clear double dissociation between the mechanisms for savings and long-term memory formation challenges widespread assumptions about the connection between savings and memory consolidation, and provides new insight into the mechanisms for motor learning.


32
Memories, both declarative and procedural, are easier to reacquire than to learn from scratch. This advantage, 33 known as savings, was first appreciated in Hermann Ebbinghaus's seminal work [1], in which he observed that 34 relearning a forgotten list of words is faster than learning a novel list. Savings has since been demonstrated in a 35 plethora of different paradigms, including cognitive tasks in humans[2,3], operant conditioning in animals [4][5][6], 36 and motor tasks in humans such as saccade adaptation [7], force-field adaptation [8][9][10][11], visuomotor 37 adaptation[12-20] and gait adaptation [21,22]. 38 Previous research generally maintained that savings results from the recall of a previously consolidated memory. 39 In fact, the presence or absence of savings itself has often been taken as a litmus test for whether a previously 40 trained memory has been consolidated [14,23,24]. In line with this idea, savings has been further suggested to 41 result from (1) the unmasking of a slower-learning, strong-retention process in a multi-rate learning model [10]; 42 (2) context-or relevance-based switching between such multiple slow processes, each specific to a different 43 memory[25-28]; or (3) reverting to the memory of a previously learned motor plan that was reinforced by 44 success or mere repetition [16,29]. All of these proposed mechanisms focus on savings as the manifestation of a 45 latent, stable, consolidated motor memory that is robust to both interference and the passage of time. 46 In line with this idea, a recent but influential view has proposed that savings in motor adaptation specifically 47 7 percentage of that required for full adaptation. In particular, normalized early adaptation (trial 10 after 180 perturbation onset) was faster compared to initial adaptation (initial adaptation: 62.8±4.1% vs. relearning: 181 81.7±2.5%; 40-trial and 800-trial washout data separately: 82.1±2.1% and 81.4±3.2%, respectively). We defined 182 savings simply as the difference between these normalized adaptation data for the retraining vs. the initial 183 learning conditions. The top panel in Fig. 3e,f shows an estimate of this savings measure in the overall 184 adaptation. We find statistically significant savings for early adaptation (trial 10; savings of 18.9±3.3% of the 185 ideal adaptation, t(39)=5.8, p = 5.410 -7 [40-trial washout data: 19.3±3.3%, t(39)=5.8, p = 4.410 -7 ; 800-trial 186 washout data: 18.6±3.6%, t(39)=5.1, p = 4.110 -6 ). Inspection of the overall adaptation data in the top panels of 187 Fig. 3c,d reveals that both the readaptation and initial adaptation curves asymptote near the ideal adaptation 188 level, meaning that the room for improvement, and thus the capacity for savings, is reduced as training proceeds. 189 In line with this observation, the savings we observed for mid (trial 40) and late (trial 70) overall adaptation 190 were smaller than the savings observed for early (trial 10) adaptation (<10% of the ideal adaptation in all cases). 191 The savings we observed at trials 40 and 70 were, however, statistically significant (t(39)=2.6, p = 0. Savings does not arise from the rapid reemergence of temporally-persistent memories 194 Intriguingly, the clear pattern of savings we found in the learning curves for overall adaptation was not present 195 for temporally-persistent adaptation. In only one of the four conditions in Experiments 1 and 2 (readaptation 196 after a 40-trial washout in Experiment 1) was the persistent adaptation higher during relearning than initial 197 adaptation, and in that condition the readaptation built upon a substantially higher pre-training level than the 198 corresponding initial training condition (Fig. 3a). When pre-training levels of persistent adaptation were taken 199 into account by normalizing learning curves, we found that relearning for temporally-persistent adaptation was 200 slower, rather than faster, than initial learning as shown in Fig. 3a,b. This absence of savings is illustrated in 201 A post-hoc analysis asked whether temporally-persistent adaptation actually displayed slower relearning. It 209 revealed that after both washout periods combined, savings in persistent adaptation was, in fact, significantly 210 negative (t(38) = -2.4, p=0.0232, 2-tailed paired t-test). This indicates that, for temporally-persistent adaptation, 211 relearning was significantly slower than initial learning. Individually, This was most clear in the long 800-trial 212 8 washout data (which allowed us to examine savings without any effects of residual temporally-persistent 213 adaptation), with significantly negative savings (t(38) = -2.3, p=0.0244, 2-tailed paired t-test). Savings after the 214 short incomplete washout was also nominally negative but, in this case, not significantly so (t(37) = -1.7, p = 215 0.1073, 2-tailed paired t-test). The negative savings results we observe in the 800-trial and 40-trial washout data 216 are similar but it is likely that the 800-trial result is more reliable as the 40-trial washout data suffer from 217 incomplete washout of the initial adaptation before relearning. Thus our data show a conspicuous absence of 218 savings in the reacquisition of temporally-persistent adaptation in all conditions we examined, instead showing 219 clear anti-savings despite robust savings in the reacquisition of overall adaptation. 220 Savings arises from the faster acquisition of temporally-volatile memories 221 The absence of savings in the temporally-persistent component of adaptation suggests that the temporally-222 volatile component of adaptation is responsible for the savings observed in overall adaptation. We calculated 223 savings for temporally-volatile adaptation (bottom row in Fig. 3e,f) based on the normalized volatile adaptation 224 (bottom row of Fig. 3c,d), which was computed as the difference between the normalized overall and 225 normalized persistent adaptation (top row of Fig. 3c,d). We found that volatile adaptation during early training 226 (trial 10) was 2-3fold faster for retraining than for initial training after both the short and long washout periods 227 in both experiments (as shown in the bottom row of Fig. 3c,d). Specifically, we found that trial 10 volatile 228 readaptation was 52.4±3.8% of the ideal adaptation to the 30° VMR with data after both types of washout 229 combined vs. 22.0±4.2% for initial adaptation, t(38)=6.2, p = 1.310 -7 (readaptation for 40-trial washout data: 230 51.3±4.5%, t(37) = 5.7, p = 9.010 -7 for savings; readaptation for 800-trial washout data: 52.4±4.1%, t(38) = 231 5.6, p = 8.710 -7 for savings). This indicates substantial, statistically significant savings in temporally-volatile 232 adaptation as illustrated in the bottom row of Fig. 3e,f that stands in stark contrast to the absence of savings 233 observed in temporally-persistent adaptation, suggesting that overall savings arises from the former, but not the 234 latter. 235 In line with the above, it is statistically clear -both for combined data but also for 800-trial and 40-trial washout 236 data separately -that overall savings is overwhelmingly driven by savings in volatile adaptation. In particular, 237 when we analyzed the ratio of overall savings that is accounted for by volatile vs. persistent savings, we found 238 that temporally-volatile savings accounted for essentially all overall savings (95% confidence intervals for the % 239 contribution of volatile savings to overall: [110% to 219%], confidence interval estimates obtained using 240 bootstrap, see Materials and Methods). This was similar for both the 40-trial data [95% to 192%] and the 800-241 trial data [112% to 235%] separately. Correspondingly, we found the contribution of temporally-persistent 242 savings to be overwhelmingly negative (95% confidence intervals for the % contribution of persistent savings to 243 overall: [-119% to -10%]; 40-trial washout data: [-92% to 6%]; 800-trial washout data [-135% to -12%]). The 244 near-or complete absence of savings in temporally-persistent adaptation indicates that savings in the overall 9 adaptation is primarily driven by temporally-volatile savings, and further suggests that temporally-volatile 246 savings may be the sole source of savings in overall adaptation. Thus, our result indicates that savings arises 247 from the faster reacquisition of volatile memories, rather than the re-manifestation of persistent memories. 248 Temporally-volatile savings arise from implicit adaptation 249 Previous research associated savings in visuomotor adaptation with the rapid recall of explicit strategies, rather 250 than faster implicit adaptation [18,30,31]. This led us to investigate the contributions of implicit and explicit 251 processes in the temporally-volatile savings we observed in our paradigm. We thus ran Experiment 3 (N=40), 252 which consisted of two 80-trial learning episodes separated by 800 washout trials. We dissected savings into 253 implicit and explicit components using special instruction trials which prompted participants to disengage any 254 explicit strategy by aiming their hand directly to the target [31,53-58].These instructions were presented 255 immediately before and after the first (trial 10) 60s time delay following the onset of the visuomotor rotation in 256 both initial learning and relearning, and allowed us to dissect adaptation into four subcomponents: implicit-257 persistent; implicit-volatile; explicit-persistent; explicit-volatile ( Figure 4B, see Materials and Methods for 258 details). 259 In line with our findings in Experiments 1 and 2, we found savings for overall and volatile adaptation 260 (14.3±3.6%, t(39) = 4.0, p = 0.00014 and 11.2±4.8%, t(37) = 2.4, p = 0.0119, correspondingly) but not persistent 261 adaptation (4.0±4.9%, t(38) = 0.8, p = 0.21). Dissection of savings into explicit and implicit components 262 revealed savings for both overall implicit and implicit-volatile adaptation (14.1±5.7%, t(38) = 2.5, p = 0.0088 263 and 13.3±5.5%, t(37) = 2.4, p = 0.0104, correspondingly) but not explicit-volatile adaptation (-2.1±6.2%, t(37) = 264 -0.3, p = 0.63) or any of the persistent subcomponents (implicit-persistent: 3.8±4.7%, t(38) = 0.8, p = 0.21; 265 explicit-persistent: 0.2±4.4%, t(38) = 0.1, p = 0.48). This finding suggests that overall savings were driven by 266 the implicit and temporally-volatile component of adaptation, in turn suggesting that the temporally-volatile 267 savings we observed in Experiments 1 and 2 predominantly reflect an implicit process rather than an explicit 268 strategy. That the volatile component observed in Experiments 1 and 2 is primarily implicit is not surprising: 269 First, it is unclear why an explicit strategy could be temporally-volatile to the point of being largely or 270 completely forgotten after a short 1-minute delay. In fact, our recent work indicates that explicit adaptation 271 displays essentially no temporal volatility, with over 95% stability across 1-minute delays [59]. Second, our 272 paradigm elicited scant explicit adaptation (likely due to elements of our experiment design aimed at inducing 273 implicit learning such as the use of point-to-point (rather than shooting) movements, the lack of aiming 274 instructions, the lack of markers that could aid off-target aiming, and the presence of low-latency online 275 feedback [56,60-63]), and without substantial explicit adaptation we lacked power for measuring explicit 276

savings. 277
Dissecting long-term memory in visuomotor adaptation 278 We next investigated whether the ability to dissect motor learning into temporally-persistent and temporally-279 volatile components could shed light on the mechanisms for the formation of long-term memories. To 280 accomplish this, we examined the relationship between the levels of temporally-persistent and temporally-281 volatile learning observed after initial training and the amount of retention observed 24 hours later (Experiment 282 3). After a baseline period, we trained 25 participants on a 30° VMR for 120 trials. After this initial training, 283 they were tested for temporally-persistent adaptation, retrained for 60 trials, and retested before returning the 284 following day when they were tested for retention (Fig. 5a  or temporally-persistent adaptation. When examining long-term memory, retained 24 hours after training, we 291 found that participants retained 9.1±1.2° of the trained 30° rotation (orange bar in Fig. 5b). This corresponded 292 to 32.4±4.2% of the overall learning and 45.6±4.6% of the temporally-persistent learning from day 1. 293 Dissociable effects of temporally-volatile and temporally-persistent adaptation on the formation 294 of long-term memory 295 To examine whether the dissection of day 1 learning into temporally-persistent and temporally-volatile 296 components could shed light on the mechanism for long-term motor memory formation, we compared the levels 297 of temporally-volatile, temporally-persistent, and overall learning to the amount of 24-hour retention for each 298 individual participant. Looking for positive contributions of each component to 24-hour retention (using linear 299 regression with regression coefficients restricted to be positive), we found no significant relationship between 300 overall learning on day 1 and 24-hour retention on day 2 (r = +0.14, F(23,1) = 0.4, p = 0.51). However, we 301 found a highly significant positive relationship between persistent learning on day 1 and 24-hour retention 302 (slope = 0.80, r = +0.71, F(23,1) = 22.9, p = 0.00008). In contrast, we found no positive relationship between 303 volatile learning on day 1 and 24-hour retention; in fact, the best-fit slope was zero (r=0.0, F(23,1) = 0, p=1), as 304 the best fit slope without restricting regression coefficients would have been negative. This indicates that 305 temporally-volatile learning does not lead to 24-hour retention, consistent with the fact that volatile learning, by 306 definition, will over the course of one minute. We thus find that, whereas neither overall adaptation nor the 307 temporally-volatile component can predict it, the temporally-persistent component of adaptation, measured only 308 one minute after training, is able to accurately predict retention 24 hours after training. 309 11 We next performed a stepwise bivariate regression analysis of how 24 hour retention depended on temporally-310 volatile and temporally-persistent learning from day 1, as illustrated in Fig. 6a,b. This analysis was particularly 311 important here because temporally-volatile and temporally-persistent learning were not independent across 312 individuals but instead displayed a strong negative relationship such that participants with higher day 1 313 temporally-volatile learning displayed smaller day 1 temporally-persistent learning and vice versa. This 314 bivariate regression revealed that adding temporally-volatile learning as a second regressor after temporally-315 persistent learning resulted in no significant improvement in the ability to explain 24-hour retention (R 2 316 increased from 49.8% to 51.7% corresponding to a partial R 2 of only 3.8%, F(22,1) = 0.4, p = 0.36). In contrast, 317 adding temporally-persistent learning as a second regressor after temporally-volatile learning resulted in a large 318 improvement in the ability to explain 24-hour retention (R 2 increased from 0.0% to 51.7%, corresponding to a 319 partial R 2 of 51.7%, F(22,1) = 23.6, p = 0.00007). The results of this analysis are shown in Fig. 6a In summary, we find in Experiment 4, that increased temporally-persistent adaptation leads to stronger long-323 term memory, whereas increased temporally-volatile adaptation does not. This sharply contrasts with 324 Experiments 1-2 where we found that temporally-volatile adaptation led to savings whereas temporally-325 persistent adaptation did not. Taken together, these results demonstrate a striking double dissociation between 326 the contributions of temporally-persistent and temporally-volatile learning to long-term memory and savings. 327 To even more directly compare the two contributions to this double dissociation, we returned to the savings data 328 from Experiments 1-2 and performed a bivariate analysis of the inter-individual differences in overall savings 329 based on the levels of temporally-volatile and temporally-persistent learning during retraining (See Materials  330 and Methods for details). This bivariate regression analysis is analogous to that performed on the 24-hour 331 retention data above and is illustrated in Fig. 6c,d, in parallel format to Fig. 6a,b. Adding temporally-volatile 332 learning as a second regressor after temporally-persistent learning resulted in a significant improvement in the 333 ability to explain savings (R 2 increased from 0.0% to 13.8% corresponding to a partial R 2 of 13.8%, F(37,1) = 334 5.9, p = 0.0197). We note here that this p-value, corresponds to the variance reduction associated with adding 335 temporally-volatile learning to the regression and is equivalent to a two-tailed test for its regression slope being 336 non-zero. Thus, if a one-tailed test would have instead been used, in line with the idea of testing for a 337 significantly positive relationship between savings and initial temporally-volatile learning, its p-value would 338 have been < 0.01. In contrast, adding temporally-persistent learning as a second regressor after temporally-339 volatile learning resulted in no significant improvement in the ability to explain savings (R 2 increased from 340 13.7% to 13.8%, corresponding to a partial R 2 of 0.2%, F(37,1) = 0.1, p = 0.81). The findings from this 341 regression analysis show that interindividual differences in the temporally-volatile but not the temporally-342 persistent component of initial learning explain individual differences in the amount of savings during 12 relearning. This adds to the evidence illustrated in Fig. 3, summarized in Fig. 6e, that temporally-volatile 344 learning displays savings whereas temporally-persistent learning does not. 345 13

346
Here we compared the mechanisms responsible for savings and long-term memory in human motor learning, 347 finding that temporally-volatile adaptation leads to savings and that temporally-persistent adaptation leads to 348 long-term memory. When we dissected adaptive responses into temporally-persistent and temporally-volatile 349 components using 60-second breaks (Fig. 1), we found that temporally-persistent memories washed out 4-20x 350 more slowly than temporally-volatile memories (Fig. 2), leaving a considerable temporally-persistent residual 351 even after 100 washout trials (see Fig. 2 We, therefore, controlled for the effect of residual temporally-persistent adaptation, either experimentally with 356 an extended 800-trial washout period which could eliminate it, or analytically with appropriate baseline 357 subtraction and normalization, allowed us to accurately assess savings. With this control, we consistently found 358 significantly greater savings for temporally-volatile than temporally-persistent adaptation (Figs. 3, 6). In fact, 359 whereas temporally-volatile adaptation showed savings by displaying relearning that was consistently faster than 360 initial learning, temporally-persistent adaptation remarkably showed an anti-savings -displaying relearning that 361 was significantly slower than initial learning both in the overall data and specifically in the 800-trial washout 362 condition in which complete washout of both persistent and volatile learning occurred, allowing savings to be 363 most cleanly measured. Temporally-persistent learning was also nominally slower, though not significantly so, 364 in the 40-trial washout data. Remarkably, we found that savings in temporally-volatile adaptation was 365 sufficiently large to overcome the anti-savings in persistent adaptation, and still confer robust savings on overall 366 adaptation. Moreover, we found that the temporally-volatile savings we observed were due to implicit rather that 367 explicit learning (Fig. 4). Our data thus suggest that savings in overall adaptation is derived from implicit, 368 temporally-volatile adaptation, representing an increased propensity to more rapidly form a temporally-volatile 369 memory, rather than the reemergence of a temporally-persistent memory. 370 When we dissected adaptation into volatile and persistent components to examine the mechanisms for long-term 371 memory, we found a strong positive relationship between 24-hour retention and temporally-persistent but not 372 temporally-volatile adaptation (Figs. 5, 6). Together, our findings for savings and 24-hour retention delineate a 373 powerful double dissociation whereby temporally-persistent learning leads to long-term memory but not 374 savings, and temporally-volatile learning leads to savings but not long-term memory. 375 Our findings provide a resolution to the apparent discrepancy between previous studies which isolated implicit 378 savings in adaptation, yet found either anti-savings [32] or savings [20]. The paradigm used in the former study 379 likely promoted temporally-persistent implicit adaptation, which here we find to display anti-savings, because 380 compared to the current findings which employed only one movement direction, the multiplicity of movement 381 directions used would dramatically increase the temporal spacing between same-direction movements, limiting 382 temporally-volatile adaptation -though the extent of this effect is unfortunately difficult to definitively evaluate 383 because the inter-trial time intervals were not explicitly reported in the study. In contrast, the quickly-paced 384 paradigm used in the latter study would permit greater accumulation of temporally-volatile adaptation, which 385

Juxtaposition between savings in temporally-volatile and temporally-persistent adaptation
here we find to display enough savings to overcome anti-savings in temporally-persistent adaptation. Other 386 factors may also have affected the balance between temporally-volatile and temporally-persistent components to 387 drive the savings vs anti-savings observed in these two studies. 388 This effect whereby temporally-volatile savings would be reduced when a larger number of target directions are 389 present in the experiment design, because same-direction inter-trial time intervals would increase and force 390 temporally-volatile adaptation to decay to a greater extent, would also predict reduced savings even in studies 391 that did not isolate implicit adaptation. This prediction is indeed borne out in previous work, with studies using 392 4-8-target paradigms finding either less pronounced savings [13,19]  Incomplete washout can contaminate the assessment of savings 400 Our experiments revealed that temporally-persistent adaptation requires a surprisingly long period to wash out -401 well above 100 trials and much longer than overall adaptation that combines volatile and persistent components 402 and is effectively washed out in just 20-40 trials (Fig. 2). This occurs because a negative temporally-volatile 403 adaptation acts to mask the enduring temporally-persistent component after 20-40 washout trials. Therefore, if 404 the washout of temporally-persistent adaptation is not specifically measured, it is easy to get the false 405 impression that a short block of 20-40 trials is sufficient to washout all adaptation so that true savings -which 406 refers to relearning after complete washout of all adaptation -can be cleanly measured. the complete washout of all components of adaptation so that the savings they found could not be explained by 414 apparent savings due to the unmasking of a slowly decaying learning process, used only 40 trials meaning that 415 the savings they observed almost certainly included apparent savings from incomplete washout of temporally-416 persistent adaptation. The current results from the 800-trial washout data where temporally-persistent adaptation 417 was entirely eliminated, do show clear evidence for the true savings that Zarahn et al hypothesized, and it is 418 unfortunately impossible to know just how much of the savings they reported was due to incomplete washout 419 because of paradigmatic differences that may have increased or decreased the degree to which the washout was 420 incomplete. 421

422
Our findings dissociating savings from long-term memory, and demonstrating savings to be driven by 423 temporally-volatile, rather than temporally-persistent memories, upend the widespread view that the faster 424 relearning that characterizes savings results from the recall of a previously consolidated, stable motor 425 memory [8,9,14,16,17,19,30,64,65]. Whereas savings has been taken as a litmus test for the consolidation of 426 motor memory 1, 2 or even 7 days after training[13,23], here we find that savings is primarily driven by the 427 temporally-volatile component of adaptation, which decays over the course of one minute or less. These 428 temporally-volatile savings could not arise from a consolidated memory of previous adaptation, as any memory 429 solid enough to survive the hour-long, 800-trial washout period in our experiments, would certainly be stable 430 against the passage of time during the minute-long breaks in our experiment used to define temporal lability. 431 Our findings thus indicate that savings is not driven by a consolidated memory of the adapted motor output, in 432 line with the double dissociation between savings and long-term memory that we demonstrate. 433 In contrast, the savings we observe is driven by an increased learning rate for adaptive but temporally-volatile 434 changes in motor output from one trial to the next when experiencing the same perturbation again after washout. 435 This ability is very different from the usual conception of a consolidated motor memory in which the trained 436 actions themselves are remembered. Thus, our findings are consistent with a model in which savings is based on 437 an ability to improve the adaptation of actions, rather than a memory for the actions themselves. 438 confidence) of slow learning on day 1 is retained after 24 hours, mirroring our Experiment 3 finding that 46±9% 471 (95% confidence) of persistent learning on day 1 is retained after 24 hours. Moreover, the trial-to-trial learning 472 characteristics of the fast and slow processes mirror the ones for volatile and persistent adaptation, respectively. 473

Mechanisms for learning rate modulation in temporally-volatile adaptation
In particular, slow adaptation displays slower learning and better retention than fast adaptation, just as 474 temporally-persistent adaptation displays slower learning and better retention than temporally-volatile 475 adaptation (see Fig. 3c,d and Fig. 2a, respectively). These parallels challenge the idea that the fast process 476 corresponds to explicit adaptation [68], as they argue that the fast process corresponds to temporally-volatile 477 adaptation which here we find to be implicit. Taken together these observations suggest that temporally-volatile 478 adaptation is captured by the fast process of the two-state model, and temporally-persistent adaptation is 479 captured by the slow process. 480 Experiments 1 and 2 consisted of reaches towards a 90° target direction (in the midline, directly away from the 499 body). After a 220-trial baseline block with no visual rotation, subjects in Experiment 1 (N=20) entered the main 500 part of the session which contained three 80-trial training periods. During training, a 30° visuomotor rotation 501 (VMR) was imposed about the starting position. The sign of this VMR was the same for all training periods for 502 each subject, with half the subjects training with a clockwise VMR and the other half training with a counter-503 clockwise VMR. The first and second training periods were separated by a 40-trial washout period, whereas the 504 second and third training periods were separated by an 800-trial washout period. The training schedule in 505 Experiment 2 (N=20) was the same apart from that the 800-trial washout period came first (between the first and 506 second training periods, see Fig. 1). 507 Experiment 3 (N=41) was similar to Experiment 2 in that it contained two 80-trial training periods separated by 508 a 800-trial washout period (but not a third training period). It was designed to examine whether the temporally-509 volatile savings like the ones observed in Experiments 1 and 2 were due to an implicit or explicit process. 510

481
To dissect savings into implicit and explicit components, we used special instruction trials which prompted 511 participants to disengage any explicit strategy by aiming their hand directly to the target. This method, also 512 referred to as exclusion (since participants are to exclude strategies from their reach)[69], has been, in various 513 forms, widely used to dissect implicit and explicit visuomotor adaptation[31,53-58]. Specifically, instructions 514 were given to either move to the center of the target or to its near/far end (both of which would not alter the 515 reaching angle) and were presented immediately before and after the first (trial 10) 60s time delay within both 516 visuomotor rotation training episodes (initial learning and relearning). 517 This enabled us to directly assess overall implicit adaptation (the amount of adaptation on the first instruction 518 trial) and implicit-persistent adaptation (the amount of adaptation in the second instruction trial, which followed 519 a 60s delay); and, by comparing these two, this enabled us to assess implicit-volatile adaptation. Moreover, by 520 comparing adaptation in the second instruction trial to the no-instruction trial following it, we assessed explicit-521 persistent adaptation; and, by estimating overall adaptation as the average adaptation two trials before and after 522 all these delay/instruction trials, we obtained estimates of overall explicit, volatile, and persistent adaptation 523 ( Figure 4B). 524 To minimize delays in reaction time, which would increase the inter-trial time interval and lead to further 525 reduction in temporally-volatile adaptation, participants were presented with an "upcoming instruction" sound 526 during the trial preceding the instruction. To familiarize participants with instruction trials (and the preceding 527 The aim of Experiment 4 (N=25) was to examine the formation of long-term memories of VMR adaptation. The 532 experiment began with a baseline period with no VMR that consisted of 456 trials, spread evenly across 19 533 target directions. After this baseline, subjects were trained on a 30° VMR for 120 reaches to a target placed at 534 90° (in the midline, directly away from the body, see Fig. 4a). The direction of the 30° visual rotation was 535 approximately balanced, with 13 subjects trained with a counter-clockwise VMR and 12 subjects with a 536 clockwise VMR. This was followed by a testing block with 3 reaches towards each of the 19 targets. During this 537 block, visual feedback was withheld so that repeated measurements could be made without these measurements 538 being contaminated by additional training that could be elicited by visual feedback. We used the first movement 539 towards the training direction to measure temporally-persistent adaptation. After this testing block, subjects 540 were retrained on the 30° VMR for an additional 60 trials and after that were tested again without visual 541 feedback to measure temporally-persistent adaptation as described above. Participants returned the following 542 day to be tested for 24-hour retention without visual feedback. 543 20 Sample size determination 544 While sample sizes for experiment groups in analogous studies typically range between 8 and 12, here we used 545 somewhat larger sample sizes (N = 20, 20, 41 and 25 for Experiments 1, 2, 3 and 4, respectively). For 546 Experiments 1 and 2, we examined a larger number of participants so that we could rigorously assess not only 547 whether savings is present or not for temporally-persistent and temporally-volatile adaptation, but also the time-548 course of savings for these two adaptation components at multiple points during training, as well as whether 549 there are any subtle differences in savings or the extent of washout following the 40-trial vs the 800-trial 550 washout periods. The larger sample sizes in Experiments 1 and 2 also enabled more precise comparisons 551 between the time course of washout for temporally-persistent and temporally-volatile adaptation, as the time 552 constant estimates for these washout curves can be especially susceptible to noise in the data. In Experiment 3, 553 we doubled the sample size relative to Experiment 2, given that Experiment 3 involved dissection of adaptation 554 into four (explicit-persistent, explicit-volatile, implicit-persistent, implicit-volatile), rather than two components. 555 In Experiment 4, we examined N=25 participants as we wanted to be able to look at not just the group-average 556 amount of 24-hour retention, but also examine how inter-individual differences in 24-hour retention on day 2 557 related to inter-individual differences in temporally-persistent and temporally-volatile adaptation on day 1 (Fig.  558   4b,c). 559 temporally-volatile savings to overall savings: in these cases we used a bootstrapping procedure (see below) 566 instead of comparing fits to individual subject data, because the high noise in these individual data leads to low 567 confidence about the corresponding individual parameters. 568 569 We performed outlier rejection on the learning curves of each experiment. Specifically, for each trial, we 570 excluded adaptation levels that were more than 3 inter-quartile ranges (IQRs) away from the subject median. 571

Data inclusion criteria
This resulted in the inclusion of 99.4% of trials. 572

573
To assess the amount of adaptation to the trained VMR, we measured the direction of hand motion on each trial. 574 In movements with visual feedback, this was defined as the direction of the vector between the hand position at 575 movement onset (based on a 6.4 cm/s velocity threshold), and the hand position 150ms later. We used 150ms to 576 21 measure feedforward adaptation, as feedback corrections should be minimal at this point. In movements with no 577 visual feedback, this was defined as the direction of the vector between the hand position at movement onset and 578 the movement endpoint. To examine learning-related changes in performance, we subtracted out the small bias 579 present in the baseline (0.13±0.11°), from all the movement-direction data. 580

581
In Experiments 1, 2 and 3, we measured temporally-persistent adaptation using one-minute breaks interspersed 582 with training. Because the temporally-volatile component of motor adaptation decays with a time constant of 15-583 20 seconds [42,44], the one-minute breaks we impose here amount to 3-4τ, and thus lead to a 95-98% decay in 584 temporally-volatile adaptation, effectively isolating the temporally-persistent component of adaptation. In 585 contrast, the trial-to-trial decay in temporally-volatile adaptation for non-break trials would be much lower, as 586 the experiments were fast-paced with a median inter-trial time interval of 2.5-2.7s, amounting to 0.1-0.2τ thus 587 leading to only 10-15% decay. We thus used the amount of adaptation on the trial immediately following such a 588 break as a measure of the temporally-persistent adaptation on that trial. These timed one-minute breaks occurred 589 every 30 trials during the VMR training blocks (on trials 10, 40, and 70 after the onset of each 80-trial training 590 episode), and in 40-trial intervals during the long washout period, as shown in Fig. 1b,c. During these breaks 591 subjects held the handle still on the starting position. 592 In addition to these timed one-minute breaks, Experiments 1 and 2 contained a number of additional rest breaks 593 which allowed subjects to put the handle aside and were not strictly timed. These breaks occurred only during 594 baseline or washout periods as shown in Fig. 1c. We used the amount of adaptation after these breaks as a 595 measure of temporally-persistent adaptation on the corresponding trials as above, but only in cases where these 596 breaks resulted in inter-trial intervals greater than 40s (65.8% of these breaks). 597 Finally, we measured volatile adaptation for the same trials by first estimating the corresponding amount of 598 overall adaptation around each persistent adaptation measurement, calculated as the average adaptation two 599 trials before and two trials after the post-break trial on which persistent adaptation was measured. We then 600 calculated temporally-volatile adaptation as the difference between these overall and temporally-persistent 601 adaptation measurements. 602 In Experiment 4, temporally-persistent adaptation was measured during the no-feedback testing blocks that 603 followed rest breaks (average break duration: 124±9s, minimum 48s) so that temporally-persistent adaptation 604 could be assessed using the three reaches in each block that were towards the training target. We estimated 605 volatile adaptation as the difference between persistent adaptation and overall adaptation. The latter was 606 assessed as the average adaptation during the last 20 trials of the training and retraining blocks. Finally, we 607 calculated 24-hour retention based on the no-feedback data from the testing block on day 2 (Fig. 5a). 608 22 Estimation of washout time constants 609 The washout of overall adaptation proceeded in two timescales: a very rapid initial washout phase during the 610 first 2-3 washout trials, during which adaptation levels went from about 27° to about 11°, and then a slower 611 washout phase that is illustrated in Fig. 1c. To compare the time constants for washout for both temporally-612 persistent and overall adaptation (Fig. 2a), we focused our overall washout analysis on the period beginning at 613 trial 3 of washout in order to focus on the slower washout phase for comparing overall and temporally-persistent 614 washout, because no temporally-persistent measurements were available during the very fast initial phase. 615 To estimate the values and confidence intervals associated with the time constants for washout, τ, we utilized a 616 bootstrapping procedure [70]. Specifically, for each one of 10000 bootstrap iterations, we randomly sampled, 617 with replacement, N=20 subjects from each group, and fit their average data with a single-exponential fit 618 (Equation 1): 619 When analyzing the overall washout curves we discarded not only trials after each one-minute or rest break that 621 removed the temporally-volatile component of adaptation in order to measure temporally-persistent learning, but 622 also the three trials immediately thereafter, during which temporally-volatile adaptation might not be fully 623 reequilibrated. 624

625
To systematically quantify savings, and specifically take into account the systematically different baselines 626 between post-long washout vs. post-short washout relearning, as well as the different baselines between 627 temporally-persistent, temporally-volatile and overall adaptation, we subtracted baseline adaptation, , 628 and normalized each learning curve by the distance between baseline and the ideal adaptation level of 30 629 degrees (Equation 2). The baseline level for overall adaptation was defined as the average of the last 5 trials 630 before training onset, whereas the baseline level for persistent adaptation was defined as the average of the last 631 three persistent-adaptation trials before training onset (in the case of baselines for initial training and training 632 after an 800-trial washout) or as the last single persistent-adaptation trial, trial, 10 trials before the onset of 633 training (in the case of baselines for training after a 40-trial washout, since the 40-trial washout contained only a 634 single persistent-adaptation measurement trial). Throughout the study, we focused on savings around 1-minute delay trials (especially trial 10 after training 641 onset which captured early adaptation, but also trials 40 and 70), as these were the trials for which all three types 642 of adaptation could be assessed. For the analysis of the across-individual relationships between savings and 643 persistent/volatile adaptation (Fig. 6c,d), however, because the measurement of temporally-volatile adaptation 644 was based on the same measurements as overall adaptation (volatile = overall [adaptation two trials before and 645 after the 60s wait trial] -persistent [adaptation on the 60s wait trial]), we instead calculated overall savings 646 based on trials 2-6 in order to ensure that any observed relationships were not due to measurements shared 647 between the dependent (savings) and independent (temporally-volatile adaptation) variables. This range was 648 selected as it was both relatively far from the measurements used to calculate temporally-volatile adaptation, but 649 also better captured the rapid rise of overall adaptation providing more power to assess inter-individual 650 differences in savings. 651

652
To examine contributions of temporally-persistent or temporally-volatile adaptation on savings and long-term 653 memory ( Fig. 6a-d), we used linear regression with slopes restricted to positive values to model positive 654 contributions of these components of adaptation and either savings or long-term retention. Specifically, for 655 studying long-term memory, we compared temporally-persistent and temporally-labile adaptation on day 1 in 656 Experiment 4 against 24-hour retention on day 2, whereas, for studying savings, we compared temporally-657 persistent and temporally-labile adaptation from trial 10 in the retraining blocks in Experiments 1 and 2 against 658 overall savings calculated as in the preceding paragraph. 659 26 and one after the 1-minute wait, allow the direct measurement of overall implicit and implicit-persistent 732 components, respectively, in turn allowing the calculation of implicit-volatile as the difference between the two. 733 The following trial is a non-instruction trial, allowing the direct measurement of explicit-persistent adaptation. 734 (c) Savings in overall, temporally-persistent, and temporally-volatile adaptation in Experiment 3, both for 735 combined implicit and explicit adaptation (left column) and broken into implicit and explicit components. In line 736 with Experiments 1 and 2, data show overall and volatile, but not persistent, savings for combined (implicit + 737 explicit) adaptation. Further dissociation into implicit and explicit components reveals this savings is due to 738 implicit, temporally-volatile adaptation. Errorbars indicate SEM. * p<0.05; ** p<0.01. 739 740 Figure 5. Temporally-persistent adaptation leads to long-term memory. 741 (a) Experiment schedule and raw data for Experiment 4. After a baseline period, subjects were trained with a 742 30° VMR for 120 trials, and were then tested for temporally-persistent adaptation, retrained for 60 trials, and 743 then retested. Subjects returned the following day when they were tested for 24-hour retention (orange circle). 744 Note that 24-hour retention is lower than overall or temporally-persistent adaptation but higher than zero. 745 During baseline (left), the cursor follows the hand motion, whereas during training cursor motion is skewed by 30° from the hand motion (in this example, counter-clockwise), resulting in a 30° error before adaptation (middle). If full adaptation is achieved, hand motion must completely counter the imposed rotation, corresponding to a 30° clockwise hand motion in this example (right). (c) Top: experiment schedule and raw data for Experiment 1. There were three phases: a baseline period followed by the initial 80-trial VMR training (average adaptation level shown in gray); a short, 40-trial washout period followed by retraining (green); and a long, 800-trial washout period followed by another 80-trial retraining session (blue). Red dashed vertical lines indicate trials conducted after 60-second breaks to isolate temporally-persistent adaptation. Brown dashed vertical lines indicate trials following rest breaks. Note that, during the washout periods, adaptation peaks on these break trials, illustrating a slower washout for temporally-persistent vs. temporally-volatile adaptation. Bottom: same but for Experiment 2, where the long, 800-trial washout period came first. Errorbars indicate SEM. Temporally-persistent adaptation washes out more slowly than overall adaptation. (a) Washout curves for the overall adaptation (shading indicates mean±SEM) and temporally-persistent adaptation (circles) for both Experiment 1 (blue) and Experiment 2 (light blue), illustrating the contrast between rapid washout for overall adaptation and slower washout for temporally-persistent adaptation. The thick dashed or dotted lines indicate exponential fits. Inset: Three different measures of washout for overall vs. persistent adaptation (combined data from Experiments 1 and 2). Retention expressed as a percentage of asymptote adaptation after 16-25 trials (left) or 51-150 trials (center) indicates slower washout for temporally-persistent compared to overall adaptation. Time constants for the washout curves (right) also show slower washout for temporally-persistent adaptation. Errorbars indicate SEM, except for the right panel in inset in which they indicate median and interquartile range of bootstrap estimates. (b) Residual adaptation before initial training (gray, left bar in each cluster), at the end of the 40-trial washout period (green, middle bar) and at the end of the 800-trial washout period (blue, right bar). The data show that the 40-trial washout period leaves a significant amount of temporally-persistent adaptation, and a smaller but also significant amount of overall adaptation. Consequently, retraining after only 40 washout trials starts from a nonzero baseline. Errorbars indicate SEM. ***p<0.001. Detail from panel a illustrating the measurements and calculations involved in estimating overall, overall implicit, implicit-persistent, implicit-volatile and explicit-persistent components. Instructions one trial before and one after the 1-minute wait, allow the direct measurement of overall implicit and implicit-persistent components, respectively, in turn allowing the calculation of implicit-volatile as the difference between the two. The following trial is a non-instruction trial, allowing the direct measurement of explicit-persistent adaptation. (c) Savings in overall, temporally-persistent, and temporally-volatile adaptation in Experiments 3, both for combined implicit and explicit adaptation (left column) and broken into implicit and explicit components. In line with Experiments 1 and 2, data show overall and volatile, but not persistent, savings for combined (implicit + explicit) adaptation. Further dissociation into implicit and explicit components reveals this savings is due to implicit, temporally-volatile adaptation. Errorbars indicate SEM. * p<0.05; ** p<0.01. Temporally-persistent adaptation leads to long-term memory. (a) Experiment schedule and raw data for Experiment 4. After a baseline period, subjects were trained with a 30° VMR for 120 trials, and were then tested for temporally-stable adaptation, retrained for 60 trials, and then retested. Subjects returned the following day when they were tested for 24-hour retention (orange circle). Note that 24-hour retention is lower than overall or temporally-persistent adaptation but higher than zero. (b) Comparison of overall, temporally-persistent, and temporally-volatile adaptation from Experiments 1, 2 and 4 with the 24-hour retention from Experiment 4. Experiments 1, 2 and 4 display similar levels of persistent and volatile adaptation.

Figure 6
A Double dissociation between savings and long-term memory, uncovered by dissection of learning into temporally-persistent and temporally-volatile components. (a) Illustration of the partial regression between temporally-persistent adaptation on day 1 (shown on x-axis) and 24-hour retention across individuals (N=25). The y-axis represents residuals of the univariate regression of 24-hour retention upon temporally-volatile adaptation. The positive relationship indicates that higher temporally-persistent adaptation explains higher long-term memory. The solid line indicates linear fit. (b) Same as a but for temporally-volatile adaptation, showing no significant relationship (slopes restricted to positive values). (c) Illustration of the partial regression between temporally-persistent adaptation during relearning (shown in x-axis) and savings for early (trials 2-6) training combined across Experiments 1 and 2. The y-axis represents residuals of the univariate regression of savings upon temporally-volatile adaptation. There is no significant relationship. (d) Same as c but for temporally-volatile adaptation, showing a significant positive relationship and thus indicating that temporally-volatile adaptation explains overall savings. Note that mean y-axis values in (a)-(d) are zero, since they represent residuals of linear regression; these values do not reflect the actual amounts of long-term retention or savings which are, on average, significantly above zero.
TV adaptation shows savings, but TP adaptation does not