ABSTRACT
Across the domains of spatial navigation and episodic memory, the hippocampus is thought to play a critical role in forming distinct representations of overlapping events. However, it is less clear how and when hippocampal representations of overlapping events become distinct. Here, using spatiotemporal pattern analysis of human fMRI data, we measured learning-related changes in hippocampal representations of overlapping real-world routes. We found that learning resulted in highly targeted representational changes within the hippocampus that specifically reduced similarity among overlapping routes. Strikingly—and in contrast to representations in other navigation-related brain regions—hippocampal representations of overlapping routes diverged to the point where overlapping routes became less similar than non-overlapping routes. This learning-related plasticity preferentially occurred for hippocampal voxels that were initially shared across overlapping routes. Collectively, these findings indicate that overlap among event representations triggers a divergence of hippocampal activity patterns that dramatically reshapes representational structure.
INTRODUCTION
Distinct experiences often contain overlapping elements, creating the potential for memory interference. For example, a single location (e.g., a living room) may be the site of many different experiences. The hippocampus is widely thought to play a critical role in coding overlapping events such that interference is minimized. Compelling evidence for this function comes from intracranial recordings in rodents during spatial navigation. For example, when rodents alternate between left- and right-hand turns in a T-maze, cells within the hippocampus differentially fire during the central stem (the overlapping path), according to whether the current route is a ‘right-turn’ or ‘left-turn’ route1,2. Similarly, human fMRI evidence indicates that elements which are shared across multiple event sequences are distinctly coded in the hippocampus according to the specific sequence to which they belong 3. While these studies and others have led to general agreement that the hippocampus forms distinct codes for overlapping experiences4-14, it is less clear how and when hippocampal codes become distinct.
One way the hippocampus may minimize interference between overlapping events is by orthogonalizing, or pattern separating, neural activity patterns15-23. For example, if two spatially-overlapping routes are orthogonally coded, their neural representations should be no more similar than those corresponding to nonoverlapping routes. A more extreme, and seemingly paradoxical, outcome is that hippocampal representations of overlapping events may be even less similar than representations of non-overlapping events. Why might this occur? When events are highly overlapping, this may create competition between event representations that the hippocampus solves in a targeted way by pushing representations apart from one another. By analogy, when two children are bickering, a parent’s reaction may be to move the children to opposite corners of the room. Thus, it may be that initial overlap among hippocampal representations triggers an ‘over-compensation’ wherein overlapping events actually become less similar than non-overlapping events4. The idea that overlap triggers a repulsion of event representations has been referred to as ‘differentiation’ and is supported and inspired by computational models24,25 and a limited number of empirical observations 4, 26–29.
Importantly, hippocampal differentiation is thought to occur gradually, as a result of repeatedly experiencing or remembering an event 26. This can be contrasted with the idea of pattern separation, where sparse coding in the hippocampus gives rise to an immediate and automatic orthoganilzation of activity patterns 30. Moreover, because differentiation is thought to be triggered by competition, neural activity patterns should change in a particular way. Namely, differentiation should preferentially occur for features within activity patterns that are initially shared across event representations. Indeed, differentiation can be thought of as an unsharing of initially-shared features.
Here, we used a real-world, human spatial-learning task inspired by canonical rodent T-Maze paradigms to test for hippocampal differentiation of overlapping routes. Using spatiotemporal analysis of fMRI data, we tracked learning-related changes in the similarity of hippocampal representations of overlapping and nonoverlapping routes. We predicted route overlap would trigger a repulsion of hippocampal representations, with similarity among overlapping routes dropping below similarity among non-overlapping routes4. Critically, while such a result would be readily explained by the differentiation account, it would not be explained by a pattern separation account. Additionally, to address the specific way in which hippocampal activity patterns changed with learning, we tested whether learning-related differentiation was related to the degree to which individual hippocampal voxels were initially shared across overlapping events.
RESULTS
Behavioral measures of route discrimination
In an initial behavioral experiment subjects studied sets of real-world routes that traversed the New York University campus. For each subject, the set of routes included (a) pairs that shared a common path before diverging to terminate at distinct destinations (‘overlapping routes’) and (b) pairs with no paths in common (‘non-overlapping routes’) (Fig. 1a). Importantly, each route contributed to both conditions. For example, ‘route 1’ and ‘route 2’ were overlapping routes, but ‘route 1’ and ‘route 3’ were non-overlapping routes (Fig. 1c). Each route contained an initial segment that was shared with another route (Segment 1), and a later segment, including the destination, that was route-specific (Segment 2; Fig. 1a). Although the real-world spatial locations of the overlapping segments were identical, the pictures for each route were taken at different times and therefore differed subtly in terms of pedestrians, vehicles, etc. (Fig. 1c and Supplementary Videos 1-8). Routes were studied twice per round for 14 rounds. Subjects were instructed to learn each route (i.e., the specific path to each destination) but were not told the destination at the start of the route. After each study round, subjects were shown individual pictures drawn from the routes and were asked to select the destination associated with each picture. Of central interest was accuracy for pictures drawn from Segment 1 of each route because selecting the correct destination for these pictures required discriminating between overlapping routes. Overall, subjects selected the correct destination (‘target’) at a higher rate than the destination of the overlapping route (‘competitor’) (F1,21 = 43.31, P = 0.000002; Fig. 2a). The relative rate of target vs. competitor responses also markedly increased across learning rounds (F1,21 = 38.11, P = 0.000004; Fig. 2b,c).
Hippocampal representations of overlapping routes diverge with learning
We next tested for hippocampal differentiation of overlapping routes in two fMRI studies. The first fMRI study used the same stimuli as the behavioral study (Fig. 1a). The second fMRI study used a new set of stimuli that again included overlapping and non-overlapping routes, but some of the non-overlapping routes terminated at a common destination (Fig. 1b). Unless otherwise noted, all analyses below combine data across experiments and all comparisons of non-overlapping routes are restricted to those that terminated at distinct destinations.
For Segment 1 of each route, we obtained a corresponding neural activity pattern by extracting voxel-wise patterns of activity as they unfolded, over time. These spatiotemporal activity patterns were then correlated for every pair of routes, resulting in a correlation matrix reflecting pairwise route similarity (Fig. 3a). We considered pattern similarity for (1) repetitions of the same route, (2) overlapping routes, and (3) non-overlapping routes. Separate correlation matrices were generated for each subject’s hippocampus and for a control region: the ‘parahippocampal place area’ (PPA), which is adjacent to the hippocampus and is involved in scene processing and navigation31,32 (Fig. 3b). Because our behavioral experiment indicated that discrimination of overlapping routes robustly improved from the 1st to 2nd half of learning (Fig. 2c), we divided the fMRI data into halves and independently computed pattern similarity measures within each of these halves. As in the behavioral experiment, subjects in both fMRI experiments were able to successfully discriminate between the overlapping routes by the end of learning (Supplementary Fig. 1).
Of critical interest, there was a learning-related decrease in pattern similarity among overlapping compared to non-overlapping routes, as reflected by an interaction between overlap (overlapping/non-overlapping) and learning (1st half/2nd half) (F1,39 = 13.163, P = 0.0008; Fig. 3c). Whereas pattern similarity among overlapping routes decreased with learning (F1,39 = 35.21, P = 0.0000006), similarity among non-overlapping routes did not change (F1,39 = 0.24, P = 0.63; Fig. 3e). This dissociation is striking when considering that all routes contributed to both the overlapping and non-overlapping comparisons. Thus, learning did not globally reduce similarity among routes; rather, learning specifically reduced similarity between overlapping routes. Moreover, overlapping route similarity decreased to the point that in the 2nd half of learning overlapping routes were markedly less similar than non-overlapping routes (F1,39 = 14.20, P = 0.0005; Fig. 3f). This result was significant in each of the fMRI experiments (PS < .05; see Supplementary Fig. 2 for results separated by experiment and Supplementary Fig. 3 for results separated by stimulus set). Thus, despite the fact that overlapping routes were spatially and visually more similar than non-overlapping routes, the hippocampus represented overlapping routes as less similar than non-overlapping routes–a result we refer to as a ‘reversal effect’ because it is opposite to the inherent similarity structure. This reversal effect was not present in the 1st half of learning (F1,39 = 1.41, P = 0.24), confirming that it was a result of learning (See Supplementary Fig. 4 for hippocampal pattern similarity computed at a finer time scale). Moreover, when considering data from Segment 2–i.e., after overlapping routes diverged-the reversal effect was absent (see Supplementary Fig. 5). Thus, hippocampal activity patterns became relatively more similar precisely when the overlapping routes diverged. These data indicate that divergence of hippocampal activity patterns was a direct response to overlap.
Within PPA, there was no learning-related reduction in the similarity of overlapping compared to non-overlapping routes (F1,39 = 2.42, P = 0.13; Fig. 3d). In fact, overlapping route similarity was greater than non-overlapping route similarity in the 1st half (F1,39 = 21.01, P = 0.00005) and 2nd half of learning (F1,39 = 4.63, P = 0.038; Fig. 3f; note: this effect differed across Experiments, see Supplementary Fig. 6).Thus, the inverted representational structure that was observed in hippocampus by the end of learning was absent in PPA. The dissociation between representational structure in PPA vs. hippocampus at the end of learning was reflected in a highly significant region x overlap interaction (F1,39 = 22.18, P = 0.00003). A similar dissociation was observed when comparing hippocampus to retrosplenial cortex, another region involved in scene-processing and navigation (see Supplementary Fig. 7).
Although we primarily focus on the comparison of overlapping vs. non-overlapping routes, the comparison of overlapping vs. same routes is also informative. To the extent that overlapping route representations are distinct, overlapping route similarity should be lower than same route similarity. Indeed, within the hippocampus there was a learning-related change such that overlapping route similarity decreased relative to same route similarity (F1,39 = 7.59, P = 0.009). Overlapping route similarity was significantly lower than same route similarity in the 2nd half of learning (F1,39 = 5.61, P = 0.023), but not in the 1st half of learning (F1,39 = 0.85, P = 0.35). Within PPA, there was no learning-related change in overlapping vs. same route similarity (F1,39 = 0.00 3, P = 0.96). The difference between overlapping and same route similarity was not significant in the 1st half (F1,39 = 0.89, P = 0.35) or 2nd half (F1,39 = 0.86, P = 0.36).
One way in which hippocampal route representations may diverge with learning is through the learned ability to predict the route destinations33-38. To test this possibility we considered data from Experiment 2, which contained pairs of non-overlapping routes that terminated at distinct destinations as well as pairs of non-overlapping routes that terminated at the same destination. If hippocampal activity patterns reflected navigational goals, spatiotemporal pattern similarity from Segment 1 should be greater for non-overlapping routes that terminated at the same destination relative to non-overlapping routes that terminated at distinct destinations. However, we found no evidence for a difference between these conditions (Supplementary Fig. 8). Specifically, there was no learning related increase in spatiotemporal similarity for same destination pairs relative to distinct destination pairs (F1,20 = 0.53, P = 0.47), nor was there a difference between same and distinct destination pairs when considering second-half data alone (t20 = 0.98, P = 0.34). Thus, we did not observe any evidence to suggest that destination coding was contributing to the divergence of hippocampal activity patterns.
Voxel-level plasticity is triggered by pattern overlap
The preceding results indicate that hippocampal representations of overlapping events diverged with learning. But how, exactly, were activity patterns shaped by learning? Putatively, differentiation of event representations is a reaction to competition26,39, with a competing memory subject to plasticity (and corresponding differentiation) to the extent that it becomes active during the processing of a target memory. Interestingly, however, it has been argued that this relationship between activation and differentiation is non-monotonic. Namely, differentiation is most likely to occur when competing memories are ‘moderately activated.’ When a competing memory is moderately activated, it is subject to inhibition that weakens associations with the target memory via the ‘drop out’ of overlapping features. However, when a competing memory is strongly activated, its features resist inhibition and the competing memory remains strongly associated with the target memory. Finally, if a competing memory is only weakly activated, there is little opportunity or need for plasticity to occur. This theory, has been referred to as the non-monotonic plasticity hypothesis and has been elegantly described in computational terms24,25 and can account for several empirical observations that are otherwise difficult to explain4, 26-28, 40, 41. We reasoned that this theory may also explain voxel-level changes across learning that were observed in the present study. Namely, within an individual voxel, we can think of a pair of overlapping routes as corresponding to partially overlapping assemblies of neurons. If the assemblies are highly overlapping, then presentation or processing of one route is likely to activate the overlapping route, via spread of activation. Although we do not have a way to directly measure the overlap of neural ensembles within a voxel, we reasoned that if a voxel responds similarly to two overlapping routes, then it is likely that the two routes are represented by overlapping ensembles of neurons (within that voxel). In other words, the degree to which a voxel responds similarly to a pair of overlapping routes may provide a clue as to the overlap in the underlying neural representations. If so, then following the non-monotonic plasticity hypothesis, we would expect differentiation to maximally occur for voxels that initially represented overlapping routes as ‘moderately similar’ (Fig. 4a).
To test this prediction, the critical first step was to quantify the similarity with which individual voxels responded to each pair of routes at the beginning of learning. To do so, we correlated the timecourse of each voxel’s activity for each pair of routes, using Segment 1 data only (see Methods). For a given pair of routes, voxels with relatively high timecourse similarity were considered to be voxels that were more strongly ‘shared’ across the routes (Fig. 4b). Timecourse similarity values were first obtained based on data from the 1st half of learning. Voxels were then rank-ordered by 1st-half timecourse similarity and binned into groups corresponding to ‘weak,’ ‘moderate,’ or ‘strong’ values (i.e., the bottom 1/3, middle 1/3, and top 1/3 of similarity values). Importantly, this binning was independently repeated for every pair of routes, each region of interest, and each subject. Thus, a given voxel may have been strongly shared across one pair of routes but weakly shared across a different pair of routes. Timecourse similarity values from the 2nd half of learning were then obtained from these voxel bins, which allowed timecourse similarity values at the end of learning to be considered as a function of timecourse similarity at the beginning of learning. It is important to note that we did not measure changes in time course similarity from the first to second half, so as to avoid problems associated with regression to the mean.
Within the hippocampus, 2nd-half timecourse similarity for overlapping routes significantly varied according to 1st-half similarity (F2,78 = 4.74, P = 0.012), with the lowest levels of 2nd-half similarity occurring for voxels that were ‘moderately’ shared during the 1st half of learning (Fig. 4c). Qualitatively, the relationship strongly matched the predicted non-monotonic pattern (Fig. 4a). For nonoverlapping routes, 2nd-half timecourse similarity did not vary according to 1st half similarity (F2,78 = 0.28, P = 0.76). An ANOVA with factors of overlap (overlapping/non-overlapping) and bin (weak/moderate/strong) revealed a significant overlap x bin interaction (F2,78 = 3.19, P = 0.046), with the difference between overlapping and non-overlapping routes (the reversal effect) highly significant in the ‘moderate’ bin (F1,39 = 19.17, P = 0.00009), marginally significant in the ‘weak’ bin (F1,39 = 3.62, P = 0.064), and not significant in the ‘strong’ bin (F1,39 = 1.53, P = 0.22). Thus, consistent with predictions, voxels that were moderately shared across routes at the beginning of learning exhibited the strongest differentiation. The relationship between 1st and 2nd half timecourse similarity for overlapping routes was markedly different in PPA, as reflected by a significant region (hippocampus/PPA) x bin interaction (F2,78 = 18.12, P = 0.0000003) and a marginally significant region x bin x overlap interaction (F2,78 = 2.95, P = 0.0584). In PPA, patterns remained relatively stable as voxels that were moderately shared in the 1st half of learning remained moderately shared in the 2nd half of learning (Fig. 4d).
To more formally assess the non-monotonic relationship between 1st-half and 2nd-half timecourse similarity, we used a Bayesian curve-fitting algorithm-the Probablistic Curve Induction and Testing Tooblbox (P-CIT)42–that was specifically developed to test for non-monotonic relationships. Relative to quadratic trend analyses, the P-CIT algorithm has important advantages. Namely, the P-CIT algorithm allows for more detailed specification of a predicted curve shape, by explicitly including a set of curve parameters. In our case, the parameters describe the relationship between first-half timecourse similarity (x-axis) and second-half timecourse similarity (y-axis). Consistent with prior usage of the P-CIT algorithm 41, we parameterized the predicted curve shape as one in which the function, when moving from left to right, drops below the initial start value and then rises above the start value. The ‘dip’ in the middle of the curve reflects the prediction that differentiation (in the second-half of learning) should be greatest for voxels that were ‘moderately’ shared at the beginning of learning. The rise above the starting point after the ‘dip’ reflects the prediction that voxels that were strongly shared at the beginning of learning will remain (relatively) strongly shared at the end of learning. Using the specified curve parameters, the algorithm estimates a probability distribution over curves, conditional on the observed data. To assess evidence in favor of the predicted curve shape, the algorithm labels each sample curve as theory consistent (in our case, if it drops below the starting value and then rises above the starting value) or inconsistent and then computes a log Bayes factor value that represents the log ratio of evidence in favor of or against the predicted shape. Positive log Bayes factor values indicate greater evidence in favor of the theory. For this analysis, we re-binned all of the 1st half timecourse similarity values into 60 bins (5 voxels per bin) in order to allow for greater variability in the observed curve shape. This analysis used data aggregated across all subjects 28,41,42.
The estimated curve (Fig. 4e) was consistent with the predicted curve shape (log Bayes factor = 1.51) and explained a significant amount of variance in the actual data (χ2 = 11.13, P = 0.0008). We next ran a permutation test to estimate the null distribution of theory consistency values. Out of 500 permutations, only 2.6% yielded log Bayes factor values that matched or exceeded the value obtained from the un-permuted data, indicating that it was unlikely to obtain this level of support for the predicted curve shape by chance (Fig. 4f). Finally, to assesses the population-level reliability of the non-monotonic curve we ran a bootstrap resampling test in which we iteratively resampled data from subjects with replacement and then computed the log Bayes factor value for each iteration. Four-hundred and eighty-eight of the 500 bootstrap iterations (97.6%) yielded positive Bayes factor values. Thus, the curve-fitting analyses provided converging evidence for the predicted non-monotonic relationship between voxel overlap at the beginning vs. end of learning: hippocampal voxels that were ‘moderately shared’ across overlapping routes at the beginning of learning were the ‘least shared’ by the end of learning.
DISCUSSION
Here, across two fMRI studies, we found that hippocampal representations of overlapping spatial routes dramatically diverged with learning-to the point that overlapping routes were coded as less similar than non-overlapping routes. This ‘reversal effect’ was clearly a result of learning as it was only evident after subjects gained considerable familiarity with the routes and it paralleled behavioral improvement in memory-based route discrimination. The result was also selective to the hippocampus, with no evidence of a reversal effect in PPA. Finally, using a novel analysis approach, we show that the learning-related divergence of hippocampal activity patterns was most likely to occur for voxels that were moderately shared across overlapping routes at the beginning of learning.
Several aspects of the stimuli we used and the analyses we employed are critical for the interpretation of our results. First, our analysis approach specifically compared representations of overlapping events to representations of nonoverlapping events4. This allowed for learning-related changes to be expressed relative to a meaningful baseline. Indeed, the fact that hippocampal representations of visually- and spatially-overlapping routes became less similar than routes that contained no spatial overlap or visual similarity is not only striking, but it provides important insight into the underlying mechanism (a point we detail below). Second, our design did not involve separate sets of routes for the overlapping and nonoverlapping comparisons; rather, each route was included in each ‘condition.’ For example, the comparison of routes 1 and 2 was a comparison of overlapping routes, whereas the comparison of routes 1 and 3 was a comparison of non-overlapping routes. As such, any observed differences between overlapping and non-overlapping routes cannot be attributed to differences between the actual stimuli or to differences in attention, familiarity, vigilance, etc. It is also of note that our findings generalized across different sets of stimuli (experiments 1 and 2). Lastly, for our critical comparison of overlapping vs. non-overlapping routes, we only considered spatiotemporal activity patterns during the overlapping segments of the routes (Segment 1 data)-that is, before the overlapping routes diverged. Indeed, once the overlapping routes diverged (Segment 2 data), the hippocampal reversal effect ‘disappeared’ (Supplementary Fig. 5). Thus, hippocampal representations of overlapping routes were most dissimilar when routes actually overlapped, clearly indicating that the reversal effect was triggered by event overlap.
The mechanism that has most frequently been associated with disambiguating hippocampal representations is pattern separation15-23. However, the reversal effect we observed in the hippocampus is not readily explained by pattern separation. Rather, ‘perfectly’ pattern-separated representations of overlapping routes should have similarity values at-but not below-the similarity values for non-overlapping routes. In other words, pattern separation should only reduce global similarity among route representations43. It has also been argued that hippocampal activity patterns for overlapping events become disambiguated as event-specific contextual elements are learned and relational networks are formed3,8. Contextual elements may include event outcomes, locations, rewards, etc. However, because overlapping routes inherently shared more contextual features than non-overlapping routes, the observed reversal effect is difficult to explain based on a simple context-learning account. A related possibility is that hippocampal activity patterns reflected predictions about route destinations33-38. However, we did not observe any evidence in favor of destination coding (Supplementary Fig. 8) and, again, this account does not explain the reversal effect we observed. Rather, associating each route with a unique destination would only reduce global similarity among routes, similar to pattern separation or context learning. Thus, at least in the present study, hippocampal activity patterns appeared to reflect route information other than predicted destinations13.
In contrast to accounts based on pattern separation, context learning, or destination coding, a differentiation account readily explains several critical aspects of our findings. First, according to a differentiation account, learning should result in highly-targeted representational changes where overlapping events are specifically distanced from one another. This prediction is supported by our observation of selective learning-related decreases among overlapping route representations in the hippocampus. Second, if sufficient differentiation occurs, overlapping events should become less similar than non-overlapping events4. The observed reversal effect is therefore naturally explained by differentiation. Third, differentiation is thought to occur incrementally over the course of learning26. This can be contrasted with the idea that pattern separation reflects an automatic orthogonalization that occurs during initial event encoding owing to sparse coding within the hippocampus16,30. In other words, whereas differentiation is inherently a learning phenomenon, pattern separation is not. Our results clearly provide evidence of learning-related changes, consistent with the idea of differentiation. Indeed, the reversal effect only began to emerge after routes had been presented approximately 20 times (see Supplementary Fig. 4). Thus, differentiation occurred quite slowly. One question raised by the slow time scale of differentiation is why hippocampal representations continued to diverge once overlapping route representations were no more similar than non-overlapping route representations? While speculative, it is possible that the divergence of hippocampal route representations may have been influenced by similarity among representations within input regions. For example, within PPA overlapping route representations tended to remain more similar than non-overlapping route representations even during the 2nd-half of learning.
While our findings are highly consistent with a differentiation account, it is important to emphasize that our results do not argue against a role for pattern separation, context learning or destination coding. Indeed, the lack of greater hippocampal pattern similarity among overlapping compared to non-overlapping routes at the beginning of learning may reflect an initial orthogonalization (pattern separation) of the route representations, with differentiation being additive to this initial orthogonalization. On the flip side, by establishing a role for hippocampal differentiation in a task closely modeled after rodent spatial learning paradigms, our results raise the important possibility that prior reports of pattern separation in the rodent hippocampus during spatial learning1,2,19 might also reflect, at least in part, an influence of differentiation.
Beyond characterizing pattern similarity across voxels, a novel aspect of the present study is that we identified how responses within individual hippocampal voxels were reconfigured over the course of learning to produce pattern-level differentiation. Specifically, we observed a non-monotonic relationship in the similarly with which individual voxels responded to pairs of overlapping routes at the beginning vs. end of learning. Voxels that initially had ‘moderate’ levels of timecourse similarity were the voxels that exhibited the lowest timecourse similarity (or greatest differentiation) by the end of learning. This analysis was motivated by theoretical arguments and empirical evidence that plasticity in a competing memory’s representation is non-monotonically related to its activation, with plasticity preferentially occurring when competing memories (or corresponding neural ensembles) are moderately activated25,26,28. While our analysis was inspired by this theoretical prediction, our analysis differed from prior examples in that we considered plasticity at the level of individual voxels as opposed to ‘network level’ memory strength or differentiation. Additionally, our analysis did not directly measure activation of competing memories; instead we used timecourse similarity as a way to infer when activation of competing neural ensembles was most likely to occur. Namely, we reasoned that timecourse similarity at the beginning of learning would be related to the degree of overlap among underlying neural ensembles (within a given voxel) and, therefore, to the probability that viewing one route would result in ‘spill over’ of activation to the competing neural ensemble. While our predictions were predicated on this assumption, the observed function was strongly consistent with the predicted nonmonotonic relationship. To our knowledge, this is the first empirical evidence of non-monotonic plasticity within the hippocampus.
A final question concerns the nature of information coded for by the hippocampus. That is, what kinds of information were reflected in hippocampal activity patterns? Several prior studies have demonstrated spatial coding within the human hippocampus44-48. Here, our findings appear to be at odds with a spatial coding account in that hippocampal activity patterns diverged precisely when routes were spatially overlapping. However, it is possible that when routes overlap the hippocampus forms separate spatial reference frames for each route. Thus, hippocampal activity patterns in the present study may have reflected spatial codes if overlapping routes were represented within distinct maps. Alternatively, the hippocampus may have coded for non-spatial information that differentiated the overlapping routes (e.g., pedestrians, cars, etc.). While we cannot determine the precise nature of hippocampal representations in the present study, it is clear that hippocampal representations were very distinct from representations in PPA-a cortical region adjacent to the hippocampus that has been implicated in coding spatial landmarks49. Indeed, PPA and the hippocampus exhibited opposite representational structures at the end of learning: in PPA, overlapping routes were more similar than non-overlapping routes, whereas in the hippocampus this representational structure was reversed. Thus, while the hippocampus may be part of a broader network of regions involved in spatial navigation and memory9,35,45,48,50,51 our findings highlight unique representational dynamics within the hippocampus.
METHODS
Subjects
New York University (NYU) students and alumni who were familiar with the NYU campus participated in the study. Subjects were restricted to NYU alumni and students in order to facilitate route learning and to reduce potential between-subject variance. Subjects were between the ages of 18-35, right-handed, native English speakers, had normal or corrected-to-normal vision and had no history of neurological disorders. Twenty-two subjects participated in the behavioral experiment (15 female; mean age = 20.77). Two additional subjects’ data were not collected due to technical errors. Twenty subjects (13 female; mean age = 22.15) participated in fMRI Experiment 1. Four additional subjects were excluded from data analysis - one for falling asleep in the scanner, two for technical errors during scanning, and one due to unreliable localizer data (see Regions of Interest). Twenty-one subjects (9 female; mean age = 23.17) participated in fMRI Experiment 2. One additional subject’s data was excluded from data analysis due to excessive head motion and another additional subject was excluded for technical errors with scanner. Sample sizes for the fMRI studies were based on a similar experiment from our lab4. Informed consent was obtained according to procedures approved by the New York University Committee on Activities Involving Human Subjects.
Stimuli and Design
In the behavioral experiment and fMRI Experiment 1 the stimuli consisted of eight routes that traversed the NYU campus (Fig. 1a). Each route was comprised of a series of 98 unique pictures. All pictures were taken at regular intervals (every 10 paces) from an egocentric perspective by a researcher walking along the route. All routes started in the same location and made exactly three turns before ending at distinct destinations. Critically, the 8 routes consisted of 4 overlapping pairs. Overlapping pairs followed the same path for the majority of the route before diverging on the third turn to their respective destinations. The pictures for each route were taken at different times and therefore the pictures during the overlapping portion of routes were subtly different and could be distinguished from one another based on subtle differences in the pedestrians, vehicles, lighting, etc.For analysis purposes, routes were divided into pairs that shared an overlapping path (‘overlapping routes’; e.g. routes 1 and 2) or took distinct paths (‘nonoverlapping routes’; e.g. routes 1 and 3). Furthermore, each route was divided into two segments: ‘Segment 1’ refers to the segment of each route that overlapped with another route and ‘Segment 2’ refers to the route-unique segment of each route. The third turn-which marked the boundary between Segments 1 and 2-occurred at the exact same picture numbers within pairs or overlapping routes (e.g., for routes 1 and 2) and varied minimally (between picture numbers 74-77) across sets of overlapping pairs (e.g., for routes 1/2 vs. routes 3/4). Likewise, all turns within a pair of overlapping routes occurred at identical time points in order to maximize the similarity of overlapping routes. There was exactly one overlapping pair that left the starting point in each cardinal direction (north, south, east, west). The 8 routes were divided into 2 sets (north/south routes and east/west routes). Each subject was assigned one set of routes (4 routes total) to learn, with the assignment of route sets alternating subject-by-subject. We included 2 sets of routes in order to ensure our results could not be explained by the idiosyncrasies of any one route.
A new set of 8 routes was used in fMRI Experiment 2 (Fig. 1b). The routes were constructed using the same parameters as the routes used in the behavioral and first fMRI experiments, with one key difference. Instead of all routes terminating at distinct locations, fMRI Experiment 2 contained pairs of routes that took distinct paths but ended at the same destination. As before, the 8 routes were divided into two sets of 4 and each set of 4 contained two pairs of overlapping routes. The routes in each set could be divided into pairs that (a) shared an overlapping path but terminated at distinct destinations (‘overlapping routes’; e.g. routes 1 and 2), (b) had non-overlapping paths and terminated at distinct destinations (‘non-overlapping routes’; e.g. routes 1 and 4) or (c) had nonoverlapping paths but terminated at the same destinations (‘same destination’; e.g. routes 1 and 3). Due to geographical constraints, the third turn (i.e. when overlapping routes diverged) in this set of routes occurred slightly later (between picture numbers 84-86) than in the set used in the behavioral and first fMRI experiments.
Videos of the overlapping route pairs used in the experiments are available in Supplementary Videos 1-8.
Procedure
Behavioral Experiment
Route Learning
Subjects completed 14 rounds of route learning, with each route presented twice per round in random order. During a route learning trial, pictures from a route were presented in rapid succession (220 ms per picture, 10 ms blank screen in between pictures). Importantly, subjects were not told the destination of the route prior to the trial. Rather, the destination was only revealed at the end of the route, with the final picture (the destination) presented for 1690 ms. The destination’s name was also displayed above the final picture. Each route learning trial lasted a total of 24s and was followed by a 1-s inter-trial interval (ITI) during which a fixation cross was presented. Each round also contained two ‘catch’ trials to ensure subjects’ vigilance but were excluded from all analyses. For each catch trial, a route began as with a normal trial but the presentation stopped at a pre-selected picture number. A cue then appeared above the picture either instructing participants to identify (1) the routes’ final destination (destination test) or (2) the direction of the next turn (direction test). During the 3s response period the picture and test cue remained on screen with the four destination labels (destination test) or left/right labels (direction test) printed below the picture and participants selected their response using a keyboard. Catch trials stopped on pictures presented between 3-15s after the trial onset and at intervals of 1.5 s (to coincide with the TR length in the fMRI experiments; see fMRI Acquisition). The combined duration of the two test trials within each round were constrained to equal the duration of a full route learning trial (24 seconds). Although each subject completed an equal number of destination and direction catch trials throughout the experiment, and each route was tested an equal number of times, the assignment of catch trial type to both route number and round was randomized so as not to be predictable. That is, within a given round there could be 2 destination catch trials, 2 direction catch trials, or 1 of each, and a given route could be tested twice via a destination catch trial, twice as a direction catch trial, or once as each test.
Inter-Round Picture Test
At the end of each of the 14 route learning rounds, subjects were shown 20 static pictures, one at a time, drawn from the routes (5 per route in random order) and for each picture subjects were asked to select the corresponding destination from a set of four label options. The inter-round picture test was self-paced and subjects responded via keyboard. To ensure that the five pictures tested from each route in each test round were evenly sampled across positions in the route, each route’s 95 pictures (excluding the last 3 pictures that contained visuals of the destination) were divided into 5 time-bins of 19 pictures. For each inter-round picture test, one picture from each time-bin, from each route, was randomly selected to be tested with the constraint that a given picture was only tested once throughout the experiment. Responses on the test were divided into three groups: (1) ‘target’ if subjects selected the correct destination, (2) ‘competitor’ if subjects selected the overlapping route’s destination, and (3) ‘other’ if subjects selected the destination from a non-overlapping route.
Map Test
In order to assess each subject’s knowledge of the routes, subjects also completed a map test after finishing all rounds of route learning. For each trial on the map test, subjects were cued with a picture of a route’s destination for 4s. A map of the NYU campus then appeared on screen and subjects had 8s to click on the spatial location of the cued destination using a computer mouse. They were then prompted to draw with a pen the route taken to that destination on a paper print out of the campus map. Finally, participants completed both the Santa Barbara Sense of Direction Scale (SBSOD) and the Questionnaire on Spatial Representation (QSR) to assess their spatial acuity and reasoning. Results from the map test and questionnaires are not reported in the current study.
fMRI Experiments 1 and 2
Route Learning
The procedures from the behavioral experiment were slightly modified to be suitable for fMRI scanning. In both fMRI experiments, subjects first completed 2 practice route learning rounds (2 repetitions of each route per round) to familiarize them with the routes and task structure. Subjects then entered the scanner and completed an additional 14 rounds of route learning. The practice rounds were identical to the scanner rounds except that the first practice round did not contain any catch trials. During the scanned route learning rounds, the ITI was 6 s (fixation cross) to allow for better separation of hemodynamic response functions.
Inter-Round Picture Test
The inter-round picture test used in the fMRI experiments was shorter than in the behavioral experiment. In the fMRI version, there were a total of only 4 trials which contained pictures randomly sampled from the 4 routes. The sampled pictures were not constrained to be from different routes. The only constraint was that the pictures used in the inter-round picture test were not used in the post-scan memory test (described below). Additionally, in the fMRI version of the inter-round picture test subjects were shown each picture for a fixed amount of time (2.5 s) and could only respond during that time, using an MRI-compatible button box. Because the inter-round picture tests in the fMRI experiments only sparsely assessed route learning, these data are not reported. These test trials were only included to motivate subjects to learn the routes.
Functional Localizer
Following the 14 rounds of route learning subjects completed one localizer scan that was used to functionally define regions of interest for the fMRI analyses. The localizer scan contained 36 alternating blocks of three image types (12 blocks per category): faces, scenes (hallways or houses), and objects (cars or guitars). Each block lasted a total of 6 s and contained 12 greyscale images presented for 500 ms each. Subjects pressed a button whenever they detected a scrambled image, which occurred on half of all blocks (counterbalanced across category). An additional 12 baseline ‘blocks’ showing a blank grey screen (also 6 s each) were randomly interspersed with the other blocks.
Post Tests
After exiting the scanner subjects first completed a map test (identical to the behavioral experiment). Next, subjects completed an extended picture test which included ten pictures drawn from each route (every 10th picture from picture 4 to 94), tested in random order. On each trial, the route picture was presented above the set of destination names (4 destination names in Experiment 1 and 2 destination names in Experiment 2). Subjects used a computer mouse to click on the destination name associated with each picture. This test was self-paced. Finally, subjects completed the Santa Barbara Sense of Direction Scale (SBSOD) and the Questionnaire on Spatial Representation (QSR).
fMRI Data Analysis
MRIAcquisition
Scanning was performed on a 3T Siemens Allegra head-only scanner at the Center for Brain Imaging at New York University using a Siemens head coil. Structural images were collected using a T1-weighted protocol (256 × 256 matrix, 176 1-mm sagittal slices). Functional images were acquired using a T2* weighted EPI single shot sequence containing 26 contiguous axial slices oriented parallel to the long-axis of the hippocampus (repetition time = 1.5 s, echo time = 23 ms, flip angle = 77 degrees, voxel size = 2 × 2 × 2 mm). The functional images did not cover the entire brain; rather, a limited field of view centered on the hippocampus was chosen in order to improve spatial resolution of data from the hippocampus. For the route learning scans, the first 6 volumes (during which time a “Get Ready” screen was presented, followed by a fixation cross) were discarded to account for T1 stabilization. For the localizer scan, the first 8 volumes and last 8 volumes (during which time a fixation cross was presented) were discarded. Field map and calibration scans were collected to improve functional-to-anatomical coregistration.
fMRI Preprocessing
Images were preprocessed using SPM8 (Wellcome Department of Cognitive Neurology, London, United Kingdom), FSL (FMRIB’s Software Library, Oxford,United Kingdom) and custom Matlab (The MathWorks, Natick, MA) routines. The preprocessing procedures included correction for head motion, coregistration of functional to anatomical images (using a registration procedure that aligned both functional and anatomical images to a calibration scan), and an unwarping procedure. Images from the functional localizer scan were spatially smoothed using a 4-mm full-width/half-maximum Gaussian kernel. Images from the route learning phase, which were used for pattern analyses, were smoothed using a moderate 2-mm full-width/half-maximum Gaussian kernel in order to improve signal-to-noise ratio. Prior research suggests that smoothing does not reduce sensitivity of pattern-based fMRI analyses 50. All analyses were performed in subjects’ native space.
fMRI univariate analysis
To analyse the localizer data, SPM was used to construct a general linear model with three regressors of interest corresponding to the three visual categories (scenes, faces, objects). These regressors were constructed as boxcar functions that onset at the first image of a category block and lasted for the duration of the block. Motion, block, and linear drift were modeled as regressors of no interest. All regressors were convolved with a canonical double-gamma hemodynamic response function. A linear contrast of scenes vs. faces and objects was used to obtain voxelwise estimates of scene sensitivity and a linear contrast of faces, scenes, and objects vs. baseline was used to obtain voxelwise estimates of visual sensitivity.
Regions of interest
Anlyses were performed using a region of interest (ROI) approach targeting the hippocampus, parahippocampal place area (PPA), and retrosplenial cortex (RSC). Anatomical hippocompal ROIs were defined using freesurfer’s automated sub-cortical segmentation procedure. The resultant hippocampal ROIs were then visually inspected and manually edited for any inaccuracies before registering them to each subject’s functional space. In order to identify voxels with high signal-to-noise ratios and to create ROI masks the same size as the PPA and RSC masks (see below), the hippocampal ROI consistend of the top 300 visually-responsive voxels within bilateral hippocampus, as determined from the category localizer (contrast of faces, scenes, and objects vs. baseline). Although this voxel selection procedure was implemented to increase our sensitivity to detect small differences in hippocampal patterns, it is important to note that our main findings were not dependent on such selection methods. Indeed when no voxel selection was applied within the hippocampus the interaction between overlap (overlap/non-overlap) and learning (1st half/2nd half) remained significant (F(1,39) = 4.75, P = 0.0354), as did the reversal effect in the 2nd half of learning (F(1,39) = 7.30, P = 0.0102).
PPA and RSC were identified using a combination of the category localizer and group-based probabilistic scene-selective ROIs identified from previous studies 51; http://web.mit.edu/bcs/nklab/GSS.shtml). First, the group-based PPA and RSC masks were registered to each subject’s native space and voxels overlapping with the anatomically defined hippocampal masks were removed from the PPA/RSC masks to ensure independent ROIs. Then, the top 300 scene-selective voxels (contrast of scenes vs. faces and objects from the category localizer) within PPA and, separately, within RSC, were selected. This method ensured that the PPA and RSC ROIs were subject-specific but equal in size (number of voxels) and general location across all subjects 47. Note: we chose 300 voxels as an a priori threshold for all our ROIs. This number corresponded to roughly the top 20% of the hippocampal voxels, 30% of the voxels within the group-based PPA mask, and 15% of the voxels in the group-based RSC mask. One subject from Experiment 1 was excluded because the average t value within their PPA ROI was more than two standard deviations below the mean PPA response in Experiment 1 (this was the only subject with a mean PPA or RSC response that was more than 2 standard deviations below the experiment mean); subjective assessment of data from this subject confirmed that there was no well-defined cluster within the group-based PPA mask that selectively responded to scenes.
Spatiotemporal pattern similarity
Pattern similarity analyses were performed on ‘raw’ (unmodeled) fMRI data. Several additional preprocessing steps were performed prior to performing pattern analyses. Functional images were detrended, high-pass filtered (0.01 Hz), and then z-scored within run. For route learning trials, volumes 3-19 (corresponding to 3-27s after stimulus onset) were divided into volumes corresponding to Segment 1 (i.e. the portion of each route that shared a path with another route) and Segment 2 (i.e. the unique portion of each route after overlapping paths diverged). The volume in each route corresponding to the transition between Segments 1 and 2 (i.e., the third turn in the routes) was discarded from analyses in order to keep Segments 1 and 2 distinct. In Experiment 1, Segment 1 occurred within the first 11 volumes and Segment 2 occurred within the last 4 volumes. In Experiment 2 the overlapping routes diverged slightly later; thus, Segment 1 corresponded to the first 12 volumes Segment 2 corresponded to the last 3 volumes. To perform pattern analyses, spatial activity patterns were concatenated across volumes of interest so that each route Segment was represented by a spatiotemporal pattern of activity whose vector length was equal to the number of voxels within an ROI * the number of TRs included in the Segment.
For each subject and each ROI, we computed pattern similarity scores (Pearson correlations) reflecting the representational similarity across each pair ofroutes. Correlations were always performed using data from independent fMRI runs (odd and even runs) in order to ensure independence. Thus, for analysis of data from the first half of learning, each route’s average spatiotemporal activity pattern was obtained from runs 1, 3, and 5 (odd runs) and, separately, from runs 2, 4, and 6 (even runs); average ‘odd run patterns’ were then correlated with average ‘even run patterns.’ Likewise, for analysis of data from the second half of learning, each route’s average spatiotemporal activity pattern was obtained from runs 9, 11, and 13 (odd runs) and, separately, from runs 10, 12, and 14 (even runs), and odd and even patterns were correlated. Data from runs 7 and 8 were excluded in order to ensure an equal number of odd and even runs within each half. Because each subject studied 4 routes, a 4 x 4 correlation matrix was generated for each subject (Fig. 3a). Before any correlation values were averaged within conditions (e.g., overlapping routes), correlation coefficients were z-transformed (Fisher’s z).
Timecourse similarity
Timecourse similarity indexed the degree to which individual voxels were ‘shared’ across a given pair of routes. To compute timecourse similarity, we first obtained route-specific vectors of activation (using Segment 1 data only) for each voxel. The length of each timecourse vector was equal to the number of Segment 1 TRs (11 in Experiment 1; 12 in Experiment 2). Timecourse vectors were separately averaged across odd and even runs within each half (as with the spatiotemporal pattern analyses). Average timecourse vectors were then correlated (Pearson correlation) for every pair of routes, separately for each learning half (Fig. 4b). Resulting correlation coefficients were z-transformed (Fisher’s z).
Curve Fitting Procedure
To estimate the relationship between timecourse similarity for overlapping routes at the beginning vs. end of learning, we used the probabilistic curve induction and testing (P-CIT) Bayesian curve-fitting toolbox 34;http://code.google.com/p/p-cit-toolbox). To prepare the data for the algorithm, the 300 voxels within each subject’s hippocampal ROI were rank-ordered according to 1st half timecourse similarity values and then divided into 60 bins (5 voxels/bin) ordered from lowest to highest similarity. Second half timecourse similarity values were then obtained for each of these 60 bins. This process was independently repeated for each pair of overlapping routes and the resulting data (i.e., the sets of 60 2nd half timecourse similarity values) were then averaged across route pairs, producing a single set of 60 values per subject. Consistent with all prior implementations of the P-CIT toolbox 28,33,34, we then aggregated data across subjects (i.e., 41 subjects x 60 voxel bins = 2,460 data points) and the P-CIT algorithm was applied to these aggregated data.
The P-CIT algorithm approximates the posterior distribution over possible plasticity curves (i.e which curves are most probable given the data). As a first step, the algorithm rescales the predictor variable (here, 1st half timecourse similarity values) to fit within a range of 0 to 1. Next, the algorithm defines a parameterized family of curves (piecewise-linear curves with three segments) and then randomly samples 50,000 curves from this parameterized space in which the bounds of the x-axis extend from 0 to 1 and the bounds on the y-axis extend from -1 to 1. For each sampled curve the algorithm assigns an importance weight indicating how well the curve’s shape fit the observed data. After assigning importance weights to all 50,000 samples, the algorithm regenerates a new set of sample curves by taking the highest weighted curves from the previous sample and slightly distorting them. This process of assigning importance weights to sampled curves and resampling curves based on those importance weights was repeated for 20 iterations. The resultant collection of weighted samples can be interpreted as an approximate posterior probability distribution over curves. In other words, curve samples with high importance weights are relatively probable given the data whereas sample curves with low importance weights are relatively improbable. The algorithm then estimates a curve by averaging the sampled curves together weighted by their importance values. Credible intervals were computed by finding the range of y values contained within the middle 90% of the curve probability mass at each interval along the x-axis. To summarize the level of evidence in favor of the nonmonotonic plasticity hypothesis the algorithm computes the log Bayes factor that reflects the log of the ratio of evidence in favor of the hypothesis to the evidence against the hypothesis. To compute the log Bayes factor, first each curve in the final set of samples was labeled as either theory consistent (i.e. if moving from left to right the curve dropped below its starting point and then rose above its starting value) or theory inconsistent. Then, the proportion of the posterior probability distribution mass taken up by theory-consistent samples was computed by summing together the weights corresponding to theory consistent curves and dividing that value by the sum of the weights corresponding to theory-inconsistent curves28. The log Bayes factor was then computed as: where
The correction factor compensates for the fact that only 33.28% of the space of possible curves is comprised of theory consistent curves. Without this adjustment we would expect a large imbalance in favor theory-inconsistency due to chance. To test whether the obtained log Bayes factor value could have arisen due to chance, we ran a non-parametric permutation test. Specifically, we re-ran the algorithm 500 times with the 1st and 2nd half timecourse similarity values shuffled within subject yielding an empirical null distribution over log Bayes factor values. We then measured where our observed log Bayes factor value fell on this distribution to compute the probability of obtaining this value under the null hypothesis. Lastly, the P-CIT toolbox also calculates a chi-squared likelihood ratio test that assesses how well the estimated curve explains the data. The P-value for this test indicates the probability of obtaining the observed predictive accuracy relative to a null model in which there is no relationship between timecourse similarity values across halves.
Statistics: For all behavioral and fMRI analyses (with the exception of the curve-fitting procedures) we used standard random-effects statistics (paired sample t-tests and repeated measures ANOVA). Two-tailed tests were used throughout at an alpha threshold of 0.05. Unless otherwise noted, analyses combined data across experiments 1 and 2. For all ANOVAs run on these combined data, experiment number was included as a factor. For all of the hippocampal ANOVA effects described in the main text, interactions with experiment number were not significant (Ps > 0.2). See Supplementary Figs 2 and 6 for hippocampal and PPA data separated by experiment. All statistical models were implemented using R (https://www.r-project.org).
Statistics
For all behavioral and fMRI analyses (with the exception of the curve-fitting procedures) we used standard random-effects statistics (paired sample t-tests and repeated measures ANOVA). Two-tailed tests were used throughout at an alpha threshold of 0.05. Unless otherwise noted, analyses combined data across Experiments 1 and 2. For all ANOVAs run on these combined data, experiment number was included as a factor. For all of the hippocampal ANOVA effects described in the main text, interactions with experiment number were not significant (Ps > 0.2). See Supplementary Figs 2 and 6 for hippocampal and PPA data separated by experiment. All statistical models were implemented using R (https://www.r-project.org).
Data and Code Availability
All relevant data and code are available upon request from the first author (avi.chanales{at}nyu.edu).
Acknowledgements
We thank Anthony Stigliani and Kalanit Grill-Spector for providing stimuli for the category localizer. This work was supported by a grant from the National Institutes of Health (1RO1NS089729) to B.A.K.