Abstract
The human brain parsimoniously situates past events by their order in relation to time. Here, we show the posteromedial cortex geometrically abstracts the time intervals separating pairs of event-moments in long-term, episodic memory. Transcranial magnetic stimulation targeted at the precuneus erases these locally distributed multivariate representations, and alters the correlation between precuneal activity patterns and mnemonic judgements, revealing a critical role of the precuneus in abstracting temporal distances during episodic memory retrieval.
Time in physics is operationally defined as “what a clock reads”. While the passage of time between two moments can be precisely measured by a quartz crystal oscillator or biologically registered by distributed sets of brain regions across intervals of time (Buhusi & Meck, 2005; Meck, Penney, & Pouthas, 2008), the neural mechanisms that abstract such temporal distances separating events in long-term, episodic memory is incompletely understood (Mauk & Buonomano, 2004).
Representations of brief elapsed time can be inferred from single neuron activities in the primate brain (Jin, Fujii, & Graybiel, 2009; Leon & Shadlen, 2003). Time-registering neurons are found to code time with high precision in the cortico-basal ganglia circuits (Jin et al., 2009) and inferior parietal cortex (Leon & Shadlen, 2003) across short timescales. In contrast, when complex, coherent experiences become consolidated into long-term memories (McGaugh, 2000), the neural circuits that build time representations as an infrastructure for episodic retrieval are theorized to be distinct from those implicated in hippocampal-dependent encoding (Ezzyat & Davachi, 2014; Manns, Howard, & Eichenbaum, 2007) and retrieval (Hsieh, Gruber, Jenkins, & Ranganath, 2014; Nielson, Smith, Sreekumar, Dennis, & Sederberg, 2015), and from those during transient temporal processing (Jin et al., 2009; Leon & Shadlen, 2003). For the recollection of long-term autobiographical memories or episodic events, the posterior medial (PM) memory system plays an instrumental role (Ranganath & Ritchey, 2012). However, the critical issue of how elapsed time between pairs of long-term episodic events – and its interplay with the encoding context – is represented by the PM system has yet to be addressed. Here we investigated the abstraction, at a macro-anatomical level, of temporal distances that were encoded more than 24 hours previously (Kwok, Shallice, & Macaluso, 2012; St Jacques, Rubin, LaBar, & Cabeza, 2008), and determined how several members of this large cortical system are differentially implicated in this putative mnemonic function (Richter, Cooper, Bays, & Simons, 2016).
Combining functional magnetic resonance imaging (fMRI) with an interactive-video memory paradigm and a temporal order judgement task (TOJ; Fig. 1a)—a validated paradigm to study neural correlates underpinning temporal distances between units of memory traces (Kwok et al., 2012; Manns et al., 2007; St Jacques et al., 2008)—we adopted a two-forked protocol to ascertain how temporal distances separating pairs of past moments-in-time are represented in the human neocortex. We first investigated the neural correlates of temporal distances with a multivariate searchlight representational similarity analysis (RSA) (Nili et al., 2014) tuned to identify a locally distributed neural representation using 9-mm radius spherical searchlight. We parametrized a large set of pairs of event-moments geometrically separated by varying temporal intervals and applied RSA to compare neural representational dissimilarity matrices (RDM) with a number of parametric, condition-rich hypothetical models. Applied across the entire brain, the searchlight approach identifies local multi-voxel patterns driven by structured co-activation at a voxel or sub-voxel level within the size of the searchlight, thereby giving us a snapshot of the locally distributed neural architecture supporting temporal order judgements. To enhance the causal strength of the anatomical associations thereby revealed, we then focally disrupted the identified critical region with repetitive transcranial magnetic stimulation (rTMS), seeking to confirm its functional necessity for mediating the distributed representation of temporal distances. The spatial scale of rTMS-induced disruption is comparable to that of our chosen searchlight, rendering it an optimal tool for targeted, reversible disruption of the distributed representation of interest.
For memory encoding, participants played an interactive video game containing seven distinct yet related chapters, each in the range of tens of minutes on day 1 (Supplementary Fig. 1, Supplementary Table 1). By the nature of the video game, the within chapter segments contained more coherent narrative strands than those across chapters, yet all chapters were connected by a common plot. After a 24-hour retention period (day 2), on each trial, participants judged the temporal order of two images (extracted from their individually-played video game, Fig. 1b), depicting two time-points in their encoded memory, while their blood-oxygen-level-dependent (BOLD) activity was measured (TOJ task, Fig. 1c). Assuming a scale-free temporal memory representation(Kwok & Macaluso, 2015), we manipulated the between-images temporal distances (TD) for all pairs of images so that the TD distribution adhered to a power function permitting scale-invariance across subjects (Gallistel &Gibbon, 2000) (60 levels of TD, Fig. 1d).
Figure 1. Experiment overview. (a) In experimental sessions 1 and 2, participants played a video game containing seven related chapters with a first-person perspective for encoding, and 24 hours later, received 20 min of repetitive transcranial magnetic stimulation (rTMS) to either one of two cortical sites before performing a temporal order judgement task during fMRI. Order of TMS sites (within-subjects) and choices of video game chapters were counterbalanced across subjects (Supplementary Table 1). The two experimental sessions were conducted on different days to minimize rTMS carry-over effects (mean separation = 8 days). Participants underwent structural MRI scans and familiarized themselves with the gameplay using a console prior to experimental sessions proper. (b) Gameplay video: each encoding session consisted of seven chapters (Supplementary Fig. 1). (c) Temporal order judgement task. Participants chose the image that happened earlier in the video game and reported their confidence level. (d) 60 levels of temporal distances (TD) were generated for each subject according to their subject-specific video-playing duration. Although the absolute TD were different across subjects (Supplementary Fig. 1), we ensured it to be scale-invariant using a power function during image selection. Actual TDs from one subject (subj01) are shown. (e) Two pairs of images were extracted from the same chapter (Within-chapter) or two adjacent chapters (Across-chapter). The 60 levels of TD were fully matched within-subjects for these two conditions. Note that scenes depicted in Within-chapter tended to be more contextually similar than those depicted in Across-chapter. (f) Stimulation sites, superimposed onto one subject’s MRI-reconstructed skull, are marked by a green pointer. The MNI coordinates for precuneus stimulation: x, y, z = 6, -70, 44.
We first searched for neural representations resembling the matrix of temporal distances using searchlight representational similarity analysis (Nili et al., 2014). Without a priori bias for any region of interest, we searched the entirety of the cortex using a RDM consisting of 60 levels of logarithmically-transformed subject-specific TD, and identified voxels that contain information of the set of geometrically defined temporal distances in memory (Fig. 2a, Online Methods and Supplementary Fig. 2). Within this 60 × 60 matrix of moment-wise distances, we revealed that the neural pattern of judging the temporal order of a pair of memories separated with a given temporal distance is more similar to other temporal order judgements which enclosed temporal distances of a comparable scale. These voxels were in the posteromedial parietal areas, bilateral angular gyri, and middle frontal gyri (Fig. 2b, Supplementary Table 2). The scale-invariance of the design allowed us to generalize the resultant mnemonic abstraction of temporal distances across individuals, in line with the scale-free representation of time (Gallistel & Gibbon, 2000).
The temporal-distance memory representation could be confounded by perceptual similarity in each pair of images. To address this concern, we conducted three separate RSAs, in which we indexed perceptual similarity between the pairs by three different metrics: RGB cross-correlation RDM, RGB-intensity RDM, and RGB-histogram RDM. No similar representation was observed in the posteromedial parietal cortex using these candidate RDMs, suggesting that our results could not be driven simply by the image properties (rows 3 – 5, Supplementary Fig. 3). To reinforce the point, considering previous work on space-time relationships in memory (Deuker, Bellmund, Navarro Schroder, & Doeller, 2016; Nielson et al., 2015), we also computed the number of locations each participant had virtually traversed in the video, quantified the space displacement embedded between the paired images, and entered them into subject-specific Situational Changes RDM (Supplementary Fig. 4). The searchlight RSA results showed the pattern representation observed in Figure 2b can be better accounted for by temporal-distance embedded between the images than by spatial-distance that participants had travelled (rows 2, Supplementary Fig. 3). Inferential statistical comparisons of multiple models showed that the neural signals in the selected conjunction ROI (see below) were most attributable to the mnemonic representation of the moment-wise comparison of multiple temporal distances (Supplementary Fig. 5).
Having identified a multivariate pattern underlying the temporal memory abstraction in the posterior medial parietal cortex (Fig. 2b), we probed further whether there were voxels whose activities change monotonically as a function of temporal distance, irrespective of any locally distributed representation. Whole-brain parametric modulation analysis (pmod) revealed TD-specific BOLD signals in a cluster within the posteromedial region, including the precuneus (Fig. 2c, Supplementary Table 3). This relationship could not be attributed to difficulty(results were same after trial-by-trial reaction times were accounted for, Supplementary Fig. 6a). Since the two types of analysis extracted two different kinds of neurally-encoded information, their overlap in the precuneus strengthens the claim to a critical role for the region. We accordingly created a conjunction map (Fig. 2d), so that both the multivariate and univariate results underlying the temporal distance abstraction would be available for the next analysis.
Temporal distance or proximity is intertwined with the construct of context. A prominent memory model posits that item representations are linked to a changing “context” at encoding, such that a common retrieved context is triggered during recall for items that were experienced within a similar temporal context (Polyn, Norman, & Kahana, 2009). This theory predicts the TD-neural representation pattern similarity index to be higher when the two event-moment images are extracted from a “similar context” than when they are from two “different contexts”. To test this, we manipulated the factor “context” by controlling whether the paired images presented at TOJ task were extracted from the same chapters or two adjacent chapters of the video game while keeping the 60 TDs fully matched between the two conditions (Within-chapter vs. Across-chapter, Fig. 1e). We re-ran the searchlight RSA, now separately for the Within-chapter and Across-chapter trials. The representation of TD was observed only in the Within-chapter condition but not in the Across-chapter condition (Fig. 2e, right). The identified voxels were in the precuneus, retrosplenial cortex, and angular gyri bilaterally (Fig. 2e left). For statistical inference, we extracted the similarity index with the conjunction mask combining the RSA and pmod maps which contained the TD-modulated signals from each subject for comparison (Fig. 2d and Online Methods). In line with our prediction, the voxels in the Within-chapter trials contained higher pattern similarity to the TD RDM than Across-chapter trials (Fig. 2e, middle panel; one-side: P = 0.045), confirming the neural pattern similarity related to the TD RDM was indeed stronger in Within-chapter trials. This difference was also found in a voxel-wise univariate analysis. The beta-estimates (β) from pmod analyses using TD as a regressor were significantly higher in the Within-chapter condition (“Within-chapter > Across-chapter” P = 0.005 with RT effects regressed out, Supplementary Fig. 6c-d). This confirmed the mnemonic representation of temporal distances was determined by whether the pairs of images were experienced within a similar context, corroborating the interaction between temporally- and semantically-defined factors observed during memory encoding (Ezzyat & Davachi, 2014) and retrieval (Hsieh et al., 2014).
To strengthen the claim to a pivotal role of the precuneus in this operation we strategically deployed a disruptive technique, targeting it with repetitive transcranial magnetic stimulation to interrogate changes on both neural and behavioral levels (within-subjects: TMS-precuneus vs. TMS-vertex; Fig. 1f and Supplementary Table 1). Strikingly, in the fMRI session of which subjects’ precuneus had received TMS stimulation immediately prior to TOJ, the widespread representation of TD was eradicated (rows 1, Supplementary Fig. 3). To the best of our knowledge, no study has previously demonstrated such sharp susceptibility to TMS of such mnemonic representations. The multivariate representations were more vulnerable than the conventional, activation-based analyses: TMS to the precuneus did not induce any discernable changes in the univariate BOLD intensity (Supplementary Fig. 7). Altogether, these findings strengthen our argument that memory traces that are represented during temporal order judgement are indeed conveyed in some localized multivoxel readouts housed in the PM system cortices, above and beyond the modulated changes in canonical BOLD activation.
In light of the fractionation view for the parietal cortex (Nelson, McDermott, & Petersen, 2012), we further tested the hypothesis that there might be differences in the patterns of neural activity associated with the abstraction of temporal distances in the sub-regions of the PM memory network (Ranganath & Ritchey, 2012). Based on our results (Fig. 2b and Supplementary Table 2) and previous work on the parcellation of the PM memory network(Richter et al., 2016), we chose six anatomical regions-of-interest in the PM memory network (ROIs: bilateral precuneus, bilateral angular gyrus and bilateral hippocampus; see Online Methods), together with the primary visual region (entire occipital cortex) as a control, to test for the disruptive effect caused by the TMS. We extracted the similarity indices from these ROIs and found that the neural-TD pattern similarity was significantly weakened in the left precuneus following TMS to the precuneus (Fig. 3; and to a lesser extent also the left angular gyrus, Supplementary Fig. 8). Such differences were not obtained in the other ROIs (Supplementary Fig. 8). Moreover, we found that changes in individuals’ neural-TD pattern similarity in the vertex condition to be associated positively with their TOJ memory performance in this key region (also in the hippocampi, see Supplementary Fig. 8), implying the multivoxel representations are important neurobiological prerequisites for the ability to support temporal order judgement. However, disrupting the precuneus with magnetic field prior to retrieval put this neural–behavioral correlation into disarray (Fig. 3) and slowed response times (Supplementary Fig. 9), indicating the removal of the TD representations from the precuneal voxels would causally result in overt memory performance changes. Since the focal perturbation altered the mnemonic representation in the precuneus, the angular gyri and the hippocampi, it implied the disruption might have been effective through inducing alternation in functional connectivity between multiple regions, or more globally throughout the entire parietal memory network (Nilakantan, Bridge, Gagnon, VanHaerents, & Voss, 2017; Wang et al., 2014).
The present parietal representation of temporal distances between pairs of episodic events, as revealed by both univariate and multivariate pattern analyses, might act in parallel with hippocampal cells that code specific moments in time or temporal positions (Eichenbaum, 2014), or act independently as a separate mnemonic establishment of episodes over and above the hippocampal memory ensemble (Brodt et al., 2016). The current findings align with the temporal context model (Polyn et al., 2009) that the fine-grained TD memory information distributed in this cortex is comprehensively stronger when paired images were associated within the similar context. Building on extant connectivity findings between the hippocampus and neocortical regions (Moscovitch, Cabeza, Winocur, & Nadel, 2016; Ranganath & Ritchey, 2012; Vincent et al., 2006) and the hippocampal role in temporal context memory (Ezzyat & Davachi, 2014; Hsieh et al., 2014), our demonstration of distributed pattern of temporal information in the posteromedial parietal region implied the existence of a higher level parietal mnemonic readout of temporal distances between episodic experiences.
In summary, our multivariate searchlight results reveal that the temporal distance representations in the posterior parietal cortex, especially the precuneus, during TOJ retrieval are determined by how distant (and how similar the encoding contexts) two given event-moments the subjects had encountered (Kwok et al., 2012; St Jacques et al., 2008). We also establish that this multivoxel mnemonic abstraction is localized in the precuneal area and perturbation to it alters the neural––behavior relationship across the global parietal memory network, assigning this structure as a locus of flexibly effecting the manipulation of physical time during episodic memory retrieval.
AUTHOR CONTRIBUTIONS
Q.Y. designed and conducted the experiments, analyzed data, and wrote the manuscript. Y.H. discussed the results and commented on drafts. Y.K. advised on TMS protocol. K.A. produced indices for RDM Models 4 and 5. S.C.K. conceived and designed the study, acquired funding, supervised the project, and wrote the manuscript.
COMPETING FINANCIAL INTERESTS
The authors declare no competing financial interests.
METHODS
Participants
Twenty individuals participated in the study (7 female, 22.55 ± 1.54 years, mean ± sd). Data from 3 subjects were excluded due to either poor performance (1 subject performed at chance level) or scanner malfunction (projector crashed during scanning for 2 subjects at TMS-vertex session), resulting in a final group of 17 subjects (7 female, 20.65 ± 1.54 years, mean ± sd). All subjects were unfamiliar with the video game, had normal or correct-to-normal vision and did not report neurological or psychiatric disorders or current use of psychoactive drugs. All subjects were eligible for MRI and TMS procedures based on standard MRI safety screening as well as on their answers to a TMS safety-screening questionnaire (Rossi, Hallett, Rossini, Pascual-Leone, & Safety of, 2009). No subjects withdrew due to complication from the TMS or MRI procedures, and no negative treatment responses were observed. All subjects gave written informed consent and were compensated for their participation. All procedures were performed in accordance with the 1964 Helsinki declaration and its later amendments and approved by University Committee on Human Research Protection of East China Normal University (UCHRP-ECNU). The number of participants was determined based on previous studies with similar design (Ezzyat & Davachi, 2014; Wang et al., 2014).
Experimental design, stimuli, and tasks
Encoding: Interactive video game
The action-adventure video game (Beyond: Two Souls) was created by the French game developer Quantic Dream and played in the PlayStation 4 video game console developed by Sony Computer Entertainment. Participants played the game using a first-person perspective. In order to ensure the participants master the operational capability, they were trained to play the game with two additional game chapters (Training chapters: Welcome to the CIA, and The Embassy). The training session varied in duration depending on the dexterity of each participant on using the console (40 – 60 min per chapter). After the training session, participants played 14 chapters in total across two sessions: 7 in Experimental Session 1 and then another 7 in Session 2 (Fig. 1). The video game they played were recorded and stored as a single video file in MP4 format (Chapters 1~7: My Imaginary Friend, First Interview, First Night, Alone, The Experiment, Night Session, Hauntings; Chapters 8~14: The Party, Like Other Girls, Separation, Old Friends, Norah, Agreement, Briefing; see Supplementary Fig. 1).
Retrieval (scanned): Temporal Order Judgment (TOJ) task
The TOJ retrieval task required participants to choose the image that happened earlier in the video game they had encoded. The task was administrated inside an MRI scanner, where visual stimuli were presented using E-prime software (Psychology Software Tools, Inc., Pittsburgh, PA), as back-projected via a mirror system to the participant. Each trial was presented for 5 s during which participants performed the temporal order judgment. They were then allowed 3 s to report their confidence level following the memory judgement. Participants performed the TOJ task using their index and middle fingers of one of their hands via an MRI compatible five-button response keyboard (Sinorad, Shenzhen, China). Participants reported their confidence level (“Very Low”, “Low”, “High”, or “Very High”) regarding their own judgment of the correctness of TOJ with four fingers (thumb was not used) of the other hand. The left/right hand response contingency was counterbalanced across participants. Participants were told they should report their confidence level in a relative way and make use of the whole confidence scale. Following these judgments, a fixation cross with a variable duration (1 – 6 s) was presented. Each participant completed 240 trials in each of the two experimental sessions. Participants were given 15 practice trials using paired images extracted from the two additional chapters they had played in the training session out of the scanner to ensure they understand the task procedure. Participants completed a surprise recognition test after TOJ task outside scanner; data of which are not reported here.
For the TOJ task, we selected static images from the subject-specific recorded videos which the participants had played the day before. Each second in the video consisted of 29.97 static images (frames). For each game-playing session, 240 pairs of images were extracted from the seven chapters and were paired up for the task based on the following criteria: (1) the two images had to be extracted from either the same chapters or adjacent chapters (Within- vs. Across-chapter condition); (2) the temporal distance (TD) between the two images were matched between Within- and Across-chapter condition; (3) in order to maximize the TD, we first selected the second longest chapter of the video and determined the longest TD according to a power function (power = 1.5), at the same time ensuring the shortest TD to be longer than 30 frames. We generated 60 progressive levels of TD among these pairs (each level repeated twice). In sum, three within-subjects factors regarding the TOJ retrieval task were manipulated: (1) 60 TD levels permitting scale-invariance across subjects between two images (see below); (2) Context (two images extracted from either Within- or Across-chapter); (3) TMS stimulation (TMS-precuneus vs. TMS-vertex, see below).
Selection of 60 levels of temporal distances (TDs)
In order to maximize the range of all TDs, we first selected the second longest chapter of the video game and determined the longest TD (L), while ensuring the shortest TD to be longer than 30 frames. The 60 TD levels were selected according to this function, where L denotes duration of the second longest chapter of the video game in each experimental session, n denotes TD level, and value of TDn were rounded to the nearest integer using the “round” function in MATLAB. Note that the actual TDs were different across subjects, but since we applied a power function, the scale was thus rendered invariant(Kwok & Macaluso, 2015). Image-pairs extraction from each of the chapters were independently conducted across subjects. The numbers of images-pairs extracted from each of the chapters were approximately equal within-subjects.
Transcranial magnetic stimulation
TMS procedure and protocol
TMS were applied using a 70 mm Double Air Film Coil connected to a Magstim Rapid2 (The Magstim Company, Ltd., Whitland, UK). In order to localize the target brain regions precisely, we obtained individual anatomical T1-weighted magnetic resonance images and then imported them into BrainSight (Rogue Research Inc., Montreal, Canada) for stereotaxic registration of the TMS coil with the participants’ brain. The position of the coil and the subject’s head were co-registered with BrainSight, and monitored using a Polaris Optical Tracking System (Northern Digital, Waterloo, Canada) during TMS. Positional data for both rigid bodies were registered in real time to a common frame of reference and were superimposed onto the reconstructed three-dimensional MRI images of the subject using the BrainSight. The center of the coil was continuously monitored to be directly over the site of interest. For all sites (vertex, precuneus, and motor areas for measuring active motor threshold), the TMS coil was held tangential to the surface of the skull and was placed in a rostro-caudal direction. An adjustable frame was used to hold the TMS coil firmly in place, while the participants rested their heads on the chin rest. Head movements were monitored constantly by BrainSight and were negligible. We measured subjects’ active motor threshold, defined as the lowest TMS intensity delivered over the motor cortex necessary to elicit visible twitches of the right index finger in at least 5 out of 10 consecutive pulses. The location used to determine the active motor threshold was identified with a single pulse of TMS over the motor cortex at the left hemisphere. The TMS coil was systematically moved until the optimal cortical site was located to induce the largest and most reliable motor response; this stimulus output was then recorded. The TMS intensity was then calibrated at 110% of individual active motor threshold (stimulator output: 75.2 ± 6.9%, mean ± se, range from 63% to 88%, Supplementary Table 1). In Experimental Session 1 and 2, the TMS was applied at a low-frequency rate of 1 Hz with an uninterrupted duration of 20 min.
TMS stimulation sites
The target stimulation was delivered to the precuneus(Kwok et al., 2012) (MNI x, y, z = 6, -70, 44), whereas the control stimulation was delivered to the vertex. The vertex was defined individually by the point of the same distance to the left and the right pre-auricular, and of the same distance to the nasion and the inion. Due to the folding of the two cerebral hemispheres, the stimulated vertex site lies at a considerable distance from the TMS coil, thereby diminishing the effectiveness of the magnetic pulses. Stimulating the vertex is not known to produce any memory task-relevant effects and deemed as a reliable control site. Stimulation magnitude and protocols in the present study were comparable to those used in similar studies that are robust to produce significant memory-related changes by targeting at the precuneus (Bonni et al., 2015; Kraft et al., 2015; Mancini et al., 2017) or lateral parietal cortices (Nilakantan et al., 2017; Wang et al., 2014). Immediately after the end of the stimulation, participants performed four runs of Temporal Order Judgment task in the MRI scanner (delay period between the end of TMS and the beginning of MRI: Mprecuneus = 15.29 min, Mvertex= 20.76 min, t (16) = -0.87, P = 0.4).
MRI data acquisition and preprocessing
Data acquisition
All the participants were scanned in a 3-Tesla Siemens Trio magnetic resonance imaging scanner using a 32-channel head coil (Siemens Medical Solutions, Erlangen, Germany) at ECNU. In each of the two experimental sessions, a total of 1,350 fMRI volumes were acquired for each subject across 4 runs. The functional images were acquired with the following sequence: TR = 2000 ms, TE = 30 ms, field of view (FOV) = 230 × 230 mm, flip angle = 70°, voxel size = 3.6 × 3.6 × 4 mm, 33 slices, scan orientation parallel to AC-PC plane. High-resolution T1-weighted MPRAGE anatomy images were also acquired (TR = 2530 ms, TE = 2.34 ms, TI = 1100 ms, flip angle = 7°, FOV = 256 × 256 mm, 192 sagittal slices, 0.9 mm thickness, voxel size = 1 × 1 × 1 mm).
Preprocessing
Preprocessing was conducted using SPM12 (http://www.fil.ion.ucl.ac.uk/spm). Scans were realigned to the middle EPI image. The structural image was co-registered to the mean functional image, and the parameters from the segmentation of the structural image were used to normalize the functional images that were resampled to 3 × 3 × 3 mm. The realigned normalized images were then smoothed with a Gaussian kernel of 8-mm full-width half maximum (FWHM) to conform to the assumptions of random field theory and improve sensitivity for group analyses (Friston et al., 2002). Data were analyzed using general linear models and representational similarity analyses as described below with a high-pass filter cutoff of 256 s and autoregressive AR(1) model correction for auto-correlation.
Functional MRI data analysis
Parametric modulation analysis
First-level models were performed on the fMRI data collected from the TMS-vertex session only (either all the trials altogether or separately for Across-chapter vs. Within-chapter conditions). In all of these models, each of the 240 trials was modeled with a canonical hemodynamic response function as an event-related response with a duration of 5 s.
For the TMS-vertex session as a whole (Across-chapter and Within-chapter trials collapsed), we performed three parametric modulation analyses (pmod), each with a different combination of modulatory regressor/regressors (namely, TD; TD + RT; SC). For the TD pmod, we assigned the actual TD values at encoding as the modulatory parameter, and used the polynomial function up to first order. Several regressors of no interest were also included: 6 head movement regressors and 1 missing trial regressor (i.e., no-response trials; number of missing trials of Across-chapter condition: 5.65 ± 6.96, of Within-chapter condition: 5.29 ± 6.8; n = 17, mean ± sd) and the run mean. The purpose of this analysis was to test for any linear TD-dependent modulation of signal intensity in the brain between the TD between the two images at encoding and the brain activity during TOJ retrieval of the same events. For the TD + RT pmod, we aimed to identify the voxels whose activities changed as a function of TD after the removal of the influence of reaction time. Each subjects’ RTs corresponding to each TD level were entered as the modulatory parameter, together with the regressors of no interest as above. For the Situational Change (SC) pmod, we tested for any linear SC-dependent modulation of signal intensity in the brain between the spatial displacement of two images at encoding and the brain activity during TOJ retrieval of the same pairs of images. We determined the numbers of situational change between the paired of images by analyzing all the subject-specific videos frame by frame to mark out the boundaries at which a situational location had changed (Supplementary Fig. 4), and entered these video– specific and subject-specific SC values as the modulatory parameter.
For the Across-chapter vs. Within-chapter comparison, we also performed three pmod analyses with identical sets of regressors as described above (namely, TD; TD + RT; SC). We looked for changes in brain responses as a linear function of the regressor of interest (i.e., TD or SC). Maps were created by multiple regression analyses between the observed signals and regressors. The contrast maps from the first-level model of parametric analyses were taken for second-level group analyses and entered into one-sample t-tests. The group analyses were performed for each contrast using a random effects model(Penny & Holmes, 2004). The statistical threshold was set at p < 0.05 (FWE corrected) at cluster level and p < 0.001 at an uncorrected peak level according to the SPM12 standard procedure. The activation cluster locations were indicated by the peak voxels on the normalized structural images and labeled using the nomenclature of Talairach and Tournoux (1988) (Talairach & Tournoux, 1988).
Searchlight Representational Similarity Analysis (searchlight RSA)
RSA were conducted using the RSA toolbox (http://www.mrc-cbu.cam.ac.uk/methods-and-resources/toolboxes/) on the fMRI data following realignment and normalization, but without smoothing. In the Across-chapter vs. Within-chapter comparison, each unique TD level was modeled with a separate regressor and was contrasted to produce a T-statistic map (spmT maps), creating 120 statistical maps in total (Across- vs. Within-chapter conditions; 2 repetitions for each TD level). For the TMS-vertex vs. TMS-precuneus comparison, we collapsed the trials in the Across- and Within-chapter conditions and generated 60 statistical maps in either of the two sessions (4 repetitions for each TD level). Using searchlight RSA, spherical searchlights with a radius of 9 mm (93 voxels, volume = 2,511 mm3) were extracted from the brain volume and then the data (i.e., signal intensity) for the 60 TD levels were Person product-moment (1 - r) correlated with every other levels to generate a representational dissimilarity matrix (RDM), reflecting the between-condition dissimilarity of BOLD signal response. These neural RDMs were then Spearman-rank correlated with a set of candidate RDMs (see Supplementary Fig. 2a), reflecting different predictions of the information carried by similarity structure of neural signal responses and generated correlational maps (r-maps). Finally, these r-maps were converted to z-maps using Fisher transformation. All the z-maps were then submitted to a group-level one-sample t-test to identify voxels in which the similarity between the predicted RDM and observed neural RDM was greater than zero. This allowed us to identify voxels in which information of TD at retrieval might be represented (see Supplementary Fig. 2b). The statistical threshold was set as identical to those employed in the univariate analysis, which was at p < 0.05 (FWE corrected) at cluster level and p < 0.001 at an uncorrected peak level.
Leave-one-subject-out approach (LOSO), functional and anatomical ROIs
We applied LOSO approach to create functional ROIs to avoid statistical bias (Esterman, Tamber-Rosenau, Chiu, & Yantis, 2010). For instance, in order to identify an ROI (i.e., conjunction mask in Figure 2d) for Subj01, we re-estimated the contrast using a one-sample t-test on the whole-brain searchlight z-maps obtained from Subj02 to Subj17. Likewise, we also re-estimated the contrast using a one-sample t-test on the contrast maps obtained from the Pmod analysis of Subj02 to Subj17. We set the same threshold reported above to extract clusters from these two statistical maps. We next overlaid the two resultant maps and extracted a conjunction region (mask01), with which we used to extract the value in the searchlight z-map from Subj01 for further statistical analysis. This procedure was repeated 17 times and generated 17 different ROIs, which provided statistically independent regions to extract values for testing differences between conditions (Figure 2e). For the anatomical ROIs (depicted in Figure 3 and Supplementary Figure 8), 7 regions (Hippocampus_L; Hippocampus_R; Precuneus_L; Precuneus_R; Angular_L; Angular_R; Occipital cortex) of AAL template (Tzourio-Mazoyer et al., 2002) were created as masks. We extracted and averaged the similarity value within these masks for each subject for statistical tests.
Candidate representational dissimilarity matrices (RDM)
Model 1 (TD RDM, 60 × 60): We ranked the difference across the 60 TD levels, from the shortest to the longest TD, for the Within-chapter condition (or Across-chapter condition, the two RDMs were identical because of their matched TD). We first log-transformed the subject-specified TD values for each pair of images and then computed the differences with and among every other TD levels producing 60 × 59/2 values, which were then assigned to the corresponding cells of the RDM.
Model 2 (Situational Change RDM, 60 × 60): Since the temporal and spatial dimensions were closely inter-correlated. We checked whether the situational change might influence the neural patterns in those voxels that represent the TD information. We analyzed the subject-specific videos frame by frame and marked out the boundaries at which a situational location had changed (see illustration in Supplementary Fig. 4). Then we computed the numbers of situational changes contained in each of the paired images and then computed the differences with and among every other conditions producing 60 × 59/2 values, which were then assigned to the corresponding cells of the RDM.
Model 3 (RGB-cross-correlation, 60 × 60), Model 4 (RGB-intensity RDM, 60 × 60) and Model 5 (RGB-histogram RDM, 60 × 60) considered the perceptual characteristics of the images used in TOJ. For model 3, the similarity measure was based on the cross-correlation value between two images (image of size 1920 × 1080) for the three color channels (red, green, and blue; RGB). For every pair of images in each of the three color channels (RGB), we computed the cross-correlation coefficients between the pair. This is a measure of the displacement of one image relative to the other; the larger the cross-correlation coefficient (which ranges between −1 and 1), the more similar the two images was. We then computed the differences with and among every other conditions producing 60 × 59/2 values, which were then assigned to the corresponding cells of the RDM. For Model 4, we computed the pixel-wise difference between pair images for the three color channels (RGB). The computed difference is useful when the compared images are taken from a stationary camera with infinitesimal time difference. The output pixel for each color channel is assigned with the value 1 if the absolute difference between the corresponding pixels in the image pair is non-zero, or a value of 0 otherwise. A single value is generated for each of the three color channels by summing all the output pixel values (either 0 or 1). We averaged the sum of difference for all three color-channels for the intensity value of each pair of images and then computed the differences with and among every other conditions producing 60 × 59/2 values, which were then assigned to the corresponding cells of the RDM. For Model 5, we constructed color histograms for image pairs and computed the Sum-of-Square-Difference (SSD) error between them for the three color channels (RGB). For each color channel the intensity values range from 0 to 255 (i.e., 256 bins), we first computed the total number of pixels at each intensity value and then computed the SSD for all 256 bins for each image pair. The smaller the value of the SSD, the more similar the two images (image pair) was. We then computed the differences with and among every other conditions producing 60 × 59/2 values, which were then assigned to the corresponding cells of the RDM. In contrast to model 4, this approach does not require corresponding pixels in the image pair to be the same, but rather measures the existence of pixel intensity in both images. Overall, the three perceptual-similarity models (3, 4 and 5) look at different similarity measures and they complement each other; thus any difference in the appearance of the two images irrespective of the temporal distance, could be accounted for by at least one of the three models. For any two similar images, the RGB-intensity RDM results in a very small value, thus the corresponding pixels are virtually the same for the entire image. The RGB-histogram will also result in a small value as the image pairs will have the same histogram bins. The RGB-cross-correlation value will be close to 1, signifying the similarity in the images. When subsections of a scene are visible in both images with varied brightness, the RGB-cross-correlation value will still be closer to 1 but with a very high RGB-intensity RDM value.
Behavioral data analysis
To look into the TD-independent effects of precuneal disruption by TMS on memory performance, we collapsed the 60 TD levels for each subject and entered their percentage correct (ACC) or response times (RT) of the TMS sessions (TMS-vertex vs. TMS-precuneus) and the Context factor (Across-chapter vs. Within-chapter) as within-subjects factors into repeated ANOVAs.
Data and code availability
Data and codes are available upon request.
ACKNOWLEDGMENTS
This research is sponsored by the National Natural Science Foundation of China 31371052 (Y.H.), by the Ministry of Education of PRC Humanities and Social Sciences Research grant 16YJC190006, STCSM Shanghai Pujiang Program 16PJ1402800, STCSM Natural Science Foundation of Shanghai 16ZR1410200, Large Instruments Open Foundation (ECNU), and NYU Shanghai and the NYU-ECNU Institute of Brain and Cognitive Science at NYU Shanghai (S.C.K.).