Abstract
Learning requires changing the brain. This typically occurs through experience, study, or instruction. We report a new way of acquiring conceptual knowledge by directly sculpting activity patterns in the human brain. We used a non-invasive technique (closed-loop real-time functional magnetic resonance imaging) to create novel categories of visual objects in the brain. After training, participants exhibited behavioral and neural biases for the sculpted, but not control categories. The ability to sculpt new conceptual distinctions in the human brain, applied here to perception, has broad relevance to other domains of cognition such as decision-making, memory, and motor control. As such, the work opens up new frontiers in brain-machine interface design, neuroprosthetics, and neurorehabilitation.
One Sentence Summary Sculpting new visual categories in the brain with neurofeedback restructured subjective experience and neural processing.
Main Text
“For if someone were to mold a horse [from clay], it would be reasonable for us on seeing this to say that this previously did not exist but now does exist.”
Mnesarchus of Athens, ca. 100 BCE (1)
Humans continuously learn through experience, both implicitly (e.g., through statistical learning; 2,3) and explicitly (e.g., through instruction; 4,5). Brain imaging has provided insight into the neural correlates of acquiring new knowledge (6) and learning new skills (7). As humans learn to group distinct items into a novel category, neural patterns of activity for those items become more similar to one another and, simultaneously, more distinct from patterns of other categories (8–10). We hypothesized that we could leverage this process using neurofeedback to devise a fundamentally new way for humans to acquire conceptual knowledge. Specifically, sculpting patterns of activity in the human brain (‘molding the neural clay’) that mirror those expected to arise through learning of new visual categories may lead to enhanced perception of the sculpted categories (‘they now exist’), relative to similar, control categories that were not sculpted. To test this hypothesis, we implemented a closed-loop system for neurofeedback manipulation (11–18) using functional magnetic resonance imaging (fMRI) measurements recorded from the human brain in real time (every 2s) and used this method to create new neural categories for complex visual objects.
Using radial frequency components of an image (19,20), we generated a two-dimensional manifold of closed-contour shape stimuli that varied smoothly in appearance as a function of distance from a center shape (Fig. 1A). In this manifold, each of six equally spaced diameters (AG, BH, CI, DJ, EK, FL) defined a novel and arbitrary category boundary between the two groups of shapes on either side. We verified that the shapes were perceived similarly across categories in a psychophysical experiment with 10 participants using a self-paced two-alternative-forced-choice (2AFC) task. For each diameter through the space, we presented shapes along the diameter and participants judged which of the two “endpoint” shapes (e.g., A vs. G) more closely resembled the presented shape. We then used these categorization data to compute a psychometric function (Fig. 1B-G). Participants did not have a priori biases in how they categorized items across multiple partitions (psychometric function slope, repeated measures ANOVA, factor=‘direction’; F(5,54)=0.29, p=0.918).
(A) Stimuli were constructed using combinations of seven radial frequency components (RFCs) of fixed frequency F and phase P (19,20). We parametrically varied the amplitude A of two of the seven RFCs independently to generate a two-dimensional manifold of complex shapes: a displacement of 1 arbitrary distance unit (a.d.u.) on the manifold corresponded to an amplitude change of 6 arbitrary units for the 1.11 Hz component and an amplitude change of 3 arbitrary units for the 4.94 Hz component (changes in component appearance shown above and to the right of the shape space). Each point on this manifold corresponded to a unique shape that could be computed using the mathematical convention described above. Example shapes at 8 a.d.u. from the center are enlarged and shown on the outer edge (e.g., A, B, etc.); they lie 4.2 a.d.u. apart from each other on the manifold. (B-G) Psychometric function estimates for the 2AFC behavioral experiment conducted for each of six diameters: AG, BH, CI, DJ, EK, and FL. Thin lines in distinct colors represent psychometric functions for each participant (n=10) and thick black lines represent participant averages for each diameter.
Using this set of complex visual objects, we trained a new group of 10 participants over 10 days each in a real-time fMRI experiment. We used a closed-loop neurofeedback procedure to sculpt a categorical distinction between the objects on either side of one diameter through shape space. By sculpting divided neural representations for these objects, we sought to change how they are perceived (Fig. 2A). On Day 1, participants performed a 2AFC behavioral task (identical to the one used to norm the stimulus space) to obtain a baseline measure (psychometric function slope) of how they categorized stimuli along each diameter in the space. We verified that the training cohort of participants did not have any biases across categorization directions and that there were no significant differences between the training and norming cohorts (psychometric function slope, repeated measures ANOVA, factors=‘direction’, ‘cohort’; interaction: F(5,108)=0.19, p=0.964; main effect of direction: F(5,108)=0.57, p=0.722; main effect of cohort: F(1,108)=1.71, p=0.194) (Fig. S1). On Days 2-3, we ran two fMRI sessions in which we mapped how each individual participant’s brain represented the stimulus space, defined candidate brain regions for neurofeedback, and built a model of how the shape space was represented in participants’ brains to use for real-time tracking during the training phase of the experiment. After Day 3, we randomly selected one of the diameters as the category boundary to be sculpted during training for each participant (not disclosed to them). On Days 4-9, participants underwent real-time fMRI neurofeedback training, during which they were shown wobbling versions of the shapes (Movie S1) and asked to generate a mental state that would stabilize the shape in their visual display. Unbeknownst to the participants, they were given positive visual feedback (reduced wobbling) and monetary rewards when the neural representation of the shape they were viewing resembled the neural representations of other shapes in the category selected to be sculpted during training. Finally, on Day 10, participants repeated a version of the 2AFC behavioral task, in which we evaluated changes in categorical perception across both sculpted and untrained category boundaries.
(A) The experiment extended over 9-10 days for each participant: one behavioral pre-test (2AFC along each diameter from Fig. 1), two localizer fMRI sessions, 5-6 real-time fMRI neurofeedback training sessions (5 days: n=2, 6 days: n=8), and a final behavioral post-test (repeat of first day). (B) 25 shapes shown during the localizer scans were used to define parametric neural representations of the shape space: 5 shapes along each diameter, −8, −4, 0, +4, +8 a.d.u. from the center shape common to all diameters. (C) Ideal representational similarity matrix (RSM) for a neural parametric representation (left) and average RSM for the 25 shapes in the final neurofeedback region of interest (ROI) of the training cohort (n=10, right). The correlation between the ideal and observed RSMs was very high (r=0.971), suggesting strong parametric representation of the shape space in the ROI. (D) Cortical map of the neurofeedback ROI for an example participant encompassing several brain regions, including extrastriate visual cortex, parahippocampal gyrus, hippocampus, and medial frontal gyrus (see Methods for details of the selection procedure). Cortical maps for all participants (n=10) are shown in Figs. S3–S12.
We defined a region of interest (ROI) to target with neurofeedback by running two block-design fMRI localizer sessions (Days 2-3) in which we showed participants 12-15 repetitions of 49 shapes spanning the two-dimensional stimulus manifold. We constructed an ideal representational similarity matrix of a subset of 25 of these shapes (Fig. 2B) and performed a cortical searchlight analysis (21) to find all brain regions that represented the shape space parametrically (i.e., akin to a cognitive map (22,23), Fig. 2C,D). We defined the target neurofeedback ROI for each participant as the union of all their parametric representation voxels, excluding early visual cortex (780-2,209 voxels per participant; average 1,401±115 s.e.m. voxels; Figs. 2C & S2). A representative ROI is shown in Fig. 2D and all participants’ ROIs are shown in Figs. S3–S12. We hypothesized that if any portion of these ROIs (either individually or in concert) were causally related to the categorical perception of our stimuli, then sculpting novel neural categories across all of them collectively would maximize the chances of influencing participants’ perception after training.
Using this ROI and the fMRI data from Days 2-3, we selected an arbitrary diameter in the stimulus space as a category boundary for each participant (Fig. 3A). They were not informed that this was a study about visual categories, that the shapes were drawn from a continuous circular space bisected by diameters, nor about which diameter was selected as the boundary. We built a model of the neural representations of the two resulting categories of shapes. Each category was modeled as a multivariate Gaussian distribution with shared covariance (Fig. 3B). The parameters of the category-specific distributions were computed using maximum-likelihood estimation for the top 100-200 principal components of the signal elicited by that category’s shapes in the neurofeedback ROI (we used a grid search to determine the optimal number of components and hemodynamic lag for each participant). To verify that our model could predict the distinction between shape categories in the brain, we inverted it into a discriminative log-likelihood-ratio-based pattern classifier. Decoding accuracy was high (chance = 50%): 73%-81% in the lateral occipital region (defined using Freesurfer; 24), known to represent distinctions between closed contour objects (25), and 71%-79% in our neurofeedback ROI (Fig. 3C). Given that our ability to induce neural plasticity using neurofeedback is predicated on the accurate decoding of shape category during training, this measure was also used as a selection criterion for inviting participants back for training after Day 3 (criterion: >70% decoding accuracy for both LO and neurofeedback ROI; 7 participants excluded; Table S1).
(A) Unbeknownst to participants, we chose an arbitrary category boundary (blue) and sought to sculpt separate neural representations of shape categories along the perpendicular direction (purple). For example, the neural representations of shapes Q1 and Q2 were sculpted to be increasingly distant from the category boundary, illustrated by P1 and P2, respectively. (B) Two-dimensional schematic of neural representations of shape categories (the actual neural space comprised 100-200 dimensions). We sculpted the neural representations of shapes to become more similar to the corresponding attractor regions (Att) for categories 1 and 2. Arrows indicate relative change in log-likelihood ratio (LLR) required to reach the goal. (C) Gaussian model accurately predicted shape category in all participants during localizer scans (Days 2-3, 75.7%±0.854% s.e.m.; chance=50%, *p<0.001). We observed no significant differences between trained (sculpted) and untrained categories. (D) Feedback was given if neural evidence for the category of the shape on the screen (e.g., Q1) exceeded a given threshold in the distribution of LLR values for all shapes shown during the localizer scans. LLR histograms with 75% thresholds for an example participant are shown here (see Fig. S13 for all participants).
During neurofeedback training (Days 4-9), participants were told that during each trial they would see a continuously wobbling shape and had to “Generate a mental state that will make the shape wobble less or even stop!” Participants received additional instructions and were incentivized with a monetary reward for each trial in which they made progress, but were not told that feedback was based on the shape of the object, no less its relationship to the selected category boundary (see Methods for details). For each neurofeedback trial, we selected a random shape (seed) drawn from one of the two categories and generated a continuous oscillation in parameter space centered at that shape’s coordinates. Visually, participants saw shapes morphing gradually between the seed shape and other nearby, similar shapes (Movie S1). The apparent magnitude of the shape morph on the screen was manipulated via neurofeedback at every fMRI timepoint (TR=2s). Positive feedback (less oscillation) was given if the neural evidence for the category of the current shape exceeded a given threshold in the distribution of LLR values under our estimated Gaussian distributions for all shapes shown during the localizer scans (Figs. 3D, S13, & S14, Movie S2). The initial threshold was selected based on pilot scans indicating that the neural model’s decoding performance decayed as a function of wobble intensity (Fig. S15). The threshold was continually adjusted during the experiment using an adaptive procedure designed to provide feedback on approximately 33% of the trials (Table S2). Each participant (n=10) completed 520–800 trials of closed-loop real-time neurofeedback training over the course of 5-6 daily sessions. A summary of feedback thresholds and performance for each participant, training session, and run is given Table S3.
After finishing the training sessions, we evaluated whether neural sculpting of the novel visual categories was successful; that is, whether the closed-loop real-time neurofeedback procedure induced neural plasticity of the shape space representation. To measure changes in the neural representations of shape categories, we computed the difference between the average LLR of shapes during the first two days of training versus the last two days of training. We compared this score between the “trained” shape categories that we expected to have separated in the brain because they were bisected by the selected diameter, versus the “untrained” shape categories that would have been created by the perpendicular diameter, for which no separation was expected (Fig. 4A). We found strong positive neural sculpting effects for 6 out of 10 participants (between +0.342 and +1.204) and weak negative effects for 4 out of 10 participants (between −0.212 and −0.067) (Fig. 4D). From the first two days to last two days of training, we found a reliable difference in the LLR change between trained and untrained categories (Fig. 4B-D; +0.369±0.154 s.e.m; t(9)=2.27, p=0.049). Within this interaction, LLR showed an increase for the trained categories (Fig. 4B; +0.234±0.077 s.e.m., t(9)=2.87, p=0.019) and a numeric, but not significant decrease for the untrained categories (Fig. 4B; −0.136±0.087 s.e.m., t(9)=1.48, p=0.174). Effects of neural sculpting on neural representations for each individual participant are shown in Fig. S16A. These results were obtained using the LLR for the first timepoint of each neurofeedback trial to avoid potential repetition-suppression effects (26). However, similar results were obtained for LLR from up to the five timepoints from each trial during which the participants potentially received feedback (Table S4). Similar results were also observed using a ratio of the LLR for trained and untrained categories, instead of a difference score (Fig. S17). Together, these results suggest that our real-time fMRI neurofeedback neural sculpting procedure was successful in manipulating how the human brain represents complex objects from the multidimensional shape space we constructed.
(A) Shape space with example category boundaries. Diameters for trained category distinction: LLR = blue, psychometric function slope = average of yellow lines. Diameters for untrained category distinction: LLR = purple, psychometric function slope = average of red lines. (B) Effects of neural sculpting on neural representations and perception for trained and untrained categories: differences in LLR and psychometric function slopes (colors as in subpanel A; *p<0.05). (C) Changes in the brain due to neural sculpting predict perceptual changes. (D) Change in LLR between the last two days and the first two days of training for individual participants. Positive values indicate stronger neural boundaries in trained vs. untrained categories. (E) Change in psychometric function slope for trained vs. untrained categories between behavioral pre- and post-tests for individual participants. Positive values indicate stronger categorical perception for trained versus untrained categories.
Our goal was not only to modify the brain but also to test the hypothesis that neural sculpting is sufficient to alter behavior. Namely, we predicted that training would induce categorical perception whereby shapes close to the category boundary come to be perceived as clearer category members. Thus, we hypothesized that the slope of the psychometric function running perpendicular to the sculpted boundary would become steeper. To measure these perceptual changes, we estimated psychometric functions for the trained and untrained categories from the 2AFC behavioral task conducted on Day 10 and calculated a normalized difference score from the slopes on Day 1 (Fig. 4A). The discrimination slope significantly increased for the trained categories compared to the untrained categories (Fig. 4B-E; +0.137±0.049 s.e.m., t(9)=2.79, p=0.021). We observed strong positive behavioral effects for 6 out of 10 participants (0.137-0.414), a weak positive effect for one participant (0.021) and weak negative effects for 3 out of 10 participants (−0.062, −0.033, −0.008). Estimated psychometric functions for all tests, directions, and participants are shown in Fig. S16B & Table S5.
We interpret this induced categorical perception as resulting from the neural sculpting of the categories. Accordingly, the increase in behavioral discrimination slope should be related to the increase in neural separation of the category representations. Indeed, across participants there was a significant positive relationship between trained-related behavioral and neural changes: the relative increase in LLR between neural representations of shapes from distinct categories in the neurofeedback ROI was correlated with the relative strengthening of perceptual categorization in the trained direction compared to the untrained direction (Fig. 4C; Pearson r=0.761, p=0.011); a similar result was obtained using the LLR ratio measure (Fig. S17D; r=0.686, p=0.029).
The success of our neural sculpting procedure may be related to individual variation in the ability to decode category information from neural activity patterns. Specifically, category decoding was the basis of neurofeedback and thus better performance would lead to more precise feedback. To test this relationship, we defined a baseline neural decoding measure for each participant as the model decoding accuracy in the neurofeedback ROI during the two localizer scans (Days 2-3). As expected, baseline decoding was high for all participants (75.7%±0.854% s.e.m., chance=50%, Fig. 3C), and no significant difference was observed at baseline between trained (78.1%±1.46% s.e.m.) and untrained (75.3%±0.875% s.e.m.) categories (t(9)=1.12, p=0.293). However, collapsing across categories, baseline decoding was highly correlated with the neural LLR change (r=0.720, p=0.019) and the behavioral slope change (r=0.777, p=0.008) from pre- to post-training. Interestingly, these correlations persisted (LLR, r=0.710, p=0.021; slope, r=0.729, p=0.017) even after excluding the trained direction from the baseline decoding average, suggesting that the precision of shape representations (or our ability to recover them) can be considered as a general property of individuals that can be used to predict training success. Moreover, these findings provide strong post-hoc support for our decision to focus neurofeedback training on participants with high baseline neural decoding.
Our results cannot readily be explained by differences in the amount of feedback received by each participant, either overall, or for each individual category. Because of our adaptive feedback procedure, each participant received a similar amount of feedback as a proportion of total experiment trials (28.7%±0.657% s.e.m.) and for each stimulus category (Category 1: 14.9%±1.74% s.e.m.; Category 2: 13.9%±1.43% s.e.m.; t(9)=0.305, p=0.767). Furthermore, our results cannot be explained by participants explicitly learning the random category distinction enforced during the training. At the conclusion of the study (but before being fully debriefed), we informed participants that the stimuli came from an unspecified number of categories. When asked to guess freely, participants reported an average of 3.3 categories whose boundaries did not coincide with those of the categories randomly selected for each of them in the experiment (Fig. S18 & Table S6). When subsequently forced to split the stimulus space into 2 categories using a straight line, they were unable to correctly identify the category boundary, although their performance suggests a trend toward having some crude information (Figs. S18, S19 & Table S6; trained=0°, average absolute displacement=31.5°±5.89° s.e.m; untrained=90°, average absolute displacement=58.5°±5.89° s.e.m.; random choice=45°, t(9)=2.18, p=0.058). However, any such information about categories cannot readily be explained by implicit learning or priming (27,28), given that the training included an equal number of stimuli from both categories, these stimuli were presented in a random order during training, and there were no differences in the amount of feedback received for stimuli of each category.
In sum, neural sculpting with real-time fMRI puts forward two key advances. First, prior correlational studies have linked distributed patterns or fMRI activity to the perception of objects from different categories (29–31). By sculpting new neural representations and inducing perceptual changes, we provide causal evidence that these representations are sufficient for categorical perception. Second, prior neurofeedback studies focused exclusively on reinforcing existing neural representations of visual features or categories (11,12). In contrast, here we use neurofeedback for the more radical goal of sculpting categories that did not previously exist in the brain. Together, our findings broaden the possibility for non-invasive causal intervention in humans with neurofeedback fMRI, including the distant possibility of sculpting more extensive knowledge or complex concepts in the human brain, bypassing experience and instruction.
Previous studies have shown that learning novel categories induces increased within-category similarity (9,10), as well as neural and perceptual suppression of task-irrelevant features (32,33). Consistent with these findings, the observed differences between trained and untrained categories (both neurally and behaviorally) reflected numerical increases in sensitivity for trained categories and numerical decreases in sensitivity for untrained categories (Figs. 4B & S16). Additional work is required to disambiguate the contributions of these two opposite effects and their interaction with perceptual change within the context of neural sculpting. Furthermore, we sculpted visual categories within a feedback ROI that comprised multiple disparate brain regions, tailored to the individual response of each participant during independent localizer scans. Our approach was designed to maximize the possibility that our manipulation would bring about causal behavioral change, but leaves open the question of which individual brain region(s) are necessary and sufficient to influence in order to induce perceptual change. This is a question for future research that our neurofeedback sculpting technique is uniquely suited to answer.
We found that participants who showed higher baseline levels of neural shape decoding also showed larger training effects, suggesting that differences in either the precision of the neurofeedback signal or the quality of the underlying shape space representation may affect training outcomes. Different training outcomes across our cohort may also be due to individual differences in factors such as attentional control (12), ability to access and modify neural activity under an external constraint (i.e., non-responders; 34), or differential plasticity across brain regions (35). Given the open-ended nature of our instructions for the neurofeedback trials (‘Generate a mental state that will make the shape wobble less!’), it is possible that differences in strategies employed by the participants may have played a role in achieving success. Consistent with this, participants reported extremely varied approaches to the task, the most common of which involved naming the shapes and/or focusing on local features (e.g., an indentation) instead of the entire shape. Although no consensus strategy could be identified, nor one that could explain either the neural or behavioral outcomes (Table S6), it is possible that further work could identify such strategies.
Our findings suggest that neural sculpting may also be an effective tool for influencing and potentially enhancing the neural effects of other learning processes that involve neural differentiation. For example, in the visual domain, our technique may prove useful for enhancing neural differences associated with domain-level expertise for categorization or recognition (e.g., in the fusiform face area, FFA; 36,37) or brain-based education initiatives that complement classical learning paradigms (38). Conversely, our method may provide novel avenues to reverse de-differentiation of neural patterns stemming from aging (39) or from disorders impairing the natural function of visual brain regions such as visual agnosia (40) or prosopagnosia (41). Beyond the visual domain, previous studies have shown that other forms of neurofeedback aimed at enhancing or suppressing activity or connectivity of regions of interest (11,42) can be used to treat various neuropsychiatric disorders, such as major depressive disorder (43–47) and autism spectrum disorder (48). Our work provides a potential way to enact more complex interventions in patient populations that would attempt to sculpt specific patterns of brain activity within regions of interest, in order to neurally mimic (or increase alignment with) neural activity patterns of healthy controls. Finally, our work opens the possibility for novel neurorehabilitation approaches, including brain-machine interfaces (49) and neuroprosthetics (50) that rely on generating or maintaining a specific multivariate pattern of brain activity in real-time.
To conclude, the work presented here represents a new, non-invasive approach to investigating the causal relationship between neural representations and behavior using fMRI. More than 2,100 years after Mnesarchus of Athens made his observations about the nature of human experience (1), we show that his philosophical insights may, in fact, adequately describe how perception arises from human neural representations: sculpt a concept in the neural clay of the brain and it will subsequently exist.
Funding
John Templeton Foundation, Intel Corporation, NIH Award R01 MH069456.
Author contributions
All authors contributed equally to conceptualizing the research and the paradigm. M.C.I. and V.J.H.R. collected the data. M.C.I. performed the data curation, formal analysis, validation, and visualization of the results with input from all other authors. M.C.I. wrote the original draft of the manuscript and all authors contributed substantially to reviewing and editing the final written version of the work.
Competing interests
Authors declare no competing interests.
Data and materials availability
Before publication, data and code are available upon reasonable request from the corresponding author (M.C.I.). Upon publication, data and code will immediately be made publicly available through the OpenNeuro neuroscience data sharing platform, on as well as on the corresponding author’s website (www.MariusCatalinIordan.com) and GitHub.
Supplementary Materials for
Materials and Methods
Building and Norming the Shape Stimulus Space
We generated complex visual shapes defined by seven radial frequency components (RFCs) (19–20) (Fig. 1A). To obtain each shape, sine waves determined by the seven RFCs were added together and the resulting wave was wrapped around a circle to obtain a closed contour which was then filled in to create a shape. We chose this technique for building a complex visual stimulus space because prior work has shown that radial shapes generated using subsets of the same RFCs are perceived monotonically and that their neural representation is also monotonically related to parametric changes in the amplitude of the RFCs across multiple brain regions (19,51). We created a two-dimensional shape space by independently varying the amplitude of two of the seven RFCs (from 12.6 to 36.6 for the 4.94 Hz component and from −6.0 to +42.0 for the 1.11 Hz component), while holding the amplitudes of all other components constant. Distance in this two-dimensional space was computed via changes in amplitude on each axis independently: a displacement of 1 arbitrary distance unit (a.d.u.) corresponded to a change in amplitude of 6 arbitrary units for the 1.11 Hz component and a change in amplitude of 3 arbitrary units for the 4.94 Hz component. Using this mathematical convention allowed us to generate novel visual shapes for any arbitrarily chosen point within this two-dimensional manifold.
To map the stimulus space perceptually, we chose shapes that sat at ±2, ±4, ±6, and ±8 arbitrary distance units from a fixed center shape along 6 equally spaced radial directions (60 degrees; AG, BH, CI, DJ, EK, FL; Fig. 1A). This yielded 9 equally spaced shapes along each of 6 radial directions, including the common center shape (e.g., A and G are the 8 a.d.u. shapes from center for the AG direction). We then recruited 16 healthy individuals (‘norming cohort’) from the Princeton community with normal or corrected-to-normal vision who provided informed consent to a protocol approved by the Princeton University Institutional Review Board and were compensated ($12 per hour) to participate in a self-paced two-alternative-forced-choice behavioral experiment. For each radial direction, participants were told that the endpoints (e.g., shapes A and G, 8-distance-units away from center, shown continuously on the left and right of the screen) correspond to two different categories of shapes and they were instructed to use this information to categorize new shapes that appeared in the center of the screen. New shapes were drawn randomly from the 9 shapes chosen for that particular radial direction (see above) such that 40 repetitions of all 9 shapes were shown in each run of the experiment. To ensure that categorization effects were not influenced by the left/right placement of the endpoints on the screen, each participant performed two separate runs for each radial direction (e.g., AG vs. GA) and the order of the runs and assignment of endpoints to each run was counterbalanced across participants. All shapes subtended 5 degrees of visual angle on the screen and the experiment was run entirely using Matlab (version 2016a) and PsychToolbox software (52). The endpoint shapes (identical to one of the ‘categories’) were also included as catch-trials and we excluded from the analysis six participants who incorrectly categorized more than 10% of the endpoint shapes for any given direction (criterion: >16 out of 160; 2 participants excluded: 29/160 and 67/160) or more than 5% of the endpoint shapes overall across all directions (criterion: >48 out of 960; 4 participants excluded: 77/960, 84/960, 93/960, and 456/960).
For the remaining 10 participants, for each radial direction, we computed the probabilities (x) of categorizing each individual shape as the two shapes at the endpoints and fitted a corresponding psychometric function (estimated slope and threshold; Fig. 1B-G):
To check that the stimulus space was perceived similarly across all 6 radial directions, we performed a repeated measures ANOVA on the slopes of the resulting psychometric functions with ‘direction’ as a factor.
All participants in the main experiment (see below, n=10, ‘training cohort’) took part in an identical self-paced two-alternative-forced-choice behavioral experiment on Day 1 (see below) of the study. We performed a repeated measures ANOVA on the slopes of the resulting psychometric functions with ‘direction’ and ‘cohort’ as factors to verify that no significant differences existed between how participants in the norming and training cohorts perceived the stimulus space (Fig. S1).
fMRI Localizer Scans to Identify Cognitive Map Brain Regions
The localizer scans (which were not analyzed in real time) involved showing participants sample shapes from the stimulus space in order to (a) test whether we can decode shape category information with high precision from neural patterns (participant exclusion criterion detailed below); (b) find all brain regions that represent the stimulus space as a cognitive map ((22–23) prerequisite for defining our neurofeedback target region of interest (ROI), detailed below); and (c) to build a neural model of the shape space (detailed below) to be used during real-time neurofeedback training to track how participants’ brains represent the stimuli presented on the screen in each experimental trial.
We performed two scan sessions per participant on separate days. Each scan comprised an anatomical scan (T1 MPRAGE acquisition), followed by 4-8 functional (echoplanar) runs (11m23s each) for a total of 12-15 total functional runs collected per participant across the two scan days. During each functional run, participants were shown 49 distinct stimuli from the shape space using a short block design: 2.5s stimulus presentation, black shapes on equiluminant gray background; followed by a 10 second inter-stimulus interval during which a countdown to the next stimulus presentation was displayed in white font on the same equiluminant gray background. During future real-time fMRI scans, z-scoring with respect to the full response of each run would not be possible during continuous acquisition. Thus, to normalize signal in each voxel and run in a consistent manner, we included a 70s countdown (white font on equiluminant gray background) at the beginning of each run, during which the participants were instructed to keep their eyes open and stay alert. After eliminating the first 6 TRs to allow T1 equilibration and accounting for hemodynamic lag, this yielded 30 TRs of functional data per run, comparable between all runs (localizer and real-time), which could be used to normalize the signal acquired at each subsequent TR (see fMRI Preprocessing for Localizer Scans). Stimuli (color: black) were displayed on a rear-projection screen using a projector (1024×768 resolution, 60 Hz refresh rate) and subtended 5 degrees of visual angle against a uniform gray background. Participants viewed the visual display through a mirror that was mounted on the head coil. Stimulus presentation was performed entirely using Matlab (version 2016a) and PsychToolbox software.
For each localizer run, similarly to the shape space norming experiment, we chose shapes that sat at ±2, ±4, ±6, and ±8 arbitrary distance units from a fixed center shape along 6 equally spaced radial directions (60 degrees; AG, BH, CI, DJ, EK, FL; Fig. 1A). This yielded 9 equally spaced shapes along each of 6 radial directions, including the common center shape (e.g., A and G are the 8 a.d.u. shapes from center for the AG direction), for a total of 49 unique shapes. The presentation order was randomized across runs and across participants. During each unique shape presentation (3s), participants were asked to indicate using an MR-compatible button box whenever a shape ‘oscillated’ (orthogonal task to ensure alertness). An ‘oscillation’ was defined as a parametric continuous perturbation of the shape in a random direction in the parametric manifold, after which it returned it its original position with the same speed that it was perturbed (250 ms total, i.e., we picked a random point in parameter space near the original shape chosen by drawing an i.i.d. sample from the circumference of a circle of radius 2 arbitrary distance units centered at the shape, then we created an animated morph between the original shape and the one defined by the point on the circle, and back to the original shape). Each trial had either 0, 1, or 2 ‘oscillations’, with an average of 1 ‘oscillation’ per trial, randomized across all trials in a run. The absolute position of the center of mass of the shapes was always held constant at the center of the screen; the oscillation in parameter space simply created the visual impression that the current shape morphed into a similar shape and then reverted to the original (see Movie S1 for a demo of continuous oscillation).
fMRI Acquisition
Structural and functional MRI data were collected on a 3T Siemens Skyra scanner with a 64-channel head coil. Functional images were acquired using an echoplanar imaging sequence (TR, 2000 ms; TE, 28 ms; 36 transversal slices; voxel size, 3×3×3 mm; 64° flip angle; IPAT factor 2). This sequence produced a full brain volume for each participant. Anatomical images were acquired using a T1-weighted MPRAGE sequence, using a GeneRalized Autocalibrating Partial Parallel Acquisition (GRAPPA) acceleration factor of 2 (TR, 2530 ms; TE, 3.3 ms; voxel size, 1×1×1 mm; 176 transversal slices; 7° flip angle). Given the time constraints of real-time fMRI processing and our goal of matching acquisition parameters for classic and real-time scans, we did not correct for susceptibility-induced distortions in any of our echoplanar images.
fMRI Preprocessing for Localizer Scans
The images were preprocessed using custom AFNI (53), Freesurfer (24), and bash scripts. All analyses were performed in participants’ native space and no smoothing was applied. The first six volumes of each run were discarded to allow T1 equilibration. For each run, the remaining functional images were spatially realigned to correct for head motion and registered to the participants’ structural T1 image, using boundary-based registration implemented in AFNI’s afni_proc.py. We then performed polynomial trend correction using AFNI’s 3dDeconvolve tool and simultaneously regressed out 6 degrees of head motion (x, y, z, roll, pitch, yaw) from the functional data. We used FreeSurfer’s recon-all tool to estimate the boundaries of the gray matter for each participant, as well as anatomically defined lateral occipital (LO) and early visual cortex (EVC) regions of interest, and AFNI’s 3dSurf2Vol and 3dAllineate to obtain corresponding volume masks of the gray matter and regions of interest aligned to each participant’s functional data. The functional data for each localizer run within these regions was then z-scored using the mean and standard deviation of each voxel during the countdown at the beginning of each corresponding run (30 TRs after the first six volumes were discarded).
Neural Cognitive Map of the Shape Stimulus Space and Neurofeedback Region of Interest
To measure whether object-selective cortex (LO) represents the shape space as a cognitive map (i.e., similar to how it’s built parametrically and how it’s perceived by observers), we picked five out of the nine shapes for each direction (Fig. 2B; center, ±4 a.d.u, ±8 a.d.u.) and computed a representational similarity matrix (RSM) describing the relative relationships between neural activity elicited by the shapes in LO. We compared this RSM with an ideal RSM obtained by assuming a parametric linear distance relationship between the neural activity of the shapes (Figs. 2C & S2; e.g., the more distinct the shapes, the larger the distance between them) using Pearson correlation.
To construct our neurofeedback region of interest (ROI), we sought all brain regions in each participant’s brain that represented the stimulus space parametrically (i.e., akin to a cognitive map (22,23)) and ran a searchlight analysis (21) (cube of side length 7 voxels intersected with gray matter volume, minimum 172 voxels, distance between cube centers = 2 voxels) across their entire gray matter mask. In each voxel cube, we computed the Pearson correlation with the ideal RSM (Fig. 2D). First, to increase the probability that potential sources of top-down control (e.g., prefrontal cortex, parietal cortex, etc.) would be subsumed by the neurofeedback ROI, we ensured that it must include at least one non-connected cluster of 50 or more voxels outside of extrastriate visual cortex (i.e., not connected to LO). Second, to ensure that ROI sizes are relatively similar across participants despite natural variability in signal strength, we restricted the final ROI size to between 750 and 2,250 voxels across all participants. To satisfy both of these constraints, we started by selecting all clusters of 20 or more voxels that had an r>=0.50 with the ideal RSM, excluding the EVC. If the conditions were met, this became our neurofeedback ROI. Otherwise, we either relaxed or increased the r threshold stepwise using values from the set {0.25, 0.33, 0.40, 0.50, 0.60, 0.66} until both of the conditions were met for each individual participant. Final thresholds and resulting neurofeedback ROI sizes for each participant are shown in Table S1. Surface maps of final ROIs for each participant in the full experiment are shown in Figs. S3–S12.
Neural Model of Shape Stimulus Space
To measure how the shape space was categorically represented in the neurofeedback ROI, we split the stimulus space into groups of two categories each comprising half of the shape space circle (Fig. 3A). To maximize the number of training examples available, the category boundaries were chosen halfway between the diameters defined in Fig. 1A, e.g., halfway between lines AG and BH, halfway between lines BH and CI, etc. This yielded 6 categorical partitions of the shape space, each associated with 24 out of 49 unique trials per localizer experiment run (24 shapes shown belonged to one category partition, 24 belonged to the other partition, and the center shape was on the dividing category boundary line, and thus was not used for estimating a category model). For each trial in each functional localizer run, we selected two TRs following each short block onset (accommodating for optimal activation given hemodynamic lag, see below) and considered them as independent training points in order to maximize the amount of training data available. This procedure yielded 48 training points per category per functional run, with a total of 576—720 training points per category per run for each participant and each of the 6 categorical partitions (i.e., ABCDEF vs. GHIJKL, BCDEFG vs. HIJKLA, CDEFGHI vs. JKLABC, etc.). To build a neural model, we modeled each categorical partition as a multivariate Gaussian distribution. The parameters of these category-specific distributions were computed using maximum likelihood estimation using the localizer scan data associated with each category partition (schematic shown in Fig. 3B). Given the high dimensionality of the training data (750— 2,250 voxels per participant) and the relatively small number of training examples (n=576—720 per category per participant), we used a dimensionality reduction procedure to project the training data (voxel activations) onto its first 100, 150, and 200 principal components (ProjectedVoxelActivityk={projectedVoxelActivityi}, k∈{1,2}) and estimated a shared covariance matrix (ΣS) for both shape categories in each partition:
Additionally, given potential variability in hemodynamic lag between brain regions and across participants (54), we also selected two potential lags: the training data comprised either the 3rd and 4th TRs following short block onset (4s lag) or the 4th and 5th TRs following short block onset (6s lag). We built neural category models for each direction in each participant using all six combinations of these two parameters (PC dimensions = 100, 150, 200 x hemodynamic lag = 4s, 6s) in both LO and the neurofeedback ROI. To verify that our model could correctly predict the distinction between shape categories in the brain, we used it as a linearly discriminative log-likelihood-ratio-based pattern classifier by computing the probability of a new point x (n-dimensional neural representation of a new shape) under the two estimated distributions:
Using our classifier, participants whose average leave-one-run-out cross-validated decoding accuracy across all 6 category partitions was below 70% (chance = 50%) in either LO or the neurofeedback ROI using all possible parameter combinations were excluded from participating in the training portion of the experiment as their reliability of decoding was considered too low to afford real-time neurofeedback training. For all other participants, we selected the parameter combination (PC dimensions x hemodynamic lag) that gave the highest average decoding accuracy across all 6 categorical partitions in the neurofeedback ROI (Fig. 3C), regardless of performance in LO. Localizer sessions decoding results for LO and neurofeedback ROIs, as well as final parameter combination choices for each trained participant are shown in Table S1.
Neurofeedback fMRI Experiment Design
The full neurofeedback experiment comprised ten total sessions per participant (Fig. 2A):
Session 1: Behavioral 2AFC pre-test
Sessions 2-3: Neural localizer scans / Model training data acquisition
Sessions 4-9: Real-time fMRI neurofeedback training
Session 10: Behavioral 2AFC post-test (identical to pre-test, counterbalanced presentation order)
We recruited twenty-five healthy participants over the course of 17 months (May 2018-October 2019; 13 female, 23 right-handed, ages 18-35, mean age 22.4) from the Princeton community with normal or corrected-to-normal vision who provided informed consent to a protocol approved by the Princeton University Institutional Review Board and who were compensated for their participation ($12/h for behavioral sessions, $20/h for fMRI sessions, retention incentives: $5/session cumulative bonus for each fMRI session beyond the first, $0.10 per successful neurofeedback trial, $100 for completing the full experiment). The behavioral pre-test and the neural localizer scans were run within the span of 1-3 weeks for each participant. If the participant had satisfactory performance on the behavioral test (using criteria listed in section Building and Norming the Shape Stimulus Space) and if a neural model with high decoding accuracy could be built using the neural data from the two localizer scans (using criteria listed in section Neural Cognitive Map of the Shape Stimulus Space), then the participant was invited to take part in the rest of the experiment. Of our initial set of twenty-five participants, two were rejected for not meeting MRI safety criteria after participating in the behavioral pre-test, eight were rejected for poor behavioral and/or neural decoding performance during sessions 1-3, and five withdrew their participation after 1-3 sessions. Ten participants (7 female, all right-handed, ages 20-35, mean age 24.0) took part in the full experiment, two of which were trained for 5 days and eight of which were trained for 6 days. Participant performance during the localizer sessions and reasons for exclusion are shown in Table S1.
Each session of the experiment was run on a separate day. The real-time training portion and subsequent behavioral post-test took place 2-40 days after the localizer scans. To minimize fatigue and afford potential benefits from overnight memory consolidation (55), we sought to have participants come into the lab for at most 2 hours each day on consecutive days during the training portion of the experiment. Participants performed a variable number of trials per day, depending on the amount of time the scanner setup and participant pre-scan procedures took (minimum 20 trials, maximum 140 trials per day). We ensured that participants performed at least 500 total neurofeedback training trials before the final behavioral post-test. To minimize potential effects of over-training and to guarantee a comparable amount of training across our entire cohort, we stopped the training procedure once a participant performed 800 trials (two participants). Participants were allowed to skip at most one day for emergencies and/or personal reasons (ensuring that no more than 48h elapsed between sessions when this happened) and three participants chose to avail themselves of this opportunity.
Real-Time fMRI Neurofeedback Procedure
For each participant, after choosing the optimal combination of parameters for our neural model (see Neural Model of Shape Stimulus Space), we randomly choose one of the 6 radial category partitions to become their training category boundary (information hidden from participants), ensuring that neural decoding accuracy was at least 70% for this specific direction and that it hadn‘t been chosen before for any previous participant (after we trained 6 participants on all 6 distinct partitions, we reset this latter constraint).
We performed five to six neurofeedback training scan sessions per participant on separate days (see Table S3 for session details). Each scan comprised an anatomical scan (T1 MPRAGE acquisition), followed by one functional (echoplanar) localizer run (11m37s, identical to Sessions 2-3), and 1-7 functional (echoplanar) neurofeedback training runs per day (10m32s each) for a total of 26-40 total functional neurofeedback runs collected per participant (active in-scanner training time: 4.6h-7.0h per participant). The training runs used a block design with twenty trials each (16s=8TRs stimulus presentation, followed by 12s=6TRs ITI) and a 72s countdown at the beginning of each localizer run, analogous to the localizer scans (see fMRI Localizer Scans to Identify Cognitive Map Brain Regions), during which the participants were instructed to keep their eyes open and stay alert. Stimuli (color: black) were displayed on a rear-projection screen using a projector (1024×768 resolution, 60 Hz refresh rate) and subtended 5 degrees of visual angle against a uniform gray background. Participants viewed the visual display through a mirror that was mounted on the head coil. Stimulus presentation was performed entirely using Matlab (version 2016a) and PsychToolbox software. For each trial, a random shape was chosen from the stimulus space (i.i.d. from the shape space circle, real number parameters distinct from localizer scan shapes), making sure that equal numbers of shapes from both categories are shown in each training run (ten each, randomized order). Since preliminary tests showed that the decoding performance of the neural shape model deteriorated for shapes close to the category boundary (Fig. S15), we ensured that chosen shapes were at least one arbitrary distance unit away from the category boundary for each participant. To ensure that this constraint did not introduce an artificial perceptual cue about the category boundary, we also excluded shapes that were within one arbitrary distance unit from the perpendicular direction to the category boundary (i.e., the direction of maximal separation). For each trial, we generated a continuous oscillation centered at each trial-specific shape, determined by drawing points i.i.d. from a uniform distribution on the unit circle of a fixed radius around the shape (Fig. S14). Similarly to the localizer scans alertness task (except with no stops in movement), the shape traveled to the point on the circle via the corresponding radius, then returned to the original shape, then traveled to the next randomly selected point on the circle, etc. Each oscillation (back and forth to the circle) took 500ms, with 4 independent oscillation for each 2s TR (Movie S1).
The radius of the circle that determined each cluster of 4 oscillations in each TR was the main method by which neurofeedback was given to study participants. They were told that during each trial they would see a continuously oscillating shape on the screen and given the following set of instructions:
“Generate a mental state that‘s going to make each shape oscillate / wobble less or even stop!”;
“Different shapes may require different strategies.”; and
“When you are successful in slowing down the shapes, it has to do with your mental state over the past 8-10 seconds; progress is not instantaneous.”
The first instruction encouraged participants to generate mental state variability that may help shift representations in the neurofeedback ROI. The second instruction addressed the possibility that shifting representations in neural space in different (opposing) directions of a cognitive map may require different types of top-down influences; while it may have alerted participants to the possibility that multiple categories of shapes exist, the high variability of the shape space was designed to make boundaries between such categories difficult to intuit. The final instruction was necessary due to hemodynamic lag inherent to the fMRI signal, which together with its temporal autocorrelation properties that would invariably influence the visual (shape-related) feedback received during the experiment. As participants attempted to generate such a mental state while looking at the shape on the screen, during each TR (2s), we preprocessed their functional data (see fMRI Preprocessing for Real-Time Scans), and then we used our neural model to identify how far from the neural category boundary the shape was represented in the participants’ neurofeedback ROI; if the shape was strongly represented as a member of its ground truth category (known to the experimenter, but not to the participant), then positive feedback was given to the participant by shrinking the radius of the circle from which the oscillation parameters were chosen (causing the oscillation to look less extreme, Movie S1, trial 2), otherwise no feedback was given and the participant saw the oscillation continue with the same apparent amplitude (see Neurofeedback fMRI Data Analysis for details). Each oscillating shape was presented for a total of 8 TRs. During the first 3 TRs of the block, no feedback was given since hemodynamic lag prevented it. Afterwards, for the remaining 5 TRs of the block, participants received appropriate (or no) feedback on a TR-by-TR basis. Feedback was cumulative, such that if the participant maintained a mental state for the shape that took it away from the category boundary for 3 TRs out of the 5 TRs in which feedback is given, then the shape actually stopped oscillating altogether (radius of oscillation decreased to zero). In practice, this was extremely rare, but did occur sometimes, especially towards the end of the experiment. We limited each training trial block to 16s (8 TRs) to eschew potential repetition suppression effects that might diminish our ability to apply our neural model and/or decode the category of the shape on the screen (26). Participants were also instructed that they would receive a monetary reward of $0.10 every time they shrank the radius of the oscillation (positive feedback), together with a bonus of $0.25 if they managed to stop the oscillation completely (positive feedback for 3 out of a total of 5 possible feedback TRs per trial).
fMRI Preprocessing for Real-Time Scans
Structural and functional MRI data were collected in an identical fashion to the localizer scans (coil, equipment, parameters, etc.). The images were preprocessed using custom AFNI (53), Matlab (version 2016a), C++, and bash scripts. All analyses were performed in participants’ native space and no smoothing was applied. The first six volumes of each run were discarded to allow T1 equilibration. We used AFNI’s afni_proc.py script to preprocess the localizer run from each training day scan (analogous to fMRI Preprocessing for Localizer Scans), as well as align this functional run to the anatomical scan matching the Sessions 2-3 data used to generate the offline neural shape model. This allowed us to generate a template image (the 50th volume of the localizer EPI run) to which we would be able to align new volumes acquired during the neurofeedback training runs in a reasonable amount of time (aligning to an anatomical scan or to a full functional run usually requires tens of seconds and is not useable in a real-time setting, but aligning a single volume to another single volume requires <500ms on average). This template volume was also aligned to the neurofeedback ROI mask, which allowed fast recovery of the relevant functional data from each real-time acquired volume after fast single-volume-to-single-volume alignment, such that the neural model could subsequently be evaluated in real time.
The real-time functional scans were transferred via a fast network connection from the scanning console (Siemens) memory directly to the hard drive of a high-powered server on which all analyses were performed. Each real-time volume was aligned to the template volume (see above) using AFNI’s 3dvolreg and the neurofeedback ROI mask was used to subsequently extract the relevant functional data. Given the constraints of the real-time environment, classical preprocessing techniques and normalization procedures could not be implemented (e.g., polynomial trend regression, motion parameter regression, z-scoring with respect to the entire timecourse). Instead, the functional data acquired up to a given TR were temporally high-pass filtered (e.g., on TR 45, the first 45 TRs were filtered together; on TR 46, the first 46 TRs were filtered together; etc.) with a 52s (26 TRs) period cutoff (the length of two consecutive trials, including ITIs) using a fast custom script written in C++. The functional data was then z-scored using the mean and standard deviation of each voxel during the countdown at the beginning of each corresponding run (30 TRs after the first six volumes were discarded to allow T1 equilibration).
Neurofeedback Online fMRI Data Analysis
The functional data from the neurofeedback ROI was projected into the dimensionality-reduced space defined during the procedure used to select the optimal number of PC dimensions and hemodynamic lag for shape category decoding (see Neural Model of Shape Stimulus Space). The multivariate Gaussian neural model was then used to estimate the log-likelihood ratio (LLR) of the current data point (representing the neural activity elicited by the current stimulus being shown on the screen in the scanner) for each of the two categories of shapes (e.g., ABCDEF vs. GHIJKL) selected for training for the current participant. At the beginning of each trial, the default oscillation radius (see Real-Time fMRI Neurofeedback Procedure) was set to 1.875 arbitrary distance units in the parameter space based on a trade-off between ensuring that the perturbation is visually salient and pilot results regarding how much the wobble amplitude degraded stimulus category decoding (Fig. S15). If the computed LLR for a given TR was above a particular threshold (described below), then the amplitude of the oscillation was reduced during the subsequent TR by 0.625 arbitrary distance units (for all four oscillation iterations shown during that 2s TR, 500ms each), otherwise the radius was kept unchanged. Feedback was cumulative, allowing the radius of the oscillation to become zero (thus causing the shape to become static on the screen) after 3 cumulative (but not necessarily consecutive) TRs of positive feedback during a trial (see Fig. S14 for a diagram of this process). The LLR threshold for the first training run was initially set to the 70th percentile of the distribution of all data points (1,152—1,440 per participant) collected during the localizer scans for that particular participant under the optimal offline Gaussian neural model. After the first training run, the threshold was potentially adjusted using an adaptive procedure for two reasons. First, since the only way in which participants can learn to shift their neural representations is through receiving feedback via the shape oscillation radius, we sought to converge on giving feedback for approximately 1/3 of the twenty trials in each run. Second, given the high variability of conditions elicited by fMRI scanning across multiple days in terms of signal strength and sources of external noise, we expected our ability to decode using a static neural model to vary across the training week, as well as across training runs. As such, an adaptable threshold can recover from being too stringent (and never providing feedback) or too lax (providing feedback on almost every trial), both of which would be undesirable outcomes of the training procedure. Thus, the threshold was adjusted given participant performance (how many trials generated positive feedback) on all previous runs since the beginning of the training day as detailed in Table S2. By design, positive feedback could only be given on trials where the model correctly guessed the ground truth category of the shape being shown and, to avoid potentially random feedback being given due to classifier uncertainty near the category boundary (Fig. S15), the threshold was never lowered below 60% of the localizer scan LLR regardless of participant performance. The threshold was also kept unchanged between the last session of a training day and the first session of a subsequent day, regardless of performance on the final run of the previous day. The LLR distribution for a representative participant, together with a sample threshold (75%) are shown in Fig. 3D. Distributions and sample thresholds for all participants are shown in Fig. S13.
Post-Experiment fMRI Data Analysis
To evaluate our ability to influence neural representations using neurofeedback, we computed the model LLR of all stimuli shown to each participant across all their neurofeedback runs during their first two days of training (pre) and their last two days of training (post), for the trained category boundary (Trained, e.g., ABCDEF vs. GHIJKL) and for the orthogonal category boundary (Untrained, e.g., DEFGHI vs. JKLABC) (Fig. 4A). To maximize sensitivity of this measure given potential repetition suppression effects (26), we focused on LLR values from the first TR of each trial only as the most reliable measure of LLR across the neurofeedback training runs. Results using the first 2, 3, 4, and 5 TRs of each trial (participants received at most 5 instances of feedback per trial) were highly similar and shown in Table S4.
To measure relative neural change as a consequence of training, we computed a participant-level summary statistic of the effect of neurofeedback training by averaging the LLR across the first two days (pre) and the last two days (post) and used two measures. First, we used a difference score to compute the LLR change separately for the trained and untrained categories:
We then used a paired t-test to investigate whether training induced different changes for the trained and untrained category distinctions.
Second, we also computed the ratio between trained and untrained LLR for the pre and post conditions:
We then used a paired t-test to investigate whether the LLR ratio changed as a consequence of training between the first two days and the last two days of the experiment (Fig. S17).
Behavioral Tests, Debrief, and Post-Experiment Data Analysis
To evaluate the influence of our neurofeedback training on participants’ perceptual experience, we conducted a 2AFC experiment (procedure identical to that described in Building and Norming the Shape Stimulus Space) during the first and last sessions of the full experiment. Analogously to the norming experiment, we estimated psychometric function slopes for each of the 6 radial directions in our stimulus space separately for the behavioral pre-test (before training, first session), and for the behavioral post-test (after training, last session). To compare the strength of category separation between the pre- and post-tests, we computed the average slope for the two directions most perpendicular to the training category boundary in each participant (Trained, e.g., for categories ABCDEF and GHIJKL, the average slope of lines CI and DJ), as well as the average slope for the two directions most parallel to the training category boundary in each participant (Untrained, e.g., for categories ABCDEF and GHIJKL, the average slope of lines AG and FL) (Fig. 4A). As a control analysis, in addition to the change in psychometric slope for the Trained and Untrained directions in the stimulus space for each participant, we also computed this change for the average of the two directions that were neither parallel, nor perpendicular to the trained category boundary (Neutral, e.g., for categories ABCDEF and GHIJKL, the average slope of lines BH and EK) (Fig. S20).We used a normalized difference score to compute the relative change in psychometric function slope before and after training for the Trained and Untrained directions:
Finally, we computed a two-sided paired t-test between Trained and Untrained slopes to evaluate whether our neurofeedback training procedure induced a significant behavioral change consistent with our hypothesis (e.g., higher relative psychometric function sharpening in the Trained direction, compared to the Untrained direction). Estimated psychometric functions for all tests, directions, and participants are shown in Table S5.
To measure whether a relationship exists between neural change and perception, we computed the Pearson correlation between the LLR change (computed using both difference score and ratio) and the behavioral change, i.e., how well does LLR change in the neurofeedback ROI predict psychometric function change for the Trained vs. Untrained direction in behavior.
To investigate if participant training outcomes may be related to individual differences in access to neural information via computational analysis of fMRI data, we defined the baseline neural decoding measure for each participant as the Gaussian classifier decoding accuracy within the training ROI during the two localizer scans (Days 2 & 3), averaged across all six directions of the stimulus space we scanned (Fig. 3C). We then computed the correlations between baseline decoding for each participant and their LLR neural change measures, as well as their behavioral outcomes.
Finally, after the two-alternative forced choice behavioral post-test was completed (but before full debrief as to the goals of the study), we asked our participants to fill out a short questionnaire about their experience, including asking them about whether they suspected there were multiple categories of shapes, if so, how many and which ones, as well as commenting on strategies they used to perform the task during neurofeedback. The questionnaire and a summary of the answers are shown in Table S6. The free guess category boundaries, as well as the 2-category forced choice category boundaries given by each participant are shown in Figs. S18 & S19.
Psychometric functions estimated for each of the six directions in the stimulus space (AG, BH, CI, DJ, EK, FL) for the norming cohort (A, n=10, replicated from Fig. 1B-G for convenience) and the training cohort (B, n=10, bottom). Color lines indicate individual participants and bolded line indicates average. No significant differences were observed between the two cohorts or between the six directions (psychometric function slope, repeated measures ANOVA, factors=‘direction’, ‘cohort’; interaction: F(5,108)=0.19, p=0.964; main effect of direction: F(5,108)=0.57, p=0.722; main effect of cohort: F(1,108)=1.71, p=0.194).
Representational similarity matrices (RSM) obtained by computing the Pearson correlation between neural activity elicited by pairs of shapes spanning the stimulus space (Fig. 2B) in the neurofeedback ROI of each participant during the localizer scans (Days 2-3). Correlation with the ideal representational similarity matrix RSM (Fig. 2C) is shown below each participant RSM.
Searchlight analysis indicating all of participant #1’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, parahippocampal gyrus, and hippocampus. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #2’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, inferior temporal cortex, anterior temporal cortex, and superior temporal sulcus. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #3’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, parahippocampal gyrus, hippocampus, and medial frontal gyrus. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #4’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, transverse occipital sulcus, and dorsolateral prefrontal cortex. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #5’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, inferior temporal cortex, parahippocampal gyrus, and dorsolateral prefrontal cortex. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #6’s brain regions that represented the shape space parametrically, including extrastriate visual cortex and orbitofrontal cortex. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #7’s brain regions that represented the shape space parametrically, including extrastriate visual cortex and parietal cortex. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #8’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, parietal cortex, and dorsolateral prefrontal cortex. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #9’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, transverse occipital sulcus, inferior parietal cortex, and dorsolateral prefrontal cortex. See Methods for details of the searchlight procedure and threshold selection.
Searchlight analysis indicating all of participant #10’s brain regions that represented the shape space parametrically, including extrastriate visual cortex, inferior temporal cortex, anterior temporal cortex, and dorsolateral prefrontal cortex. See Methods for details of the searchlight procedure and threshold selection.
We computed a distribution of log-likelihood ratios (LLR) using the estimated neural model Gaussian distributions for 48 out of the 49 shapes shown to participants (all except the ambiguous center shape) during each fMRI run scanned during the Days 2-3 fMRI sessions (12-15 runs per participant, 576-720 examples per category per participant). The full histogram of the LLR distributions are shown for each participant and each category (by definition, LLR1 = - LLR2), together with a representative threshold of 75% of the distribution used during the neurofeedback trials to assess performance and give feedback to the participants for each category.
The four types of TRs shown to participants during the neurofeedback training. At the beginning of each trial, a random shape was selected and made to oscillate with a radius of 1.875 a.d.u. (default TR) (the shape position was static on the screen, but the shape identity morphed gradually according to the trajectory shown in parameter space). If a participant accumulated positive feedback within that trial (cf. Fig. 3D), the radius of the oscillation was reduced in increments of 0.625 a.d.u., otherwise it was left unchanged from the previous TR. Each 2s TR comprised four independent shape oscillations of 500 ms each (see Movie S1 for two example trials). Participants also received a monetary reward for each trial in which they were able to reduce the radius of the oscillation.
We collected two fMRI sessions for a pilot participant who performed the localizer task during the first session (four scans on separate days, 26 runs, 20 shapes of varying distance from center per run) and a center dot color change task during the fifth session (8 runs of 20 shapes similar to neurofeedback design, of varying levels of oscillation radius). Gray bars show classifier training accuracy. Blue bars show classifier testing accuracy for runs with varying distance from stimulus space center in session 1. Pink bars show classifier testing accuracy for runs with varying oscillation radius from session 2. Performance of the model decayed for shapes closer to the category boundary (top left) and when the model was trained on static shapes (e.g., localizer runs) and tested on oscillating shapes (e.g., neurofeedback runs). Based on these data, we chose an upper limit of 1.875 a.d.u. for the shape oscillation during neurofeedback trials and did not train shapes that were less than 1 a.d.u. from the category boundary, as the performance of the model was not reliable for combinations of parameters outside of these bounds.
(A) LLR for each participant’s trained (blue) and untrained (purple) categories during the first two days (light color) and last two days (dark color). We observed a significant interaction effect between training and category (ANOVA; F(1,76)=4.92, p=0.030); (B) LLR for the trained (blue) and untrained (purple) categories, plotted together with the psychometric function changes for the trained (light orange) and untrained (dark orange) categories. Difference between LLR bars yields graph in Fig. 4D and difference between Behavior bars yields graph in Fig. 4E.
(A) Diagram of stimulus space for example participant with emphasized trained direction, category boundary (for LLR computation), and directions used to compute psychometric function slope changes in the behavioral experiments. LLR Ratio (green) represents the LLR in the Trained direction divided by the LLR in the Untrained direction. (B) Main effect of LLR Ratio change as a consequence of neurofeedback training (LLR Ratio difference between first two days and last two days of training). (C) Main behavioral effect of neurofeedback training (replicated from Fig. 4E for convenience). (D) Correlation plot of LLR ratio main effect vs. behavioral main effect. Similar to the LLR difference score measure, we observed a strong correlation between the neural change and perceptual change (r=0.686, p=0.029). (E) LLR Ratio for each participant during their first two days and last two days of neurofeedback training. We observed a significant increase of LLR ratio at the end of the experiment compared to the beginning (p=0.024).
At the conclusion of the experiment (but before being fully debriefed), participants were told that the shapes they had seen during training belonged to an unspecified number of categories. They were instructed to draw their best guess for the category boundary/boundaries between them. Participants reported an average of 3.3 categories, whose boundaries (blue lines) did not coincide with the category boundaries randomly selected for each of them in the experiment (orange lines). When subsequently forced to split the stimulus space into 2 categories using a straight line, they were still unable to correctly identify the category boundary, however their performance suggests a trend towards being able to access some information about the category boundary given an explicit external constraint (purple line, trained=0°, average absolute displacement=31.5° ± 5.89° s.e.m,; random choice=45°, t(9)=2.18, p=0.058).
Histogram of angle displacement from correct category boundary for forced 2-category guesses in Fig. S18 (trained=0°, average absolute displacement=31.5° ± 5.89°; untrained=90°, average absolute displacement=58.5° ± 5.89°). Participants were unable to guess the guess the training category boundary, however their performance suggests a trend towards having some crude information about the category boundary given an explicit external constraint (random choice=45°, t(9)=2.18, p=0.058).
As a control analysis, we computed the psychometric function slope change for the average of the two directions that were midway between parallel and perpendicular to the trained category boundary (in gray, neutral, e.g., for categories ABCDEF and GHIJKL, the average slope of lines BH and EK). The behavioral change in the neutral direction was almost absent (−0.017±0.064 s.e.m., t(9)=0.257, p=0.803) and this change was not significantly different from that observed in the trained or untrained directions, albeit potentially more similar to the former than the latter (neutral<trained t(9)=1.26, p=0.239; neutral>untrained t(9)=2.08, p=0.067). The trained and untrained bar plots are replicated from Fig.S16B for convenience.
We recruited 25 participants for the study, of which 6 withdrew before completing the study, 2 were rejected due to not meeting MRI scanning safety criteria, 7 were rejected due to poor performance during the localizer scans, and 10 finished the full experiment. The embedded table legend provides detailed explanations for each column and field.
To account for high variability of fMRI data across multiple days and runs, the LLR feedback threshold for each training run was adjusted given participant performance (how many trials generated positive feedback) on all previous runs since the beginning of the training day using an adaptive procedure designed to converge on giving feedback for approximately 1/3 of the twenty trials in each run.
We report details on every feedback run for every training session of every participant, including number of trials with positive feedback and corresponding feedback LLR threshold for each run computed using the procedure detailed in Table S2.
LLR difference score computed using model estimates from a variable number of TRs across each neurofeedback trial (1-5; participants received at most 5 instances of feedback per trial). We observed similar results in all cases to when using only the first TR from each trial, with minor weakening of the reported main effect of neurofeedback training as more TRs were added, likely due to repetition suppression effects (26) across the length of the 16s trials.
The embedded table legend provides detailed explanations for each column and field.
At the conclusion of the study, but before being fully briefed on the goals of the experiment, participants were asked to complete a questionnaire about their experience, strategies for completing the task, as well as guesses (and drawings) of potential category boundaries on a never-before-seen picture of the circular stimulus space.
Video shows two example trials during a neurofeedback training run with no feedback (first trial, shape oscillation radius 1.875 a.d.u.) and simulated positive feedback that eventually stops the shape from oscillating (second trial, shape oscillation radii: 1.875 a.d.u, 1.250 a.d.u., 0.625 a.d.u.. 0 a.d.u.).
External Link to Movie (opens in browser): http://www.princeton.edu/~miordan/sculpting.html
Video shows an example trial during a neurofeedback training run with simulated positive feedback that eventually stops the shape from oscillating (shape oscillation radii: 1.875 a.d.u, 1.250 a.d.u., 0.625 a.d.u.. 0 a.d.u.). Top right corner shows simulated trajectory of current stimulus in neural shape model. Bottom left corner shows extent of shape oscillation in circular shape space.
External Link to Movie (opens in browser): http://www.princeton.edu/~miordan/sculpting.html
Acknowledgments
The authors would like to thank M. L. Nguyen, A. N. Hoskin, E. A. Piazza, N. Keshavarzian, E. A. McDevitt, J. W. Antony, N. Y. Kim, and A. R. Burton for help with data collection; A. C. Mennen for discussions about the infrastructure for real-time fMRI neurofeedback; N. DePinto, L. E. Nystrom, and G. McGrath for technical support; and A. Tompary for contributions to initial pilot studies.