## Abstract

Abstraction can be defined as a cognitive process that identifies common features - abstract variables, or concepts - shared by many examples. Such conceptual knowledge enables subjects to generalize upon encountering new examples, an ability that supports inferential reasoning and cognitive flexibility. To confer the ability to generalize, the brain must represent variables in a particular ‘abstract’ format. Here we show how to construct neural representations that encode multiple variables in an abstract format simultaneously, and we characterize their geometry. Neural representations conforming to this geometry were observed in dorsolateral pre-frontal cortex, anterior cingulate cortex and the hippocampus in monkeys performing a serial reversal-learning task. Similar representations are observed in a simulated multi-layer neural network trained with back-propagation. These findings provide a novel framework for characterizing how different brain areas represent abstract variables that are critical for flexible conceptual generalization.

The capacity to conceptualize information underlies most forms of high-level cognitive processing, including categorizing, reasoning, planning, emotion regulation, and decision-making. Each of these processes rely on the ability of the brain to not merely represent information about the details of a particular event (or “example”), but to represent information that describes one or more features that are also shared by many other examples. These features correspond to abstract “cognitive” variables, or concepts. Knowledge of abstract variables (conceptual knowledge) enables one to generalize and immediately draw conclusions about a newly encountered example^{1}. Moreover, the capacity to generalize across multiple abstract variables enhances cognitive and emotional flexibility, enabling one to adjust behavior in a more efficient and adaptive manner. However, a conceptual framework and corresponding data for understanding how the brain simultaneously represents multiple variables in an abstract format - i.e., how the brain can link a single example to multiple concepts simultaneously - has been lacking.

One possibility is that when a population of neurons represents an abstract variable, all information about specific examples is discarded and only the combination of features essential to the abstract variable is retained. For example, the only information retained in an abstract format could be the feature that all the instances belonging to a conceptual set have in common. However, in this case, generalization applied to a new example can only occur with respect to this single encoded abstract variable. The capacity to link a new example to multiple abstract variables simultaneously promotes flexibility, but it requires neural populations to retain multiple pieces of information in an abstract format. To investigate whether and how variables are represented in an abstract format within a neural population, we trained monkeys to perform a serial reversal-learning task. The task involved switching back and forth between two contexts, where contexts were un-cued and defined by sets of stimulus-response-outcome mappings (or contingencies) that differed in each context. Monkeys utilized inference to perform this task efficiently, as demonstrated by the fact that once they experienced that one trial type had changed its contingencies upon a context switch, they changed their behavior for other trial types in the new context.

We sought to determine whether the variables related to this task, especially the variable ‘context’, were represented by neuronal populations in an abstract format that could support generalization. Neurophysiological recordings were targeted to the hippocampus (HPC) and two parts of the pre-frontal cortex (PFC), the dorsolateral pre-frontal cortex (DLPFC) and anterior cingu-late cortex (ACC). The hippocampus has long been implicated in generating episodic associative memories ^{2–4} that could play a central role in creating and maintaining representations of variables in an abstract format. Indeed, studies in humans have suggested a role for HPC in the process of abstraction^{5}. We also recorded from two parts of PFC, ACC and DLPFC, due to their established role in encoding rules and other cognitive information ^{6–10}. Neural signals representing abstract cognitive variables have been described in PFC ^{7, 9–11}, but prior studies have not tested explicitly whether and how multiple variables are represented in an abstract format within a population of neurons.

Neurophysiological data show that multiple task-relevant variables, including context, operant response, and reinforcement outcome, are represented simultaneously in an abstract format in hippocampus, DLPFC, and ACC. This abstract format was revealed by an analysis of the geometry of the representations, which is characterized by the arrangement of the points representing different experimental conditions in the firing rate space of all recorded neurons. In this firing rate space, the parallelism of the coding directions for each abstractly represented variable was significantly enhanced compared to a random unstructured geometry in which abstraction does not occur. The observed geometry enables generalization across conditions within the recorded neural populations in all three brain areas, a signature of abstraction. In a multi-layered neural network trained with back-propagation, a similar geometry of the representations was observed. These results provide a conceptual and mechanistic framework for understanding how the brain can relate a single example to multiple abstract variables simultaneously within a population of neurons.

### Monkeys use inference to adjust their behavior

We designed a serial-reversal learning task in which switches in context involve un-cued and simultaneous changes in the operant and reinforcement contingencies of four stimuli. In other words, while the set of stimuli remains constant, two distinct sets of stimulus-response-outcome mappings exist implicitly, one for each context. Correct performance for two of the stimuli in each context requires releasing the button after stimulus disappearance; for the other two stimuli, the correct operant response is to continue to hold the button (Figure 1a,b; see Methods for details). For two of the stimuli, correct performance results in reward delivery; for the other two stimuli, correct performance does not result in reward delivery, but it does prevent both a time-out and monkeys’ having to repeat the trial (Figure 1b). Without warning, randomly after 50-70 trials, the operant and reinforcement contingencies switch to the other context; contexts switch many times within an experiment.

On average, the monkeys’ performance drops to significantly below chance immediately after a context switch, as the change in contingencies is un-cued (see image number 1 in Fig. 1c). In principle, monkeys could simply re-learn the correct stimulus-response associations for each image independently after every context switch. Behavioral evidence indicates that this is not the case. Instead, the monkeys perform inference. After a context switch, as soon as they have experienced the changed contingencies for one or more stimuli, on average they infer that the contingencies have changed for the stimuli not yet experienced in the new context. Performance is significantly above chance for the stimulus conditions when inference can be applied (see image numbers 2-4 in Fig. 1c). As soon as monkeys exhibited evidence of inference by performing correctly on an image’s first appearance after a context switch, the monkeys’ performance was sustained at asymptotic levels for the remainder of the trials in that context (Fig. 1d).

The observation that monkeys can perform inference suggests that the different stimulus-response-outcome associations within the same context are somehow linked together. The observed behavior can be explained by utilization of at least two distinct strategies. First, monkeys could utilize a relatively large look-up table to decide what to do. This strategy entails remembering the stimulus-action-outcome of the previous trial and then using this information to select the action in response to the image on the current trial. Since there are 4 possible images that could appear on any given trial, and these images could appear after a trial from either context, the table would contain 32 entries (4 images multiplied by the 8 possible stimulus-action-outcome combinations of the previous trial). The second strategy requires that monkeys create a new abstract variable (a “concept”) that pools together all the stimulus-response-outcome combinations (trial types, or instances) that are present within each context to create representations of the two contexts. Upon seeing an image, a monkey can refer to its knowledge of the current context, and then select an action. Both strategies support the inference that we just discussed.

### The geometry of neural representations that encode abstract variables

We now consider what types of patterns of activity in populations of neurons could support task performance for each of the two strategies that we just described. These patterns could encode information about context in different ways that would correspond to different levels of abstraction. One way to encode the look-up table of the first strategy is to represent each entry of the table with one distinct pattern of activity, which would encode one combination of the current stimulus and one of the 8 possible stimulus-action-outcome sequences of the previous trial. These 8 sequences could be encoded in the interval preceding the presentation of the stimulus with 8 distinct patterns of activity. When these patterns are random, it is likely that for a sufficient number of neurons the 4 conditions of one context are separable from the 4 conditions of the other context. This is true also in the case in which there is a cloud of points for each of the 8 conditions^{12}. Hence, a simple linear decoder can extract the information about context from neural activity in the case of random patterns. Nevertheless, random representations are obviously not in an abstract format, and they would not permit generalization across conditions.

To understand which features of a neural representation enable generalization, the signature of abstraction, it is instructive to consider the geometry of the firing rate space. In this space, each coordinate axis is the firing rate of one neuron, and hence, the total number of axes is as large as the number of recorded neurons.

One simple way to represent context in an abstract format is to retain only the information about context and discard all information about the stimulus, action and reinforcement outcome of the previous trial. Essentially, the neural population would not encode information about specific instances (the details of the prior trial), but would instead encode what is shared between all prior trials of the same context. In this case, the 4 points in the firing rate space that correspond to context 1 would coincide, or, more realistically, in the presence of noise, they would cluster around a single point (see Fig. 2a for an example of simulated data where the activity of 3 neurons are plotted against each other for each of 8 conditions). The other 4 points, for trials occurring in context 2, would form a different cluster. This geometry represents context in an abstract format because it provides a representation that is disassociated from specific instances. Importantly, clustering leads to a geometric arrangement that permits generalization. Indeed, a linear classifier trained to decode context from a small subset of clustered points will generalize right away to all the other points, if the noise is not too large. This is a fundamental property of abstraction that has already been discussed in^{5,11}.

The clustering geometry we just described allows one to encode only a single abstract variable, but in the real world a single example can often be linked to multiple abstract variables simultaneously. We now show that it is possible to construct neural representations that encode multiple variables in an abstract format at the same time. Consider the simple example that we illustrate in Figure 2b, where we show again the neural representation in the firing rate space in the case of 3 neurons. In this example, the firing rate *f*_{3} of the third neuron in the interval preceding image onset depends only on context and not the stimulus identity, operant action or value of the previous trial. The points of the two contexts lie on two parallel planes that are orthogonal to the 3rd axis. The other two neurons encode the other task-relevant variables as strongly as the third neuron encodes context. In this geometric format, there is no need for clustering, and indeed, the distances between data points within a context are as large as the distances for data points across contexts. However, an abstract representation of context is clearly embedded in the geometry because the third neuron encodes only context and throws out all other information, enabling generalization.

The simple example of Fig. 2b relies on neurons that encode only one variable, context, to provide a representation in an abstract format. Neurons with such pure selectivity are rarely observed, either in our dataset (see below), or more generally in many studies that have demonstrated that neurons more commonly exhibit mixed selectivity for multiple variables ^{13, 14}. However, all the generalization properties of the representation of context shown in Figure 2b are preserved even when we rotate the points in the firing rate space (see Figure 2c). Here context is still an abstract variable, even though all neurons may now respond to multiple task-relevant variables. Notice that the neural representation constructed in Figure 2c already encodes multiple abstract variables at the same time. Indeed all 3 neurons respond to more than one task-relevant variable, and, more specifically, they exhibit linear mixed selectivity ^{13, 14} to context, operant action and reward value of the previous trial. Context, action and value are also abstract, as illustrated in Figure 2d, where we show the exact same geometry of Figure 2c, but highlight how it encodes also the value of the previous trial in an abstract format. All the points corresponding to the rewarded conditions are contained in the yellow plane, and the non-rewarded points are in the gray plane. These two planes are parallel to each other, just like the ones for the different contexts in Figure 2c. Using a similar construction, it is possible to represent as many abstract variables as the number of neurons. However, additional limitations would arise from the amount of noise that might corrupt these representations.

### The geometry of the recorded representations

We now show that the geometry of neural representations recorded from monkeys performing the serial reversal learning task conforms to a geometry that supports abstraction, just as we proposed in Figure 2c,d. We will first show that the information about context and other task-relevant variables are present in the recorded neural representations. Then, we will show that this information is encoded in a way that allows for generalization and conforms to the geometry just described.

We recorded the activity of 1378 individual neurons in the PFC and hippocampus in two monkeys while they performed the serial reversal learning task. Of these, 629 cells were recorded in HPC (407 and 222 from each of the two monkeys, respectively), 335 cells were recorded in ACC (238 and 97 from each of the two monkeys), and 414 cells were recorded in DLFPC (226 and 188 from the two monkeys). Individual neurons exhibit mixed selectivity to all the task relevant variables and their responses are very diverse in all the three brain areas (see Figure S1). The task-relevant variables context, reinforcement outcome (value) and operant action of the previous trial were represented in each of the brain areas, as each variable could be decoded from the firing rates of populations of neurons in each area (Fig. S2a; see Methods). In particular, they could all be decoded in the time interval preceding the stimulus presentation (Fig. S2b,c right panels).

To understand in what format the information about these variables is represented, we used dimensionality reduction methods (multi-dimensional scaling, MDS) to visualize the recorded representations in 3 dimensions (see Methods). This method revealed that the representations appear similar to those illustrated in Figure 2c,d. Figure 3 depicts the MDS plots for all three brain areas, using the same notation as in Figure 2a-d. In HPC, the red and the blue points, which represent the two contexts, are well separated, suggesting some degree of clustering. However, it is clear that also in this case the intra-context distances are not negligible and that the points within the clusters are nicely organized. For example the rewarded and non-rewarded conditions are well separated; this organization is particularly evident in the movies in the Supplementary Material in which these plots can be viewed from many different angles. This indicates that value could also be represented in an abstract format. This type of geometric structure is even more prominent in the DLPFC and ACC. Moreover, the movies in the Supplementary Material suggest that the four points of each context are contained in a low-dimensional subspace, almost a plane, as in the geometry proposed in Figure 2c,d. The planes corresponding to the two contexts are approximately parallel.

### Measuring abstraction in high-dimensional spaces: cross-condition generalization and the parallelism score

We now introduce two complimentary methods for characterizing the geometry of the neural representations in the original high-dimensional firing rate space. The first method is related to the ability of a linear readout to generalize for multiple variables simultaneously, which is a fundamental property of representing a variable in an abstract format. To illustrate this method, consider for simplicity only a subset of four of the eight conditions in our experiments, where two trial types come from each of contexts 1 and 2, and only one of the trial types in each context is rewarded. A decoder can be trained to classify context only on the conditions in which the monkey received a reward in the previous trial. Thanks to the arrangement of the four points (the lines going through the two points of each context are almost parallel), the resulting decoding hyperplane (gray line in Figure 4a) successfully classifies context when testing on the other two (held-out) conditions in which the monkey did not receive reward. This decoding performance on trials that the classifier was not trained on corresponds to generalization, and it is a signature of the process of abstraction. In other words, if a decoder is trained on a subset of conditions, and it immediately generalizes to other conditions unseen before, without any need for retraining, we conclude that a variable is represented in an abstract format (one that enables generalization). To determine whether the data exhibits the geometry of Figure 2c, we can directly test the ability to generalize by following the same procedure illustrated in Figure 4a: we train a decoder on a subset of conditions and test it on the other conditions. We define the performance of the decoder on these other conditions as the cross-condition generalization performance (CCGP).

It is important to stress that the ability to decode a variable like context when training a decoder on a subset of all the trials does not necessarily imply that the CCGP for context is large. For example, if the points corresponding to different conditions are at random positions in the firing rate space, the decoding accuracy can be arbitrarily high if the noise is small, but the CCGP will be at chance level. In other words, in the example of Figure 4a, if the points are at random positions and not arranged as in the figure, the probability that the two test conditions will be on the correct sides of the hyperplane is 0.5, and hence the CCGP would not be different from chance. This is true also even when all four points are very well separated (spanning a 3-dimensional subspace), and context is therefore decodable (for points at random positions, the probability that the points of the two contexts are not separable goes to zero as the number of neurons increases).

The second method we used to characterize the geometry of neural representations posits that there is a specific aspect of the geometry that may account for generalization performance: the degree to which the coding directions determined when training a decoder are parallel for different sets of training conditions. Consider the case depicted in Figure 4b. Here we draw the two hyperplanes (which are lines in this case) obtained when a decoder is trained on the two points on the left (the rewarded conditions, gray) or on the two points on the right (unrewarded conditions, black). The two lines representing the hyperplanes are almost parallel, indicating that this geometry will allow good generalization regardless of which pair of points we train on.

We estimate the extent to which these hyperplanes are aligned by examining the coding directions (the arrows in the figure) that are orthogonal to the hyperplanes. For good generalization, these coding directions should be as close to parallel as possible. This is the main idea behind the parallelism score (PS), a measure described in detail in the Methods. A large PS indicates a geometry likely to permit generalization and therefore the corresponding variable would be represented in an abstract format. When multiple abstract variables are simultaneously represented, the PS should be large for all variables, constraining the points to approximately define a geometry of the type described in Figure 2a,b. As the PS focuses on parallelism between coding directions, it can detect the existence of abstract variables even when the neurons are not specialized, or, in other words, when the coding directions are not parallel to the coordinate axes.

In Figure 4 we report both the CCGP and PS measured in the three brain areas during the 900 ms time interval that starts 800 ms before the presentation of the visual stimulus (see Methods). The CCGP analysis reveals that context is abstract in all three areas, and the level of abstraction is comparable across brain areas. An analysis similar to what has been proposed in ^{5}, which only considered clustering for representing an abstract variable, would lead to the wrong conclusion that context is strongly abstract only in HPC (Figure S3). In fact, all three variables are represented in an abstract format in all three brain areas, except the action of the previous trial in the hippocampus. Interestingly, the action can be decoded in HPC (see Figure 4c), even if it is not abstract. Remarkably, the PS exhibits a pattern very similar to the CCGP, indicating a direct correspondence between the geometry of representations and generalization. In conclusion, these analyses show that multiple abstract variables are encoded in the populations of neurons recorded from each studied brain area. Moreover, the geometry of these representations is similar to the one constructed in Fig. 2c,d.

### Abstraction in multi-layer neural networks trained with back-propagation

The geometric features of neural representations that we have just described may constitute a general principle that neural networks exhibit when performing cognitive tasks relying on conceptual information. To examine this possibility, we trained a simple neural network model with back-propagation, and asked whether neural representations in the network exhibit similar geometric features as observed in the experiments. Using back-propagation we trained a two layer network (see Figure 5a) to read an input representing a handwritten digit between 1 and 8 (from the MNIST dataset) and to output whether the input digit is odd or even, and, at the same time, whether the input digit is large (> 4) or small (≤ 4) (Figure 5b). We wanted to test whether the learning process would lead to abstract representations of two abstract variables, or concepts: parity and magnitude (i.e. large or small). This abstraction process is similar to the one studied in the experiment in the sense that it involves aggregating together inputs that are visually very dissimilar (e.g. the digits ‘1’ and ‘3’, or ‘2’ and ‘4’). Analogously, in the experiment very different sequences of events (visual stimulus, operant action and value) are grouped together into what we defined as contexts.

After training the network, we presented input samples that were held out from training, and we ‘recorded’ the activity of the two hidden layers. The multidimensional scaling plots, similar to those of Figure 3 for the experimental data (but reduced to two dimensions), are shown in Figure 5c for the input layer and for the two hidden layers of the simulated network. Each digit in these plots represents a different input. They are colored according to the parity/magnitude tasks illustrated in Figure 5b. While it is difficult to detect any structure in the input layer (the slight bias towards red on the left side is mostly due to the similarity between ‘1’s and ‘7’s), in the second hidden layer we observe the type of geometry that would be predicted for a neural representation that encodes two abstract variables, namely parity (even digits on the left, odd digits on the right), and magnitude (large at the top, small at the bottom). The digits tend to cluster at the four vertices of a square, which is the expected arrangement.

Just as in the experiments, we computed both the CCGP and the PS. We analyzed these two quantities for all possible dichotomies of the eight digits, not just for the dichotomies corresponding to magnitude and parity. This corresponds to all possible ways of dividing the digits into two equal size groups. In Figure 5d,e we ranked these dichotomies according to their CCGP and their PS, respectively. The largest CCGP and PS correspond to the parity dichotomy, and the second largest values correspond to the magnitude dichotomy (circles marked by crosses in Figure 5d,e). For these two dichotomies, both the CCGP and the PS are significantly different from those of the random models. There are other PS values that are significant. However, they correspond to dichotomies whose labels denote one or both of the two trained dichotomies. If one restricts the analysis to the dichotomies orthogonal to both of them (filled circles in Figure 5d,e), none are significantly above chance level. This analysis shows that the geometry of the neural representations in the simulated network is similar to that observed in the experiment. Furthermore, the CCGP and the PS can identify the dichotomies that correspond to abstract variables even when one has no prior knowledge about these variables. Indeed, it is sufficient to compute the CCGP and the PS for all possible dichotomies to discover that parity and magnitude are the abstract variables in these simulations.

Next, we asked whether a neural network trained to perform a simulated version of our experimental task would reveal similar principles. Because of the sequential character of the task, it was natural to model the monkeys’ interaction with their environment in the task within a Reinforcement Learning framework ^{15}. Specifically, we used Deep Q-learning, a technique that uses a deep neural network representation of the state-action value function of an agent trained with a combination of temporal-difference learning and back-propagation refined and popularized by ^{16}. The use of a neural network is ideally suited for a comparative study between the neural representations of the model and the recorded neural data. As commonly observed in Deep Q-learning, the neural representations that we obtained with Deep Q-learning display significant variability across runs of the learning procedure. However, in a considerable fraction of runs, the neural representations recapitulate the main geometric features that we observed in the experiment S3. In particular, after learning, context is represented as an abstract variable in the last layer, despite not being explicitly represented in the input, nor in the output. Moreover, the neural representations of the simulated network encode multiple abstract variables simultaneously, as observed in the experiment.

## Discussion

The cognitive process that finds a common feature – an abstract variable – shared by a number of examples or instances is called abstraction. Abstraction enables one to utilize inference and deduce the value of an abstract variable when encountering a new example. Here we proposed a geometric construction of neural representations that accommodate the simultaneous encoding of multiple abstract variables. These representations are characterized by a high parallelism score, a measure based on the angles between coding directions for any given variable. Crucially, this geometrical property is related to a useful statistical property – cross-condition generalization performance – that characterizes how linear readouts can generalize across conditions, which is a signature of representing a variable in an abstract format.

We used the parallelism score and cross-condition generalization to characterize the neural activity patterns recorded in a serial reversal learning experiment in which monkeys switched back and forth between two contexts. These measures revealed that the representation of the taskrelevant variable “context” was in an abstract format in HPC, DLPFC and ACC prior to the stimulus presentation on a trial. This information can support performance of the task as knowledge of the stimulus identity combined with knowledge of context can be used to select which operant action to perform and to anticipate accurately whether to expect a reward.

In addition to context, all the recorded brain areas represented at least one other variable in an abstract format (the action and reward value of the previous trial in DLPFC and ACC, and the value of the previous trial in HPC). Representations that arise in simple neural network models trained with back-propagation also represent multiple variables in an abstract format simultaneously, as the representations exhibit the same geometric properties that we observed in the experimental data. The observed geometry may therefore be a general feature underlying the encoding of abstract variables in biological and artificial neural systems.

Although information about the value and action of the previous trial are represented in all three brain areas, this information is not needed to perform correctly on the next trial if the animal did not make a mistake. However, at the context switch, the reward received on the previous trial is the only feedback from the external world that indicates that the context has changed. Therefore, reward value is essential when adjustments in behavior are required, suggesting that storing the recently received reward could be beneficial. Consistent with this notion, in the Supplementary Material S3 we show in simulations that value becomes progressively more abstract as the frequency of context switches increases (see Figure S14). In addition, monkeys occasionally make mistakes that are not due to a context change. To discriminate between these occasional errors and those due to a context change, information about value is not sufficient and information about the previously performed action could be essential for deciding correctly the operant response on the next trial. Conceivably, the abstract representations of reward and action may also afford the animal more flexibility in learning and performing other tasks. Previous work has shown that information about recent events is represented whether it is task-relevant or not (see e.g. ^{17,18}). This storage of information (a memory trace) may even degrade performance on a working memory task ^{19}, but presumably the memory trace might be beneficial in other scenarios that demand cognitive flexibility in a range of real-world situations.

Our analysis showed that DLPFC and ACC represent more variables in an abstract format than hippocampus, as the action of the previous trial is in an abstract format only in DLPFC and ACC. This may reflect the prominent role of pre-frontal areas in supporting working memory (see e. g. ^{20–22}). Moreover, the fact that the hippocampus represents fewer variables in an abstract format as characterized by the parallelism score and cross-condition generalization explains why if one only considers clustering as a signature of abstraction, context is strongly identified as being in an abstract format only in the hippocampus (see Figure S3). However, the geometric analysis of patterns of neural activity reveals that pre-frontal cortex also represents context in abstract format. In general, detecting abstract variables becomes more difficult as their number grows, since this increases the dimensionality of the sub-spaces encoding different values of each abstract variable. In this case, one therefore requires more samples in order to generalize, which affects the statistics of the cross-condition generalization performance.

Context, action and value of the previous trial can all be represented in an abstract format in the recorded areas, but context is particularly interesting because it is not explicitly represented in the sensory input, nor in the motor response, and hence it requires a process of abstraction (learning) based on the temporal statistics of sequences of stimulus-response-outcome associations. However, it is important to stress that learning may also be required for creating abstract representations of more concrete variables, such as action, which corresponds to a recent motor response, or value, which encodes a sensory experience, namely recent reward delivery.

### Dimensionality and abstraction in neural representations

Dimensionality reduction is widely employed in many machine learning applications and data analyses because, as we have seen, it leads to better generalization. In our theoretical framework, we constructed representations of abstract variables that are indeed relatively low-dimensional, as the individual neurons exhibit linear mixed selectivity ^{13, 14}. In fact, these constructed representations have a dimensionality that is equal to the number of abstract variables that are simultaneously encoded. Consistent with this, the neural representations recorded in the time interval preceding the presentation of the stimulus are relatively low-dimensional, as expected (Supplementary S4).

Previous studies showed that the dimensionality of neural representations can be maximal (monkey PFC, ^{13}), very high (rodent visual cortex^{23}), or as high as it can be given the structure of the task ^{24}. These results seem to be inconsistent with what we report in this article. However, dimensionality is not a static property of neural representations; in different epochs of a trial, dimensionality can vary significantly (see e.g. ^{25}). Dimensionality has been observed to be maximal in a time interval in which all the task-relevant variables had to be mixed non-linearly to support task performance ^{13}. Here we analyzed a time interval in which the variables that are encoded do not need to be mixed. In this time interval, the most relevant variable is context, and encoding it in an abstract format can enhance flexibility and support inference. However, during the presentation of the stimulus, the dimensionality of the neural representations increases significantly (Supplementary S4), indicating that the context and the current stimulus are mixed non-linearly later in the trial, similar to prior observations ^{13, 14,23, 26, 27}.

Finally, we should emphasize that our data are also consistent with intermediate regimes in which the coding directions are not perfectly parallel. Distortions of the idealized geometry can significantly increase dimensionality, providing representations that preserve some ability to generalize, but at the same time support operations requiring higher dimensional representations (see Supplementary Information S17).

### Other forms of abstraction in the computational literature

Machine learning and in particular computational linguistics have recently started to demonstrate impressive results on difficult lexical semantic tasks thanks to the use of word embeddings, which are vector representations whose geometric properties reflect the meaning of the linguistic tokens they represent. One of the most intriguing properties of recent forms of word embeddings is that they exhibit linear compositionality that makes the solution of analogy relationships possible via linear algebra ^{28, 29}. A well-known example is provided by shallow neural network models that are trained in an unsupervised way on a large corpus of documents and end up organizing the vector representations of common words such that the difference of the vectors representing ‘king’ and ‘queen’ is the same as the difference of the vectors for ‘man’ and ‘woman’^{28}. These word embeddings, which can be translated along parallel directions to consistently change one feature (e.g. gender, as in the previous example), clearly share common coding principles with the parallel representations that we propose. Moreover, this type of vector representation has been shown to be highly predictive of fMRI BOLD signals measured while subjects are presented with semantically meaningful stimuli ^{30}.

A different but also very appealing approach to extracting compositional features in an unsupervised way relies on variational Bayesian inference to learn to infer interpretable factorized representations of some inputs^{31–34}. These methods have exhibited remarkable success in disentangling independent factors of variations of a variety of real-world datasets, and it will be exciting in the future to examine whether our methods will have any bearing on the analysis of the functioning of these algorithms.

Abstraction is also an important active area of research in Reinforcement Learning (RL), as it is a fertile concept for solution strategies to cope with the notorious “curse of dimensionality”, i. e. the exponential growth of the solution space of a problem with the size of the encoding of its states^{35}. Most abstraction techniques in RL can be divided in two main categories: ‘temporal abstraction’ and ‘state abstraction’.

Temporal abstraction is the workhorse of Hierarchical Reinforcement Learning ^{36–38} and is based on the notion of temporally extended actions (or options): the idea of enriching the repertoire of actions available to the agent with “macro-actions” composed of conditional sequences of atomic actions built to achieve useful sub-goals in the environment. Temporal abstraction can be thought of as an attempt to reduce the dimensionality of the space of action sequences: instead of having to compose policies in terms of long sequences of actions, the agent can select options that automatically extend for several time steps.

State abstraction methods are most closely related to our work. In brief, they rely on the idea of simplifying the representation of the domain exposed to the agent by hiding or removing information about the environment that is non-critical to maximize the reward function. Typical instantiations of this technique involve information hiding, clustering of states, and other forms of domain aggregation and reduction ^{39}. The use of neural networks as function approximators to represent a decision policy that we used in the last section of the Results and in Supplementary Information S3 also falls within the category of a state abstraction method. The idea is that the inductive bias of neural networks induces generalization across similar inputs, allowing them to mitigate the curse of dimensionality in high-dimensional domains. A particularly well-known success story of neural networks applied to reinforcement learning is Deep Q-learning^{16}, which employed a deep neural network representation of the Q-function of an agent trained to play 49 different Atari video games with super-human performance using a combination of temporal-difference learning and back-propagation. Our efforts to model the task used in the experiments in section S3 demonstrates that neural networks induce the type of abstract representations that we propose and suggests that our analysis techniques could be useful to elucidate the geometric properties underlying the success of Deep Q-learning neural networks.

### Characterizing brain areas by analyzing the geometry of neural representations

Historically, brain areas have been characterized by describing what task-relevant variables are encoded, and by relating the encoding of these variables to behavior, either by correlating neural activity with behavioral measures, or by perturbing neural activity to assess the necessity or sufficiency of the signals provided by the brain area. Here we describe a geometric characterization of neural activity patterns that goes beyond variable encoding and corresponds to a notion of abstraction defined in terms of generalization across different instances of the same variable. As discussed previously, even random patterns can encode task-relevant variables, but without representing them in an abstract format. Consequently, merely requiring that a variable be encoded by a neural population may fail to identify those structures responsible for cognitive functions that demand utilization of information stored in an abstract format. The analysis of geometric features such as the parallelism score of the activity patterns evoked in different neural populations could thereby reveal important functional differences between brain areas, differences not apparent from a decoding analysis alone or from an analysis of single neuron response properties.

The generation of neural representations of variables in an abstract format is central to many different sensory, cognitive and emotional functions. For example, in vision, the creation of neural representations of objects that are invariant with respect to their position, size and orientation in the visual field is a typical abstraction process that has been studied in machine learning applications (see e.g. ^{40, 41}) and in the brain areas involved in representing visual stimuli (see e.g. ^{42, 43}). This form of abstraction may underlie fundamental aspects of perceptual learning. Here we have focused on a form of abstraction that we believe is essential to higher cognitive functions, such as context-dependent decision-making and emotion regulation, the use of conceptual reasoning to learn from experience, and the application of inference. The types of abstraction that underlie these processes almost certainly rely on reinforcement learning and memory, as well as the ability to forge conceptual links across category boundaries. The analysis tools developed here can be applied to electrophysiological, fMRI and calcium imagining data and may prove valuable for understanding how different brain areas contribute to various forms of abstraction that underlie a broad range of mental functions. Future studies must focus on the specific neural mechanisms that lead to the formation of abstract representations, which is fundamentally important for any form of learning, for executive functioning, and for cognitive and emotional flexibility.

## Competing Interests

The authors declare that they have no competing financial interests.

## Methods

### M1 Task and Behavior

Two rhesus monkeys (Macaca mulatta; two males respectively, 8 and 13 kg) were used in these experiments. All experimental procedures were in accordance with the National Institutes of Health guide for the care and use of laboratory animals and the Animal Care and Use Committees at New York State Psychiatric Institute and Columbia University. Monkeys performed a serial-reversal learning task in which they were presented one of four visual stimuli (fractal patterns). Stimuli were consistent across contexts and sessions, presented in random order. Each trial began with the animal holding down a button and fixating for 400 ms (Fig. 1a). If those conditions were satisfied, one of the four stimuli was displayed on a screen for 500 ms. In each context, correct performance for two of the stimuli required releasing the button within 900 ms of stimulus disappearance; for the other two, the correct operant action was to continue to hold the button. For 2 of the 4 stimuli, correct performance resulted in reward delivery; for the other 2, correct performance did not result in reward. If the monkey performed the correct action, a trace interval of 500 ms ensued followed by the liquid reward or by a new trial in the case of non-rewarded stimuli. If the monkey made a mistake, a 500 ms time out was followed by the repetition of the same trial type if the stimulus was a non-rewarded one. In the case of incorrect responses to rewarded stimuli, the time-out was not followed by trial repetition and the monkey simply lost his reward. After a random number of trials between 50 and 70, the context switched without warning and with it the operant and reinforcement contingencies changed. Operant contingencies switched for all images, but for two stimuli the reinforcement contingencies did not change, in order to ensure orthogonality between operant and reinforcement contingencies. A different colored frame (red or blue) for each context appears on the edges of the monitor on 10 percent of the trials, randomly selected, and only on specific stimulus types (stimulus C for context 1 and stimulus D for context 2) although never in the first five trials following a contextual switch. All trials with a contextual frame were excluded from all analyses presented.

### M2 Electrophysiological Recordings

Recordings began only after the monkeys were fully proficient in the task and performance was stable. Recordings were conducted with multi-contact vertical arrays electrodes (v-probes, Plexon Inc., Dallas, TX) with 16 contacts spaced at 100 *μm* intervals in ACC and DLPFC, and 24 contacts in HPC, using the Omniplex system (Plexon Inc.). In each session, we individually advanced the arrays into the three brain areas using a motorized multi-electrode drive (NAN Instruments). Analog signals were amplified, band-pass filtered (250 Hz – 8 kHz), and digitized (40 kHz) using a Plexon MAP system (Plexon, Inc.). Single units were isolated offline using Plexon Offline Sorter. To address the possibility that overlapping neural activity was recorded on adjacent contacts, or that two different clusters visible on PCA belonged to the same neuron, we compared the zero-shift cross-correlation in the spike trains with a 0.2 ms bin width of each neuron identified in the same area in the same session. If 10 percent of spikes co-occurred, the clusters were considered duplicated and one was eliminated. If 1-10 percent of spikes co-occurred, the cluster was flagged and isolation was checked for a possible third contaminant cell. Recording sites in DLPFC were located in Brodmann areas 8, 9 and 46. Recording sites in ACC were in the ventral bank of the ACC sulcus (area 24c). HPC recordings were largely in the anterior third, spanning across CA1-CA2-CA3 and DG.

### M3 Selection of trials/neurons, and decoding analysis

The neural population decoding algorithm was based on a linear classifier (see e.g. ^{11}) trained on pseudo-simultaneous population response vectors composed of the spike counts of the recorded neurons within specified time bins and in specific trials ^{44}. The trials used in the decoding analysis are only those in which the animal responded correctly (both for the current trial and the directly preceding one), in which no context frame was shown (neither during the current nor the preceding trial), and which occurred at least five trials after the most recent context switch. We retain all neurons for which we have recorded at least 15 trials satisfying these requirements for each of the eight experimental conditions (i.e. the combinations of context, value and action). Every decoding analysis is averaged across several repetitions to estimate trial-to-trial variability (as explained more in detail below). For every repetition, from among all selected trials we randomly split off five trials per condition to serve as our test set, and used the remaining trials (at least ten per condition) in our training set. For every neuron and every time bin, we normalized the distribution of spike counts across all trials in all conditions with means and standard deviations computed on the trials in the training set. Specifically, given an experimental condition *c* (i.e. a combination context, value and action) in a time bin *t* under consideration, we generated the pseudo-simultaneous population response vectors by sampling, for every neuron *i*, the z-scored spike count in a randomly selected trial in condition *c*, which we indicate by . This resulted in a single-trial population response vector , where *N* corresponds to the number of recorded neurons in an area under consideration. This single-trial response vector can be thought of as a noisy measurement of an underlying mean firing rate vector , such that , with indicating a noise vector modeling the trial-to-trial variability of spike counts. Assuming that the trial-to-trial noise is centered at zero, we estimate the mean firing rate vectors taking the sample average: , where the angular brackets indicate averaging across trials. We then either trained maximum margin (SVM) linear classifiers on the estimated mean firing rate vectors for the eight conditions in the training set, or we trained such classifiers on the single-trial population response vectors generated from the training set of trials. In the latter case, in order to obtain a number of trials that is large compared to the number of neurons, we re-sampled the noise by randomly picking noisy firing rates (i.e. spike counts) from among all the training trials of a given experimental condition for each neuron independently. Specifically, we re-sampled 10,000 trials per condition from the training set. While this neglected correlations between different neurons within conditions, we only had little information about these correlations in the first place, since only a relatively small numbers of neurons were recorded simultaneously. Regardless of whether we trained on estimated mean firing rate vectors or on re-sampled single-trial population response vectors, the decoding performance is measured on a cross-validated manner on 1000 resampled single-trial population response vectors generated from the test set of trials. For every decoding analysis training and testing were then repeated 1,000 times over different random partitions of the trials into training and test trials. The decoding accuracy that we reported were computed as the average results across repetitions.

Statistical significance of the decoding accuracy was assessed using a permutation test for classification^{45}. Specifically, we repeated the same procedure just described, but at the beginning of every repetition of a decoding analysis, trials were shuffled, i.e. associated to a random condition. This is a way of estimating the probability that the population decoders that we used would have given the same results that we obtained by chance, i.e. when applied on data that contain no information regarding the experimental conditions.

In Fig. S2 we show the cross-validated decoding accuracy as a function of time throughout the trial (for a sliding 500 ms time window) for maximum margin classifiers trained only on the mean neural activities for each condition, while Fig. 4c shows similar results for linear classifiers trained on the mean firing rates in the neural data within a time window from -800 ms to 100 ms relative to stimulus onset.

For all analyses, data were combined across monkeys, because all key features of the data set were consistent across the two monkeys.

### M4 The cross-condition generalization performance (CCGP)

The hallmark feature of abstract neural representations is their ability to support generalization. When several abstract (in our case binary) variables are encoded simultaneously, generalization must be possible for all the abstract variables. We quantify a powerful form of generalization using a measure we call the cross-condition generalization performance (in fact we can view this measure as a quantitative definition of the degree of abstraction of a set of neural representations). It is analogous to the cross-validated decoding performance commonly employed, except that instead of splitting up the data randomly, such that trials from all conditions will be present in both the training and test sets, we instead perform the split according to the condition labels, such that the training set consists entirely of trials from one group of conditions, while the test set consists only of trials from a disjoint group of conditions. We train (on the former) a linear classifier for a certain dichotomy that discriminates the conditions in the training set according to some label (one of the abstract variables), and then ask whether this discrimination generalizes to the test set by measuring the classification performance on the data from entirely different conditions, which were never seen during training.

This means that in order to achieve a large cross-conditions generalization performance, it is not sufficient to merely generalize over the noise associated with trial-to-trial fluctuations of the neural activity around the mean firing rates corresponding to individual conditions. Instead, the classifier has to generalize also across different conditions on the same side of an (abstract) dichotomy, i.e., across those conditions that belong to the same category according to the abstract variable under consideration.

Given our experimental design with eight different conditions (distinguished by context, value and action of the previous trial), we can investigate different balanced (four versus four condition) dichotomies, and choose one, two or three conditions from each side of a dichotomy to form our training set. We use the remaining conditions (three, two or one from either side, respectively) for testing, with larger training sets typically leading to better generalization performance. For different choices of training conditions we will in general obtain different values of the classification performance on the test conditions, and we define the cross-condition generalization performance (CCGP) as its average over all possible sets of training conditions (of a given size). In Fig. 4a we show the CCGP (on the held out fourth condition) when training on three conditions from either side of the context, value or action dichotomies.

The selection of trials used is the same as for the decoding analysis, except that here we retain all neurons that have at least ten trials for each experimental condition that meet our selection criteria (since the split into training and test sets is determined by the labels of the eight conditions themselves, so that for a training condition we don’t need to hold out additional test trials). We pre-process the data by z-scoring each neuron’s spike count distribution separately. Again, we can either train a maximum margin linear classifier only on the cluster centers, or on the full training set with trial-to-trial fluctuations (noise), in which case we re-sample 10,000 trials per condition, with Fig. 4a showing results using the latter method.

### M5 The parallelism score (PS)

We developed a measure based on angles of coding directions to characterize the geometry of neural representations of variables in the firing rate space. When training a linear classifier on a pair of conditions (one from each side of a dichotomy) that differ only in one label (but agree on all others), the weight vector defining the resulting separating hyperplane will be aligned with the vector connecting the cluster centers corresponding to the neural representations of the two training conditions if we assume isotropic noise around both of them. This corresponds to the coding direction for the potentially abstract variable under consideration, given this choice of training set. Other coding directions for the same variable can be obtained by choosing a different pair of training conditions (defined by different, but equal values for the other variables corresponding to orthogonal dichotomies). The separating hyperplane associated with one such pair of training conditions is more likely to correctly generalize to another pair of conditions if the associated coding directions are parallel (as illustrated in Fig. 2d,e). Therefore, we introduce a measure to quantify the alignment of the different coding directions, which we call the parallelism score (PS).

If we had only four conditions (and hence at most two abstract variables) as shown in Fig. 2d, there would be only two coding directions for a given variable (from the two pairs of training conditions), and we would simply consider the cosine of the angle between them (i.e., the normalized overlap of the two weight vectors). In our experiments, there were eight conditions (leading to at most three perfectly abstract variables), and thus pairing them across the separating hyperplane will lead to four normalized coding vectors for *i* = 1, 2, 3, 4 (corresponding to four pairs of training conditions). In this case, we consider the cosines of the angles between two of them , and we average over all six of these angles (corresponding to all possible choices of two different coding vectors). Note that these coding directions are simply the unit vectors pointing from one cluster center to another, and we don’t train any classifiers for this analysis.

In general there are multiple ways of pairing up conditions across the separating hyperplane of a dichotomy under consideration. Because we don’t want to assume a priori that we know the correct way of pairing up conditions (which would depend on the labels of the other abstract variables), we instead consider all possible ways of matching up the conditions on the two sides of the dichotomy one-to-one, and then define the PS as the maximum across all possible pairings of the average cosine. There are two such pairings in the case of four conditions, and 24 pairings for eight conditions (in general there are (*m*/2)! for *m* conditions, so there would be a combinatorial explosion if *m* was large). Therefore, the parallelism score (for eight conditions) is defined as

The parallelism scores of the context, value and action dichotomies of our data are plotted in Fig. 4b. The selection of trials used in this analysis is the same as for the decoding and cross-condition generalization analyses, retaining all neurons that have at least ten trials for each experimental condition that meet our selection criteria, and z-scoring each neuron’s spike count distribution individually.

Note that a high parallelism score for one variable/dichotomy doesn’t necessarily imply perfect generalization across other variables. Even if the coding vectors for a given variable are approximately parallel, the test conditions might be much closer together than the training conditions. In this case generalization would likely be poor and the orthogonal dichotomy would have a low parallelism score. (Moving the cluster centers in neural representation space affects the parallelism scores of at least some dichotomies, and despite being based on angles the set of all such scores depends implicitly on pairwise distances, except on the overall scale of the whole geometry). Even high parallelism scores for multiple variables don’t guarantee good generalization of one dichotomy across another one. When training a linear classifier on noisy data, the shape of the noise clouds could skew the weight vector of a maximum margin classifier away from the vector connecting the cluster centers of the training conditions. In addition, even if this is not the case and the noise is isotropic, generalization might still fail because of a lack of orthogonality of the coding directions for different variables (the eight conditions might be arranged at the corners of a parallelepiped instead of a cuboid). In summary, while the parallelism score is not equivalent to the cross-condition generalization performance, high scores for a number of dichotomies with orthogonal labels characterize a family of (approximately factorizable) geometries that can lead to good generalization properties if the noise is sufficiently well behaved (consider e.g. the case of the principal axes of the noise distributions being aligned with the coding vectors), and specifically for the simple case of isotropic noise, if the coding directions for different variables are approximately orthogonal to each other.

### M6 Random models

In order to assess the statistical significance of the above analyses we need to compare our results (for the decoding performance, abstraction index, cross-condition generalization performance, and parallelism score, which we collectively refer to as scores here) to the distribution of values expected from an appropriately defined random control model. There are various sensible choices for such random models, each corresponding to a somewhat different null hypothesis we might want to reject. The simplest case we consider is a shuffle of the data, in which assign a new, random condition label to each trial for each neuron independently (in a manner that preserves the total number of trials for each condition). When re-sampling artificial, noisy trials, we shuffle first, and then re-sample in a manner that respects the new, random condition labels. This procedure destroys almost all structure in the data, except the marginal distributions of firing rates of individual neurons. The error bars around chance level for the decoding performance in Figs. S2 and 4, and for the parallelism score in Figs. 4 and 5, are based on this shuffle control (plus/minus two standard deviations).

A different kind of structure is retained in a class of geometric random models, which we construct in order to rule out another type of null hypothesis. For the analyses that depend only on the cluster centers of the eight conditions, we can construct a random geometry by sampling new cluster centers from an isotropic Gaussian distribution (and rescaling it to keep the total signal variance the same as in the data). Such a random arrangement of the mean firing rates (cluster centers) is a very useful control to compare against, since such geometries do not constitute abstract neural representations, but nevertheless typically allow relevant variables to be decoded. For analyses that depend also on the structure of the noise (in particular, decoding and CCGP with re-sampled trials), our random model in addition requires some assumptions about the noise distributions. We could simply choose identical isotropic noise distributions around each cluster center, but training a linear classifier on trials sampled from such a model would essentially be equivalent to training a maximum margin classifier on the cluster centers only. Instead, we choose to preserve some of the noise structure of the data by moving the (re-sampled) noise clouds to the new random position of the corresponding cluster and performing a discrete rotation around it by permuting the axes (for each condition independently). If our scores are significantly different from those obtained using this random model, we can reject the null hypothesis that the data was generated by a random isotropic geometry with the same total signal variance and similarly shaped noise clouds as in the data. The error bars around chance level for the CCGP in Figs. 4 and 5 are derived from this geometric random control model.

We can also consider the distribution of scores across the 35 different balanced dichotomies we can form using the eight conditions in our data set. Since there are clearly correlations between the scores of different dichotomies (e.g. because the labels may be partially overlapping, i.e., not orthogonal), we do not think of this distribution as a random model to assess the probability of obtaining certain scores from unstructured data. However, it does allow us to make statements about the relative magnitude of the scores compared to those of other variables that may also be decodable from the data and possibly abstract.

### M7 Simulations of the multi-layer network

The two hidden layer network depicted in Figure 5 contains 768 neurons in the input layer, 100 in each hidden layer and four neurons in the output layer. We used eight digits (1-8) of the full MNIST data set to match the number of conditions we considered in the analysis of the experiment. The training set contained 48128 images and the test set contained 8011 digits. The network was trained to output the parity and the magnitude of each digit and to report it using four output units: one for odd, one for even, one for small (i.e. a digit smaller than 5) and one for large (a digit larger than 4). We trained the network using the back-propagation algorithm ‘`train`’ of matlab (with the neural networks package). We used a tan-sigmoidal transfer function (‘`tansig`’ in matlab), the mean squared normalized error (‘`mse`’) as the cost function, and the maximum number of training epochs was set to 400. After training, we performed the analysis of the neural representations using the same analytical tools that we used for the experimental data, except that we did not z-score the neural activities since they were simultaneously observed in the simulations.

## Acknowledgements

We are grateful to L.F. Abbott and R. Axel for many useful comments on the manuscript. This project is supported by the Simons Foundation, and by NIMH (1K08MH115365, R01MH082017). SF and MKB are also supported by the Gatsby Charitable Foundation, the Swartz Foundation, the Kavli foundation and the NSF’s NeuroNex program award DBI-1707398. JM is supported by the Fyssen Foundation. SB received support from NIMH (1K08MH115365, T32MH015144 and R25MH086466), and from the American Psychiatric Association and Brain & Behavior Research Foundation young investigator fellowships.

## Footnotes

↵† co-senior authors

We asked whether a neural network trained to perform a simulated version of our experimental task would reveal a geometry similar to the one observed in the experiments. We used Deep Q-learning, a technique that uses a deep neural network representation of the state-action value function of an agent trained with a combination of temporal-difference learning and back-propagation refined and popularized by Mnih et al 2015. There are also many new analyses in the Supplementary Information.