Abstract
The ability to make inferences using abstract rules and relations has long been understood to be a hallmark of human intelligence, as evidenced in logic, mathematics, and language. Intriguingly, modern work in animal cognition has established that this ability is evolutionarily widespread, indicating an ancient and possibly foun-dational role in natural intelligence. Despite this importance, it remains an open question how inference using abstract rules is implemented in the brain — possibly due to a lack of competing hypotheses at the level of collective neural activity and of behavior. Here we report the generation and analysis of a collection of neural networks (NNs) that perform transitive inference (TI), a classical cognitive task that requires inference of a single abstract relation between novel combinations of inputs (if A > B and B > C, then A > C). We found that NNs generated using standard training methods (i) generalize fully (i.e. to all novel combinations of inputs), (ii) generalize when inference requires working memory (WM), a capacity thought to be essential for inference in living subjects, (iii) express multiple emergent behaviors long documented in humans and animals, in addition to novel behaviors not previously studied, and (iv) adopt different solutions that yield alternative predictions for both behavior and collective neural activity. Further, a subset of NNs expressed a “subtractive” solution that was characterized in neural activity space by a simple dynamical pattern (an oscillation) and geometric arrangement (ordered collinearity). Together, these findings show how collective neural activity can accomplish generalization according to an abstract rule, and provide a series of testable hypotheses not previously established in the study of TI. More broadly, these findings suggest new ways to understand how neural systems realize abstract rules and relations.
1 Introduction
Cognitive faculties such as logical reasoning, mathematics, and language have long been recognized to be characteristic of human-level intelligence. Common to these faculties is abstraction: the ability to generalize prior knowledge and experience to novel circumstances. Importantly, abstraction typically entails understanding particular relationships, e.g. “adjacent to”, “same as”, “relevant to”) between items (e.g. stimuli, objects, behaviors, words, variables), which can then be used to infer equivalent relationships between items not previously observed together, i.e. completely novel combinations of items. Such relational inferences – which can be understood as entailing the learning and use of abstract rules – are the basis of a structured form of knowledge, often termed a “schema,” that is thought to enable humans to generalize in systematic and meaningful ways [1, 2, 3, 4, 5, 6], and thus has been posited as essential to advanced cognition.
Intriguingly, recent insights from landmark work in animals [7, 8, 9, 10, 11, 12] indicates that cognition based on relations is more prevalent and thus possibly more fundamental than previously appreciated. This prevalence is evidenced by the observation that cognitive abilities that entail relational inference – such as navigation [13, 14, 15], learning-to-learn [9, 16, 17], and concept/structure learning [11, 12, 18] – are in fact widespread across the animal kingdom. Further, these abilities have been linked to memory systems in the brain – variously termed “relational memory”, “cognitive maps”, “learning sets”, among others – that enable humans and animals alike to make systematic inferences [19, 20, 21, 5], generalize across different domains [16, 22, 23], learn rapidly [24, 25, 26, 27, 28], and plan and envision new experiences [29, 30, 31, 32, 33, 34, 35, 36]. These findings and insights extend the scope of relational inference to a wide range of species and cognitive abilities, and, further, imply that there exists a deep interrelationship between relational inference and memory.
Despite this unifying importance, it remains an open question how relational inference is implemented in neural systems, whether in artificial networks (e.g. those performing linguistic [37, 38, 39] or symbolic [40, 41] tasks) or in the brain. Toward answering this question, a fundamental scientific aim is to identify or generate putative neural implementations that can be used to derive multiple experimentally testable hypotheses. In particular, hypotheses at the level of behavior and of collective (population-level) neural activity may be crucial given that these levels have proved decisive in clarifying whether and how neural systems in the brain implement various cognitive functions (e.g. vision, movement, timing, decision-making [42, 43, 44, 45, 46]). Notably, in both neurobiology and machine intelligence, relational inference is often studied in relatively complex cases (e.g. spatial [16, 21, 24, 47, 48, 49] and linguistic [38, 37, 50] knowledge), for which it may be relatively difficult to formulate or generate hypotheses derived from neural implementations, especially at the level of neural populations and of behavior. In this way, studying simpler cases may be advantageous or even crucial towards understanding relational inference in the brain. We therefore took a two-part approach intended to yield such hypotheses: first, we stipulated a task paradigm that distills relational inference into a simple yet essential form, and second, we adopted a methodology suited to discover possible population-level and behavior-relevant neural implementations thereof.
2 Transitive inference: a classic cognitive task
We first sought to operationalize relational inference in a task that was (i) reduced to a single abstract relation and (ii) based on stimuli (e.g. images) that can be presented with temporal precision. We reasoned that each of these properties might be critically important, in that (i) reduces complexity, which could ultimately enable discovering neural implementations, and (ii) delimits periods of sensory-driven neural activity (and by extension neural activity underlying inference), thus facilitating subsequent interpretation of observed neural activity.
A classic task paradigm capturing these properties is transitive inference (TI) [10, 51, 52, 53, 54], which tests a subject’s ability to use premises A > B and B > C to infer A > C — an instance of an abstract rule (i.e. choose items that are ‘higher’, based on the transitive ‘>‘ relation). TI operationalizes a simple yet powerful schema (Fig. 1a) that enables generalization from N premises (training cases) to order N 2 probes (test cases) in accordance with the relation-based rule (i.e. choose the ‘higher’ item), thus testing generalization based on the schema (‘schematization’) rather than interpolation or extrapolation (manifested in the pattern of correct responses in the task, Fig. 1b). The transitive ‘>‘ relation is itself a component of various types of relations, and is of fundamental importance in symbolic reasoning – together suggesting that TI is not essentially based on specific stimuli (or stimulus features) in isolation.
At a broader level, TI can be understood as a simplified alternative to other paradigms testing relational inference but involving multiple abstract rules (e.g. linguistic syntax and semantics) and/or task stimuli that are relatively challenging to isolate (e.g. spatial paradigms [16, 21, 24, 47]). Further, in contrast to approaches that focus mainly on lower-level neurobiological phenomena (e.g. neural firing that has abstract correlates [14, 22, 55]), TI requires direct behavioral report of successful inference, thus affording identification of potentially critical relationships between behavior and underlying neural implementations.
Remarkably, though TI is a classic task in behavioral psychology [10, 51, 53, 56, 57] and indeed a cornerstone of symbolic reasoning, the neural basis of TI has remained unclear [53, 58, 59, 60], potentially due to a lack of putative neural implementations and testable hypotheses.
3 A neural approach to transitive inference
Despite the long history of TI as a cognitive task, investigation of its neural basis in the brain is relatively recent [52, 61, 62, 63, 64, 65]. Prior work implies that an approach seeking to identify biologically accurate neural implementations would benefit from three criteria:
First, an approach should impose minimal architectural constraints. TI is observed in an extremely broad range of species, including primates, birds, rodents, and insects [53, 59, 60, 66, 67, 68]; possibly the result of convergent evolution [69]), a striking ubiquity that implies that highly specialized neural architecture may not be essential.
Second, an approach should explicitly require memory across time, particularly working memory (WM) [70, 71, 72, 73]. In living subjects, relational inferences such as TI characteristically rely on memory since subjects must assess relationships between events not experienced simultaneously (e.g. sensory stimuli separated in time) memory enables such events to be in effect brought together. WM in particular is thought to be essential to relational inference [74, 75, 76, 77], not only because WM is generally required in real-world cases of relational inference (e.g. language comprehension, spatial navigation), but also because prior work suggests that relational inference is accomplished in the brain by a neural system that intrinsically supports and/or relies upon WM (e.g. prefrontal cortex, possibly via a process akin to deliberation or reasoning [70, 72, 78]). Surprisingly, though TI exemplifies relational inference, prior work has only investigated indirect relationships between TI and WM (either by having separate WM vs. TI tasks [79], or by linking each to a common brain region [61, 80]), with no TI study to our knowledge directly testing WM by imposing an intervening delay between presented items whose relationship is to be inferred (e.g. item 1 - delay - item 2; schematic in Fig. 1c and Fig. S1). Notably, various behavioral studies of TI involve items that are sufficiently separated in space so as to require sequential observations (e.g. [52, 67, 68]), thus implicitly requiring WM.
Third, an approach should explicitly model dynamics. Even beyond the fact that WM necessitates dynamics over the course of a task trial, collective dynamical processes in the brain govern spontaneous neural activity (prominent in awake subjects; [81, 82, 83]) and likely also behavior in tasks, particularly tasks that expressly allow for self-determined responses. Such self-determined responses – and therefore the collective dynamics that underlie them – are relevant when tasks do not rigidly cue responses (e.g. via a ‘go’ cue), such as timing [84, 85] and reaction-time tasks [86, 87, 88], and are more broadly relevant in naturalistic or real-world settings [89].
Importantly, all three of these criteria are suited for the methodology of generating and analyzing task-optimized recurrent neural networks (RNNs), an approach that has proved successful in discovering neurobiologically insightful implementations of other cognitive abilities, and, further, in generating testable predictions at the level of collective neural activity [90, 91, 92]. We therefore adopted this approach, and, further, expanded upon it in two ways. First, in conjunction with RNNs, we also assessed whether and how trained models that cannot implement WM, yet have neurally relevant feedforward structure, might transitively generalize. To do so, we investigated two archetypal models: logistic regression (LR [93]) and multi-layer perceptron (MLP [94]) (schematic in Fig. S2a; see Methods), each tested on a task format having no delay (Fig. 1c). Second, to identify multiple potential solutions to TI, we investigated two neurobiologically relevant types of RNN variants:
Learnable connectivity
In NNs and the brain, network architecture is comprised of two types of connectivity: feed-forward and recurrent. This distinction is important sincethe operations implemented by each type are not simply interchangeable: recurrent connectivity entails dynamics of the network’s internal state (also interpretable as the network’s working memory), which has no equivalent in networks that are solely feedforward. With regard to learning and performing tasks, the role of changes in feedforward vs. recurrent connectivity remains generally unclear (e.g. [95, 96, 97]). This is indeed the case for TI (a paradigm that requires learning of premise trials, e.g. A vs. B) — that is, it is not known whether the neural system implementing TI in the brain relies on learned connectivity that is feedforward, recurrent, or some configuration of both. It is worth emphasizing that this distinction is in fact critical for relational inference tasks such as TI, where the significance of the stimuli in the task (items A, B, C, etc.) has no a priori relationship to stimulus features, unlike tasks based on particular stimulus modalities and features (e.g. tactile frequency [98, 99], visual frequency and orientation [100], object categories [101]) that are known to be encoded in feedforward inputs from upstream brain regions. To address this gap in understanding, we investigated RNNs for which feedforward connectivity was either trainable or not trainable (fully-trainable RNN (f-RNN) vs. recurrent-trainable RNN (r-RNN), respectively; diagrammed in Fig. 1d).
Training regime
The accuracy of trained NNs in matching neural systems in the brain has been found to depend on including biologically relevant constraints on the training procedure, such as penalties (regularizations) that serve to limit excess neural activity [48, 102, 103, 104]. Further, differences in the initial strength of connectivity – the magnitude of connection weights prior to training [105, 106] – have been found to yield NNs having striking differences in implementation (internal representations and dynamics) that are concomitant to differences in neurobiological accuracy [48, 103, 104, 107, 108]. These findings indicate that it is critically important to investigate both factors in trained NNs — training constraints and initial strength of connectivity — when seeking to identify biologically accurate neural implementations. We thus trained NN variants using systematically varied parameter sets corresponding to these factors, referring to each set as a “training regime” (‘simple’ to ‘complex’, Table 1).
4 A variety of neural models perform TI
We first sought to determine whether trained RNNs could successfully perform TI. Unlike perceptual tasks or tasks based on statistical inference, TI expressly requires generalization to novel recombinations of inputs. Such generalization requires additional knowledge regarding the underlying relationship between inputs (the transitive schema, Fig. 1a). Thus TI is a taskthat requires an a priori inductive bias – here, for transitivity for which it is not generally known whether relatively unconstrained models, such as trained RNNs, manifest [92, 109, 110].
Mirroring TI as encountered by living subjects (7-item TI with items A to G; each item represented as a random high-dimensional (100-D) input; see Methods), we trained RNNs (100 recurrent units, tanh nonlinearity) exclusively on “premise” (training) trials (A vs. B, B vs. C, C vs. D, etc.) and evaluated whether RNNs generalized to test trials (B vs. D, etc.; all trial types shown in Fig. 1b). Training was conducted using gradient descent optimization and backpropagation-through-time; furthermore, for RNNs, all trials required working memory (WM) (in delay format, Fig. 1c and Fig. S1). Response choice was defined by which of two output units (linear readouts; corresponding to choice 1 vs. 2) was activated in the trial, and response time (RT) was defined as the time at which activity of the output unit met a fixed threshold (85% of maximum activation). For an initial assessment of whether RNNs could generalize, a simulation of all trial types was performed under noiseless conditions.
We found that trained RNNs often generalized perfectly, i.e. responded correctly to all test trials (example RNNs in Fig. 2; summary in Table 2). Interestingly, it was also common for trained RNNs to respond correctly on all premise pairs (training trials) yet not generalize (example RNNs in Fig. S3; Table 2), whereas feedforward models trained on a no-delay task format never failed to generalize (examples in Fig. S2; LR: 100 out of 100 instances; MLP: 100 out of 100 instances; see also [111] for additional results showing generalization in MLPs), highlighting an essential difference posed by the requirement for WM (see additional discussion in Methods, Task).
Notably, we also observed that, among RNNs, fully-trainable RNNs (f-RNN) and simple training regimes more frequently yielded RNNs that generalized (Table 2), providing an initial hint that RNN variants might have functionally important differences.
5 Trained RNNs show a collection of emergent behaviors
Decades of work have established that living subjects performing TI show striking and commonly shared patterns of behavior, manifesting both as patterns of errors and response times (RT) [53, 60]. These behavioral patterns are based on trial types, where each trial type is defined by items (e.g. AB, BA), and where each item is defined by its implicit position — or “rank” — in the transitive schema (Fig. 1a; the rank of A is highest, while G is lowest). As recognized previously [60], empirically observed behaviors are not only important constraints on explanatory accounts of TI, but also potential sources of insight into underlying implementations.
We therefore investigated whether these behaviors (described below), and possibly others, were expressed by RNNs that successfully perform TI (Fig. 2). Importantly, expression of these behaviors would effectively be emergent, since the RNNs were neither pre-configured nor trained to express particular behaviors beyond that of responding correctly on training trials. Further, prior work on TI has either omitted or left unanalyzed delay periods that intervene between item presentation – necessitating WM – leaving unknown whether and how previously described TI behaviors relate to WM. Indeed, imposing a delay requiring WM also raises the possibility of observing, either in models or subjects, behavioral patterns that have not been previously studied and that could reflect underlying implementations.
Given these considerations, we took the following approach to determine whether RNNs expressed either known or novel behavioral patterns. First, we focused on RNNs that generalized fully (correct responses for every test trial type). Next, for each individual RNN, we simulated the networks (5000 runs, each of all trial types) with progressively increasing levels of intrinsic noise until the average performance of the RNN was >50% on training trials and <95% on testing trials (i.e. sub-asymptotic performance, the level of performance for which behavioral patterns have been observed; see Methods for further details). Lastly, we ran simulations (5000 runs, all trial types) at this noise level, from which we then measured performance (% correct) and RTs. RTs were measured using a previously established criterion (time to a fixed threshold in output units [112]).
We found that RNNs exhibited not only previously observed TI behaviors, but also a novel behavioral pattern not previously studied (Fig. 3; an analogous analysis of behavior in feedforward models is shown in Fig. S2). Importantly, behavior was both qualitatively and quantitatively comparable to that of living subjects, both single RNNs and across RNNs (Fig. S4). In the following sections, we address each behavioral pattern.
The symbolic distance effect
A standard observation across behavioral studies of TI is the “symbolic distance” effect: the larger the difference in rank between items (e.g. A vs. D compared to A vs. B), the higher the accuracy and the lower the RT [51, 53, 59, 60]. We found that RNNs performing TI invariably exhibited the symbolic distance effect (Fig. 3, second column; see also Fig. S4 for effect across all trial types).
The end item effect
Along with the symbolic distance effect, a standard behavioral observation in subjects is the “end item” (or “terminal item”) effect: trials containing either the highest- or lowest-rank item (A and G, respectively; both “end items”) are associated with higher performance and lower RTs [53, 60, 113]. We found that RNNs performing TI invariably exhibited the end item effect (Fig. 3, third column; see also Fig. S4 for effect across all trial types).
The “end order” effect: a novel behavior with two alternative versions
In some behavioral studies of TI, subjects perform better in trials containing the highest rank item (item A) compared to trials containing the lowest rank item (item G), a pattern termed “lexical marking” ([111, 114]; related to a ‘magnitude’ effect [66]). Further, a previous study of MLPs that perform TI in the non-delay format (Fig. 1c) has suggested that MLPs characteristically show lexical marking [111]. In contrast to these results, we did not observe a consistent lexical marking effect, either in feedforward models (LR and MLPs, Fig. S2; see Fig. S2 for further discussion) or RNNs (Fig. 3, fourth column). This discrepancy may be due to differences in the relative size of MLPs, as the number of hidden layer units is comparable to number of task items in [111], while in the present study the number of hidden layer units (100 units) is an order of magnitude larger than task items (7 items) (further details in Methods).
While investigating lexical marking behavior, we made an unexpected observation: RNNs showed lower RTs if an end item (A or G) was presented first rather than second (Fig. 3, fourth column, f-RNN simple). We further unexpectedly found that other RNNs showed the inverse behavioral pattern, namely, lower RTs if an end item (A or G) was presented 2nd (rather than 1st) (Fig. 3, fourth column, f-RNN complex, and both r-RNN simple and complex). We termed the shared pattern the “end order” effect (i.e. end-item order; quantified separately in the sixth column of Fig. 3, fifth column), and its former and latter patterns as ‘first-faster’ and ‘second-faster’ versions, respectively.
Crucially, in examining this behavior across networks using a quantitative index (end order index, ranging from -1 (second-faster) to 1 (first-faster); see Methods) (Fig. 4), we found that the two alternative versions of the behavior systematically differed with respect to both types of RNN variants (learnable connectivity (f-RNN vs. r-RNN) and training regime (simple to complex), Fig. 1d). First, we found that r-RNNs always showed the second-faster pattern (Fig. 4b, bottom row), whereas f-RNNs showed both (Fig. 4b, top row). Second, we found that complex-regime RNNs nearly always showed the second-faster pattern (Fig. 4b, red histograms in top and bottom rows). In contrast, simple-regime f-RNNs consistently showed the first-faster pattern (Fig. 4b, both simple (high) and simple (low), black and blue histograms, respectively, in top row), with f-RNNs trained in the simple (high) regime always showing the first-faster pattern (Fig. 4b, black histograms in top row). Notably, intermediate-regime f-RNNs showed mixed and lower magnitude patterns, consistent with a systematic relationship between training regime and behavior.
More broadly, the finding that RNNs performing TI consistently show alternative patterns of behavior (first-vs. second-faster end order behavior; Fig. 3, fifth column) implied that the networks had systematically different underlying implementations. This implication was further substantiated by the finding that these differing behaviors systematically varied across networks (Fig. 4). Together, these findings provided additional motivation to investigate the underlying implementations of these networks.
6 A simple dynamic and geometry implementing transitive comparison
Neural recordings in living subjects performing TI indicate that single neurons in the brain can encode variables relevant to TI, including symbolic distance [62, 63]. It remains unclear what collective neural process or activity pattern implements the comparison operation – akin to a ‘>‘ operator – that generalizes transitivity to novel combinations of items (test cases).
For initial insight, we analyzed how solely feedforward models generalized transitivity (no-delay format in logistic regression and multi-layer perceptron models, Fig. S5). In these models, examination of unit activations indicated that transitive comparison was implemented using a “subtractive” solution, where the rank (A, B, C, etc.) and position (left vs. right) of each item was mapped to the magnitude and sign, respectively, of unit activation (Fig. S5). Notably, in either feedforward model (LR or MLP), this subtractive solution was sufficiently realized by the (trained) feedforward weights operating directly on the input [111], clarifying that a direct operation on otherwise arbitrary inputs (items A, B, C, etc. encoded as high-dimensional (100-D) random vectors in input activity space; see Methods) is adequate to perform TI when feedforward input can be learned or trained. This further raised the question of how TI is performed when modifiable feedforward connectivity is not sufficient (e.g. when memory over time is required, as in WM) and/or not available (e.g. a putative neural system in the brain underlying TI, and as the case in r-RNNs).
We thus sought to clarify neural implementations in RNNs solving TI. Given the above findings in feedforward models (Fig. S5), we broadly hypothesized that other neural systems having trainable feedforward connectivity — such as f-RNNs— might adopt a similar subtractive implementation. Beyond this hypothesis, we sought to address three unknowns. First, it was not clear whether or how the subtractive implementation observed in feedforward models – which relies on simultaneous input and thus requires twice the number of input connectionsas RNNs – was relevant to the RNNs solving TI with delay requiring WM. Second, it was also not clear whether f-RNNs would consistently adopt the same implementation, across either different instances (different random initializations of network weights) or different training regimes (simple vs. complex, Table 1). Lastly, it remained unclear whether the implementation of TI in r-RNNs, which have no trained feedforward connectivity, would show any similarity to feedforward implementations.
To address these points, we began by first investigating the implementation of TI in fully-trainable RNNs (f-RNNs) trained in the simple regime (Fig. 5; RNN types and training regimes in Fig. 1d), as we thought these networks might show the most similarity to the feedforward models characterized previously (Fig. S5). We initially observed that population activity was consistently low-dimensional (% variance explained by top 3 PCs, Table 3). Next, in examining population trajectories (example network in Fig. 5a), we observed two signature patterns: (1) a linearly-arranged rank-ordered response to item presentation (Fig. 5a, observable in green-shaded circles, corresponding to the network response to item 1), and (2) a rotational pattern during the delay period (Fig. 5a, bottom), suggesting an oscillatory dynamic.
Pattern 1 suggested a similarity to the subtractive implementation, as the pattern was characterized by an intrinsically 1-D ordering of item ranks (observed in single units in feedforward models, whereas in N-D population activity space in RNNs) and was due to trained feedforward connectivity. Pattern 2 suggested that a low-dimensional dynamical process — specifically a single (2-D) oscillation — might be sufficient to account for how these networks implemented transitive comparison across time. Importantly, RNN activity during the delay period was effectively described by linear dynamics (ordinary least-squares fit) characterized by an oscillatory mode correspondent with pattern 2 (Fig. 5b-d, R2 ∼0.8-0.9, see Fig. S7). Given these clues, we next sought to identify the underlying implementation explicitly. Importantly, prior work has found that, in trained RNNs, analysis of network-level neural dynamics with respect to fixed points (FPs) can identify dynamical components that have specific functional roles [92, 115, 116, 117] and are jointly sufficient to perform cognitive tasks.
Taking this approach, we found that simple-regime f-RNNs had an “early-trial” fixed point located near activity trajectories at the beginning of trials (Fig. 5a, orange cross). This fixed point had an oscillatory mode of frequency ∼0.5 cycles/delay (highlighted by open arrowheads in Fig. 5b) and was orthogonal to the mean trajectory across trial types (i.e. the cross-condition mean (XCM), a population activity component observed across trial-based tasks [118, 119, 120]; plotted as yellow line in Fig. 5a; quantification of orthogonality in Fig. S8). Some RNN instances also showed additional FPs that were overtly associated with other task functions (Fig. 5a, black crosses): two “choice” FPs that are stable FPs (attractors) toward which trajectories corresponding to each of the two choices travelled and would eventually settle after item 2 presentation (not shown), and a saddle FP, located between the two choice FPs, that was stable except for a single unstable axis oriented toward each choice FP (Fig. 5a, choice FPs: black crosses respectively near end of choice 1 and choice 2 trajectories; saddle FP: center black cross, saddle unstable axis in pink line). This unstable axis is effectively a “choice axis,” as it was aligned with a subsequent alignment of activity trajectories with either one of the two choice FPs, following presentation of item 2 (Fig. 5a, item 2 presented at the time point corresponding to the colored stars), and was parallel to the direction between choice 1 and choice 2 activity states. More generally, these latter patterns of activity in these networks indicate an effective choice axis (direction pointing from choice 1 vs. choice 2 activity states; see Methods) that is parallel to the oscillatory mode (quantification in Fig. S8).
Together, these dynamical components indicated that the transitive comparison could be performed within the 2D linear subspace (plane) of the oscillatory mode of the early-trial FP. To test this possibility, we evaluated whether TI could be performed solely by activity and dynamics within the identified oscillation (constituting a linear approximation of the oscillation seen in the RNN). We found that this oscillation (see Methods), when presented with trial-structured input (item 1 - delay - item 2), yielded activity trajectories for which correct choice was linearly separable (Fig. 5d).
This clarified a simple implementation of transitive comparison over time (Fig. 5e): namely, a single oscillation that uniformly re-orients activity states corresponding to item 1 (A, B, C, etc.), doing so via a common angular displacement. In the delay period (following the presentation of item 1), this re-orientation serves to shift the activity state in the direction opposite to the direction of the activity displacement due to item presentation. Note that the activity displacement due to item presentation is rectilinear – a consequence of its implementation as feedforward input – whereas the dynamics-driven activity displacement in the delay period is angular. The re-oriented activity state is thereby “subtractive” with respect to the activity shift elicited by the subsequent item presentation (item 2).
Notably, a particular angular shift of ∼0.5 cycles (resulting from an oscillatory frequency of ∼0.5 cycles/delay) re-orients the activity state to be opposite (diametric) to that imposed by the presentation of the item prior to the delay. This can be likened to a change-of-sign along a “comparison” axis in population activity space (Fig. 5e), and is analogous to the mapping of signs (-vs. +) to item position (left vs. right) in the subtractive implementation of TI in feedforward models (Fig. S5). The resulting activity along the comparison axis accounts for transitive comparison and the symbolic distance effect, as long as the output implemented by the network (the readout mechanism, e.g. “choice axis” of saddle-point dynamics, Fig. 5a) is aligned with the comparison axis.
7 Geometric signatures of alternative neural implementations
The above findings identify a neural implementation of TI (Fig. 5), but do not address other putative neural implementations. As outlined further above (‘A neural approach to transitive inference’), prior studies have found that RNN variants performing a task may show strikingly different levels and signatures of neurobiological accuracy [48, 103, 121, 97], implying that variants adopt different underlying neural implementations. Likewise, we previously observed that RNN variants performing TI systematically showed alternative behavioral patterns (Fig. 4), similarly implying alternative neural implementations.
To clarify these alternative implementations – in particular, the population-level patterns of neural activity they respectively predict – we took the ‘subtractive’ implementation shown by simple-regime f-RNNs as a point of reference (Fig. 5). We sought to identify an activity pattern that characterizes this implementation while also being easily quantifiable (whether in neural models or experimental data). Such a characteristic or signature activity pattern could then be a means by which to survey or discover alternative implementations. From the results above (Fig. 5), two candidate activity patterns were (1) the linearly-arranged rank-ordered arrangement of activity states (pattern 1), which here refer to as “ordered collinearity,” and (2) the oscillation associated with transitive comparison (pattern 2). Across RNN variants, we assessed these two patterns in turn.
Surprisingly, we found that pattern 1 was not unique to the subtractive implementation observed in simple-regime f-RNNs. A striking indication that this was the case came from analyzing r-RNNs (recurrent-trainable RNNs), an RNN variant that necessarily adopts an alternative overall implementation, since the subtractive implementation (Fig. 5e) depends in part on the ordered collinear activity made possible by learned feedforward connectivity (not available in r-RNNs, see Methods). Despite this fundamental difference, neural activity in r-RNNs appears remarkably similar to that of simple-regime f-RNNs (Fig. 6a, compare to Fig. 5a): in particular, r-RNNs showed oscillatory activity reminiscent of the single oscillation associated with transitive comparison previously seen in simple-regime f-RNNs (frequency ∼0.5 cycles/delay, inferred either from fixed-point linearization or from linear dynamics inferred from delay period activity, Fig. 6b-c). As with f-RNNs previously (Fig. 5d), linearization analysis of these networks revealed that a single linear oscillatory mode was sufficient to accomplish transitive comparison in time (Fig. 6d). Interestingly, this sufficiency was only apparent when the oscillation was applied to full-dimensional activity (Fig. 6d, bottom row; see Methods for further explanation) and not activity restricted to 2-D subspace of the oscillation (Fig. 6d, top row), an indication that the implementation of TI in these networks also depends on higher-dimensional activity components. These results indicate that though r-RNNs cannot adopt the subtractive solution (Fig. 5e), the implementation in r-RNNs nonetheless can rely on a single dynamical pattern (here an oscillation) to accomplish transitive comparison over time.
Unlike pattern 1, pattern 2 (ordered collinearity) differed substantially across alternative implementations of TI. To assess pattern 2 – a simple geometric pattern in population activity space (seen in particular activity subspaces, e.g. PCs and oscillatory subspaces, Fig. 5a, c) – we first defined a measure that quantifies the degree to which neural activity conforms to this pattern, which we termed the ordered collinearity index (OCI) (ranging from -1 to +1, schematic in Fig. 7a; see Methods). The OCI is measured in the full activity space of individual networks (ambient N-D space reduced to top 10 PCs, see Methods). As expected from previous results (Figs. 5 and 6), simple-regime f-RNNs expressed OCIs that were relatively high (>0.5) or near 1 (Fig. 7a, black and blue histograms in left and middle columns; OCI, early delay: simple (high): 0.78 ± 0.04, simple (low): 0.73 ± 0.04; OCI, late delay: simple (high): 0.95 ± 0.02, simple (low): 0.90 ± 0.04; mean ± s.d., n=50 for each variant) and that were virtually always higher than that of r-RNNs in the early part of the delay period (Fig. 5a, first column; r-RNN: 0.22 ± 0.11, n=240 instances, simple-regime f-RNN: 0.76 ± 0.05, n = 100 instances; mean ± s.d.) (note that early vs. late delay were defined as the first vs. last quarter of the delay period, respectively). These results establish that the OCI – a measure that is evaluable for any neural system performing TI over a delay period – can be used to distinguish between alternative implementations.
Measuring the OCI yielded further insight into activity geometry across implementations. In examining OCIs at different time points during the delay (Fig. 7a, left and middle columns), we noticed that OCI could increase dramatically across the delay period (e.g. ∼0.1 to ∼ 1 for r-RNNs trained in the simple (high) regime, Fig. 7a, black histograms in bottom row; also previously observable as an “unfolding” of activity trajectories during the delay period in Fig. 6a; see Fig. S6 for example time traces of OCI). This observation led us to examine change in OCI across RNN variants (Fig. 7a, right column). Intriguingly, f-RNNs exhibited systematic differences depending on training regime: in particular, simple- and intermediate-regime networks invariably increase OCI during the delay (Fig. 7a, black, blue, and green histograms in top row; OCI change: simple (high): 0.17 ± 0.03, simple (low): 0.17± 0.05, intermediate: 0.08 ± 0.06; mean ± s.d., n=50 instances for each variant), whereas complex-regime networks consistently decrease OCI during the delay (Fig. 7a, both complex (high) and complex (low), red and orange histograms, respectively, in top row; OCI change: complex (high): -0.13 ± 0.10, complex (low): -0.09 ± 0.07; mean ± s.d., n=50 instances for each variant). These qualitatively different patterns of activity (increasing vs. decreasing OCI) were consistent with alternative underlying implementations in these network variants, as implied previously by the alternative patterns of behavior expressed by these variants (Fig. 4). More broadly, these findings raised the possibility that, besides ordered collinearity (which characterizes the subtractive implementation of TI, Fig. 5f), there might exist other geometric arrangements of activity, expressed during the delay period, that distinguish between alternative implementations of TI and thus constitute additional neural predictions.
Leveraging this insight, we identified one such predictive geometric activity pattern. We were struck by the fact that alternative implementations of TI – namely, the subtractive solution (Fig. 5f) vs. the solution in simple-regime r-RNNs (Fig. 6; lacking ordered collinearity from feedforward input) – nonetheless both appeared to depend upon a similar dynamical pattern: an single oscillation enabling transitive comparison over time. We conjectured that a single oscillation could be a relatively simple case in a putative wider set of dynamical patterns enabling transitive comparison over time, and, further, that the essential operation of this wider set of dynamics is rotation.
Given this possibility, we hypothesized that activity geometry based on angular rather than distance relationships (i.e. angles (vs. distances) between activity states in A vs. B trials, A vs. C, B vs. C, etc.) might effectively distinguish between alternative implementations of TI. To test this possibility, we evaluated two geometric measures – mean angle (Fig. 7b) and mean distance (Fig. 7c) – that were similar to the OCI in that they quantify geometric relationships of neural activity across trial types (see Methods), but different in that they do not measure conformity specifically to ordered collinearity. Strikingly, we found that change in mean angle (Fig. 7b, left column), but not mean distance (Fig. 7b, right column), categorized RNN variants (values <0 vs. >0) in overall accordance with the two types of alternative implementations previously implied: (1) the subtractive implementation exhibited in simple-regime f-RNNs vs. the implementation in r-RNNs (which do and do not rely on learned feedforward connectivity, respectively) (Fig. 7b, left column, black histogram in top row vs. histograms in bottom row; simple-regime f-RNN: -0.043 ± 0.010 (mean ± s.d., n=50 instances), vs. 0, p < 10−9 ; r-RNN, all variants: 0.015 ± 0.022 (mean ± s.d., n=240 instances), vs. 0, p < 10−22; signed-rank tests), and (2) the alternative implementations of simple-vs. complex-regime f-RNNs (as implied from these networks’ alternative behaviors, Fig. 4) (Fig. 7b, left column and top row, values <0 vs. >0 in black and blue vs. orange and red histograms; simple-regime f-RNNs: -0.025 ± 0.022 (mean ± s.d., n=100 instances), vs. 0, p < 10−13; complex-regime f-RNNs: 0.018 ± 0.021 (mean ± s.d., n=100 instances), vs. 0, p < 10−11; signed-rank tests). In this way, mean angle change constitutes a geometric signature of alternative implementations of TI, a prediction that is directly testable on population-level activity from any neural system performing delay-format TI.
8 Behavior systematically predicts activity geometry
The observation that RNN variants performing TI showed signature differences in activity geometry (Fig. 7) was reminiscent of the alternative behavioral patterns shown by these network variants (Fig. 4). This parallel raised the possibility that solely behavioral results (whether from living subjects or neural models) could be used to make predictions regarding underlying activity geometry. More specifically, explicit quantification of the relationship between behavior and activity geometry would clarify joint neural-behavioral predictions across the RNNs, thus helping adjudicate between alternative neural implementations given empirical data. We therefore examined whether and how activity geometry corresponded with behavior, focusing on the end order effect (Fig. 3a, last column), for which RNN variants expressed alternative response-time behavior (Fig. 4).
We found that, across RNNs, end order behavior varied systematically with activity geometry (Fig. 8). In particular, two patterns of activity geometry (ordered collinearity and mean angle change; Fig. 8a, first and fourth columns, respectively) were predicted by behavior across all RNN variants, i.e. all networks regardless of differences in learnable connectivity (f-RNN and r-RNN) or training regime (simple to complex) (Fig. 8b). Importantly, specific quantitative ranges of behavior predicted, jointly, both activity geometry and alternative implementations of TI. Behavior values (end order index) >0.25 were uniquely associated with maximal ordered collinearity, thereby constituting evidence for the subtractive implementation. In contrast, behavior values <0.25 were associated with minimal ordered collinearity (and also r-RNNs, see also Fig. 4b), thereby constituting evidence for alternative implementations. Further, across all RNNs, alternative behavioral patterns (first vs. second-faster, correponding to behavioral values, >0 vs. <0, respectively) corresponded to the sign of mean angle change (>0 vs. <0, indicating angular alignment and anti-alignment, respectively), suggesting that networks showing second-faster behavior (<0, networks below y-axis in Fig. 8b, right) may have similar underlying implementations. Together, these findings clarify the structure of a predictive relationship between an observable behavior and underlying neural activity.
Since particular patterns of activity geometry (OCI and mean angle change, Fig. 7) distinguished between alternative neural implementations of TI, explicit characterization of the relationship between behavior and activity geometry (Fig. 8) thus provides additional grounds by which to adjudicate between alternative neural implementations.
9 Discussion
In this study, we generated and investigated a collection of neural models of transitive inference (TI), a classical cognitive task that distills relational inference into a simple yet essential form. We found that trained recurrent neural networks (RNNs) not only could perform TI — i.e. generalized fully to all novel (untrained) combinations of inputs (Fig. 2 and Table 2) — but also showed behavioral patterns that have long been documented in living subjects performing TI (Fig. 3). In addition to these behaviors, these networks also showed a previously unstudied pattern of behavior that manifested in two alternative versions (“end order” effect, Figs. 3 and 4). Importantly, RNNs optimized to be efficient (via training regularization and low initial connectivity strength; ‘simple’ training regime) expressed a solution to TI characterized by relatively simple dynamics and geometry in population activity space (population “subtraction,” Fig. 5). Lastly, we found that RNNs expressing this simple solution showed systematically different behavior and neural activity patterns compared to other RNNs that also performed TI (Figs. 4, 6, 7, 8; summary of predictions in Table 4).
Prior work on relational inference in the brain has often focused on task paradigms that rely on stimuli that are challenging to isolate (e.g. spatial tasks [21, 24, 47, 122, 123]), test multiple relations at once (e.g. tasks with linguistic responses and/or episodic elaboration, e.g. [124, 125, 126, 127]), or do not require behavioral report of inference. Possibly as a result, there are relatively few hypotheses and available models that clarify or explain how neural systems accomplish relational inference (i.e. generalize according to the relation) — most notably at the critical explanatory levels of population-level neural activity and behavior. We therefore adopted a task paradigm (TI) and a neural approach (task-trained neural networks) suited to meet these challenges.
Crucial to our approach was to identify, where possible, multiple putative implementations of TI. To do so, we investigated whether and how NNs performed TI when two neurobiologically relevant factors varied — learnable connectivity (i.e. fully-trainable RNNs (f-RNNs) vs. recurrent-trainable (r-RNNs)) and training regime (regularization and initial synaptic strength; simple vs. complex training regimes) (Fig. 1d, Table 1). We found that each of these NN variants could perform TI (though differing in how often they generalized, Table 2), and identified three characteristic types: (1) simple-regime f-RNNs, (2) simple-regime r-RNNs, and (3) complex-regime RNNs, each of which made different testable predictions (summarized in Table 4). Behaviorally, these NN variants expressed alternative patterns of response times (RTs) (Fig. 4, Table 4, second column). Neurally, we found that simple-regime f-RNNs expressed two prominent patterns of neural activity that were jointly sufficient for a population-level implementation of TI (population “subtraction”, Fig. 5e): first, a rank-ordered linearly arranged set of activity states — which we refer to as “ordered collinearity” — and, second, a single oscillation appropriately oriented in activity space (Fig. S8). By investigating these (and other) patterns across networks, we clarified neural activity predictions that distinguished NN variants (Fig. 7) and also systematically varied with behavior (Fig. 8).
At the same time, the neural activity expressed in different networks performing TI suggested a common principle across networks: namely, rotation. This was at first suggested by the implementation of TI expressed in simple-regime f-RNNs (Fig. 5, diagram in Fig. 5e; “subtractive” solution), in which a single oscillation serves to rotate activity states in activity space. In this solution, the function of the oscillation is to transform the initially input-driven internal representation of item 1 — a 1-D arrangement of activity states in these networks (ordered collinear arrangement, seen variously in Fig. 5, quantified in Fig. 7a) — in the direction opposite to that of the input-driven activity shift (corresponding to the presentation of item 1). This re-orientation, which is angular by virtue of the oscillation (in contrast to the input-driven activity shift, which is rectilinear), effectively results in a “subtraction” of item 1 from item 2 along a linear axis in activity space. In light of this solution, we were struck by the fact that a single oscillation also looked to be essential to implementation of transitive comparison in simple-regime r-RNNs (Fig. 6), even though these latter networks do not show the same (ordered collinear) arrangement of activity states upon presentation of item 1 (compare Figs. 5a and 6a; quantified in Fig. 7a). This suggested that there were additional input-driven activity geometries, besides ordered collinearity, that could nonetheless be transformed by dynamics in a manner enabling generalization. In this way, a single oscillation could be a simple case of a wider set of dynamical patterns enabling transitive generalization, and, further, that the essential operation of this putative wider set of dynamics is rotation. Indeed analysis of dynamics in other RNN variants performing TI suggest a diversity of dynamics (Fig. S7). While a non-rotational solution may also be possible, constraints relevant to neurobiology (e.g. metabolic efficiency or the dimensionality of outputs) may favor rotational solutions [103, 128, 117, 129]. Notably, we used the generalized idea of rotational transformation during the delay to identify an additional neural prediction distinguishing RNN variants (mean angle change, Fig. 7b; summarized in Table 4).
Apart from this broader implication, we note that each of the predictions — behavioral or neural — established in this study can be readily tested in living subjects, and moreover can be used to adjudicate between different neural models of TI (Table 4). Surprisingly, the delay-based task format (Fig. S1; adopted to enable direct investigation of the interrelationship between WM and relational inference [74, 75, 76, 77]) has not, to our knowledge, been investigated in any prior study of TI. (Indeed it remains possible that prior behavioral and neural findings in TI are ultimately derived from an intrinsically WM-based process in the brain.) It will therefore first be necessary to establish that living subjects can perform TI in a delay format, after which particular behavioral patterns can be assessed. It is worth noting that the present models express behavioral patterns that have been consistently observed in past TI studies, namely the symbolic distance and end item effects (Fig. 3). Beyond these patterns, solely behavioral investigation can test the present neural models, specifically whether and to what degree living subjects show the models’ novel behavioral prediction (“end order” effect, Fig. 4). Further, the alternative versions of this behavioral pattern (first-vs. second-faster, Fig. 4a) raises the possibility that this behavior, if observed, may help account for behavioral differences — and by extension differences in underlying neural implementation — between individual subjects. Notably, recent work focusing on other task paradigms also seeks to use behavior to to infer subject-specific latent variables [130, 131, 132].
Complementing behavior, the present models also make a set of population-level neural activity predictions (Table 4, columns 3-5). These predictions are most immediately relevant to neural data from brain regions previously linked to relational inference and/or WM, most notably prefrontal cortex (PFC) and hippocampus [133, 134, 135, 136, 137, 138, 139], both required for TI performance [61, 140]. Importantly, testing the predictions we have established has the potential to provide insight not only into how the brain implements TI, but also how these brain regions contribute to the ability to perform other cognitive tasks, particularly those involving WM and relational inference. In particular, each of the three classes of neural models we have identified are associated with different properties with regard to learnable connectivity, efficiency, and dynamics (Table 4, Implications). In this way, distinguishing between these models (by way of neural and behavioral data) would suggest that these properties pertain to the neural system in the brain responsible for TI. For example, empirical support for simple-regime r-RNN would suggest that the neural system in the brain underlying TI relies on learned recurrent dynamics rather than learned feedforward input — two fundamentally different implementations of relational knowledge (Fig. 5e). The findings in the present study may also relate to the neural basis of other cognitive functions. Of direct interest are neural activity patterns relevant to abstraction — whether at the level of single cells (e.g. place and grid cell firing [14, 55] and other firing having abstract correlates [141, 142]) or neural populations (e.g. activity geometries [45, 107, 143, 144, 145] and dimensionalities [146, 147, 148, 149] suitable for generalization). In our approach to TI, we deliberately chose not seek to fit or capture these neural activity patterns, and instead stipulated relatively unconstrained neural models. Indeed there may exist important relationships between such activity patterns and those that we find in NNs performing TI. It is also worth emphasizing that our findings do not directly address learning processes, for which prior studies have proposed various models and mechanisms [60, 150, 151, 152, 153, 154] (including for explicit variants of TI, where human subjects are informed of the transitive hierarchy [155, 156, 157]). Further, our analyses and neural activity predictions focus on delay period activity, leaving open the question of whether and how neural activity following presentation of both items may contribute to transitive generalization. More broadly, it is important to point out that there exist any number of other types of relational inferences (e.g. spatial navigation), and that the relevant brain regions linked to TI and relational inference also support or pertain to cognitive capacities such as structure learning, episodic and semantic memory, and imagination. This convergence of diverse cognitive functions indicates that, toward understanding their biological basis, there is a major need to synthesize approaches.
Author contributions
K.K. conceived study in discussion with X.W, V.F., and L.F.A. All authors contributed to study design. K.K. implemented models and performed analyses. K.K. wrote manuscript with input from all authors.
Code availability
Code and trained networks will be made available upon publication.
Methods
Task
Transitive inference (TI) is a classic cognitive task that requires subjects to infer an abstract ordered relation – here, transitivity (Fig. 1a) – between items not previously observed together, i.e. using A > B and B > C to infer A > C (Fig. 1b). TI defines test cases expressly as novel recombinations of training inputs, thus primarily testing relational rather than statistical inference. We focused on a 7-item version of TI, in which there are 12 training trials and 30 test trials (training: A vs. B, B vs. C, etc.; test: A vs. C, B vs. D, etc.; see Fig. 1b for diagram of trial types and correct responses), though in pilot work we found that our approach could be generalized to fewer or more items with qualitatively similar results.
Given the interrelationship between relational inference (exemplified by TI) and working memory (WM) [75, 76], we investigated TI in a task format that explicitly imposes a delay that necessitates WM (Fig. S1a, delay format). Further, for comparison to previous modeling work [60, 111] and for potential insight, we also studied a no-delay task format, for which the presentation of task items (A, B, C, etc.) is simultaneous (diagram in Fig. 1c). It is worth noting that in some prior TI studies, a delay between stimuli is implicit in the free-exploration afforded to subjects (e.g. [52]).
Besides WM, an important difference between the delay vs. no-delay formats is that the no-delay version requires twice as many input parameters as the delay version (e.g. twice as many input connections in a neural system). This difference may make the delay version not only more difficult to perform, but also more neurobiologically accurate with respect to neural systems underlying abstract cognition: these systems look to receive extremely diverse inputs, implying that the extent of input connectivity is relatively constrained [158].
Input stimuli (items)
A panel of input stimuli corresponding to items A, B, C, etc. was constructed for each model instance. In TI in living subjects, an item is an arbitrary sensory object (e.g. image, odor) with no features that are significant a priori. To capture this property, items were represented as randomly generated input vectors u (uA, uB, etc.), modeling sensory-driven activity in upstream neurons. For simplicity, each u was drawn from a multivariate standard normal distribution of dimension Nin.
The identity of items 1 and 2 varied by trial type (e.g. AB, BA, AC, BC, etc.; see Fig. 1b for all trial types). Nin was chosen to be 100, matching the size of the hidden layer of the neural models (MLP and RNN), which was set to N = 100 (see below). This ensured that input corresponding to each item presentation elicited patterns of activation that would be uncorrelated in the activity space of the neural models – at least prior to training – thereby simulating arbitrary sensory stimuli. Note that the TI (and relational inference more generally) is not defined in terms of particular stimulus features, and indeed is most rigorously tested in the absence of stimulus features indicating items’ rank in the transitive hierarchy [159].
Model architectures
Three model architectures were studied: a recurrent neural network (RNN), logistic regression (LR), and a multilayer perceptron (MLP). Each was implemented in Python using the NumPy and PyTorch [160] packages, in addition to custom code for RNNs and all subsequent analyses.
Recurrent neural network (RNN)
To investigate population-level neural dynamics, we studied the standard continuoustime RNN: where xi are activity of the recurrent units, ri are the corresponding rates, N is the number of recurrent units (100 for all networks), Nin is the number of input units, and τ is the time constant. The rates ri derive from the activations xi via a tanh nonlinearity, ri = tanh(xi).
The tanh non-linearity was chosen because we found it to be the most effective for generating and analyzing network dynamics; further, in pilot work we found that other non-linearities (e.g. rectified tanh) yielded RNNs that exhibited unrealistic behavioral patterns when simulated (see below for description of behavioral simulations; rectified-tanh RNNs exhibited an unrealistic bias in choice 1 vs. choice 2). We note that RNNs in the present study are intended to model neural activity (generate testable predictions) expressly at the population level.
The network units interact via the recurrent synaptic weight matrix J. The input to the system is u, the activity of the set of Nin input units that influence the network through input weights B. The output of the system is z, the activity of a set of Nout output units, each defined to be a linear readout of network activity: Each output unit zi is a weighted sum of network rates with weights, W, with a constant bias, bi. In all models, three output units were implemented (Nout = 3), corresponding to three alternative behavioral actions (see below, Model outputs). All analyses of neural activity were performed on xi. Note that analyses of ri would yield similar predictions.
Network simulations were performed using Euler’s method with discrete time step t = τ/10. Intrinsic singleunit noise ηi(t) was generated at each time step by drawing values from a Gaussian random variable with zero mean and s.d. of 0.2.
Prior to training, the parameters of the model were initialized as follows. The entries of J were initialized as draws from a Gaussian random variable with zero mean and variance. The entries of B were initialized as draws from a Gaussian random variable with zero mean and variance. The elements of W, and all bias terms, were initialized to 0. Both h0 and g0 were hyperparameters that were systematically varied across RNNs (see Table 1 and RNN variants below).Logistic regression (LR). The LR model (schematic in Fig. S2a) was studied to clarify the possible role of feed-forward connectivity in performing TI. Each LR consisted of two linear readouts (corresponding to choice 1 and 2) each of which had coefficients for every input dimension Nin for each of the two items presented. In each simulation of the model, Gaussian noise (zero mean, s.d. of 0.2) was added to each of the readouts.
Multi-layer perceptron (MLP)
In addition to the LR model, single-layer MLPs (schematic in Fig. S2a; N = 100 hidden units, fully connected; see also [111] for study of MLPs of smaller size, i.e. three hidden units solving five-item TI) were studied to clarify the possible role of feedforward connectivity in TI. Entries of the input weight matrix were initialized as Gaussian entries with 0 mean and variance , with h0 = 1; entries of the output weight matrix were initialized to 0; all biases were initialized to 0. In each simulation of the model, Gaussian noise (zero mean, s.d. of 0.2) was added to each of the hidden units.
Model input
RNN
The input to the RNN in trial m, u(t, m), consisted of the presentation of items 1 and 2 with an intervening delay, dividing three periods in each trial: rest, delay, and choice (Fig. S1a), with durations 0.5 τ, 2 τ, and 2 τ, respectively. These durations were sufficiently long to yield differing neural implementations of TI in trained networks, and also because this was the minimal (delay) duration relevant to WM [161, 162]; in pilot work, we found that RNNs trained with longer durations yielded similar results. Item presentations were modeled as instantaneous pulses (t), as TI (and relational inference more generally) is not dependent on sensory input of a particular duration. The input over the course of the trial is diagrammed in Fig. S1b.
MLP and LR
The input to the feedforward models (LR and MLP) in trial m, u(m), consisted of the joint (simultaneous) presentation of items 1 and 2, requiring twice the input dimensionality of RNNs (i.e. given a fixed dimensionality for individual input stimuli, i.e. items); thus the feedforward models had Nin = 200 rather than Nin = 100 as in the RNNs (diagrammed in Fig. S2a).
Model output
RNN
The output from the RNN was composed of three output units z1, z2, z3 corresponding to three behaviors: choice 1, choice 2, and rest, respectively. In training, the target output ^z(t, m) was defined for every time point ts and for each trial type m such that the correct output unit was activated above resting levels during the period following the presentation of the second item (target values: resting: 0, activated: 5; diagram of target output in Fig. S1b).
The target values of output units are given by (diagrammed in Fig. S1b): where M is the number of training trial types (12 total, see Fig. 1b), T is the number of timesteps in each trial, and Nout is the number of output units. In example outputs (Fig. 2, top row, and Fig. S3a), the choice value plotted was the difference in the readout values for z1 and z2 averaged over the last half of the choice period and normalized to the magnitude of the largest such difference value across all trial types.
RNN behavior
The behavior – i.e. the choice response and response time (RT) – of an RNN in a trial was defined using an established criterion (see [112]). The z1 (choice 1) and z2 (choice 2) output units were passed through a simple monotonic saturating function ranging in value from 0 to 1: where is the target value of the output unit.
The response (i.e. choice in trial) was defined by the identity of the output unit (choice 1 vs. choice 2, see above) that first reached a fixed threshold value of 85% in the choice period, and normalized to the duration of the choice period (0 to 1, where 1 is ‘max’ in plots of RT).
The RT was defined as the time of the response, measured as the time elapsed from t2 (the time of presentation of item 2). Under certain conditions (i.e. when additional noise was added to RNNs (see further below), or cases when the RNN was presented with same-item stimuli, e.g. AA, BB, CC, etc., Fig. 2; these trial types were not evaluated in this study), the threshold was not reached for output unit. These trials did not count as correct and are shown in plots as ‘no response’ trials (Fig. 2).
MLP and LR
For both feedforward models, the output was composed of two output units z1, z2, corresponding to choice 1 and choice 2, respectively. In training, the target output was defined for each trial type m such that the correct output unit for each trial type (choice 1 vs. 2, Fig. 1b) was activated (value for active: 1, value for not active: 0). The response for a given trial was defined by the identity of the output unit which had the higher activity value. In example outputs (Fig. S2b), the choice value plotted was the difference in the readout values for z1 and z2 normalized to the magnitude of the largest such difference value across all trial types.
Model training
Models were optimized (trained) solely on training trials and not test (inference) trials. The ability of trained models to respond correctly to inference trials thereby mirrors that of living subjects that have only experienced or learned from training trials. In this way, analysis of models that respond correctly on test trials (i.e. perform inference) can be studied to identify putative neural implementations in the brain. Parameter updates were performed for batches of training trials, where each batch consisted of 128 trials randomly sampled from the training trial types defined by the task (diagrammed in Fig. 1b).
RNN
RNNs were trained to minimize Etask, the average squared difference between z(t, m), the readout of the network on trial m, and , the target output for that trial: where m corresponds to different training trials, T corresponds to the length of the trial (in time steps), and Nout is the number of readout units. Etask stipulates that the optimization procedure generate networks that respond correctly in training trials.The overall error function E was comprised of Etask and two additional terms that implement regularization, which has been found to promote neurobiologically accurate solutions in trained RNNs [Sussillo, Cuevas, Beiran]. The two terms were RL2, a standard L2 regularization on input and output synaptic weights, and RF R, a regularization on the network rates.
The overall error function was where the α and β hyperparameters set the strength of each type of regularization. The first regularization term is a standard L2 penalty on input and output synaptic weights: The second regularization term is a metabolic penalty on rates in the network: Both terms have been found to promote neurobiologically realistic responses in trained RNNs [48, 103]. The objective of training was to minimize E by modifying the network parameters J, B, W, x(t=0), and constant bias terms.
Training was implemented with the Adam optimizer [163], with updates to the network parameters calculated using backpropagation through time [164, 165]. Parameter updates were performed for training sets (batches) of 128 trials, where the trials in each batch were randomly sampled from the task-defined training trials (diagrammed in Fig. 1b). Training was stopped when the RNN responded correctly on all trial types (training and test; see above for response criteria) in the absence of noise (ηi set to 0), or when Etask fell below 0.1. Up to 30,000 training epochs were run.
MLP and LR
Each feedforward model was trained using the Adam optimizer with a cross-entropy loss. For the LR models, training was performed to convergence (i.e. until loss did not improve for 1000 training epochs); for MLPs, training was performed until the network responded correctly on all trial types (training and test; see above for response criteria) in the absence of noise (ηi set to 0). For both models, a weight-decay term (given by the L2-norm of all parameters) scaled by hyperparameter α was included in the loss function, with similar results across a range of values (presented are 0.1 for LR and 0.001 for MLP).
RNN variants
To identify multiple biologically relevant neural implementations of TI, two classes of RNN variants were studied: First, RNN variants that differed in learnable connectivity: f-RNN and r-RNN. f-RNNs were RNNs where all connection weights (feedforward and recurrent) were trainable; r-RNNs were RNNs for which feedforward weights (B and W, see above) were not allowed to be modified from their initial random values (Gaussian draws) in training. In addition, we separately found that RNNs for which feedforward input weights (B) were not allowed to change showed similar results to r-RNNs. Since these latter RNNs were a relatively closer point of comparison to f-RNNs, these models were analyzed in Fig. 5 and Fig. S7 (third row).
Second, RNN variants that differed with respect to initial connectivity strength (prior to training) and regularization (during training) – jointly termed “training regime” and defined for five classes (Table 1): simple (high), simple (low), intermediate, complex (low), and complex (high). In general we use the terms “simple-regime” and “complex-regime” to refer to the simple (high) and complex (high) training regimes, respectively, though in some cases we use “simple-regime” to refer to both simple (high) and simple (low) (and similarly for “complex-regime”). If the case, this is made explicit where presented. Ten RNN variants were studied in all: two types of learnable connectivity variants (f-RNN and r-RNN) by five types of training regime variants.
In addition to these two types of variants, for each RNN variant (e.g. simple-regime f-RNN) a collection of individual instances (randomly initialized networks, given initial connectivity strength parameters in Table 1) were studied. In particular, we trained 100 instances of each RNN variant, subsequently studying only those models that performed TI perfectly (correct responses to all trial types) under noise-free conditions (ηi set to 0). This subset of instances were then subject to behavioral and neural analyses.
Behavior simulation and behavioral patterns
To investigate behavioral patterns across models (Fig. 3, 4, 8, S2d, S4), models were simulated on all 42 trial types (12 training and 30 test, Fig. 1b). To simulate average levels of performance that were realistic to living subjects performing TI and showing characteristic behavioral patterns (>50% performance in training trials and <100% performance on trial types with large symbolic distance; see example monkey data in Fig. S4a), we took the following approach.
Simulation approach
For each model instance that performed perfectly (correct responses on all trial types) under noise-free conditions, we added progressively larger amounts of intrinsic noise to model units (s.d. of ηi, increased from 0.5 in increments of 0.1; LR: readout units, MLP: hidden-layer units, RNN: hidden-layer (recurrent) units) until the performance of the model (averaged over 500 simulations across all trial types) satisfied these basic performance criteria: >50% training performance (moreover for both choice 1 and 2 training trials) and <95% performance on the largest symbolic distance trials (AG and GA).
With the addition of noise, a subset of simulated trials (∼20%) did not meet the output activity threshold criterion (fixed at 85%) for a response (no response trials, see above). For simplicity, we designated the choice in these trials to be randomly either 1 or 2 in calculating average performance, and did not include these trials in RT analysis (note also that alternative approaches to defining the model response can be employed [148]). The first 50 RNN instances (where available) to pass these the basic performance criteria were subsequently analyzed. For subsequent behavioral analysis, 5000 simulations of all trial types at the identified noise level were run.
Symbolic distance effect
Trial types differing by the magnitude of the difference in rank between items (i.e. distance 1: AB, BA, BC, CB, CD, DC, DE, ED, EF, FE, FG, GF; 2: AC, CA, BD, DB, CE, EC, DF, FD, EG, GE; 3: AD, DA, BE, EB, CG, GC; 4: AE, EA, BF, FB, CG, GC, 5: AF, FA, BG, GB, 6: AG, GA). Schematic of trial types in Fig. 3a, second column.
End item effect
Trial types differing by whether or not they contain end items A or G. Schematic of trial types in Fig. 3a, third column.
Lexical marking effect
Trial types containing end item A vs. end item G. Schematic of trial types in Fig. 3a, third column. Note that trial types containing both A and G (i.e. AG and GA) were not included.
“End order” effect
Trial types for which end items (A or G) were presented 1st (item 1) vs. 2nd (item 2). Schematic of trial types in Fig. 3a, third column. Note that trial types containing both end items (i.e. AG and GA) were not included. The end order effect refers the (sequential) order of item presentation, and is therefore specific to the delay task format of TI (Fig. 1c, Fig. S1a).
The end order index (Figs. 4, 8) was defined as: where RT1st is the average RT over trials where end items were presented first (item 1), and RT2nd is the average RT over trials where end items were presented second (item 2).
Visualization of population activity
To clarify the neural implementation of TI in RNNs that performed the task, we visualized population-level neural activity by performing PCA of activity vectors across all time points and trial types (12 training and 30 test, Fig. 1b) under noise-free conditions (ηi set to 0) (% variance explained in Table 3). Activity was plotted in the top 3 PCs (Fig. 5a, 6a).
Task–relevant activity axes
Visualization of population activity in RNNs performing TI (Figs. 5, 6) suggested that the underlying neural implementations were characterized by specific arrangements (in activity space) of activity states with respect to two directions (axes), each defined on the basis of different trial types in the task (visual schematic of each axis in Fig. S8a).
The first was the choice axis, defined as the direction pointing from the activity states of Choice 1 vs. Choice 2 trial types (21 trial types each; red vs. blue, respectively, in Fig. 1b) during the choice period (period following the delay, Fig. S1a). The choice axis was calculated as a unit-normalized vector pointing from the mean of activity vectors across Choice 1 trial types to the mean of activity vectors in Choice 2 trial types, both for activity averaged over the first quarter of the choice period.
The cross-condition mean (XCM; yellow line in Figs. 5a, 5b; [118, 119, 120]), was defined and calculated as the average trajectory across all trial types. The XCM was moreover calculated in two ways: for visualizations, the XCM was calculated for every time point (Fig. 5a, 5b); for quantifications, the XCM was calculated by taking the mean neural activity across a given time window (e.g. first or last quarter of the delay period). The XCM axis, defined as the direction pointing from the activity states at the beginning of the delay period to the end of the delay period, was calculated as a unit-normalized vector pointing from the XCM of the first quarter of the delay period to the XCM of the last quarter of the delay period.
Inference of linear dynamics
To identify dynamical components expressed in RNNs performing TI, we fit neural activity from the delay period to an unconstrained linear dynamics model (; least-squares fit) from noise-free simulations (ηi set to 0) across all time points during the delay and across all trial types. In the delay period there are 7 trial types corresponding to each possible item 1 (A through G). The fit was performed for the top 10 PCs of delay-period activity. R2 values of the fit were relatively high (0.5-0.9; see Fig. S7, first column, for values across RNNs). Eigenvalues of the A matrix were subsequently plotted for each type of RNN variant (Fig. S7, second through sixth columns).
Fixed point analysis and linearization
To identify dynamical components of RNNs performing TI, we used fixed-point analysis and linear approximation methods [115, 121, 116].
Fixed-point finding was implemented using custom code in PyTorch, following an established method [115, 121]. Optimization via gradient descent (Adam optimizer) was used to identify activity states in which the speed of RNN dynamics was minimized (mean-squared error loss). The optimization was seeded using activity states from noise-free simulations of trial types in the TI task (specifically at these time points: 0, time of item 1 presentation, time of item 2 presentation, halfway through delay, the last time point, and the last timestep when trials were simulated with 100 additional timesteps), in addition to 5 batches of 50 activity state seeds, each of which were drawn randomly from activity states of trials in which item 1 and item 2 were randomly jittered in time across the trial. Each batch was optimized for 50000 epochs and stopped after 5000 epochs with no improvement in loss. Candidate FPs were those activity states for which the speed was lower than 10−5. Redundant candidate FPs were eliminated by requiring that, between candidate FPs, the activity of every recurrent unit differed by more than 10−5.
To obtain the Jacobian matrix A of linearization, two methods were used. The first was analytic (based on the weight matrix [115]); the second was numerical [92], using the function grad() in the PyTorch autograd library: at each fixed point, the function grad() was used to calculate the entries of A for the trained (and frozen) RNN (with no external input). Each approach yielded the same results.
Oscillation of transitive comparison
To identify the putative oscillation associated with transitive comparison (comparison oscillation), for each RNN performing TI we detected the oscillatory mode (mode with eigenvalue having a non-zero imaginary component) of highest frequency (e.g. highlighted in arrowheads in Fig. 5b and 6b for example networks), as this was the mode associated with transitive comparison in simple-regime RNNs (modes with frequency ∼0.5 cycles/delay, Figs. 5d, 6d). This detection was performed from the inferred linear dynamics matrix (with eigenvalue spectra across RNNs in Fig. S7), as this can also be equivalently performed for experimental neural data. The 2D linear subspace (plane) of the identified oscillatory mode was defined by the eigenvectors, which were orthogonalized and unit-normalized prior to being used to visualize population activity (Fig. 5b, right) and to quantify model predictions regarding activity geometry (angles of task-relevant axes with respect to the oscillation, Fig. S8).
Measuring activity geometry
To quantify patterns of population-level neural activity characteristic of different neural implementations of TI, we calculated the following geometric measures (indices). These indices were calculated for delay period activity, and were based on the following groupings of task items: S = 7 (all items: A, B, C, D, E, F, G). Shigh = 3 (high-rank items: A, B, C). Slow = 3 (low-rank items: E, F, G). Note that each item defines a trial type during the delay period; thus for delay period activity there were 7 different trial types. All indices were calculated in the top 10 PCs of delay period activity. Note that the geometric indices were measured in the full activity space of networks (N-D ambient space, reduced to top 10 PCs) to avoid assumptions or biases incurred from an intermediate step of estimating activity subspaces (such as the subspace of an oscillation).
Ordered collinearity index (OCI)
The ordered collinearity index (OCI) measures the degree to which activity is rank-ordered and linearly arranged in activity space (schematic in Fig. 7a; characteristic of the “subtractive” solution, Fig. 5f), defined as Activity state vectors (v) were network activity states (x) measured with respect to the cross-condition mean (XCM; yellow circle in Fig. 7a; see also Fig. 5a). Note that the negative sign results in value +1 for activity corresponding to rank-ordered collinearity. OCI was measured during two periods of time (time-averaged over the period): over the 1st quarter of the delay (early delay OCI; Fig. 7a, left column) and over the last quarter of the delay (late delay OCI; Fig. 7a, middle column). OCI change (Fig. 7a, right column) was defined as the late delay OCI - early delay OCI.
Mean angle
To generalize OCI change to other possible angular arrangements of neural activity, we measured mean angle change (schematic of mean angle in Fig. 7b), defined as Activity state vectors (v) were network activity states (x) measured with respect to the cross-condition mean (XCM), and averaged over either the first (early) or last late) quarter of the delay period.Mean distance. To help distinguish between angular vs. non-angular rearrangements of activity states in activity space (during the delay period), we measured mean distance change (schematic of mean distance in Fig. 7b), defined as
Axis angles
To quantify predicted angular relationships between activity patterns (Fig. S8), we calculated the relevant cosine angles (dot products). In particular, simple-regime RNNs (Figs. 5 and 6) predict two angular relationships: (1) orthogonality (cosine angle: 0) between the XCM and the plane of the oscillation associated with transitive comparison (plane of the comparison oscillation), and (2) alignment (cosine angle: ± 1) between the choice axis and the plane of the comparison oscillation. The plane of the comparison oscillation was defined by the two eigenvectors of the comparison oscillation (see above, Linear dynamics), which were orthogonalized and unit-normalized.
For (1) (Fig. S8b, left column), the dot product was calculated between the XCM and each of these plane vectors. As a conservative estimate, the cosine angle was defined as the dot product having the larger manitude.
For (2) (Fig. S8b, right column), the dot product was calculated between the choice axis and each of these plane vectors. As a conservative estimate, the cosine angle was taken to be the dot product having the smaller magnitude.
All angles were calculated after first reducing neural activity to the top PCs (PCs calculated from delay period activity; either top 3 or 10 PCs, indicated in Fig. S8b).
Statistical tests
All statistical tests were non-parametric and two-sided.
Supplementary Figures
Acknowledgements
We thank J. Johnston, L. Tian, S. Lippl, M. Triplett, C. Monfredo, N. Biderman, B. Antin, J. Cunningham, and members of the Columbia Center for Theoretical Neuroscience for comments and discussion, and R. Yang for guidance on model training. This work was supported by the the Simons Collaboration for the Global Brain (521921 and 542981), NIH grants (R01 MH090188; R01 MH105174; R01 MH111703, V.F.), the NSF NeuroNex award (DBI-1707398), the Gatsby Charitable Foundation, and an NIMH K99 (MH126158-01A1, KK).
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵
- [87].↵
- [88].↵
- [89].↵
- [90].↵
- [91].↵
- [92].↵
- [93].↵
- [94].↵
- [95].↵
- [96].↵
- [97].↵
- [98].↵
- [99].↵
- [100].↵
- [101].↵
- [102].↵
- [103].↵
- [104].↵
- [105].↵
- [106].↵
- [107].↵
- [108].↵
- [109].↵
- [110].↵
- [111].↵
- [112].↵
- [113].↵
- [114].↵
- [115].↵
- [116].↵
- [117].↵
- [118].↵
- [119].↵
- [120].↵
- [121].↵
- [122].↵
- [123].↵
- [124].↵
- [125].↵
- [126].↵
- [127].↵
- [128].↵
- [129].↵
- [130].↵
- [131].↵
- [132].↵
- [133].↵
- [134].↵
- [135].↵
- [136].↵
- [137].↵
- [138].↵
- [139].↵
- [140].↵
- [141].↵
- [142].↵
- [143].↵
- [144].↵
- [145].↵
- [146].↵
- [147].↵
- [148].↵
- [149].↵
- [150].↵
- [151].↵
- [152].↵
- [153].↵
- [154].↵
- [155].↵
- [156].↵
- [157].↵
- [158].↵
- [159].↵
- [160].↵
- [161].↵
- [162].↵
- [163].↵
- [164].↵
- [165].↵