Abstract
According to standard models of synaptic plasticity, correlated activity between connected neurons drives changes in synaptic strengths to store associative memories. Here we tested this hypothesis in vivo by manipulating the activity of hippocampal place cells and measuring the resulting changes in spatial selectivity. We found that the spatial tuning of place cells was rapidly reshaped via bidirectional synaptic plasticity. To account for the magnitude and direction of plasticity, we evaluated two models – a standard model that depended on synchronous pre- and post-synaptic activity, and an alternative model that depended instead on whether active synaptic inputs had previously been potentiated. While both models accounted equally well for the data, they predicted opposite outcomes of a perturbation experiment, which ruled out the standard correlation-dependent model. Finally, network modeling suggested that this form of bidirectional synaptic plasticity enables population activity, rather than pairwise neuronal correlations, to drive plasticity in response to changes in the environment.
Main Text
Activity-dependent changes in synaptic strength can flexibly alter the selectivity of neuronal firing for particular features of the environment, providing a cellular substrate for learning and memory. Various forms of Hebbian synaptic plasticity have been considered for decades to be the main or even only synaptic plasticity mechanisms present within most brain regions of a number of species. The core feature of such plasticity mechanisms is that they are autonomously driven by the repeated presence of correlated presynaptic and postsynaptic activity that leads to either increases or decreases in synaptic strength depending on the exact temporal coincidence (1–4).
The hippocampus plays an important role in many forms of learning and memory, and the spatial firing rates of hippocampal place cells have been shown to change with alterations in environmental context or the locations of salient features, like reward (5–11). Furthermore in CA1 neurons, place cell activity can emerge in a single trial following a dendritic calcium spike (also called a plateau potential) (12–14). The form of synaptic plasticity responsible for this rapid change in selectivity, termed behavioral timescale synaptic plasticity (BTSP), modifies synaptic inputs active within a multi-second time window around the plateau potential. That BTSP strengthens many synaptic inputs whose activation did not cause or even coincide with postsynaptic activity suggests that it might be a fundamentally different form of plasticity than classical correlation-driven Hebbian plasticity (1–3). Such a plasticity rule could enable representation learning in cortical brain regions like the hippocampus to be guided by delayed behavioral outcomes, rather than by short timescale associations of neuronal input and output. However, it was not clear from previous experiments if short timescale correlations would modulate changes in synaptic strength induced by BTSP, which may reveal similarities with other correlative forms of plasticity.
In the current study, we sought to directly determine the dependence of BTSP on the correlation of presynaptic activity and postsynaptic depolarization in individual place cells. Intracellular voltage recordings from CA1 place cells were established in head-fixed mice trained to run for a water reward on a circular treadmill decorated with visual and tactile cues to distinguish spatial positions (187 cm in length). We began by examining how the induction of BTSP changes the membrane potential (Vm) dynamics in neurons already exhibiting location specific firing (i.e. place cells). To do so we injected brief step currents (300 ms duration) through the intracellular electrode for a small number of consecutive laps (5-8 laps; Fig. 1, A to C) to evoke plateau potentials at a second location along the track some distance from the initial place field (labeled “Induction 2” in Fig. 1, A to H). We observed that the plasticity induced by experimentally-evoked dendritic plateaus both increased Vm ramp amplitude near the plateau induction position, and also decreased ramp amplitude at the peak location of the original place field (Fig. 1, B and D). The time course of the Vm changes showed that decreases in ramp amplitude occurred at positions in space that were traversed multiple seconds before or after induced plateaus (Fig. 1, E and J). Furthermore, the exact magnitude of decreases in Vm ramp amplitude was greatest at spatial positions where initial ramp amplitude was largest (Fig. 1I). Interestingly, in a subset of recordings the ramp amplitude at the original place field location was not reduced (Fig. 1G). Inspection of the animals’ run trajectories during such instances revealed that long pauses in running just before the plateau induction position on multiple laps “protected” the original place field from depression by excluding the underlying location-selective inputs from the plasticity time window (Fig. 1, F to H). When both initial ramp amplitude and relative input timing are considered, it is apparent that the preferred conditions for large synaptic depression are that spatial inputs 1) have already been strengthened by previous plasticity, resulting in elevated postsynaptic depolarization at the time of presynaptic spikes, and 2) are activated within a time window ∼2 – 4 seconds away from a plateau (Fig. 1J, trace color indicates initial ramp amplitude; Fig 1K; two-dimensional interpolation from data, trace color indicates change in ramp amplitude, see Materials and Methods). In summary, BTSP can either strengthen or weaken synapses in a small number of trials, providing a bidirectional learning mechanism capable of both rapid memory storage and erasure.
(A) Spatial firing of a CA1 pyramidal cell recorded intracellularly from a mouse running laps on a circular treadmill. Dendritic plateau potentials evoked by intracellular current injection first induce a place field at ∼120 cm (Induction 1), then induce a second place field at ∼10 cm and suppress the first (Induction 2). (B) Black: intracellular Vm traces from individual laps in (A); blue: example low-pass filtered Vm ramp superimposed on unfiltered trace, and duplicated with expanded scale (inset). (C) Animal position vs. time during laps in which place fields were induced by evoked plateaus (locations marked with colored circles). t refers to the inter-event intervals between traversals of the initial place field peak location and evoked plateaus. (D) Spatially binned subthreshold Vm ramp depolarizations averaged across laps after plasticity induction. Colored dashes mark the average locations of evoked plateaus. (E) For each position, induced changes in Vm ramp amplitude are plotted against the time interval between plateau onset and traversal of that position during plasticity induction laps. (F – H) Same as (C – E) for a different example cell in which the amplitude of the original place field was not reduced following the second plasticity induction. (I) For all recorded neurons with a pre-existing place field in which plasticity was induced at a second location (n=13), changes in ramp amplitude are compared to initial ramp amplitude for each spatial bin (1.87 cm). Induced changes in ramp amplitude are negatively correlated with initial ramp amplitude. Explained variance (R2) and statistical significance (p < 0.05) reflect Pearson’s correlation and a 2-tailed null hypothesis test. (J) For the same neurons as in (I), changes in ramp amplitude are compared to time to plateau onset. Trace color indicates initial ramp amplitude before plasticity induction. (K) Two-dimensional Gaussian regression and interpolation of data from all recorded plasticity inductions (20 inductions from 13 cells) was used to estimate the plasticity rule that relates initial ramp amplitude and time to plateau onset to induced changes in ramp amplitude (trace color).
The above analysis revealed a relationship between initial Vm ramp amplitude and bidirectional changes in Vm depolarization induced by BTSP (Fig. 1, I to K), though it was inverted compared to most common formulations of correlative plasticity in which small depolarizations induce synaptic depression and large depolarizations induce synaptic potentiation (3, 15–17). We next sought to investigate this possible causal relationship between postsynaptic depolarization and plasticity induced by BTSP. First, we formulated a set of mathematical models of the underlying synaptic learning rule to generate testable hypotheses and to predict changes in synaptic strength given the following quantities: presynaptic spike times, postsynaptic depolarization, postsynaptic plateaus, and the strengths of each synaptic input before each plateau.
We compared two classes of plasticity models - a standard model dependent on coincident presynaptic spiking and postsynaptic depolarization (correlative), and an alternative model dependent instead on the timing of presynaptic spikes and the strengths of each synaptic input at the time of activation (non-correlative) (see Materials and Methods for details). To account for the long time course of BTSP, both models required temporal filters of synaptic activity to generate slow biochemical intermediate signals marking synapses as eligible for either synaptic potentiation or depression (local synaptic eligibility; Fig. 2, A to C; fig. S1, C and D; fig. S2, C and D) (1, 3, 15, 16, 18). Biologically, these traces could correspond to the enzymatic activity of calcium-dependent kinases and phosphatases, and post-translational modification and synaptic localization of proteins that regulate synaptic function (18–23). While in the voltage-dependent model, the amplitudes of these eligibility signals were modulated by the value of postsynaptic depolarization at the time of presynaptic activation (Fig. 2A and fig. S1, A, C and D), in the weight-dependent model (Fig. 2C and fig. S2, C and D), eligibility for plasticity depended only on presynaptic firing rate. Both models also required a temporal filter of the plateau potential to generate a second intermediate signal that extended in time long enough to interact with synaptic activity occurring up to seconds after a plateau potential (global dendritic instructive signal; Fig. 2, A to C; fig. S1, C and D; fig. S2, C and D). This plateau-related “instructive signal” was broadcast globally to all synapses, and was required for plasticity. Changes in synaptic weight in the models occurred only during periods of temporal overlap between localized eligibility signals and the global instructive signal (Fig. 2B; fig. S1, C and D; fig. S2, C and D).
(A) Voltage-sensitivity of synaptic eligibility signals for potentiation and depression. (B) Model presynaptic firing rates and postsynaptic Vm during simulation of one lap of treadmill running during plasticity induction. The timing of a dendritic plateau potential is indicated with a black dash. Example traces depict two presynaptic inputs, one that potentiates (left, blue) and one that depresses (right, red). (C) Temporal filters with exponential rise and decay (synaptic eligibility signal filter, grey; dendritic instructive signal filter, black). (D) Presynaptic input firing rates (B) were convolved with the eligibility signal filter (C) and multiplied by a voltage modulation factor (A) to generate long duration synaptic eligibility signal traces for potentiation (solid blue and red lines) and depression (dashed blue and red lines) (potentiating input example from (B), left; depressing input example from (B), right). The dendritic plateau potential was convolved with the instructive signal filter (C) to generate a long duration dendritic instructive signal (black). Periods of temporal overlap of eligibility and instructive signals that drive plasticity are shaded. (E) Sigmoidal gain functions sensitive to the amplitude of synaptic eligibility signals regulate the rates of synaptic potentiation (blue) and depression (red). (F) Net rates of change in synaptic weight (potentiating input example from (B) and (D), blue, left; depressing input example from (B) and (D), red, right). (G) Spatially-binned Vm ramp depolarization predicted by the model (before Induction 2, grey; after Induction 2; black). (H) Weighted contributions of individual presynaptic inputs to the postsynaptic ramp (before Induction 2, left; after Induction 2, right). The potentiating (blue) and depressing (red) input examples shown in (B), (D) and (F) are highlighted in color.
(A) State diagram for two-state kinetic model describing the flow of finite synaptic resources. (B – H) Same as fig. S1, B to H for simulation of the weight-dependent model of bidirectional BTSP. (D) Same as fig. S1D, except only a single synaptic eligibility signal is shown (red and blue lines). In the weight-dependent model, postsynaptic voltage does not modulate the amplitude of synaptic eligibility traces, so eligibility for potentiation and depression are both marked by a single trace generated by convolving presynaptic firing rates (B) with the synaptic eligibility signal filter (C).
(A) Diagram depicts a “voltage-dependent” model of bidirectional BTSP. Three factors influence changes in synaptic strength at each input: 1) presynaptic firing rate and timing, 2) postsynaptic Vm depolarization at the time of presynaptic spiking, and 3) postsynaptic plateau timing and duration. The product (degree of correlation) of presynaptic firing rate and postsynaptic depolarization determines the amplitude of long duration “synaptic eligibility signals” that mark each synapse as eligible for later synaptic potentiation or depression. Synaptic eligibility signals, following an additional nonlinear transformation, are later converted into changes in synaptic strength when in the presence of a second required “instructive signal” generated downstream of postsynaptic plateaus. (B) Example traces depict the signals described in (A) for a single presynaptic input onto a neuron that exhibited a pre-existing place field before plateau induction. Shown is a single lap on the circular treadmill for a trial in which a plateau was evoked by intracellular current injection. Top: while presynaptic firing at this input (red) does not overlap in time at all with the postsynaptic plateau (black), it does coincide with the spatially-tuned depolarization underlying the cell’s initial place field (grey). Middle: this generates long duration eligibility signals (blue: potentiation eligibility; red: depression eligibility) that overlap in time with the delayed instructive signal (black) (shading marks area of signal overlap). Bottom: at this input a large rate of synaptic depression and a small rate of synaptic potentiation result in a net decrease in synaptic strength. (C) Diagram depicts an alternative “weight-dependent” model of bidirectional BTSP. Three factors influence changes in synaptic strength at each input: 1) presynaptic firing rate and timing, 2) postsynaptic plateau timing and duration, and 3) the current synaptic weight of each input before each evoked plateau. In this model, synaptic eligibility signals depend only on presynaptic firing. Like in (A), a plateau-related instructive signal is required to convert synaptic eligibility signals into changes in synaptic weight. However, in this model, the current weight of each input influences the magnitude and direction of synaptic plasticity such that weak synapses favor potentiation, and strong synapses favor depression. (D) The voltage-dependent model was optimized to generate predicted ramp depolarizations for each neuron in the experimental dataset (20 inductions from 13 neurons). Changes in ramp amplitude are compared to time to plateau onset, with color indicating initial ramp amplitude (compare to Fig. 1J). (E) Estimate of plasticity rule obtained by regression and interpolation of model simulation data (compare to Fig. 1K). (F – G) Same as (D – E) for the weight-dependent model.
Model parameters (fig. S3, A to K and fig. S4, A to K, see Materials and Methods) were optimized by minimizing the difference between measured and predicted Vm ramp depolarizations. Both model variants generated predictions in good agreement with experimental data (fig. S3, L to O; fig. S4, L to O; and fig. S5), which underscores the importance of the components common to both models – long timescale intermediate signals downstream of synaptic activation that are transformed into bidirectional changes in synaptic weight by dendritic plateau potentials. However, the two models made qualitatively different predictions about the causal role of the activation state of the postsynaptic neuron in controlling the magnitude and direction of plasticity. While in the voltage-dependent model, correlation between presynaptic activity and postsynaptic depolarization influences the sign of plasticity, in the alternative model, the sign of plasticity is independent of postsynaptic voltage, and is modulated instead by current synaptic weight such that weak synapses tend to potentiate and strong synapses tend to depress (3, 24–29).
(A – K) The free parameters of the voltage-dependent model of bidirectional BTSP were optimized for each cell in the experimental dataset. Shown here are the distributions of parameter values across cells. (L – O) Model predictions and experimental data are compared for features measured from spatially binned Vm ramp depolarizations. In 7/13 neurons with a second place field induced (Induction 2, red), the first place field was also experimentally-induced (Induction 1, blue), so the model was fit to predict both place field inductions with the same set of parameters. Explained variance (R2) and statistical significance (p < 0.05) reflect Pearson’s correlation and 2-tailed null hypothesis tests. (L) Peak ramp amplitude. (M) Ramp width. (N) Shift of ramp peak location relative to mean location of plateau onset. (O) Minimum ramp amplitude across spatial bins.
(A – O) Same as fig. S3, A to O for parameters and predictions of the weight-dependent model of bidirectional BTSP.
(A – G) Additional model variants of varying complexity were also evaluated for their capability to predict experimentally measured bidirectional changes in Vm ramp amplitude by BTSP. The plasticity rule was estimated by two-dimensional interpolation from model data (see Materials and Methods). (A – C) Model predictions from variants of the voltage-dependent model with 7 (A), 11 (B), or 13 (C) free parameters (see Materials and Methods). (D) Plasticity rule estimated from the experimental data. (E – G) Model predictions from variants of the weight-dependent model with 7 (E), 11 (F), or 13 (G) free parameters (see Materials and Methods). (H) Residual error of ramp depolarizations predicted by each model is
We next sought to distinguish between the two model classes with an in vivo perturbation experiment where neuronal Vm was experimentally depolarized by intracellular current injection during plasticity induction trials. In otherwise silent CA1 neurons exhibiting no spatial tuning or spiking during treadmill running, we injected current through the intracellular recording pipette to depolarize the neuron at steady-state by ∼10 mV, which often exceeded threshold for spiking (Fig. 3A; see fig. S6, A to C for simulation results supporting significant dendritic depolarization during this manipulation). Then, for 4 – 5 consecutive laps, plateau potentials were experimentally induced by an additional large, brief step current (∼ 300 ms) at a fixed location along the track. In all neurons tested, this procedure resulted in the emergence of a large place field near the plateau induction site, as evidenced by spiking and a large amplitude Vm ramp depolarization (Fig. 3, A and C). Consistent with previous control experiments converting silent cells to place cells without Vm depolarization (Fig. 3B) (13), we observed only increases, and no decreases in ramp amplitude at spatial positions surrounding the plateau location (Fig. 3C and fig. S6D). This absence of synaptic depression is inconsistent with the prediction of the standard voltage-dependent model (Fig. 3D), but instead favors the alternative model, which predicted that only previously potentiated inputs would be eligible for synaptic depression, independent of postsynaptic voltage.
Postsynaptic voltage perturbation experiment: simulation of dendritic biophysically-detailed CA1 pyramidal cell model with realistic morphology and dendritic ion channel distributions (39) to estimate the effect of steady-state somatic depolarization on distal dendritic Vm. Three conditions are compared: a silent cell with uniform input weights (black), a silent cell with ∼10 mV of steady-state depolarization induced by somatic current injection (purple), and a place cell receiving potentiated inputs at the peak of its place field (blue). Under conditions of somatic current injection, a combination of attenuated propagating depolarization and back-propagating action potentials amplifies local synaptic input by activating dendritic voltage-gated ion channels, resulting in a level of dendritic depolarization comparable to the place field condition. (A) Simulated Vm traces recorded from soma (left), distal apical oblique dendrite (center), and a distal apical dendritic spine (right). (B) Mean low-pass filtered Vm at simulated dendritic recording sites at varying distances from the soma. (C) Same as (B) for simulated recordings from dendritic spines. (D – F) Quantification of experimental ramp depolarizations induced by BTSP under conditions of somatic depolarization (Fig. 3). (D) Peak ramp amplitude (p < 0.015). (E) Ramp width (p > 0.143). (F) Shift of ramp peak location relative to mean location of plateau onset (p > 0.255). p-values reflect two-sided Mann-Whitney U tests.
(A) Intracellular Vm traces from individual laps in which plasticity was induced by experimentally-evoked plateau potentials in an otherwise silent CA1 cell. During plasticity induction laps, the neuron was experimentally depolarized by ∼10 mV at steady-state with an intracellular current injection. On the background of this elevated depolarization at every spatial position, step current injections (300 ms) evoked plateau potentials at the same spatial position for five consecutive laps and induced a place field. (B – C) Place field ramp depolarizations induced by experimentally-evoked plateaus (individual cells in grey). Similar to control neurons that were converted from silent cells to place cells without steady-state depolarization (n=25, average in black) (B), neurons that underwent plasticity induction during steady-state depolarization (n=5, average in purple) (C) exhibited only synaptic potentiation, and no synaptic depression, at all spatial positions. (D) The data regression and interpolation in Fig. 1K was used to predict the changes in ramp amplitude that would result if BTSP was either dependent on (red) or independent of (blue) postsynaptic voltage.
These results strongly indicate that bidirectional BTSP is fundamentally different from other previously characterized forms of associative synaptic plasticity that depend on three factors – presynaptic spiking, postsynaptic voltage, and a delayed reinforcement signal (1, 18, 30, 31). A particular advantage of a voltage-independent plasticity rule (Fig. 2C and fig. S2) is that changes in strength at each synapse are determined independently by signals generated locally, whereas plasticity rules that depend on the global activation state of the postsynaptic neuron may not allow independent credit to be assigned to the subset of synapses that contributed to a desired outcome (32).
We next aimed to explore how this form of plasticity could impact memory storage at the network level. During goal-directed navigation, hippocampal neurons have been shown to preferentially acquire new place fields near behaviorally-relevant locations, and to translocate existing place fields towards those locations (8–10, 33). Based on previous evidence that plateau probability in CA1 is facilitated by long-range feedback inputs onto distal CA1 dendrites from entorhinal cortex (12, 34, 35), and diminished by dendrite-targeting inhibition (35–39), we constructed a network model of the CA1 microcircuit where the probability of plateau initiation and thus BTSP induction was regulated by feedback inhibition and an instructive input from entorhinal cortex (Fig. 4, A and B) (10, 40–44).
(A) Diagram depicts components of a hippocampal network model. A population of CA1 pyramidal neurons receives spatially tuned excitatory input from a population of CA3 place cells and an instructive input from entorhinal cortex (EC) that signals the presence of a behavioral goal. The output of CA1 pyramidal neurons recruits feedback inhibition from a population of interneurons. (B) The probability that model CA1 neurons emit plateau potentials and induce bidirectional plasticity is negatively modulated by feedback inhibition. As the total number of active CA1 neurons increases (labeled “normalized population activity”), feedback inhibition increases, and plateau probability decreases until a target level of population activity is reached, after which no further plasticity can be induced (black). An instructive input signaling the presence of a goal increases plateau probability, resulting in a higher target level of population activity inside the goal region (red). (C) Each row depicts the summed activity of the population of model CA1 pyramidal neurons across spatial positions during a lap of simulated running. Laps 1-10 reflect exploration of a previously unexplored circular track. During laps 11-20, a goal is added to the environment at a fixed location (90 cm). During laps 21-25, the goal is removed for additional exploration of the now familiar environment. (D – E) Activity of individual model CA1 pyramidal neurons during simulated exploration as described in (C). (D) The activity of neurons are sorted by the peak location of their spatial activity following 10 laps of novel exploration. A fraction of the population remains inactive and untuned. (E) The activity of neurons after 10 laps of goal-directed search are first sorted by their original peak locations (left), and then re-sorted by their peak locations following exposure to the fixed goal (right). An increased fraction of neurons express place fields near the goal position. (F) Histogram depicts neurons recruited to express new place fields in each spatial bin (epochs: novel explore, blue; fixed goal, red; familiar explore, grey). (G) Histogram depicts absolute distance of translocated place fields. (H) Histogram depicts relative distance of translocated place fields to the goal location.
We simulated a virtual animal running at a constant velocity on a circular treadmill for three separate phases of exploration (Fig. 4C). The first phase (ten laps) simulated exploration of a novel environment. The next phase (ten laps) simulated a goal-directed search for a target placed at a single fixed location (90 cm). Finally, the stability of acquired spatial representations were assessed by five additional laps with the goal removed (Fig. 4C). At each time step (10 ms), instantaneous plateau probabilities were computed for each cell (Fig. 4B), determining which neurons would initiate a dendritic plateau and undergo plasticity, following the experimentally validated bidirectional synaptic learning rule from Fig. 2C.
During the first few laps of simulated exploration, CA1 pyramidal neurons rapidly acquired place fields that, as a population, uniformly tiled the track (Fig. 4; C, D, and F). As neurons increased their activity over time, feedback inhibition increased proportionally and prevented further plasticity (Fig. 4, A to C). During laps with a goal presented at a fixed location, an additional population of silent neurons acquired place fields nearby the goal location (Fig. 4, E and F), while a separate population shifted their place field positions towards the goal (Fig. 4; E, G, and H). Overall this resulted in an increased proportion of place cells with place fields nearby the goal position (Fig. 4E). Simulated place cell activity remained stable in the final phase of exploration of the now familiar environment (Fig. 4, C and F to H). These network modeling results recapitulate experimentally observed statistics of CA1 place cell translocation during goal-directed behavior (9).
This model provides a proof of principle that bidirectional BTSP can enable populations of place cells to rapidly adapt their spatial representations to changes in the environment without any compromise to spatial selectivity. An important feature of this model is that plasticity is regulated by the global activity of populations of neurons, rather than by pairwise correlations between single neurons and their inputs. This allows the network to rapidly acquire population-level representations of previously unencountered stimuli, as well as to modify outdated representations to better reflect changes to behaviorally relevant stimuli like goal location. Interestingly, bidirectional BTSP is asymmetric, tending to potentiate inputs more strongly when they are active before a plateau rather than during or after a plateau. In the network model, this caused the population representation of the goal to actually peak before the goal location itself (see also (6)), producing a predictive memory representation that could potentially be used by an animal to recall the path leading to the goal (45).
In summary, we found that dendritic plateaus could induce both potentiation and depression of subsets of synaptic inputs, resulting in translocation of a cell’s place field towards the position where the plateaus were evoked. Quantitative inference of the underlying learning rule from the experimental data revealed that the direction and magnitude of changes in synaptic strength depended on the current strength of each input at the time of a plateau, but not the degree of correlation between presynaptic and postsynaptic activity. In addition, bidirectional BTSP exhibited saturability (fig. S7) and state-dependence, two important features for stability of learned neuronal representations (3, 17, 27, 46, 47).
Peak Vm ramp amplitudes are not correlated with the total accumulated duration of plateau potentials across laps during plasticity induction, indicating a saturating nonlinearity. Shown are data from both silent cells in which de novo place fields were induced (light grey, n=25), and place cells in which a second plasticity induction translocated a pre-existing place field (dark grey, n=13). Explained variance (R2) and statistical significance (p < 0.05) reflect Pearson’s correlation and a 2-tailed null hypothesis test.
Together our experimental and modeling results establish bidirectional BTSP as a non-correlative mechanism for rapid and reversible learning. Rather than acting to autonomously reinforce pre-existing short timescale correlations between pre- and post-synaptic activity like standard Hebbian learning, bidirectional BTSP is capable of completely reshaping pairwise neuronal correlations in response to instructive input signals that promote dendritic plateau potentials. While it could be argued that BTSP is still a correlative form of plasticity due to its requirement that presynaptic spikes occur within a time window surrounding a plateau, this long timescale correlation is not between presynaptic spiking and the activation or output state of the postsynaptic neuron, but rather between presynaptic spiking and activation of a separate instructive input pathway that drives the postsynaptic plateau (12, 34, 35). Furthermore, since this long timescale coincidence between two inputs is only permissive for BTSP, but does not determine the sign of the plasticity, it cannot be classified as correlative in the same sense as classical Hebbian or even anti-Hebbian forms of plasticity. As suggested by our network model, if plateau potentials are generated by mismatch between a target instructive input and the output of the local circuit, as reflected by dendritically-targeted feedback inhibition, bidirectional BTSP can implement objective-based learning (48, 49). In addition to providing insight into the fundamental mechanisms of spatial memory formation in the hippocampus, these findings suggest new directions for general theories of biological learning and the development of artificial learning systems (44, 50).
Materials and Methods
In vivo intracellular electrophysiology
All experimental methods were approved by the Janelia Institutional Animal Care and Use Committee (Protocol 12-84 & 15-126). All experimental procedures in this study, including animal surgeries, behavioral training, treadmill and rig configuration, and intracellular recordings, were performed identically to a previous detailed report (13) in an overlapping set of experiments, and are briefly summarized here.
In vivo experiments were performed in 6-12 week-old mice of either sex. Craniotomies above the dorsal hippocampus for simultaneous whole-cell patch clamp and local field potential (LFP) recordings, as well as affixation of head bar implants were performed under deep anesthesia. Following a week of recovery, animals were prepared for behavioral training with water restriction, handling by the experimenter, and addition of running wheels to their home cages. Mice were trained to run on the cue-enriched linear treadmill for a dilute sucrose reward delivered through a licking port once per lap (∼187 cm). A MATLAB GUI interfaced with a custom microprocessor-controlled system for behavioral tracking and control. Position-dependent reward delivery and intracellular current injection were triggered by photoelectric sensors, and animal run velocity was measured by an encoder attached to one of the wheel axles. In a subset of experiments (Fig. 3), in addition to position-dependent step current to evoke plateau potentials, steady-state current was injected to depolarize neurons beyond threshold for axosomatic action potentials during plasticity induction laps. While steady-state depolarization of the soma is expected to attenuate along the path to distal dendrites (51), the pairing of back-propagating action potentials with synaptic inputs has been shown to significantly amplify dendritic depolarization by inactivating A-type potassium channels and activating voltage-gated sodium channels and NMDA-type glutamate receptors (52–55). Simulations of a biophysically-detailed CA1 place cell model with realistic morphology and distributions of dendritic ion channels (39) suggest that steady-state somatic depolarization of a silent CA1 pyramidal cell in vivo results in levels of distal dendritic depolarization comparable to place cells at the peak of their place field (fig. S6, A to C).
To establish whole-cell recordings from CA1 pyramidal neurons, an extracellular LFP electrode was lowered into the dorsal hippocampus using a micromanipulator until prominent theta-modulated spiking and increased ripple amplitude was detected. Then a glass intracellular recording pipette was lowered to the same depth while applying positive pressure. The intracellular solution contained (in mM): 134 K-Gluconate, 6 KCl, 10 HEPES, 4 NaCl, 0.3 MgGTP, 4 MgATP, 14 Tris-phosphocreatine, and in some recordings, 0.2% biocytin. Current-clamp recordings of intracellular membrane potential (Vm) were amplified and digitized at 20 kHz, without correction for liquid junction potential.
Place field analysis
To analyze subthreshold Vm ramps, action potentials were first removed from raw Vm traces and linearly interpolated, then the resulting traces were low-pass filtered (<3 Hz). For each of 100 equally sized spatial bins (1.87 cm), Vm ramp amplitudes were averaged across periods of 5-10 minutes of running laps on the treadmill. The spatially-binned ramp traces were then smoothed with a Savitzky-Golay filter with wrap-around. Ramp amplitude was quantified as the difference between the peak and the baseline (average of the 10% most hyperpolarized bins). For cells with a second place field induced, the same baseline Vm value determined from the period before the second induction was also used to quantify ramp amplitude after the second induction. Ramp width was quantified as the peak-normalized area under the curve. Plateau duration was estimated as the duration of intracellular step current injections, or as the full width at half maximum Vm in the case of spontaneous naturally-occurring plateaus. For each spatial bin, the elapsed time between traversal of that position and onset of a plateau was variable across induction laps, depending on lap-by-lap differences in run velocity (e.g. Fig. 1, C and F). To analyze the relationship between this time interval and changes in ramp amplitude, time relative to plateau onset (Fig. 1, E, H, J and K) was conservatively estimated as the minimum time delay across all induction laps. Since not all possible pairs of initial ramp amplitude and time delay relative to plateau onset were sampled in the experimental dataset, expected changes in ramp amplitude (Fig. 1K; Fig. 2; E and G; Fig. 3D; fig. S5; and fig. S8) were predicted from the sampled experimental or model data points by a two-dimensional Gaussian process regression and interpolation procedure using a rational quadratic covariance function, implemented in the open-source python package sklearn (56, 57).
(A – D) Additional model variants with alternative values of place field width for presynaptic CA3 place cell inputs were also evaluated for their capability to predict experimentally measured bidirectional changes in Vm ramp amplitude by BTSP. Plasticity rules were estimated by two-dimensional interpolation from model data (see Materials and Methods). (A – C) Model predictions from variants of the weight-dependent model with 75 cm (A), 90 cm (B), or 105 cm (C) CA3 input place field widths. (D) Residual error of ramp depolarizations predicted by each model is averaged across spatial bins and cells.
Computational modeling
Two classes of mathematical models of the synaptic learning rule underlying bidirectional BTSP were built and optimized to predict the spatially tuned Vm ramp depolarizations of experimentally recorded CA1 place cells. All code (python) necessary to reproduce the modeling results is open-source and publicly available (58, 59). The following components and notation were shared across all model variants. CA1 place cell ramp depolarization ΔV as a function of position x was modeled as a weighted sum of the spatial firing rates of a population of 200 CA3 place cell inputs with place fields spaced uniformly across a 187 cm circular track, with a background level of depolarization Vb subtracted. This background is equivalent to the level of activation that would be produced if all inputs had a uniform weight of 1.
The firing rate Ri of CA3 place cell i with place field peak location yi was modeled as a Gaussian function of position x, accounting for wraparound of the circular track with length ℓ:

The place field widths of CA3 inputs, controlled by σ, were set to have a full floor width (3 · √2 · σ) of 90 cm (half-width of ∼34 cm) throughout the study (60), though models tuned with alternative values of input field widths generated quantitatively similar predictions (fig. S8). Initial synaptic weights in silent cells before the first place field induction were assumed to have a value of 1. Initial synaptic weights before the second place field induction in neurons already expressing a place field were estimated from the recorded ramp depolarization by least squares approximation. The scaling factor a was calibrated such that if the synaptic weights of CA3 place cell inputs varied between 1 and 2.5 as a Gaussian function of their place field locations, the postsynaptic CA1 cell would express a Vm ramp with 90 cm width and 6 mV amplitude, consistent with previous measurements of place field properties and the degree of synaptic potentiation by BTSP (13).
For each experimental recording, the position of an animal as a function of time during a plasticity induction lap j determined the firing rates of model CA3 inputs as a function of time Rj,i(t) = Rj,i(xj(t)) (Fig. 2B, fig. S1B, and fig. S2B). In accordance with experimental data (12, 39), the firing rates of model place cell inputs decreased to zero during periods when the animal stopped running. Postsynaptic dendritic plateau potentials during each induction lap were modeled as binary functions of time Pj(t). To generate long duration plasticity eligibility signals specific to each synaptic input i and instructive plasticity signals
shared by all synapses, input firing rates and plateaus were convolved with causal temporal filters with exponential rise and decay (Fig. 2; fig. S1, C and D; fig. S2, C and D), denoted by a raised tilde.
Wj,i corresponds to the synaptic weight of each input i prior to each plasticity induction lap j. Changes in synaptic weight ΔWj,i were calculated once per lap by integrating a net rate of change of synaptic weight (defined separately for each model below) over the duration of lap j. For all models, additional nonlinear gain functions q± transformed synaptic eligibility signals
and contributed to the rate of change
. These scaled and rectified sigmoidal functions
were parameterized and expressed as follows:



Voltage-dependent model
The voltage-dependent model (Fig. 2A, fig. S1, and fig. S3) contained the following 11 free parameters:
1) signaleligibility τrise, 2) signaleligibility τdecay, 3) signalinstructive τrise, 4) signalinstructive τdecay, 5) ΔVmax, 6) k+, 7) k-, 8) gainth,+, 9) gainpeak,+, 10) gainth,-, 11) gainpeak,-. In this model, distinct eligibility signals for synaptic potentiation and synaptic depression were oppositely sensitive to postsynaptic voltage. The instantaneous postsynaptic depolarization amplitude ΔVj(t) during plasticity induction lap j, including the large brief depolarization produced by the dendritic plateau potential, was normalized to a saturating amplitude ΔVmax and rectified:
influenced synaptic eligibility signals according to:

The sigmoidal gain functions and
, as well as the plateau-related instructive signal
contributed to the net rate of change
according to:
where k+ and k- are scalar learning rate constants. Synaptic weights were bounded such that Wj,i ≥ 0.
Weight-dependent model
The weight-dependent model (Fig. 2C, fig. S2, and fig. S4) contained the following 11 free parameters:
1) signaleligibilityτrise, 2) signaleligibility τdecay, 3) signalinstructiveτrise, 4) signalinstructive τdecay, 5) ΔVmax, 6) k+, 7) k-, 8) gainth,+, 9) gainpeak,+, 10) gainth,-, 11) gainpeak, -.
This model was formulated with the aim of obtaining a first order dependence of changes in synaptic weights on the current value of synaptic weight just before each plasticity-inducing plateau potential. We chose a two-state non-stationary kinetic model of the form shown in fig. S2A as a concrete example of a model that satisfies this dependency. Independent and finite synaptic resources at each synapse could occupy either an inactive or an active state, and the synaptic weight of each input Wj,i, was defined as proportional to the occupancy of the active state. Since the occupancy of each state in a kinetic model constrains the flow of finite resources between states, the net change in synaptic weight at each input naturally depended on the current value of synaptic weight Wj,i. This occupancy could be equivalently interpreted as a proportion of synapses that have been potentiated among a subpopulation of inputs with shared place field locations and binary weights. Synaptic weights were constrained such that
where 0 ≤ wj,i ≤ 1, and Wmax depends on a parameter ΔWmax that specifies a maximum change in weight above a baseline of 1:
In the absence of any voltage sensitivity of synaptic eligibility signals, both eligibility for potentiation and eligibility for depression depended only on a presynaptic firing rate and were therefore equivalent in the formulation of this model variant (fig. S3, C and D):
Sigmoidal gain functions
and
, as well as the plateau-related instructive signal
and the current normalized synaptic weight wj,i contributed to the net rate of change
according to:
where k+ and k- are scalar learning rate constants.
For both of the above model variants, the values of the bounded free parameters were automatically explored using a population-based version of the simulated annealing algorithm (59) to minimize an objective error function based on the difference between target and predicted ramp waveforms for each cell in the experimental dataset. Additional model variants with either fewer (7) or more (13) free parameters were also tested (fig. S5). In the simpler model variants, the nonlinear gain functions q± were not applied to the eligibility signals and the change in synaptic weights instead depended linearly on the amplitude of the synaptic eligibility signals . These models were unable to account for the depression component of bidirectional BTSP (fig. S5). In the more complex model variants, eligibility signals for potentiation
and depression
were filtered with distinct time constants (signaleligibility,+ τrise, signaleligibility,+ τdecay, signaleligibility,-τrise, signaleligibility,-τdecay). The additional complexity of these model variants did not result in improved predictions relative to the original set of models (fig. S5).
Goal-directed spatial learning model
To investigate the implications of bidirectional BTSP for reward learning by a population of CA1 place cells (Fig. 4), we constructed a network model comprised of 1000 CA1 pyramidal cells each receiving input from a population of 200 CA3 place cells with place fields spaced at regular intervals spanning the 187 cm circular track. Lap running was simulated at a constant run velocity of 30 cm/s. The synaptic weights at inputs from model CA3 place cells to model CA1 cells were controlled by the weight-dependent model of bidirectional BTSP described above. For this purpose, the 11 free parameters of that model were tuned to match synthetic data under the following constraints: 1) 5 consecutive induction laps with one 300 ms duration plateau per lap at a fixed location resulted in a place field ramp depolarization that peaked 10 cm before the location of plateau onset, had an asymmetric shape (80 cm rise, 40 cm decay), and had a peak amplitude of 8 mV; 2) 5 subsequent plasticity induction at a location 90 cm away from the initial peak for 5 consecutive laps resulted in a 5 mV decrease in ramp amplitude at the initial peak location, and an 8 mV peak ramp amplitude at the new translocated peak position.
Before simulated exploration, all synaptic weights were initialized to a value of 1, which resulted in zero ramp depolarization in all model CA1 cells. Under these baseline conditions, each model CA1 neuron k had a probability pk(t) = pbasal = 0.0075 of emitting a single dendritic plateau potential in 1 second of running. During each 10 ms time step, this instantaneous probability pk(t) was used to weight biased coin flips to determine which cells would emit a plateau. If a cell emitted a plateau, it persisted for a fixed duration of 300 ms, and was followed by a 500 ms refractory period during which pk(t) was transiently set to zero.
After the first lap, CA1 neurons that had emitted at least one plateau and had induced synaptic potentiation produced nonzero ramp depolarizations (Fig. 4C). The output firing rates Rk(t) of each CA1 neuron k were considered to be proportional to their ramp depolarizations Vk(t). The activity Rinh(t) of a single inhibitory feedback element was set to be a normalized sum of the activity of the entire population of CA1 pyramidal neurons:
where the normalization constant b was chosen such that the activity of the inhibitory feedback neuron would be 1 if every CA1 pyramidal neuron expressed a single place field and as a population their place field peak locations uniformly tiled the track. Then, the probability that any CA1 neuron k would emit a plateau pk(t) was negatively regulated by the total population activity via the inhibitory feedback term Rinh(t):
where γbasal defined a target normalized population activity (set to 0.3) and f is a descending sigmoid (Fig. 4B).
In some laps, a specific location was assigned as the target of a goal-directed search. To mimic the activation of an instructive input from entorhinal cortex signaling the presence of the goal, for a period of 500 ms starting at the goal location, the probability that a CA1 neuron would emit a plateau potential pk(t) was transiently increased. Within the goal region, the relationship between pk(t) and Rinh(t) was instead:
where pgoal is an elevated peak plateau probability of 0.03 per second, γgoal is an elevated target normalized population activity (set to 0.5) and f is a descending sigmoid (Fig. 4B).
Acknowledgements
We are grateful to Nicolas Brunel and Karel Svoboda for discussions and comments on the manuscript, Grace Ng for contributing to software development, Ivan Raikov for technical assistance with high-performance computing, and Kristopher Bouchard at LBNL for sharing large-scale computing resources provided by the National Energy Research Scientific Computing Center, a Department of Energy Office of Science User Facility (DE-AC02-05CH11231). This work was also made possible by computing allotments from NSF (XSEDE Comet and NCSA Blue Waters) and supported by HHMI and NIH (BRAIN U19 award NS104590).