Abstract
We describe a framework where a biologically plausible spiking neural network mimicking hippocampal layers learns a cognitive map known as the successor representation. We show analytically how, on the algorithmic level, the learning follows the TD(λ) algorithm, which emerges from the underlying spike-timing dependent plasticity rule. We then analyze the implications of this framework, uncovering how behavioural activity and experience replays can play complementary roles when learning the representation of the environment, how we can learn relations over behavioural timescales with synaptic plasticity acting on the range of milliseconds, and how the learned representation can be flexibly encoded by allowing state-dependent delay discounting through neuromodulation and altered firing rates.
Competing Interest Statement
The authors have declared no competing interest.