PT - JOURNAL ARTICLE AU - Ruggero Basanisi AU - Andrea Brovelli AU - Emilio Cartoni AU - Gianluca Baldassarre TI - A spiking neural-network model of goal-directed behaviour AID - 10.1101/867366 DP - 2019 Jan 01 TA - bioRxiv PG - 867366 4099 - http://biorxiv.org/content/early/2019/12/06/867366.short 4100 - http://biorxiv.org/content/early/2019/12/06/867366.full AB - In mammals, goal-directed and planning processes support flexible behaviour usable to face new situations or changed conditions that cannot be tackled through more efficient but rigid habitual behaviours. Within the Bayesian modelling approach of brain and behaviour, probabilistic models have been proposed to perform planning as a probabilistic inference. Recently, some models have started to face the important challenge met by this approach: grounding such processes on the computations implemented by brain spiking networks. Here we propose a model of goal-directed behaviour that has a probabilistic interpretation and is centred on a recurrent spiking neural network representing the world model. The model, building on previous proposals on spiking neurons and plasticity rules having a probabilistic interpretation, presents these novelties at the system level: (a) the world model is learnt in parallel with its use for planning, and an arbitration mechanism decides when to exploit the world-model knowledge with planning, or to explore, on the basis of an entropy-based confidence on the world model knowledge; (b) the world model is a hidden Markov model (HMM) able to simulate sequences of states and actions, thus planning selects actions through the same neural generative process used to predict states; (c) the world model learns the hidden causes of observations, and their temporal dependencies, through a biologically plausible unsupervised learning mechanism. The model is tested with a visuomotor learning task and validated by comparing its behaviour with the performance and reaction times of human participants solving the same task. The model represents a further step towards the construction of an autonomous architecture bridging goal-directed behaviour as probabilistic inference to brain-like computations.Author summary Goal-directed behaviour relies on brain processes supporting planning of actions based on the prediction of their consequences before performing them in the environment. An important computational modelling approach of these processes sees the brain as a probabilistic machine implementing goal-directed processes relying on probability distributions and operations on them. An important challenge for this approach is to explain how these distributions and operations might be grounded on the brain spiking neurons and learning processes. Here we propose a hypothesis of how this might happen by presenting a computational model of goal-directed processes based on artificial spiking neural networks. The model presents three main novelties. First, it can plan even while it is still learning the consequences of actions by deciding if planning or exploring the environment based on how confident it is on its predictions. Second, it is able to ‘think’ alternative possible actions, and their consequences, by relying on the low-level stochasticity of neurons. Third, it can learn to anticipate the consequences of actions in an autonomous fashion based on experience. Overall, the model represents a novel hypothesis on how goal-directed behaviour might rely on the stochastic spiking processes and plasticity mechanisms of the brain neurons.