Abstract
The cerebellum has been considered to perform error-based supervised learning via long-term depression (LTD) at synapses between parallel fibers and Purkinje cells (PCs). Since the discovery of multiple synaptic plasticity other than LTD, recent studies have suggested that synergistic plasticity mechanisms could enhance the learning capability of the cerebellum. We have proposed that the mechanisms allow the cerebellar circuit to perform reinforcement learning (RL). However, its detailed spike-based implementation is still missing. In this research, we implemented a cerebellar spiking network as an actor-critic model based on known anatomical properties of the cerebellum. We confirmed that our model successfully learned a state value and solved the mountain car task, a simple RL benchmark. Furthermore, our model demonstrated the ability to solve the delay eyeblink conditioning task using biologically plausible internal dynamics. The study presents the implementation of the first cerebellar spiking network model capable of performing RL.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Reformatted for submission. Edited specifically on Abstract, Introduction, and Discussion.