Learning of sequential movements by neural network model with dopamine-like reinforcement signal

Exp Brain Res. 1998 Aug;121(3):350-4. doi: 10.1007/s002210050467.

Abstract

Dopamine neurons appear to code an error in the prediction of reward. They are activated by unpredicted rewards, are not influenced by predicted rewards, and are depressed when a predicted reward is omitted. After conditioning, they respond to reward-predicting stimuli in a similar manner. With these characteristics, the dopamine response strongly resembles the predictive reinforcement teaching signal of neural network models implementing the temporal difference learning algorithm. This study explored a neural network model that used a reward-prediction error signal strongly resembling dopamine responses for learning movement sequences. A different stimulus was presented in each step of the sequence and required a different movement reaction, and reward occurred at the end of the correctly performed sequence. The dopamine-like predictive reinforcement signal efficiently allowed the model to learn long sequences. By contrast, learning with an unconditional reinforcement signal required synaptic eligibility traces of longer and biologically less-plausible durations for obtaining satisfactory performance. Thus, dopamine-like neuronal signals constitute excellent teaching signals for learning sequential behavior.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Basal Ganglia / chemistry
  • Basal Ganglia / physiology
  • Dopamine / physiology*
  • Humans
  • Learning / physiology
  • Movement / physiology*
  • Neural Networks, Computer*
  • Neuronal Plasticity / physiology
  • Signal Transduction / physiology
  • Time Factors

Substances

  • Dopamine