Abstract
Animals’ choice behavior is characterized by two main tendencies: taking actions that led to rewards and repeating past actions. Theory suggests these strategies may be reinforced by different types of dopaminergic teaching signals: reward prediction error (RPE) to reinforce value-based associations and movement-based action prediction errors to reinforce value-free repetitive associations. Here we use an auditory-discrimination task in mice to show that movement-related dopamine activity in the tail of the striatum encodes the hypothesized action prediction error signal. Causal manipulations reveal that this prediction error serves as a value-free teaching signal that supports learning by reinforcing repeated associations. Computational modeling and experiments demonstrate that action prediction errors cannot support reward-guided learning but when paired with the RPE circuity they serve to consolidate stable sound-action associations in a value-free manner. Together we show that there are two types of dopaminergic prediction errors that work in tandem to support learning.
Competing Interest Statement
The authors have declared no competing interest.