Abstract
Humans and animals make predictions about the rewards they expect to receive in different situations. In formal models of behavior, these predictions are known as “value representations”, and they play two very different roles. Firstly, they drive choice: expected values of available options are compared to one another, and the best is selected. Secondly, they support learning: expected values are compared to rewards actually received, and future expectations are updated accordingly. A fundamental unanswered question is whether these different functions are mediated by the same or different neural mechanisms. Here we employ a recently-developed multi-step task for rats that cleanly separates learning from choosing. We address the role of the orbitofrontal cortex (OFC), a key player in value-based cognition. Electrophysiological recordings and optogenetic perturbations indicate that, contrary to prominent theories, the OFC does not directly drive choices. Instead, it supplies value information to a learning process that updates choice mechanisms elsewhere in the brain. This result places important constraints on neural architectures for learning and choosing.