RT Journal Article SR Electronic T1 Value Representations in Orbitofrontal Cortex Drive Learning, but not Choice JF bioRxiv FD Cold Spring Harbor Laboratory SP 245720 DO 10.1101/245720 A1 Kevin J. Miller A1 Matthew M. Botvinick A1 Carlos D. Brody YR 2018 UL http://biorxiv.org/content/early/2018/01/17/245720.abstract AB As humans and animals experience the world, they learn to associate states and actions with the expected values of the reward that is likely to follow1–3. Neural correlates of expected value are found in many brain regions, including the orbitofrontal cortex (OFC)4–9. While OFC value representations have been identified across many tasks and species10–15, their computational role remains controversial16–18. One influential hypothesis holds that they drive value-based choosing: The OFC represents the expected values of available options, and choices are made by comparing these values to one another4, 7, 9. A contrasting hypothesis holds that they drive learning: The OFC represents the expected values of immediately impending outcomes, which are compared to rewards actually received, so as to learn and adapt expectations to match the world5, 6, 19, 20. In common laboratory tasks the items to be decided between are also the items to be learned about, making the two hypothesized roles difficult to distinguish. Here, we use a recently-developed multi-step task for rats21 that separates choosing from learning. In a first step, rats choose one of two ports (“choice ports”) whose expected values are computed using planning, and are not learned. In the second step, rats are led to one of two other ports (“outcome ports”) which are not chosen between, but whose expected values are learned based on reward history. We found relatively weak OFC encoding of choice port values, needed for choosing but not learning, but far stronger encoding of outcome port values, needed for learning but not choosing. Moreover, temporally-specific silencing of OFC during outcome port entry was sufficient to disrupt behavior, and the nature of this disruption was consistent with impairment of a value learning process, but was not consistent with impairment of a choice process. We therefore suggest that value representations in the OFC directly drive learning, but do not directly drive choice.