PT - JOURNAL ARTICLE AU - Kevin J. Miller AU - Matthew M. Botvinick AU - Carlos D. Brody TI - Value Representations in Orbitofrontal Cortex Drive Learning, but not Choice AID - 10.1101/245720 DP - 2018 Jan 01 TA - bioRxiv PG - 245720 4099 - http://biorxiv.org/content/early/2018/01/17/245720.short 4100 - http://biorxiv.org/content/early/2018/01/17/245720.full AB - As humans and animals experience the world, they learn to associate states and actions with the expected values of the reward that is likely to follow1–3. Neural correlates of expected value are found in many brain regions, including the orbitofrontal cortex (OFC)4–9. While OFC value representations have been identified across many tasks and species10–15, their computational role remains controversial16–18. One influential hypothesis holds that they drive value-based choosing: The OFC represents the expected values of available options, and choices are made by comparing these values to one another4, 7, 9. A contrasting hypothesis holds that they drive learning: The OFC represents the expected values of immediately impending outcomes, which are compared to rewards actually received, so as to learn and adapt expectations to match the world5, 6, 19, 20. In common laboratory tasks the items to be decided between are also the items to be learned about, making the two hypothesized roles difficult to distinguish. Here, we use a recently-developed multi-step task for rats21 that separates choosing from learning. In a first step, rats choose one of two ports (“choice ports”) whose expected values are computed using planning, and are not learned. In the second step, rats are led to one of two other ports (“outcome ports”) which are not chosen between, but whose expected values are learned based on reward history. We found relatively weak OFC encoding of choice port values, needed for choosing but not learning, but far stronger encoding of outcome port values, needed for learning but not choosing. Moreover, temporally-specific silencing of OFC during outcome port entry was sufficient to disrupt behavior, and the nature of this disruption was consistent with impairment of a value learning process, but was not consistent with impairment of a choice process. We therefore suggest that value representations in the OFC directly drive learning, but do not directly drive choice.