Abstract
To make decisions, animals must evaluate outcomes of candidate choices by accessing memories of relevant experiences. Recent theories suggest that phenomena of habits and compulsion can be reinterpreted as selectively omitting such computations. Yet little is known about the more granular question of which specific experiences are considered or ignored during deliberation, which ultimately governs decisions. Here, we propose a normative theory to predict not just whether but which memories should be accessed at each time to enable the most rewarding future decisions. Using nonlocal “replay” of spatial locations in hippocampus as a window into memory access, we simulate a spatial navigation task where an agent accesses memories of locations sequentially, in order of the expected utility of the computation: how much more reward would be earned due to better choices. We show that our theory offers a unifying account of a large range of hitherto disconnected findings in the place cell literature such as the balance of forward and reverse replay, biases in the replayed content, and effects of experience. We suggest that the various types of nonlocal events during behavior and rest reflect different instances of a single choice evaluation operation, unifying seemingly disparate proposed functions of replay including planning, learning and consolidation, and whose dysfunction may explain issues like rumination and craving.