Abstract
Much research focuses on how the basal ganglia (BG) and dopamine (DA) contribute to reward-driven behavior. But BG circuitry is notoriously complex, with two opponent pathways interacting via several disinhibitory mechanisms, which are in turn modulated by DA. Building on earlier models, we propose a new model, OpAL*, to assess the normative advantages of such circuitry in cost-benefit decision making. OpAL* dynamically modulates DA as a function of learned reward statistics, differentially amplifying the striatal pathway most specialized for the environment. OpAL* exhibits robust advantages over traditional and alternative BG models across a range of environments, particularly those with sparse reward. These advantages depend on opponent and nonlinear Hebbian plasticity mechanisms previously thought to be pathological. Finally, OpAL* captures patterns of risky choice arising from manipulations of DA and environmental richness across species, suggesting that such choice patterns result from a normative biological mechanism.
Competing Interest Statement
The authors have declared no competing interest.