Abstract
Recent experiments and theories of human decision-making suggest positive and negative errors are processed and encoded differently by serotonin and dopamine, with serotonin possibly serving to oppose dopamine and protect against risky decisions. We introduce a temporal difference (TD) model of human decision-making to account for these features. Our model involves two opposing counsels, an optimistic learning system and a pessimistic learning system, whose predictions are integrated in time to control how potential decisions compete to be selected. Our model predicts that human decision-making can be decomposed along two dimensions: the degree to which the individual is sensitive to (1) risk and (2) uncertainty. In addition, we demonstrate that the model can learn about reward expectations and uncertainty, and provide information about reaction time despite not modeling these variables directly. Lastly, we simulate a recent experiment to show how updates of the two learning systems could relate to dopamine and serotonin transients, thereby providing a mathematical formalism to serotonin’s hypothesized role as an opponent to dopamine. This new model should be useful for future experiments on human decision-making.
Author summary Computational models have helped researchers understand how humans make decisions. Errors in model-derived predictions have long been known to relate to dopamine, but recent experiments support adding nuance to how prediction errors are modeled. In particular, prediction errors appear to be multidimensional and related to serotonin and dopamine in nonlinear ways. Accordingly, we introduce a simple model of decision-making to capture these features and explore the implications of such a model. This model should provide a conceptual framework for interpreting behavioral and neural data from decision-making experiments.
Competing Interest Statement
The authors have declared no competing interest.