Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing

Stefano Palminteri; Germain Lefebvre; Emma J Kilford; Sarah-Jayne Blakemore

doi:10.1371/journal.pcbi.1005684

Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing

PLoS Comput Biol. 2017 Aug 11;13(8):e1005684. doi: 10.1371/journal.pcbi.1005684. eCollection 2017 Aug.

Authors

Stefano Palminteri^{1

2

3

4}, Germain Lefebvre^{2

3

5}, Emma J Kilford¹, Sarah-Jayne Blakemore¹

Affiliations

¹ Institute of Cognitive Neuroscience, University College London, London, United Kingdom.
² Laboratoire de Neurosciences Cognitives, Institut National de la Santé et de la Recherche Médicale, Paris, FR.
³ Departement d'Études Cognitives, École Normale Supérieure, Paris, FR.
⁴ Institut d'Études de la Cognition, Université de Recherche Paris Sciences et Lettres, Paris, FR.
⁵ Laboratoire d'Économie Mathématique et de Microéconomie Appliquée, Université Panthéon-Assas, Paris, FR.

Abstract

Previous studies suggest that factual learning, that is, learning from obtained outcomes, is biased, such that participants preferentially take into account positive, as compared to negative, prediction errors. However, whether or not the prediction error valence also affects counterfactual learning, that is, learning from forgone outcomes, is unknown. To address this question, we analysed the performance of two groups of participants on reinforcement learning tasks using a computational model that was adapted to test if prediction error valence influences learning. We carried out two experiments: in the factual learning experiment, participants learned from partial feedback (i.e., the outcome of the chosen option only); in the counterfactual learning experiment, participants learned from complete feedback information (i.e., the outcomes of both the chosen and unchosen option were displayed). In the factual learning experiment, we replicated previous findings of a valence-induced bias, whereby participants learned preferentially from positive, relative to negative, prediction errors. In contrast, for counterfactual learning, we found the opposite valence-induced bias: negative prediction errors were preferentially taken into account, relative to positive ones. When considering valence-induced bias in the context of both factual and counterfactual learning, it appears that people tend to preferentially take into account information that confirms their current choice.

MeSH terms

Adult
Computational Biology
Decision Making / physiology*
Feedback, Psychological / physiology*
Female
Humans
Learning / physiology*
Male
Reinforcement, Psychology*
Task Performance and Analysis
Young Adult

Grants and funding

WT_/Wellcome Trust/United Kingdom