RT Journal Article
SR Electronic
T1 Exploration and recency as the main proximate causes of probability matching: a reinforcement learning analysis
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 104752
DO 10.1101/104752
A1 Carolina Feher da Silva
A1 Camila Gomes Victorino
A1 Nestor Caticha
A1 Marcus Vinícius Chrysóstomo Baldo
YR 2017
UL http://biorxiv.org/content/early/2017/01/31/104752.abstract
AB Researchers have not yet reached a consensus on why human participants perform suboptimally and match probabilities instead of maximize in a probability learning task. The most influential explanation is that participants search for patterns in the random sequence of outcomes, but it is unclear how pattern search produces probability matching. Other explanations do not take into account how reinforcement learning shapes people’s choices.This study aimed to investigate probability matching from a reinforcement learning perspective. We collected behavioral data from 84 young adult participants who performed a probability learning task wherein the most frequent outcome was rewarded with 0.7 probability. We then analyzed the data using a reinforcement learning model that searches for patterns. The model predicts that pattern search may slow down learning, and that exploration (making random choices to learn more about the environment) and recency (discounting early experiences to account for a changing environment) may also impair performance.Our analysis estimates that 85% (95% HDI [76, 94]) of participants searched for patterns and believed that each trial outcome depended on one or two previous ones. The estimated impact of pattern search on performance was, however, only 6%, while those of exploration and recency were 19% and 13% respectively. This suggests that probability matching is caused by uncertainty about how outcomes are generated, which leads to pattern search, exploration, and recency.