PT - JOURNAL ARTICLE AU - James, Ashwin AU - Bethus, Ingrid AU - Muzy, Alexandre TI - Capturing Optimal and Suboptimal behavior Of Agents Via Structure Learning Of Their Internal Model AID - 10.1101/2024.09.30.615767 DP - 2024 Jan 01 TA - bioRxiv PG - 2024.09.30.615767 4099 - http://biorxiv.org/content/early/2024/10/01/2024.09.30.615767.short 4100 - http://biorxiv.org/content/early/2024/10/01/2024.09.30.615767.full AB - This study introduces a novel framework for understanding the cognitive underpinnings of individual behavior which often deviates from rational decision-making aimed at maximizing rewards in real-life scenarios. We propose a structure learning approach to infer an agent’s internal model, composed of a learning rule and an internal representation of the environment. Crucially, the combined contribution of these components — rather than their individual contribution — determines the overall optimality of the agent’s behavior. By exploring various combinations of learning rules and environment representations, we identify the most probable agent model structure for each individual. We apply this framework to analyze rats’ learning behavior in a free-choice task within a T-maze with return arms, evaluating different internal models based on optimal and suboptimal learning rules, along with multiple possible representations of the T-maze decision graph. Identifying the most likely agent model structure based on the rats’ behavioral data reveals that slower learning rats employed either a suboptimal or a moderately optimal agent model, whereas fast learners employ an optimal agent model. Using the inferred agent models for each rat, we explore the qualitative differences in their individual learning processes. Policy entropy, derived from the inferred agent models, also highlights variations in the balance between exploration and exploitation strategies among the rats. Traditional reinforcement learning approaches to addressing suboptimal behavior focus separately on either suboptimal learning rules or flawed environment representations. Our approach jointly models these components, revealing that suboptimal decisions can arise from complex interactions between learning rules and environment representations within an agent’s internal model. This provides deeper insights into the cognitive mechanisms underlying real-world decision-making.Competing Interest StatementThe authors have declared no competing interest.