TY - JOUR
T1 - When models matter
T2 - Environmental demand guides the arbitration between model-based and model-free control
AU - Held, Leslie K.
AU - Lesage, Elise
AU - Kool, Wouter
AU - Braem, Senne
N1 - Publisher Copyright:
© The Psychonomic Society, Inc. 2025.
PY - 2025
Y1 - 2025
N2 - As humans, we often repeat previously rewarded actions without thinking, but we also possess the ability to plan ahead and simulate actions based on an internal model of the environment. These two types of control are commonly conceptualized as model-free versus model-based control. While there is a body of research on interindividual differences in using either strategy, we aimed to test whether people can learn to regulate which strategy to use based on environmental demand. We used a two-stage decision-making task where participants tracked the drifting rewards associated with two second-stage states. Each trial started with one of two possible first-stage states, each offering two choices that deterministically led to one of the second-stage states. Successful generalization between first-stage options indicated model-based control, while mere repetition of previously rewarded choices reflected model-free behavior. We manipulated how often participants (n = 140) were exposed to alternations versus repetitions of first-stage states. When these states frequently repeat, there is a reduced need to consult the transition structure, because it pays off to adopt model-free control and simply retake previously rewarded actions. Conversely, when first-stage states frequently alternate, it is more beneficial to adopt model-based control, considering the transition structure and generalizing reward outcomes between them. In line with our hypothesis, we show that participants exposed to more first-stage state alternations were more model-based in a test phase than participants exposed to more first-stage state repetitions. These findings suggest that people learn to arbitrate between different reinforcement-learning strategies consistent with a cost–benefit analysis sensitive to environmental demands.
AB - As humans, we often repeat previously rewarded actions without thinking, but we also possess the ability to plan ahead and simulate actions based on an internal model of the environment. These two types of control are commonly conceptualized as model-free versus model-based control. While there is a body of research on interindividual differences in using either strategy, we aimed to test whether people can learn to regulate which strategy to use based on environmental demand. We used a two-stage decision-making task where participants tracked the drifting rewards associated with two second-stage states. Each trial started with one of two possible first-stage states, each offering two choices that deterministically led to one of the second-stage states. Successful generalization between first-stage options indicated model-based control, while mere repetition of previously rewarded choices reflected model-free behavior. We manipulated how often participants (n = 140) were exposed to alternations versus repetitions of first-stage states. When these states frequently repeat, there is a reduced need to consult the transition structure, because it pays off to adopt model-free control and simply retake previously rewarded actions. Conversely, when first-stage states frequently alternate, it is more beneficial to adopt model-based control, considering the transition structure and generalizing reward outcomes between them. In line with our hypothesis, we show that participants exposed to more first-stage state alternations were more model-based in a test phase than participants exposed to more first-stage state repetitions. These findings suggest that people learn to arbitrate between different reinforcement-learning strategies consistent with a cost–benefit analysis sensitive to environmental demands.
KW - Dual-system RL
KW - Model-based
KW - Model-free
KW - Reinforcement learning
KW - Two-step task
UR - https://www.scopus.com/pages/publications/105018935248
U2 - 10.3758/s13415-025-01350-9
DO - 10.3758/s13415-025-01350-9
M3 - Article
C2 - 41083652
AN - SCOPUS:105018935248
SN - 1530-7026
JO - Cognitive, Affective and Behavioral Neuroscience
JF - Cognitive, Affective and Behavioral Neuroscience
ER -