TY - JOUR
T1 - R-learning in actor-critic model offers a biologically relevant mechanism for sequential decision-making
AU - Shuvaev, Sergey
AU - Starosta, Sarah
AU - Kvitsiani, Duda
AU - Kepecs, Adam
AU - Koulakov, Alexei
N1 - Funding Information:
We thank Aubrey Siebels for technical assistance with conducting the animal experiments. Funding in direct support of this work: The Swartz Foundation; DFG Grant STA 1544/1-1. Additional revenues related to this work: Travel support by The Simons Center for Quantitative Biology, The Gatsby Charitable Foundation, Burroughs Wellcome Fund, Google DeepMind, and Simons Foundation.
Publisher Copyright:
© 2020 Neural information processing systems foundation. All rights reserved.
PY - 2020
Y1 - 2020
N2 - When should you continue with your ongoing plans and when should you instead decide to pursue better opportunities? We show in theory and experiment that such stay-or-leave decisions are consistent with deep R-learning both behaviorally and neuronally. Our results suggest that real-world agents leave depleting resources when their reward rate falls below its exponential average, which, we argue, is a Bayes optimal rule in dynamic natural environments. Our work links reinforcement learning, the marginal value theorem and Bayesian inference approaches to offer a learning algorithm and a decision rule for making sequential stay-or-leave choices.
AB - When should you continue with your ongoing plans and when should you instead decide to pursue better opportunities? We show in theory and experiment that such stay-or-leave decisions are consistent with deep R-learning both behaviorally and neuronally. Our results suggest that real-world agents leave depleting resources when their reward rate falls below its exponential average, which, we argue, is a Bayes optimal rule in dynamic natural environments. Our work links reinforcement learning, the marginal value theorem and Bayesian inference approaches to offer a learning algorithm and a decision rule for making sequential stay-or-leave choices.
UR - http://www.scopus.com/inward/record.url?scp=85108452393&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85108452393
SN - 1049-5258
VL - 2020-December
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 34th Conference on Neural Information Processing Systems, NeurIPS 2020
Y2 - 6 December 2020 through 12 December 2020
ER -