TY - GEN
T1 - Model-Based Reinforcement Learning via Imagination with Derived Memory
AU - Mu, Yao
AU - Zhuang, Yuzheng
AU - Wang, Bin
AU - Zhu, Guangxiang
AU - Liu, Wulong
AU - Chen, Jianyu
AU - Luo, Ping
AU - Li, Shengbo Eben
AU - Zhang, Chongjie
AU - Hao, Jianye
N1 - Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Model-based reinforcement learning aims to improve the sample efficiency of policy learning by modelling the dynamics of the environment. Recently, the latent dynamics model has been further developed to enable fast planning in a compact space. It summarizes the high-dimensional experiences of an agent, which mimics the memory function of humans. Learning policies via imagination with the latent model shows great potential for solving complex tasks. However, only considering memories from the true experiences in the process of imagination could limit its advantages. Inspired by the memory prosthesis proposed by neuroscientists, we present a novel model-based reinforcement learning framework called Imagining with Derived Memory (IDM). It enables the agent to learn policy from enriched diverse imagination with prediction-reliability weight, thus improving sample efficiency and policy robustness. Experiments on various high-dimensional visual control tasks in the DMControl benchmark demonstrate that IDM outperforms previous state-of-the-art methods in terms of policy robustness and further improves the sample efficiency of the model-based method.
AB - Model-based reinforcement learning aims to improve the sample efficiency of policy learning by modelling the dynamics of the environment. Recently, the latent dynamics model has been further developed to enable fast planning in a compact space. It summarizes the high-dimensional experiences of an agent, which mimics the memory function of humans. Learning policies via imagination with the latent model shows great potential for solving complex tasks. However, only considering memories from the true experiences in the process of imagination could limit its advantages. Inspired by the memory prosthesis proposed by neuroscientists, we present a novel model-based reinforcement learning framework called Imagining with Derived Memory (IDM). It enables the agent to learn policy from enriched diverse imagination with prediction-reliability weight, thus improving sample efficiency and policy robustness. Experiments on various high-dimensional visual control tasks in the DMControl benchmark demonstrate that IDM outperforms previous state-of-the-art methods in terms of policy robustness and further improves the sample efficiency of the model-based method.
UR - http://www.scopus.com/inward/record.url?scp=85131830224&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85131830224
T3 - Advances in Neural Information Processing Systems
SP - 9493
EP - 9505
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Y2 - 6 December 2021 through 14 December 2021
ER -