Skip to main navigation Skip to search Skip to main content

Context-aware policy reuse

  • Siyuan Li
  • , Fangda Gu
  • , Guangxiang Zhu
  • , Chongjie Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Transfer learning can greatly speed up reinforcement learning for a new task by leveraging policies of relevant tasks Existing works of policy reuse either focus on selecting a single best source policy for reuse without considering contexts, or fail to guarantee learning an optimal policy for a target task To improve transfer efficiency and guarantee optimality, we develop a novel policy reuse method, called Context-Aware Policy reuSe (CAPS), that enables multi-policy reuse Our method learns when and which source policy is best for reuse, as well as when to terminate its reuse CAPS provides theoretical guarantees in convergence and optimality for both source policy selection and target task learning Empirical results on a grid-based navigation domain and the Pygame Learning Environment demonstrate that CAPS significantly outperforms other state-of-the-art policy reuse methods.

Original languageEnglish
Title of host publication18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages989-997
Number of pages9
ISBN (Electronic)9781510892002
StatePublished - 2019
Event18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019 - Montreal, Canada
Duration: May 13 2019May 17 2019

Publication series

NameProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume2
ISSN (Print)1548-8403
ISSN (Electronic)1558-2914

Conference

Conference18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
Country/TerritoryCanada
CityMontreal
Period05/13/1905/17/19

Keywords

  • Policy reuse
  • Reinforcement learning
  • Transfer learning

Fingerprint

Dive into the research topics of 'Context-aware policy reuse'. Together they form a unique fingerprint.

Cite this