Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

  • Duc Thien Nguyen
  • , William Yeoh
  • , Hoong Chuin Lau
  • , Shlomo Zilberstein
  • , Chongjie Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. In this paper, we introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where a DCOP is a function of the value assignments in the preceding DCOP. We also introduce a distributed reinforcement learning algorithm that balances exploration and exploitation to solve MD-DCOPs in an online manner.

Original languageEnglish
Title of host publication13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages1341-1342
Number of pages2
ISBN (Electronic)9781634391313
StatePublished - 2014
Event13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014 - Paris, France
Duration: May 5 2014May 9 2014

Publication series

Name13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014
Volume2

Conference

Conference13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014
Country/TerritoryFrance
CityParis
Period05/5/1405/9/14

Keywords

  • DCOP
  • Dynamic DCOP
  • MDP
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs'. Together they form a unique fingerprint.

Cite this