Multi-Agent Learning with Policy Prediction

  • Chongjie Zhang
  • , Victor Lesser

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Scopus citations

Abstract

Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting the basic gradient ascent approach with policy prediction. We prove that this augmentation results in a stronger notion of convergence than the basic gradient ascent, that is, strategies converge to a Nash equilibrium within a restricted class of iterated games. Motivated by this augmentation, we then propose a new practical multi-agent reinforcement learning (MARL) algorithm exploiting approximate policy prediction. Empirical results show that it converges faster and in a wider variety of situations than state-of-the-art MARL algorithms.

Original languageEnglish
Title of host publicationProceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI 2010
PublisherAAAI press
Pages927-934
Number of pages8
ISBN (Electronic)9781577354642
DOIs
StatePublished - Jul 15 2010
Event24th AAAI Conference on Artificial Intelligence, AAAI 2010 - Atlanta, United States
Duration: Jul 11 2010Jul 15 2010

Publication series

NameProceedings of the 24th AAAI Conference on Artificial Intelligence, AAAI 2010

Conference

Conference24th AAAI Conference on Artificial Intelligence, AAAI 2010
Country/TerritoryUnited States
CityAtlanta
Period07/11/1007/15/10

Fingerprint

Dive into the research topics of 'Multi-Agent Learning with Policy Prediction'. Together they form a unique fingerprint.

Cite this