TY - JOUR
T1 - LINDA
T2 - multi-agent local information decomposition for awareness of teammates
AU - Cao, Jiahan
AU - Yuan, Lei
AU - Wang, Jianhao
AU - Zhang, Shaowei
AU - Zhang, Chongjie
AU - Yu, Yang
AU - Zhan, De Chuan
N1 - Publisher Copyright:
© 2023, Science China Press.
PY - 2023/8
Y1 - 2023/8
N2 - In cooperative multi-agent reinforcement learning (MARL), where agents only have access to partial observations, efficiently leveraging local information is critical. During long-time observations, agents can build awareness for teammates to alleviate the restriction of partial observability. However, previous MARL methods usually neglect awareness learning from local information for better collaboration. To address this problem, we propose a novel framework, multi-agent local information decomposition for awareness of teammates (LINDA), with which agents learn to decompose local information and build awareness for each teammate. We model the awareness as stochastic random variables and perform representation learning to ensure the informativeness of awareness representations by maximizing the mutual information between awareness and the actual trajectory of the corresponding agent. LINDA is agnostic to specific algorithms and can be flexibly integrated with different MARL methods. Sufficient experiments show that the proposed framework learns informative awareness from local partial observations for better collaboration and significantly improves the learning performance, especially on challenging tasks.
AB - In cooperative multi-agent reinforcement learning (MARL), where agents only have access to partial observations, efficiently leveraging local information is critical. During long-time observations, agents can build awareness for teammates to alleviate the restriction of partial observability. However, previous MARL methods usually neglect awareness learning from local information for better collaboration. To address this problem, we propose a novel framework, multi-agent local information decomposition for awareness of teammates (LINDA), with which agents learn to decompose local information and build awareness for each teammate. We model the awareness as stochastic random variables and perform representation learning to ensure the informativeness of awareness representations by maximizing the mutual information between awareness and the actual trajectory of the corresponding agent. LINDA is agnostic to specific algorithms and can be flexibly integrated with different MARL methods. Sufficient experiments show that the proposed framework learns informative awareness from local partial observations for better collaboration and significantly improves the learning performance, especially on challenging tasks.
KW - StarCraft II
KW - centralized training with decentralized execution (CTDE)
KW - multi-agent system
KW - reinforcement learning
KW - teammates awareness
UR - https://www.scopus.com/pages/publications/85166224141
U2 - 10.1007/s11432-021-3479-9
DO - 10.1007/s11432-021-3479-9
M3 - Article
AN - SCOPUS:85166224141
SN - 1674-733X
VL - 66
JO - Science China Information Sciences
JF - Science China Information Sciences
IS - 8
M1 - 182101
ER -