Multiagent Adversarial Collaborative Learning via Mean-Field Theory
Luo, Guiyang1; Zhang, Hui2,3; He, Haibo4; Li, Jinglin1; Wang, Fei-Yue3,5,6
发表期刊IEEE TRANSACTIONS ON CYBERNETICS
ISSN2168-2267
2021-10-01
卷号51期号:10页码:4994-5007
通讯作者Luo, Guiyang(luoguiyang@bupt.edu.cn)
摘要Multiagent reinforcement learning (MARL) has recently attracted considerable attention from both academics and practitioners. Core issues, e.g., the curse of dimensionality due to the exponential growth of agent interactions and nonstationary environments due to simultaneous learning, hinder the large-scale proliferation of MARL. These problems deteriorate with an increased number of agents. To address these challenges, we propose an adversarial collaborative learning method in a mixed cooperative-competitive environment, exploiting friend-or-foe Q-learning and mean-field theory. We first treat neighbors of agent i as two coalitions (i's friend and opponent coalition, respectively), and convert the Markov game into a two-player zero-sum game with an extended action set. By exploiting mean-field theory, this new game simplifies the interactions as those between a single agent and the mean effects of friends and opponents. A neural network is employed to learn the optimal mean effects of these two coalitions, which are trained via adversarial max and min steps. In the max step, with fixed policies of opponents, we optimize the friends' mean action to maximize their rewards. In the min step, the mean action of opponents is trained to minimize the friends' rewards when the policies of friends are frozen. These two steps are proved to converge to a Nash equilibrium. Then, another neural network is applied to learn the best response of each agent toward the mean effects. Finally, the adversarial max and min steps can jointly optimize the two networks. Experiments on two platforms demonstrate the learning effectiveness and strength of our approach, especially with many agents.
关键词Games Training Collaborative work Task analysis Nash equilibrium Sociology Statistics Adversarial collaborative learning (ACL) friend-or-foe Q-learning mean-field theory multiagent reinforcement learning (MARL)
DOI10.1109/TCYB.2020.3025491
关键词[WOS]COMPREHENSIVE SURVEY ; CONTROL SCHEME ; SYSTEM ; DESIGN
收录类别SCI
语种英语
资助项目Natural Science Foundation of China[61876023] ; National Science Foundation[ECCS 1917275]
项目资助者Natural Science Foundation of China ; National Science Foundation
WOS研究方向Automation & Control Systems ; Computer Science
WOS类目Automation & Control Systems ; Computer Science, Artificial Intelligence ; Computer Science, Cybernetics
WOS记录号WOS:000706832000023
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:16[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/46228
专题多模态人工智能系统全国重点实验室_平行智能技术与系统团队
通讯作者Luo, Guiyang
作者单位1.Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100088, Peoples R China
2.Tencent Res, Technol & Engn Grp, Beijing 100193, Peoples R China
3.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
4.Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA
5.Qingdao Acad Intelligent Ind, Innovat Ctr Parallel Vis, Qingdao 266109, Peoples R China
6.Macau Univ Sci & Technol, Inst Syst Engn, Macau, Peoples R China
推荐引用方式
GB/T 7714
Luo, Guiyang,Zhang, Hui,He, Haibo,et al. Multiagent Adversarial Collaborative Learning via Mean-Field Theory[J]. IEEE TRANSACTIONS ON CYBERNETICS,2021,51(10):4994-5007.
APA Luo, Guiyang,Zhang, Hui,He, Haibo,Li, Jinglin,&Wang, Fei-Yue.(2021).Multiagent Adversarial Collaborative Learning via Mean-Field Theory.IEEE TRANSACTIONS ON CYBERNETICS,51(10),4994-5007.
MLA Luo, Guiyang,et al."Multiagent Adversarial Collaborative Learning via Mean-Field Theory".IEEE TRANSACTIONS ON CYBERNETICS 51.10(2021):4994-5007.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Luo, Guiyang]的文章
[Zhang, Hui]的文章
[He, Haibo]的文章
百度学术
百度学术中相似的文章
[Luo, Guiyang]的文章
[Zhang, Hui]的文章
[He, Haibo]的文章
必应学术
必应学术中相似的文章
[Luo, Guiyang]的文章
[Zhang, Hui]的文章
[He, Haibo]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。