CASIA OpenIR
A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
Zhang, Zhen1; Wang, Dongqing1; Zhao, Dongbin2,3; Han, Qiaoni1; Song, Tingting1,4
Source PublicationIEEE ACCESS
ISSN2169-3536
2018
Volume6Pages:70223-70235
Corresponding AuthorZhang, Zhen(tbsunshine8@163.com)
AbstractMulti-agent reinforcement learning (MARL) can be used to design intelligent agents for solving cooperative tasks. Within the MARL category, this paper proposes the probability of maximal reward based on the infinitesimal gradient ascent (PMR-IGA) algorithm to reach the maximal total reward in repeated games. Theoretical analyses show that in a finite-player-finite-action repeated game with two pure optimal joint actions where no common component action exists, both the optimal joint actions are stable critical points of the PMR-IGA model. Furthermore, we apply the Q-value function to estimate the gradient and derive the probability of maximal reward based on estimated gradient ascent (PMR-EGA) algorithm. Theoretical analyses and simulations of case studies of repeated games show that the maximal total reward can be achieved under any initial conditions. The PMR-EGA can be naturally extended to optimize cooperative stochastic games. Two stochastic games, i.e., box pushing and a distributed sensor network, are used as test beds. The simulations show that the PMR-EGA displays consistently an excellent performance for both stochastic games.
KeywordMulti-agent reinforcement learning gradient ascent Q-learning cooperative tasks
DOI10.1109/ACCESS.2018.2878853
WOS KeywordEVOLUTIONARY GAME-THEORY ; POLICY GRADIENT ; SYSTEMS
Indexed BySCI
Language英语
Funding ProjectShandong Provincial Natural Science Foundation of China[ZR2017PF005] ; National Natural Science Foundation of China[61873138] ; National Natural Science Foundation of China[61803218] ; National Natural Science Foundation of China[61573353] ; National Natural Science Foundation of China[61533017] ; National Natural Science Foundation of China[61573205]
Funding OrganizationShandong Provincial Natural Science Foundation of China ; National Natural Science Foundation of China
WOS Research AreaComputer Science ; Engineering ; Telecommunications
WOS SubjectComputer Science, Information Systems ; Engineering, Electrical & Electronic ; Telecommunications
WOS IDWOS:000453261200001
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Citation statistics
Cited Times:1[WOS]   [WOS Record]     [Related Records in WOS]
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/25666
Collection中国科学院自动化研究所
Corresponding AuthorZhang, Zhen
Affiliation1.Qingdao Univ, Sch Automat, Qingdao 266071, Peoples R China
2.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
3.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
4.Qingdao Metro Grp Co Ltd, Operating Branch, Qingdao 266000, Peoples R China
Recommended Citation
GB/T 7714
Zhang, Zhen,Wang, Dongqing,Zhao, Dongbin,et al. A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents[J]. IEEE ACCESS,2018,6:70223-70235.
APA Zhang, Zhen,Wang, Dongqing,Zhao, Dongbin,Han, Qiaoni,&Song, Tingting.(2018).A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents.IEEE ACCESS,6,70223-70235.
MLA Zhang, Zhen,et al."A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents".IEEE ACCESS 6(2018):70223-70235.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zhang, Zhen]'s Articles
[Wang, Dongqing]'s Articles
[Zhao, Dongbin]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zhang, Zhen]'s Articles
[Wang, Dongqing]'s Articles
[Zhao, Dongbin]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhang, Zhen]'s Articles
[Wang, Dongqing]'s Articles
[Zhao, Dongbin]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.