Knowledge Commons of Institute of Automation,CAS
Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics | |
Zhang, Qichao1,2; Zhao, Dongbin1,2 | |
发表期刊 | IEEE TRANSACTIONS ON CYBERNETICS |
ISSN | 2168-2267 |
2019-08-01 | |
卷号 | 49期号:8页码:2874-2885 |
摘要 | This paper is concerned about the nonlinear optimization problem of nonzero-sum (NZS) games with unknown drift dynamics. The data-based integral reinforcement learning (IRL) method is proposed to approximate the Nash equilibrium of NZS games iteratively. Furthermore, we prove that the data-based IRL method is equivalent to the model-based policy iteration algorithm, which guarantees the convergence of the proposed method. For the implementation purpose, a singl-ecritic neural network structure for the NZS games is given. To enhance the application capability of the data-based IRL method, we design the updating laws of critic weights based on the offline and online iterative learning methods, respectively. Note that the experience replay technique is introduced in the online iterative learning, which can improve the convergence rate of critic weights during the learning process. The uniform ultimate boundedness of the critic weights are guaranteed using the Lyapunov method. Finally, the numerical results demonstrate the effectiveness of the data-based M. algorithm for nonlinear NZS games with unknown drift dynamics. |
关键词 | Integral reinforcement learning (IRL) neural network (NN) nonzero-sum (NZS) games off-policy single-critic unknown drift dynamics |
DOI | 10.1109/TCYB.2018.2830820 |
关键词[WOS] | H-INFINITY CONTROL ; NONLINEAR-SYSTEMS ; ALGORITHM |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Key Research and Development Plan[2016YFB0101000] ; National Natural Science Foundation of China[61573353] ; National Natural Science Foundation of China[61533017] ; National Natural Science Foundation of China[61533017] ; National Natural Science Foundation of China[61573353] ; National Key Research and Development Plan[2016YFB0101000] |
WOS研究方向 | Automation & Control Systems ; Computer Science |
WOS类目 | Automation & Control Systems ; Computer Science, Artificial Intelligence ; Computer Science, Cybernetics |
WOS记录号 | WOS:000467561700005 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
七大方向——子方向分类 | 强化与进化学习 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/24568 |
专题 | 多模态人工智能系统全国重点实验室_深度强化学习 |
通讯作者 | Zhao, Dongbin |
作者单位 | 1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China |
第一作者单位 | 中国科学院自动化研究所 |
通讯作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Zhang, Qichao,Zhao, Dongbin. Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics[J]. IEEE TRANSACTIONS ON CYBERNETICS,2019,49(8):2874-2885. |
APA | Zhang, Qichao,&Zhao, Dongbin.(2019).Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics.IEEE TRANSACTIONS ON CYBERNETICS,49(8),2874-2885. |
MLA | Zhang, Qichao,et al."Data-Based Reinforcement Learning for Nonzero-Sum Games With Unknown Drift Dynamics".IEEE TRANSACTIONS ON CYBERNETICS 49.8(2019):2874-2885. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
CYB-qichao.pdf(1021KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[Zhang, Qichao]的文章 |
[Zhao, Dongbin]的文章 |
百度学术 |
百度学术中相似的文章 |
[Zhang, Qichao]的文章 |
[Zhao, Dongbin]的文章 |
必应学术 |
必应学术中相似的文章 |
[Zhang, Qichao]的文章 |
[Zhao, Dongbin]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论