Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs

doi:10.1016/j.neucom.2017.01.076

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 深度强化学习

	Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs
	Zhang, Qichao1,2 ; Zhao, Dongbin1,2 ; Zhu, Yuanheng1,2
发表期刊	NEUROCOMPUTING
	2017-05-17
卷号	238 期号:*页码:377-386
文章类型	Article
摘要	In this paper, the fully cooperative game with partially constrained inputs in the continuous-time Markov decision process environment is investigated using a novel data-driven adaptive dynamic programming method. First, the model-based policy iteration algorithm with one iteration loop is proposed, where the knowledge of system dynamics is required. Then, it is proved that the iteration sequences of value functions and control policies can converge to the optimal ones. In order to relax the exact knowledge of the system dynamics, a model-free iterative equation is derived based on the model-based algorithm and the integral reinforcement learning. Furthermore, a data-driven adaptive dynamic programming is developed to solve the model-free equation using generated system data. From the theoretical analysis, we prove that this model-free iterative equation is equivalent to the model-based iterative equations, which means that the data-driven algorithm can approach the optimal value function and control policies. For the implementation purpose, three neural networks are constructed to approximate the solution of the model-free iteration equation using the off-policy learning scheme after the available system data is collected in the online measurement phase. Finally, two examples are provided to demonstrate the effectiveness of the proposed scheme. (C) 2017 Published by Elsevier B.V.
关键词	Adaptive Dynamic Programming Optimal Control Neural Network Fully Cooperative Games Data-driven Constrained Input
WOS标题词	Science & Technology ; Technology
DOI	10.1016/j.neucom.2017.01.076
关键词[WOS]	ZERO-SUM GAMES ; H-INFINITY CONTROL ; DIFFERENTIAL GRAPHICAL GAMES ; NONLINEAR-SYSTEMS ; LEARNING SOLUTION ; UNKNOWN DYNAMICS ; MULTIAGENT SYSTEMS ; EXPERIENCE REPLAY ; CONTROL DESIGN ; ALGORITHM
收录类别	SCI
语种	英语
项目资助者	National Natural Science Foundation of China (NSFC)(61273136 ; National Key Research and Development Plan(2016YFB0101000) ; 61573353 ; 61533017 ; 61603382)
WOS研究方向	Computer Science
WOS类目	Computer Science, Artificial Intelligence
WOS记录号	WOS:000397372100033
引用统计	被引频次：52[WOS] [WOS记录] [WOS相关记录]
文献类型	期刊论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14336
专题	多模态人工智能系统全国重点实验室_深度强化学习
通讯作者	Zhao, Dongbin
作者单位	1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
第一作者单位	中国科学院自动化研究所
通讯作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	Zhang, Qichao,Zhao, Dongbin,Zhu, Yuanheng. Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs[J]. NEUROCOMPUTING,2017,238(*):377-386.
APA	Zhang, Qichao,Zhao, Dongbin,&Zhu, Yuanheng.(2017).Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs.NEUROCOMPUTING,238(*),377-386.
MLA	Zhang, Qichao,et al."Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs".NEUROCOMPUTING 238.*(2017):377-386.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
neurocomputing.pdf（1508KB）	期刊论文	作者接受稿	开放获取	CC BY-NC-SA	浏览下载