Error Bound Analysis of Q-Function for Discounted Optimal Control Problems With Policy Iteration
Yan, Pengfei1; Wang, Ding1; Li, Hongliang2; Liu, Derong3
发表期刊IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS
2017-07-01
卷号47期号:7页码:1207-1216
文章类型Article
摘要In this paper, we present error bound analysis of the Q-function for the action-dependent adaptive dynamic programming for solving discounted optimal control problems of unknown discrete-time nonlinear systems. The convergence of Q-functions derived by a policy iteration algorithm under ideal conditions is given. Considering the approximated errors of the Q-function and control policy in the policy evaluation step and policy improvement step, we establish error bounds of approximate Q-functions in each iteration. With the given boundedness conditions, the approximate Q-function will converge to a finite neighborhood of the optimal Q-function. To implement the presented algorithm, two three-layer neural networks are employed to approximate the Q-function and the control policy, respectively. Finally, a simulation example is utilized to verify the validity of the presented algorithm.
关键词Adaptive Dynamic Programming (Adp) Error Analysis Nonlinear Systems Policy Iteration Q-function
WOS标题词Science & Technology ; Technology
DOI10.1109/TSMC.2016.2563982
关键词[WOS]TIME NONLINEAR-SYSTEMS ; APPROXIMATE VALUE-ITERATION ; UNKNOWN INTERNAL DYNAMICS ; ADAPTIVE OPTIMAL-CONTROL ; OPTIMAL-CONTROL DESIGN ; H-INFINITY CONTROL ; ZERO-SUM GAMES ; INPUT CONSTRAINTS ; HJB SOLUTION ; REINFORCEMENT
收录类别SCI
语种英语
项目资助者National Natural Science Foundation of China(61233001 ; Beijing Natural Science Foundation(4162065) ; Early Career Development Award of SKLMCCS ; 61273140 ; 61304086 ; 61374105 ; 61533017 ; U1501251)
WOS研究方向Automation & Control Systems ; Computer Science
WOS类目Automation & Control Systems ; Computer Science, Cybernetics
WOS记录号WOS:000404354600014
引用统计
被引频次:23[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/15223
专题多模态人工智能系统全国重点实验室_复杂系统智能机理与平行控制团队
作者单位1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
2.IBM Res China, Beijing 100193, Peoples R China
3.Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
第一作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
Yan, Pengfei,Wang, Ding,Li, Hongliang,et al. Error Bound Analysis of Q-Function for Discounted Optimal Control Problems With Policy Iteration[J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS,2017,47(7):1207-1216.
APA Yan, Pengfei,Wang, Ding,Li, Hongliang,&Liu, Derong.(2017).Error Bound Analysis of Q-Function for Discounted Optimal Control Problems With Policy Iteration.IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS,47(7),1207-1216.
MLA Yan, Pengfei,et al."Error Bound Analysis of Q-Function for Discounted Optimal Control Problems With Policy Iteration".IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 47.7(2017):1207-1216.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
[00-J-2017-TSMC] Err(625KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Yan, Pengfei]的文章
[Wang, Ding]的文章
[Li, Hongliang]的文章
百度学术
百度学术中相似的文章
[Yan, Pengfei]的文章
[Wang, Ding]的文章
[Li, Hongliang]的文章
必应学术
必应学术中相似的文章
[Yan, Pengfei]的文章
[Wang, Ding]的文章
[Li, Hongliang]的文章
相关权益政策
暂无数据
收藏/分享
文件名: [00-J-2017-TSMC] Error bound analysis of Q-function for discounted optimal control problems with policy iteration.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。