Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems
Luo, Biao1; Liu, Derong2; Huang, Tingwen4; Yang, Xiong3; Ma, Hongwen1
2017-10-01
发表期刊INFORMATION SCIENCES
卷号411期号:0页码:66-83
文章类型Article
摘要

Policy iteration and value iteration are two main iterative adaptive dynamic programming frameworks for solving optimal control problems. Policy iteration converges fast while requiring an initial stabilizing control policy, which is a strict constraint in practice. Value iteration avoids the requirement of initial admissible control policy while converging much slowly. This paper tries to utilize the advantages of policy iteration and value iteration, and avoids their drawbacks at the same time. Therefore, a multi-step heuristic dynamic programming (MsHDP) method is developed for solving the optimal control problem of nonlinear discrete-time systems. MsHDP speeds up value iteration and avoids the requirement of initial admissible control policy in policy iteration at the same time. The convergence theory of MsHDP is established by proving that it converges to the solution of the Bellman equation. For implementation purpose, the actor-critic neural network (NN) structure is developed. The critic NN is employed to estimate the value function and its NN weight vector is computed with a least-square scheme. The actor NN is used to estimate the control policy and a gradient descent method is proposed for updating its NN weight vector. According to the comparative simulation studies on two examples, the effectiveness and advantages of MsHDP are verified. (C) 2017 Elsevier Inc. All rights reserved.

关键词Optimal Control Multi-step Heuristic Dynamic Programming Adaptive Dynamic Programming Nonlinear Systems Discrete-time Neural Networks
WOS标题词Science & Technology ; Technology
DOI10.1016/j.ins.2017.05.005
关键词[WOS]Spatially Distributed Processes ; Optimal Tracking Control ; Horizon Optimal-control ; Neural-network Control ; Optimal-control Scheme ; H-infinity Control ; Policy Iteration ; Control Design ; Feedback-control ; Algorithm
收录类别SCI
语种英语
项目资助者National Natural Science Foundation of China(61533017 ; Early Career Development Award of SKLMCCS ; NPRP from the Qatar National Research Fund (a member of Qatar Foundation)(NPRP 9 166-1-031) ; U1501251 ; 61374105 ; 61503377 ; 61233001)
WOS研究方向Computer Science
WOS类目Computer Science, Information Systems
WOS记录号WOS:000404197200005
引用统计
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/15245
专题复杂系统管理与控制国家重点实验室_平行控制
作者单位1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
2.Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Guangdong, Peoples R China
3.Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
4.Texas A&M Univ Qatar, POB 23874, Doha, Qatar
推荐引用方式
GB/T 7714
Luo, Biao,Liu, Derong,Huang, Tingwen,et al. Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems[J]. INFORMATION SCIENCES,2017,411(0):66-83.
APA Luo, Biao,Liu, Derong,Huang, Tingwen,Yang, Xiong,&Ma, Hongwen.(2017).Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems.INFORMATION SCIENCES,411(0),66-83.
MLA Luo, Biao,et al."Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems".INFORMATION SCIENCES 411.0(2017):66-83.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2017-10-INS-Multi-st(1092KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Luo, Biao]的文章
[Liu, Derong]的文章
[Huang, Tingwen]的文章
百度学术
百度学术中相似的文章
[Luo, Biao]的文章
[Liu, Derong]的文章
[Huang, Tingwen]的文章
必应学术
必应学术中相似的文章
[Luo, Biao]的文章
[Liu, Derong]的文章
[Huang, Tingwen]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 2017-10-INS-Multi-step heuristic dynamic programming for optimal control of nonlinear discrete-time systems.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。