A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward
Liang, Mingming1,2; Wei, Qinglai2
发表期刊NEUROCOMPUTING
ISSN0925-2312
2021-02-01
卷号424页码:23-34
通讯作者Liang, Mingming(liangmingming@gdut.edu.cn)
摘要This paper constructs a partial policy iteration adaptive dynamic programming (ADP) algorithm to solve the optimal control problem of nonlinear systems with discounted total reward. Compared with traditional policy iteration ADP algorithm, the approach updates the iterative control law only in a local region of the global system state space. With the benefit of this feature, the overall computational burden at each iteration for processing units can be significantly reduced. Hence, this feature enables our algorithm to be successfully executed on low-performance devices such as smartphones, smartwatches and the Internet of Things (IoT) objects. We provide the convergency analysis to show that the generated sequence of value functions is monotonically nonincreasing and can finally reach a local optimum. In addition, the corresponding local policy space is developed theoretically for the first time. Besides, when the sequence of the local system state spaces is chosen properly, we prove that the developed algorithm is capable of finding the global optimal performance index function for the nonlinear systems. Finally, we present a numerical simulation to demonstrate the effectiveness of the proposed algorithm. (c) 2020 Elsevier B.V. All rights reserved.
关键词Adaptive critic designs Adaptive dynamic programming Policy iteration Neural networks Neuro-dynamic programming Nonlinear systems Optimal control
DOI10.1016/j.neucom.2020.11.014
关键词[WOS]LINEAR-SYSTEMS ; ROBUST-CONTROL ; GAMES
收录类别SCI
语种英语
WOS研究方向Computer Science
WOS类目Computer Science, Artificial Intelligence
WOS记录号WOS:000611084200003
出版者ELSEVIER
七大方向——子方向分类智能控制
引用统计
被引频次:11[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/43116
专题复杂系统管理与控制国家重点实验室_复杂系统智能机理与平行控制团队
通讯作者Liang, Mingming
作者单位1.Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China
2.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
第一作者单位中国科学院自动化研究所
通讯作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
Liang, Mingming,Wei, Qinglai. A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward[J]. NEUROCOMPUTING,2021,424:23-34.
APA Liang, Mingming,&Wei, Qinglai.(2021).A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward.NEUROCOMPUTING,424,23-34.
MLA Liang, Mingming,et al."A partial policy iteration ADP algorithm for nonlinear neuro-optimal control with discounted total reward".NEUROCOMPUTING 424(2021):23-34.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Liang, Mingming]的文章
[Wei, Qinglai]的文章
百度学术
百度学术中相似的文章
[Liang, Mingming]的文章
[Wei, Qinglai]的文章
必应学术
必应学术中相似的文章
[Liang, Mingming]的文章
[Wei, Qinglai]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。