CASIA OpenIR  > 复杂系统管理与控制国家重点实验室  > 平行控制
A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems
Wei QingLai1; Liu DeRong2; Derong Liu
Source PublicationSCIENCE CHINA-INFORMATION SCIENCES
2015-12-01
Volume58Issue:12Pages:122203:1–122203:15
SubtypeArticle
AbstractIn this paper, a novel iterative Q-learning algorithm, called "policy iteration based deterministic Q-learning algorithm", is developed to solve the optimal control problems for discrete-time deterministic nonlinear systems. The idea is to use an iterative adaptive dynamic programming (ADP) technique to construct the iterative control law which optimizes the iterative Q function. When the optimal Q function is obtained, the optimal control law can be achieved by directly minimizing the optimal Q function, where the mathematical model of the system is not necessary. Convergence property is analyzed to show that the iterative Q function is monotonically non-increasing and converges to the solution of the optimality equation. It is also proven that any of the iterative control laws is a stable control law. Neural networks are employed to implement the policy iteration based deterministic Q-learning algorithm, by approximating the iterative Q function and the iterative control law, respectively. Finally, two simulation examples are presented to illustrate the performance of the developed algorithm.
KeywordAdaptive Critic Designs Adaptive Dynamic Programming Approximate Dynamic Programming Q-learning Policy Iteration Neural Networks Nonlinear Systems Optimal Control
WOS HeadingsScience & Technology ; Technology
DOI10.1007/s11432-015-5462-z
WOS KeywordOPTIMAL TRACKING CONTROL ; DYNAMIC-PROGRAMMING ALGORITHM ; CONTROL SCHEME ; APPROXIMATION ERRORS ; REINFORCEMENT
Indexed BySCI
Language英语
Funding OrganizationNational Natural Science Foundation of China(61374105 ; Beijing Natural Science Foundation(4132078) ; 61233001 ; 61273140)
WOS Research AreaComputer Science
WOS SubjectComputer Science, Information Systems
WOS IDWOS:000368790400015
Citation statistics
Cited Times:5[WOS]   [WOS Record]     [Related Records in WOS]
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/10670
Collection复杂系统管理与控制国家重点实验室_平行控制
Corresponding AuthorDerong Liu
Affiliation1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
2.Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
Recommended Citation
GB/T 7714
Wei QingLai,Liu DeRong,Derong Liu. A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems[J]. SCIENCE CHINA-INFORMATION SCIENCES,2015,58(12):122203:1–122203:15.
APA Wei QingLai,Liu DeRong,&Derong Liu.(2015).A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems.SCIENCE CHINA-INFORMATION SCIENCES,58(12),122203:1–122203:15.
MLA Wei QingLai,et al."A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems".SCIENCE CHINA-INFORMATION SCIENCES 58.12(2015):122203:1–122203:15.
Files in This Item: Download All
File Name/Size DocType Version Access License
2015_SCIS_A novel po(1215KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wei QingLai]'s Articles
[Liu DeRong]'s Articles
[Derong Liu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wei QingLai]'s Articles
[Liu DeRong]'s Articles
[Derong Liu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wei QingLai]'s Articles
[Liu DeRong]'s Articles
[Derong Liu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 2015_SCIS_A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.