Knowledge Commons of Institute of Automation,CAS
Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis | |
Wei, Qinglai1; Lewis, Frank L.2,3; Sun, Qiuye4; Yan, Pengfei1; Song, Ruizhuo5 | |
发表期刊 | IEEE TRANSACTIONS ON CYBERNETICS |
2017-05-01 | |
卷号 | 47期号:5页码:1224-1237 |
文章类型 | Article |
摘要 | In this paper, a novel discrete-time deterministic Q-learning algorithm is developed. In each iteration of the developed Q-learning algorithm, the iterative Q function is updated for all the state and control spaces, instead of updating for a single state and a single control in traditional Q-learning algorithm. A new convergence criterion is established to guarantee that the iterative Q function converges to the optimum, where the convergence criterion of the learning rates for traditional Q-learning algorithms is simplified. During the convergence analysis, the upper and lower bounds of the iterative Q function are analyzed to obtain the convergence criterion, instead of analyzing the iterative Q function itself. For convenience of analysis, the convergence properties for undiscounted case of the deterministic Q-learning algorithm are first developed. Then, considering the discounted factor, the convergence criterion for the discounted case is established. Neural networks are used to approximate the iterative Q function and compute the iterative control law, respectively, for facilitating the implementation of the deterministic Q-learning algorithm. Finally, simulation results and comparisons are given to illustrate the performance of the developed algorithm. |
关键词 | Adaptive Critic Designs Adaptive Dynamic Programming (Adp) Approximate Dynamic Programming Neural Networks (Nns) Neuro-dynamic Programming Optimal Control Q-learning |
WOS标题词 | Science & Technology ; Technology |
DOI | 10.1109/TCYB.2016.2542923 |
关键词[WOS] | OPTIMAL TRACKING CONTROL ; ZERO-SUM GAMES ; H-INFINITY CONTROL ; INPUT-OUTPUT DATA ; DEAD-ZONE INPUT ; NONLINEAR-SYSTEMS ; ALGORITHM ; DESIGN ; REPRESENTATION ; APPROXIMATION |
收录类别 | SCI |
语种 | 英语 |
项目资助者 | National Natural Science Foundation (NNSF) of China(61374105 ; Fundamental Research Funds for the Central Universities(FRF-TP-15-056A3) ; Open Research Project from SKLMCCS(20150104) ; National Science Foundation(ECCS-1405173 ; Office of Naval Research, Arlington, VA, USA(N00014-13-1-0562 ; U.S. Army Research Office(W911NF-11-D-0001) ; China NNSF(61120106011) ; China Education Ministry Project 111(B08015) ; 61304079 ; IIS-1208623) ; N000141410718) ; 61273140) |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Cybernetics |
WOS记录号 | WOS:000399797000009 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/13630 |
专题 | 多模态人工智能系统全国重点实验室_复杂系统智能机理与平行控制团队 |
作者单位 | 1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 2.Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA 3.Northeastern Univ, Shenyang 110036, Peoples R China 4.Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110036, Peoples R China 5.Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China |
第一作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Wei, Qinglai,Lewis, Frank L.,Sun, Qiuye,et al. Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis[J]. IEEE TRANSACTIONS ON CYBERNETICS,2017,47(5):1224-1237. |
APA | Wei, Qinglai,Lewis, Frank L.,Sun, Qiuye,Yan, Pengfei,&Song, Ruizhuo.(2017).Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis.IEEE TRANSACTIONS ON CYBERNETICS,47(5),1224-1237. |
MLA | Wei, Qinglai,et al."Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis".IEEE TRANSACTIONS ON CYBERNETICS 47.5(2017):1224-1237. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论