CASIA OpenIR
(本次检索基于用户作品认领结果)

浏览/检索结果: 共14条,第1-10条 帮助

限定条件            
已选(0)清除 条数/页:   排序方式:
Empirical Policy Optimization for n-Player Markov Games 期刊论文
IEEE Transactions on Cybernetics, 2022, 页码: doi={10.1109/TCYB.2022.3179775}
作者:  Yuanheng Zhu;  Weifan Li;  Mengchen Zhao;  Jianye Hao;  Dongbin Zhao
Adobe PDF(1739Kb)  |  收藏  |  浏览/下载:90/36  |  提交时间:2023/04/26
Optimal Pedestrian Evacuation in Building with Consecutive Differential Dynamic Programming 会议论文
, Budapest, Hungary, 2019-7-14
作者:  Zhu YH(朱圆恒);  Haibo He;  Dongbin Zhao;  Zhongsheng Hou
Adobe PDF(679Kb)  |  收藏  |  浏览/下载:55/28  |  提交时间:2023/05/22
Comprehensive comparison of online ADP algorithms for continuous-time optimal control 期刊论文
ARTIFICIAL INTELLIGENCE REVIEW, 2018, 卷号: 49, 期号: 4, 页码: 531-547
作者:  Zhu, Yuanheng;  Zhao, Dongbin
Adobe PDF(766Kb)  |  收藏  |  浏览/下载:407/180  |  提交时间:2017/09/13
Adaptive Dynamic Programming  Policy Iteration  Integral Reinforcement Learning  Experience Replay  Off-policy  
Policy Iteration for H infinity Optimal Control of Polynomial Nonlinear Systems via Sum of Squares Programming 期刊论文
IEEE TRANSACTIONS ON CYBERNETICS, 2018, 卷号: 48, 期号: 2, 页码: 500-509
作者:  Zhu, Yuanheng;  Zhao, Dongbin;  Yang, Xiong;  Zhang, Qichao
Adobe PDF(892Kb)  |  收藏  |  浏览/下载:291/37  |  提交时间:2018/10/10
Adaptive Dynamic Programming (Adp)  h Infinity Optimal Control  Policy Iteration (Pi)  Polynomial Nonlinear Systems  Sum Of Squares (Sos)  
深度强化学习进展: 从 AlphaGo 到 AlphaGo Zero 期刊论文
控 制 理 论 与 应 用, 2017, 卷号: 34, 期号: 12, 页码: 1529-1546
作者:  唐振韬;  邵 坤;  赵冬斌;  朱圆恒
Adobe PDF(8232Kb)  |  收藏  |  浏览/下载:218/33  |  提交时间:2021/07/05
深度强化学习  AlphaGo Zero  深度学习  强化学习  人工智能  
Policy Iteration for Hinfinity Optimal Control of Polynomial Nonlinear Systems via Sum of Squares Programming 期刊论文
IEEE Transactions on Cybernetics, 2017, 期号: PP, 页码: 1-9
作者:  Yuanheng Zhu;  Zhao DB(赵冬斌)
Adobe PDF(894Kb)  |  收藏  |  浏览/下载:335/161  |  提交时间:2017/09/13
Adaptive Dynamic Programming (Adp)  H∞ Optimal Control  Policy Iteration (Pi)  Polynomial Nonlinear Systems  Sum Of Squares (Sos)  
深度强化学习综述:兼论计算机围棋的发展 期刊论文
控制理论与应用, 2016, 卷号: 33, 期号: 6, 页码: 701-717
作者:  赵冬斌;  邵坤;  朱圆恒;  李栋;  陈亚冉;  王海涛;  刘德荣;  周彤;  王成红
Adobe PDF(2816Kb)  |  收藏  |  浏览/下载:1730/635  |  提交时间:2017/09/13
深度强化学习  初弈号  深度学习  强化学习  人工智能  
Thermal Comfort Control Based on MEC Algorithm for HVAC System 会议论文
, Killarney, Ireland, 12-17 July 2015
作者:  Li, Dong;  Zhao, Dongbin;  Zhu, Yuanheng;  Xia, Zhongpu
浏览  |  Adobe PDF(895Kb)  |  收藏  |  浏览/下载:189/75  |  提交时间:2017/12/28
A data-based online reinforcement learning algorithm satisfying probably approximately correct principle 期刊论文
NEURAL COMPUTING & APPLICATIONS, 2015, 卷号: 26, 期号: 4, 页码: 775-787
作者:  Zhu, Yuanheng;  Zhao, Dongbin
Adobe PDF(1331Kb)  |  收藏  |  浏览/下载:253/60  |  提交时间:2015/09/21
Reinforcement Learning  Probably Approximately Correct  Kd-tree  
Convergence analysis and application of fuzzy-HDP for nonlinear discrete-time HJB systems 期刊论文
NEUROCOMPUTING, 2015, 卷号: 149, 页码: 124-131
作者:  Zhu, Yuanheng;  Zhao, Dongbin;  Liu, Derong
浏览  |  Adobe PDF(860Kb)  |  收藏  |  浏览/下载:267/99  |  提交时间:2015/10/13
Discrete-time Nonlinear System  T-s Fuzzy System  Hdp