CASIA OpenIR

浏览/检索结果: 共7条,第1-7条 帮助

已选(0)清除 条数/页:   排序方式:
Adaptive bias-variance trade-off in advantage estimator for actor-critic algorithms 期刊论文
NEURAL NETWORKS, 2024, 卷号: 169, 页码: 764-777
作者:  Chen, Yurou;  Zhang, Fengyi;  Liu, Zhiyong
收藏  |  浏览/下载:57/0  |  提交时间:2024/02/22
Reinforcement Learning  Policy gradient  Actor-critic  Value function  Bias-variance trade-off  
Omnidirectional Drift Control of an Underwater Biomimetic Vehicle-Manipulator System via Reinforcement Learning 会议论文
, Suzhou, China, May 14-16, 2021
作者:  Ma, Ruichen;  Wang, Yu;  Wang, Rui;  Wang, Shuo
Adobe PDF(855Kb)  |  收藏  |  浏览/下载:118/42  |  提交时间:2023/08/02
Omnidirectional Drift Control  Undulating Fin  Underwater Biomimetic Vehicle-manipulator System (UBVMS)  Reinforcement Learning  Twin Delayed Deep Deterministic policy gradient (TD3)  
Advantage Constrained Proximal Policy Optimization in Multi-Agent Reinforcement Learning 会议论文
, 昆士兰, 2023-6
作者:  Li WF(李伟凡);  Zhu YH(朱圆恒);  Zhao DB(赵冬斌)
Adobe PDF(4104Kb)  |  收藏  |  浏览/下载:256/81  |  提交时间:2023/06/29
multi-agent  reinforcement learning  policy gradient  
Deep Deterministic Policy Gradient for High-Speed Train Trajectory Optimization 期刊论文
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 页码: 13
作者:  Ning, Lingbin;  Zhou, Min;  Hou, Zhuopu;  Goverde, Rob M. P.;  Wang, Fei-Yue;  Dong, Hairong
收藏  |  浏览/下载:202/0  |  提交时间:2022/01/27
Rail transportation  Training  Heuristic algorithms  Resistance  Optimal control  Trajectory optimization  Switches  High-speed railway  train trajectory optimization  deep deterministic policy gradient  energy efficiency  
Conservative Policy Gradient in Multi-critic Setting 会议论文
, Hangzhou, China, 2019.11.22-24
作者:  Xi, Bao;  Wang, Rui;  Wang, Shuo;  Lu, Tao;  Cai, Yinghao
浏览  |  Adobe PDF(379Kb)  |  收藏  |  浏览/下载:236/82  |  提交时间:2021/02/02
inconsistancy  stablility  Q learning  policy gradient  
Image captioning via hierarchical attention mechanism and policy gradient optimization 期刊论文
SIGNAL PROCESSING, 2020, 卷号: 167, 页码: 12
作者:  Yan, Shiyang;  Xie, Yuan;  Wu, Fangyu;  Smith, Jeremy S.;  Lu, Wenjin;  Zhang, Bailing
收藏  |  浏览/下载:211/0  |  提交时间:2020/03/30
Image captioning  Hierarchical attention mechanism  Generative adversarial network  Reinforcement learning  Policy gradient  
Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control 期刊论文
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 卷号: 47, 期号: 10, 页码: 3341-3354
作者:  Luo, Biao;  Liu, Derong;  Wu, Huai-Ning;  Wang, Ding;  Lewis, Frank L.
浏览  |  Adobe PDF(3217Kb)  |  收藏  |  浏览/下载:620/218  |  提交时间:2016/11/09
Adaptive Control  Adaptive Dynamic Programming (Adp)  Data-based  Off-policy Learning  Optimal Control  Policy Gradient