CASIA OpenIR
(本次检索基于用户作品认领结果)

浏览/检索结果: 共38条,第1-10条 帮助

限定条件        
已选(0)清除 条数/页:   排序方式:
Advantage Constrained Proximal Policy Optimization in Multi-Agent Reinforcement Learning 会议论文
, 昆士兰, 2023-6
作者:  Li WF(李伟凡);  Zhu YH(朱圆恒);  Zhao DB(赵冬斌)
Adobe PDF(4104Kb)  |  收藏  |  浏览/下载:202/68  |  提交时间:2023/06/29
multi-agent  reinforcement learning  policy gradient  
Enhanced Rolling Horizon Evolution Algorithm With Opponent Model Learning: Results for the Fighting Game AI Competition 期刊论文
IEEE TRANSACTIONS ON GAMES, 2023, 卷号: 5, 期号: 1, 页码: 5 - 15
作者:  Zhentao Tang;  Yuanheng Zhu;  Dongbin Zhao;  Simon M. Lucas
Adobe PDF(7686Kb)  |  收藏  |  浏览/下载:221/61  |  提交时间:2021/07/05
Rolling horizon evolution  opponent model  reinforcement learning  supervised learning  fighting game  
A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat 期刊论文
IEEE Transactions on Systems, Man and Cybernetics: Systems, 2023, 页码: DOI: 10.1109/TSMC.2023.3270444
作者:  Jiajun Chai;  Wenzhang Chen;  Yuanheng Zhu;  Zong-xin Yao,;  Dongbin Zhao
Adobe PDF(9249Kb)  |  收藏  |  浏览/下载:199/108  |  提交时间:2023/04/26
Online Minimax Q Network Learning for Two-Player Zero-Sum Markov Games 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 卷号: 33, 期号: 3, 页码: 1228-1241
作者:  Zhu, Yuanheng;  Zhao, Dongbin
收藏  |  浏览/下载:196/0  |  提交时间:2022/06/10
Games  Nash equilibrium  Mathematical model  Markov processes  Convergence  Dynamic programming  Training  Deep reinforcement learning (DRL)  generalized policy iteration (GPI)  Markov game (MG)  Nash equilibrium  Q network  zero sum  
Soft Contrastive Learning with Q-irrelevance Abstraction for Reinforcement Learning 期刊论文
IEEE Transactions on Cognitive and Developmental Systems, 2022, 页码: doi={10.1109/TCDS.2022.3218940}
作者:  Minsong Liu;  Luntong Li;  Shuai Hao;  Yuanheng Zhu;  Dongbin Zhao
Adobe PDF(12013Kb)  |  收藏  |  浏览/下载:68/18  |  提交时间:2023/04/26
Empirical Policy Optimization for n-Player Markov Games 期刊论文
IEEE Transactions on Cybernetics, 2022, 页码: doi={10.1109/TCYB.2022.3179775}
作者:  Yuanheng Zhu;  Weifan Li;  Mengchen Zhao;  Jianye Hao;  Dongbin Zhao
Adobe PDF(1739Kb)  |  收藏  |  浏览/下载:89/36  |  提交时间:2023/04/26
UNMAS: Multiagent Reinforcement Learning for Unshaped Cooperative Scenarios 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 页码: 12
作者:  Chai, Jiajun;  Li, Weifan;  Zhu, Yuanheng;  Zhao, Dongbin;  Ma, Zhe;  Sun, Kewu;  Ding, Jishiyu
Adobe PDF(3402Kb)  |  收藏  |  浏览/下载:230/24  |  提交时间:2022/01/27
Multi-agent systems  Training  Task analysis  Reinforcement learning  Sun  Learning systems  Semantics  Centralized training with decentralized execution (CTDE)  multiagent  reinforcement learning  StarCraft II  
Optimal Feedback Control of Pedestrian Flow in Heterogeneous Corridors 期刊论文
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2021, 卷号: 18, 期号: 3, 页码: 1097-1108
作者:  Zhu, Yuanheng;  Zhao, Dongbin;  He, Haibo
收藏  |  浏览/下载:169/0  |  提交时间:2021/08/15
Microscopy  Feedback control  Mathematical model  Data models  Dynamic programming  Psychology  Computational modeling  Adaptive dynamic programming (ADP)  heterogeneous corridors  macroscopic pedestrian dynamics  optimal feedback control  pedestrian flow  
Invariant Adaptive Dynamic Programming for Discrete-Time Optimal Control 期刊论文
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 卷号: 50, 期号: 11, 页码: 3959-3971
作者:  Zhu, Yuanheng;  Zhao, Dongbin;  He, Haibo
收藏  |  浏览/下载:162/0  |  提交时间:2021/01/07
Optimal control  Discrete-time systems  Heuristic algorithms  Dynamic programming  Convergence  Artificial intelligence  Nonlinear systems  Adaptive dynamic programming  discrete-time systems  invariant admissibility  optimal control  policy iteration  sum of squares  
Optimal Pedestrian Evacuation in Building with Consecutive Differential Dynamic Programming 会议论文
, Budapest, Hungary, 2019-7-14
作者:  Zhu YH(朱圆恒);  Haibo He;  Dongbin Zhao;  Zhongsheng Hou
Adobe PDF(679Kb)  |  收藏  |  浏览/下载:55/28  |  提交时间:2023/05/22