CASIA OpenIR

浏览/检索结果: 共10条,第1-10条 帮助

限定条件                    
已选(0)清除 条数/页:   排序方式:
Token-level Direct Preference Optimization 会议论文
, Vienna, Austria, 2024/7/21-27
作者:  Zeng,Yongcheng;  Liu,Guoqing;  Ma,Weiyu;  Yang,Ning;  Zhang,Haifeng;  Wang,Jun
Adobe PDF(883Kb)  |  收藏  |  浏览/下载:20/5  |  提交时间:2024/06/05
Advancing Air Combat Tactics with Improved Neural Fictitious Self-Play Reinforcement Learning 会议论文
Advanced Intelligent Computing Technology and Applications, 中国郑州, 2023-8
作者:  He SQ(何少钦);  Gao Y(高阳);  Zhang BF(张保丰);  Chang H(常惠);  Zhang XC(张鑫辰)
Adobe PDF(1496Kb)  |  收藏  |  浏览/下载:15/7  |  提交时间:2024/05/31
Air Combat, Reinforcement Learning, Neural Fictitious Self-Play.  
Mingling Foresight with Imagination: Model-Based Cooperative Multi-Agent Reinforcement Learning 会议论文
, New Orleans, LA, USA,, November 28 - December 9, 2022
作者:  Zhiwei Xu;  Dapeng Li;  Bin Zhang;  Yuan Zhan;  Yunpeng Bai;  Guoliang Fan
Adobe PDF(4367Kb)  |  收藏  |  浏览/下载:8/2  |  提交时间:2024/05/28
MMD-MIX: Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning 会议论文
, Shenzhen, China, 18-22 July 2021
作者:  Zhiwei Xu;  Dapeng Li;  Yunpeng Bai;  Guoliang Fan
Adobe PDF(3892Kb)  |  收藏  |  浏览/下载:8/3  |  提交时间:2024/05/28
Potential Driven Reinforcement Learning for Hard Exploration Tasks 会议论文
, 线上, 2020-4
作者:  Zhao EM(赵恩民);  Deng SH(邓诗弘);  Zang YF(臧一凡);  Kang YX(康永欣);  Li K(李凯);  Xing JL(兴军亮)
Adobe PDF(1999Kb)  |  收藏  |  浏览/下载:91/34  |  提交时间:2023/06/29
An Approximate Neuro-Optimal Solution of Discounted Guaranteed Cost Control Design 期刊论文
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 卷号: 52, 期号: 1, 页码: 77-86
作者:  Wang, Ding;  Qiao, Junfei;  Cheng, Long
收藏  |  浏览/下载:252/0  |  提交时间:2022/03/17
Control design  Cost function  Optimal control  Nonlinear systems  Adaptive systems  Switches  Adaptive learning system  discount factor  guaranteed cost function  neuro-optimal control  uncertainty  
Spiking Adaptive Dynamic Programming Based on Poisson Process for Discrete-Time Nonlinear Systems 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 页码: 11
作者:  Wei, Qinglai;  Han, Liyuan;  Zhang, Tielin
Adobe PDF(2904Kb)  |  收藏  |  浏览/下载:201/3  |  提交时间:2022/01/27
Maximum likelihood estimation (MLE)  Nonlinear systems  Optimal control  Poisson process  Spike train  Spiking Adaptive dynamic programming(SADP)  
Time-sequence Action-Decision and Navigation Through Stage Deep Reinforcement Learning in Complex Dynamic Environments 会议论文
, 厦门, 2019.12
作者:  Huimu, Wang;  Tenghai, Qiu;  Zhen, Liu;  Zhiqiang, Pu;  Jianqiang, Yi;  Zhaoyang, Liu
Adobe PDF(2178Kb)  |  收藏  |  浏览/下载:182/52  |  提交时间:2021/06/24
Mixing Update Q-value for Deep Reinforcement Learning 会议论文
, Budapest, Hungary, 2019/7/14-19
作者:  Li Zhunan;  Hou Xinwen
浏览  |  Adobe PDF(468Kb)  |  收藏  |  浏览/下载:179/73  |  提交时间:2020/06/10
Inverse reinforcement learning-based time-dependent A* planner for human-aware robot navigation with local vision 期刊论文
ADVANCED ROBOTICS, 2020, 页码: 14
作者:  Sun Shiying;  Zhao Xiaoguang;  Li Qianzhong;  Tan Min
Adobe PDF(3726Kb)  |  收藏  |  浏览/下载:302/47  |  提交时间:2020/06/02
Human-aware navigation  inverse reinforcement learning  path planning  service robot