CASIA OpenIR

浏览/检索结果: 共31条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Learning Superior Cooperative Policy in Competitive Multi-team Reinforcement Learning 会议论文
, Gold Coast, Australia, 2023-6
作者:  Qingxu Fu;  Tenghai Qiu;  Zhiqiang Pu;  Jianqiang Yi;  Xiaolin Ai;  Wanmai Yuan
Adobe PDF(25675Kb)  |  收藏  |  浏览/下载:52/13  |  提交时间:2024/06/05
Learning Heterogeneous Agent Cooperation via Multiagent League Training 期刊论文
IFAC World Congress, 2023, 页码: IFAC PapersOnLine 56-2 (2023) 3033-3040
作者:  Qingxu, Fu;  Xiaolin Ai;  Jianqiang Yi;  Tenghai Qiu;  Wanmai Yuan;  Zhiqiang Pu
Adobe PDF(996Kb)  |  收藏  |  浏览/下载:50/15  |  提交时间:2024/06/05
Reward Estimation with Scheduled Knowledge Distillation for Dialogue Policy Learning 期刊论文
Connection Science, 2023, 卷号: 35, 期号: 1, 页码: 2174078
作者:  Qiu JY(邱俊彦);  Haidong Zhang;  Yiping Yang
Adobe PDF(831Kb)  |  收藏  |  浏览/下载:64/23  |  提交时间:2024/05/29
reinforcement learning  dialogue policy learning  curriculum learning  knowledge distillation  
Constrained-cost adaptive dynamic programming for optimal control of discrete-time nonlinear systems 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 卷号: 35, 期号: 3, 页码: 3251 - 3264
作者:  Wei, Qinglai;  Li, Tao
Adobe PDF(8471Kb)  |  收藏  |  浏览/下载:70/26  |  提交时间:2024/05/28
Adaptive dynamic programming  approximate dynamic programming  constrained cost  optimal control  reinforcement learning  
Dual Self-Awareness Value Decomposition Framework without Individual Global Max for Cooperative MARL 会议论文
, New Orleans, LA, USA, December 10-16, 2023
作者:  Zhiwei Xu;  Bin Zhang;  Dapeng Li;  Guangchong Zhou;  Zeren Zhang;  Guoliang Fan
Adobe PDF(8700Kb)  |  收藏  |  浏览/下载:59/17  |  提交时间:2024/05/28
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism 会议论文
, Washington, DC, USA, February 7-14, 2023
作者:  Zhiwei Xu;  Yunpeng Bai;  Bin Zhang;  Dapeng Li;  Guoliang Fan
Adobe PDF(3345Kb)  |  收藏  |  浏览/下载:43/12  |  提交时间:2024/05/28
航天器威胁规避智能自主控制技术研究综述 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 2, 页码: 229-245
作者:  袁利;  姜甜甜
Adobe PDF(2092Kb)  |  收藏  |  浏览/下载:69/21  |  提交时间:2024/05/09
轨道威胁感知  自主决策规划  “感知-决策-执行”一体化  航天器智能自主控制  
安全强化学习综述 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 9, 页码: 1813-1835
作者:  王雪松;  王荣荣;  程玉虎
Adobe PDF(1356Kb)  |  收藏  |  浏览/下载:69/30  |  提交时间:2024/04/24
安全强化学习  约束马尔科夫决策过程  学习过程  学习目标  离线强化学习  
Federated Learning on Multimodal Data: A Comprehensive Survey 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 539-553
作者:  Yi-Ming Lin;   Yuan Gao;  Mao-Guo Gong;  Si-Jia Zhang;  Yuan-Qiao Zhang;  Zhi-Yuan Li
Adobe PDF(1253Kb)  |  收藏  |  浏览/下载:35/15  |  提交时间:2024/04/23
Federated learning, multimodal learning, heterogeneous data, edge computing, collaborative learning  
A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-oriented Dialogue Policy Learning 期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 3, 页码: 318-334
作者:  Wai-Chung Kwan;  Hong-Ru Wang;  Hui-Min Wang;  Kam-Fai Wong
Adobe PDF(2211Kb)  |  收藏  |  浏览/下载:25/9  |  提交时间:2024/04/23
Dialogue policy learning (DPL), task-oriented dialogue system (TOD), reinforcement learning (RL), dialogue system, Markov decision process