CASIA OpenIR

浏览/检索结果: 共30条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Latent Landmark Graph for Efficient Exploration-Exploitation Balance in Hierarchical Reinforcement Learning 期刊论文
Machine Intelligence Research, 2023, 页码: 158
作者:  Zhang Qingyang;  Zhang Hongming;  Xing Dengpeng;  Bo Xu
Adobe PDF(9639Kb)  |  收藏  |  浏览/下载:7/5  |  提交时间:2024/06/25
LEGO: A Multi-agent Collaborative Framework with Role-playing and Iterative Feedback for Causality Explanation Generation 会议论文
, Singapore, 2023-12
作者:  Zhitao He;  Pengfei Cao;  Yubo Chen;  Kang Liu;  Jun Zhao
Adobe PDF(1153Kb)  |  收藏  |  浏览/下载:4/2  |  提交时间:2024/06/25
M3: Modularization for Multi-task and Multi-agent Offline Pre-training 会议论文
, London, United Kingdom, 2023.5.29-2023.6.2
作者:  Meng Linghui;  Ruan Jingqing;  Xiong Xuantang;  Li Xiyun;  Zhang Xi;  Xing Dengpeng;  Xu Bo
Adobe PDF(1302Kb)  |  收藏  |  浏览/下载:18/4  |  提交时间:2024/06/11
Resizemix: Mixing data with preserved object information and true labels 期刊论文
Computational Visual Media, 2023, 页码: --
作者:  Jie Qin;  Jiemin Fang;  Qian Zhang;  Wenyu Liu;  Xingang Wang;  Xinggang Wang
Adobe PDF(9105Kb)  |  收藏  |  浏览/下载:12/4  |  提交时间:2024/06/04
Reward Estimation with Scheduled Knowledge Distillation for Dialogue Policy Learning 期刊论文
Connection Science, 2023, 卷号: 35, 期号: 1, 页码: 2174078
作者:  Qiu JY(邱俊彦);  Haidong Zhang;  Yiping Yang
Adobe PDF(831Kb)  |  收藏  |  浏览/下载:28/10  |  提交时间:2024/05/29
reinforcement learning  dialogue policy learning  curriculum learning  knowledge distillation  
Explicitly Learning Policy Under Partial Observability in Multiagent Reinforcement Learning 会议论文
, Queensland, Australia, 2023-6
作者:  Yang, Chen;  Yang, Guangkai;  Chen, Hao;  Zhang, Junge
Adobe PDF(3027Kb)  |  收藏  |  浏览/下载:38/15  |  提交时间:2024/05/29
Constrained-cost adaptive dynamic programming for optimal control of discrete-time nonlinear systems 期刊论文
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 卷号: 35, 期号: 3, 页码: 3251 - 3264
作者:  Wei, Qinglai;  Li, Tao
Adobe PDF(8471Kb)  |  收藏  |  浏览/下载:34/12  |  提交时间:2024/05/28
Adaptive dynamic programming  approximate dynamic programming  constrained cost  optimal control  reinforcement learning  
HAVEN: Hierarchical Cooperative Multi-Agent Reinforcement Learning with Dual Coordination Mechanism 会议论文
, Washington, DC, USA, February 7-14, 2023
作者:  Zhiwei Xu;  Yunpeng Bai;  Bin Zhang;  Dapeng Li;  Guoliang Fan
Adobe PDF(3345Kb)  |  收藏  |  浏览/下载:23/6  |  提交时间:2024/05/28
未知非线性零和博弈最优跟踪的事件触发控制设计 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 1, 页码: 91-101
作者:  王鼎;  胡凌治;  赵明明;  哈明鸣;  乔俊飞
Adobe PDF(1996Kb)  |  收藏  |  浏览/下载:38/15  |  提交时间:2024/05/09
自适应评判设计  事件触发控制  神经网络  最优跟踪控制  稳定性分析  零和博弈  
多智能体博弈、学习与控制 期刊论文
自动化学报, 2023, 卷号: 49, 期号: 3, 页码: 580-613
作者:  王龙;  黄锋
Adobe PDF(2088Kb)  |  收藏  |  浏览/下载:18/5  |  提交时间:2024/05/09
博弈论  多智能体学习  控制论  强化学习  人工智能