CASIA OpenIR

浏览/检索结果: 共52条,第1-10条 帮助

限定条件                
已选(0)清除 条数/页:   排序方式:
A Survey of Recent Advances in Commonsense Knowledge Acquisition: Methods and Resources 期刊论文
Machine Intelligence Research, 2024, 页码: 1
作者:  Wang, Chenhao;  Li, Jiachun;  Chen, Yubo;  Liu, Kang;  Zhao, Jun
Adobe PDF(1228Kb)  |  收藏  |  浏览/下载:5/1  |  提交时间:2024/06/25
Bidirectional Sentence Ordering with Interactive Decoding 期刊论文
ACM Transactions on Asian and Low-Resource Language Information Processing, 2023, 卷号: 22, 期号: 2, 页码: 1-15
作者:  Guirong Bai;  Shizhu HE;  Kang Liu;  Jun Zhao
Adobe PDF(1080Kb)  |  收藏  |  浏览/下载:13/4  |  提交时间:2024/06/20
MoDE-CoTD: Chain-of-Thought Distillation for Complex Reasoning Tasks with Mixture of Decoupled LoRA-Experts 会议论文
, Torino (Italia), 2024.5.20 - 2024.5.25
作者:  Xiang Li;  Shizhu He;  Jiayu Wu;  Zhao Yang;  Yao Xu;  Yang Jun;  Haifeng Liu;  Kang Liu;  Jun Zhao
Adobe PDF(1062Kb)  |  收藏  |  浏览/下载:7/3  |  提交时间:2024/06/20
Teaching Small Language Models to Reason for Knowledge-Intensive Multi-Hop Question Answering 会议论文
, Bangkok, Thailand, 2024.08.11-2024.08.16
作者:  Xiang Li;  Shizhu HE;  Fangyu Lei;  Jun Yang;  Tianhuang Su;  Kang Liu;  Jun Zhao
Adobe PDF(873Kb)  |  收藏  |  浏览/下载:13/4  |  提交时间:2024/06/20
M3: Modularization for Multi-task and Multi-agent Offline Pre-training 会议论文
, London, United Kingdom, 2023.5.29-2023.6.2
作者:  Meng Linghui;  Ruan Jingqing;  Xiong Xuantang;  Li Xiyun;  Zhang Xi;  Xing Dengpeng;  Xu Bo
Adobe PDF(1302Kb)  |  收藏  |  浏览/下载:12/3  |  提交时间:2024/06/11
A New Pre-Training Paradigm for Offline Multi-Agent Reinforcement Learning with Suboptimal Data 会议论文
, Seoul, Korea, 2024.4.14-2024.4.19
作者:  Meng Linghui;  Zhang Xi;  Xing Dengpeng;  Xu Bo
Adobe PDF(964Kb)  |  收藏  |  浏览/下载:13/5  |  提交时间:2024/06/11
Continuous Exploration via Multiple Perspectives in Sparse Reward Environment 会议论文
, 厦门国际会议中心, 2023-10-13
作者:  Chen ZP(陈忠鹏);  Guan Q(关强)
Adobe PDF(2260Kb)  |  收藏  |  浏览/下载:22/7  |  提交时间:2024/06/04
Reinforcement Learning · Exploration Strategy · Sparse Reward · Intrinsic Motivation  
稀疏奖励环境下基于自博弈框架的智能空战算法研究 学位论文
, 2024
作者:  何少钦
Adobe PDF(4570Kb)  |  收藏  |  浏览/下载:26/1  |  提交时间:2024/05/30
强化学习,离线强化学习,空战,智能决策,好奇心机制  
Leros: Learning Explicit Reasoning on Synthesized Data for Commonsense Question Answering 会议论文
, Torino, Italia, 2024-5
作者:  Wang, Chenhao;  Cao, Pengfei;  Li, Jiachun;  Chen, Yubo;  Liu, Kang;  Jiang, Xiaojian;  Xu, Jiexin;  Li, Qiuxia;  Jun Zhao
Adobe PDF(909Kb)  |  收藏  |  浏览/下载:24/6  |  提交时间:2024/05/30
Consensus Learning for Cooperative Multi-Agent Reinforcement Learning 会议论文
, Washington, DC, USA, February 7-14, 2023
作者:  Zhiwei Xu;  Bin Zhang;  Dapeng Li;  Zeren Zhang;  Guangchong Zhou;  Hao Chen;  Guoliang Fan
Adobe PDF(4141Kb)  |  收藏  |  浏览/下载:21/7  |  提交时间:2024/05/28