CASIA OpenIR

浏览/检索结果: 共18条,第1-10条 帮助

限定条件        
已选(0)清除 条数/页:   排序方式:
Adaptive Multi-Agent Coordination among Different Team Attribute Tasks via Contextual Meta-Reinforcement Learning 会议论文
, 河南开封, 2024年5月17-19日
作者:  Huang, Shangjing;  Zhao, Zijie;  Zhu, Yuanheng;  Zhao, Dongbin
Adobe PDF(15515Kb)  |  收藏  |  浏览/下载:32/11  |  提交时间:2024/06/26
User Response Modeling in Reinforcement Learning for Ads Allocation 会议论文
, 新加坡, May 13 - 17, 2024
作者:  Zhang, Zhiyuan;  Zhang, Qichao;  Wu, Xiaoxu;  Shi, Xiaowen;  Liao, Guogang;  Wang, Yongkong;  Wang, xingxing;  Zhao, Dongbin
Adobe PDF(2077Kb)  |  收藏  |  浏览/下载:42/18  |  提交时间:2024/06/25
Ads Allocation  Reinforcement Learning  User Response Modeling  
MULFE: A Multi-Level Benchmark for Free Text Model Editing 会议论文
, Bangkok, Thailand, 2024-08
作者:  Wang, Chenhao;  Cao, Pengfei;  Jin, Zhuoran;  Chen, Yubo;  Zeng, Daojian;  Liu, Kang;  Zhao, Jun
Adobe PDF(571Kb)  |  收藏  |  浏览/下载:26/11  |  提交时间:2024/06/25
Review on Peg-in-Hole Insertion Technology Based on Reinforcement Learning 会议论文
, Chongqing, China, 2023-11
作者:  Shen Liancheng;  Su Jianhua;  Zhang Xiaodong
Adobe PDF(254Kb)  |  收藏  |  浏览/下载:46/20  |  提交时间:2024/06/24
—Robot Peg-in-hole Insertion  Reinforcement Learning  Meta-Reinforcement Learning  
MoDE-CoTD: Chain-of-Thought Distillation for Complex Reasoning Tasks with Mixture of Decoupled LoRA-Experts 会议论文
, Torino (Italia), 2024.5.20 - 2024.5.25
作者:  Xiang Li;  Shizhu He;  Jiayu Wu;  Zhao Yang;  Yao Xu;  Yang Jun;  Haifeng Liu;  Kang Liu;  Jun Zhao
Adobe PDF(1062Kb)  |  收藏  |  浏览/下载:40/11  |  提交时间:2024/06/20
Teaching Small Language Models to Reason for Knowledge-Intensive Multi-Hop Question Answering 会议论文
, Bangkok, Thailand, 2024.08.11-2024.08.16
作者:  Xiang Li;  Shizhu HE;  Fangyu Lei;  Jun Yang;  Tianhuang Su;  Kang Liu;  Jun Zhao
Adobe PDF(873Kb)  |  收藏  |  浏览/下载:45/16  |  提交时间:2024/06/20
Bridging the Gap between Different Vocabularies for LLM Ensemble 会议论文
, Mexico City, Mexico, June 16–21, 2024
作者:  徐杨一帆;  Lu JL(陆金梁);  Zhang JJ(张家俊)
Adobe PDF(1982Kb)  |  收藏  |  浏览/下载:70/23  |  提交时间:2024/06/13
Interpretable Autonomous Driving Model Based on Cognitive Reinforcement Learning 会议论文
, Jeju, Korea, Jun. 02-05, 2024
作者:  Yijia Li;  Hao Qi;  Fenghua Zhu;  Yisheng Lv;  Peijun Ye
Adobe PDF(87Kb)  |  收藏  |  浏览/下载:51/25  |  提交时间:2024/06/06
TIM: An Efficient Temporal Interaction Module for Spiking Transformer 会议论文
, Jeju, korea, 2024-08
作者:  Shen, Sicheng;  Zhao, Dongcheng;  Shen, Guobin;  Zeng, Yi
Adobe PDF(717Kb)  |  收藏  |  浏览/下载:37/6  |  提交时间:2024/06/06
Token-level Direct Preference Optimization 会议论文
, Vienna, Austria, 2024/7/21-27
作者:  Zeng,Yongcheng;  Liu,Guoqing;  Ma,Weiyu;  Yang,Ning;  Zhang,Haifeng;  Wang,Jun
Adobe PDF(883Kb)  |  收藏  |  浏览/下载:71/24  |  提交时间:2024/06/05