CASIA OpenIR

浏览/检索结果: 共5条,第1-5条 帮助

限定条件        
已选(0)清除 条数/页:   排序方式:
Learning State-Specific Action Masks for Reinforcement Learning 期刊论文
Algorithms, 2024, 卷号: 17, 期号: 2, 页码: 60
作者:  Wang ZY(王梓薏);  Li XR(李欣然);  Sun LY(孙罗洋);  Zhang HF(张海峰);  Liu HL(刘华林);  Jun Wang
Adobe PDF(2976Kb)  |  收藏  |  浏览/下载:30/12  |  提交时间:2024/07/05
reinforcement learning  exploration efficiency  space reduction  
MapGuide: A Simple yet Effective Method to Reconstruct Continuous Language from Brain Activities 会议论文
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Mexico City, Mexico, 2024-6
作者:  Xinpei, Zhao;  Jingyuan, Sun;  Shaonan, Wang;  Jing, Ye;  Xiaohan, Zhang;  Chengqing, Zong
Adobe PDF(843Kb)  |  收藏  |  浏览/下载:26/7  |  提交时间:2024/06/27
neural decoding  
Review on Peg-in-Hole Insertion Technology Based on Reinforcement Learning 会议论文
, Chongqing, China, 2023-11
作者:  Shen Liancheng;  Su Jianhua;  Zhang Xiaodong
Adobe PDF(254Kb)  |  收藏  |  浏览/下载:35/18  |  提交时间:2024/06/24
—Robot Peg-in-hole Insertion  Reinforcement Learning  Meta-Reinforcement Learning  
Training Large Language Models to Follow System Prompt with Self-Supervised Fine-Tuning 会议论文
, YOKOHAMA, JAPAN, 2024-07
作者:  Junyan Qiu;  Haitao Wang;  Yiping Yang
Adobe PDF(1596Kb)  |  收藏  |  浏览/下载:42/17  |  提交时间:2024/06/17
large language models  supervised fine-tuning  instruct tuning  stylized generation  
Token-level Direct Preference Optimization 会议论文
, Vienna, Austria, 2024/7/21-27
作者:  Zeng,Yongcheng;  Liu,Guoqing;  Ma,Weiyu;  Yang,Ning;  Zhang,Haifeng;  Wang,Jun
Adobe PDF(883Kb)  |  收藏  |  浏览/下载:66/22  |  提交时间:2024/06/05