CASIA OpenIR  > 智能系统与工程
Offline reinforcement learning with representations for actions
Lou, Xingzhou1,2; Yin, Qiyue1; Zhang, Junge1; Yu, Chao3; He, Zhaofeng4; Cheng, Nengjie5; Huang, Kaiqi1
发表期刊INFORMATION SCIENCES
ISSN0020-0255
2022-09-01
卷号610页码:746-758
通讯作者Zhang, Junge()
摘要Prevailing offline reinforcement learning (RL) methods limit the policy within the area sup-ported by the offline dataset to avoid the distributional shift problem. But potential high -reward actions, which are out of the distribution of the dataset, are neglected in these meth-ods. To address such issue, we propose a new method, which generalizes from the offline dataset to out-of-distribution (OOD) actions. Specifically, we design a novel action embed-ding model to help infer the effect of actions. As a result, our value function reaches a better generalization over the action space, and further alleviate the distributional shift caused by overestimation of OOD actions. Theoretically, we give an information-theoretic explanation on the improvement of the value function's generalization over the action space. Experiments on D4RL demonstrate that our model improves the performance compared to previous offline RL methods, especially when the experience in the offline dataset is good. We conduct further study and validate that the value function's generalization on OOD actions is improved, which reinforces the effectiveness of our proposed action embedding model. (c) 2022 Published by Elsevier Inc.
关键词Offline reinforcement learning Action embedding
DOI10.1016/j.ins.2022.08.019
收录类别SCI
语种英语
资助项目National Natural Science Foundation of China[61876181] ; Beijing Nova Program of Science and Technology[Z191100001119043] ; Youth Innovation Promotion Association, CAS
项目资助者National Natural Science Foundation of China ; Beijing Nova Program of Science and Technology ; Youth Innovation Promotion Association, CAS
WOS研究方向Computer Science
WOS类目Computer Science, Information Systems
WOS记录号WOS:000860782400007
出版者ELSEVIER SCIENCE INC
引用统计
被引频次:2[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/50377
专题智能系统与工程
通讯作者Zhang, Junge
作者单位1.Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
3.Sun Yat Sen Univ, Guangzhou, Peoples R China
4.Beijing Univ Posts & Telecommun, Beijing, Peoples R China
5.Nanchang Univ, Nanchang, Peoples R China
第一作者单位中国科学院自动化研究所
通讯作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
Lou, Xingzhou,Yin, Qiyue,Zhang, Junge,et al. Offline reinforcement learning with representations for actions[J]. INFORMATION SCIENCES,2022,610:746-758.
APA Lou, Xingzhou.,Yin, Qiyue.,Zhang, Junge.,Yu, Chao.,He, Zhaofeng.,...&Huang, Kaiqi.(2022).Offline reinforcement learning with representations for actions.INFORMATION SCIENCES,610,746-758.
MLA Lou, Xingzhou,et al."Offline reinforcement learning with representations for actions".INFORMATION SCIENCES 610(2022):746-758.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Lou, Xingzhou]的文章
[Yin, Qiyue]的文章
[Zhang, Junge]的文章
百度学术
百度学术中相似的文章
[Lou, Xingzhou]的文章
[Yin, Qiyue]的文章
[Zhang, Junge]的文章
必应学术
必应学术中相似的文章
[Lou, Xingzhou]的文章
[Yin, Qiyue]的文章
[Zhang, Junge]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。