SADRL: Merging human experience with machine intelligence via supervised assisted deep reinforcement learning
Li, Xiaoshuang1,2; Wang, Xiao1,3; Zheng, Xinhu4; Jin, Junchen5; Huang, Yanhao6; Zhang, Jun Jason1,7; Wang, Fei-Yue1,3
Source PublicationNEUROCOMPUTING
ISSN0925-2312
2022-01-07
Volume467Pages:300-309
Abstract

Deep Reinforcement Learning (DRL) has proven its capability to learn optimal policies in decision-making problems by directly interacting with environments. Meanwhile, supervised learning methods also show great capability of learning from data. However, how to combine DRL with supervised learning and leverage additional knowledge and data to assist the DRL agent remains difficult. This study proposes a novel Supervised Assisted Deep Reinforcement Learning (SADRL) framework integrating deep Q-learning from dynamic demonstrations with a behavioral cloning model (DQfDD-BC). Specifically, the proposed DQfDDBC method leverages historical demonstrations to pre-train a behavioral cloning model and consistently update it by learning the dynamically updated demonstrations. A supervised expert loss function is designed to compare actions generated by the DRL model with those obtained from the BC model to provide advantageous guidance for policy improvements. Experimental results in several OpenAI Gym environments show that the proposed approach accelerates the learning processes, and meanwhile, adapts to different performance levels of demonstrations. As illustrated in an ablation study, the dynamic demonstration and expert loss mechanisms using a BC model contribute to improving the learning convergence performance compared with the baseline models. We believe that SADRL provides an elegant framework and the proposed method can promote the integration of human experience and machine intelligence. (c) 2021 Elsevier B.V. All rights reserved.

KeywordDeep reinforcement learning Behavioral cloning Dynamic demonstration Double DQN
DOI10.1016/j.neucom.2021.09.064
WOS KeywordLEVEL CONTROL ; ROBOT ; GAME ; GO
Indexed BySCI
Language英语
Funding ProjectNational Key R&D Program of China[2018AAA0101500] ; National Key R&D Program of China[2018AAA0101502]
Funding OrganizationNational Key R&D Program of China
WOS Research AreaComputer Science
WOS SubjectComputer Science, Artificial Intelligence
WOS IDWOS:000709984900012
PublisherELSEVIER
Sub direction classification人机混合推演与决策
Citation statistics
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/46292
Collection复杂系统管理与控制国家重点实验室_平行智能技术与系统团队
Corresponding AuthorWang, Fei-Yue
Affiliation1.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
3.Qingdao Acad Intelligent Ind, Parallel Intelligence Res Ctr, Qingdao 266109, Peoples R China
4.Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA
5.PCITECH, PCI Intelligent Bldg,2 Xincen Fourth Rd, Guangzhou 510653, Peoples R China
6.China Elect Power Res Inst, State Key Lab Power Grid Safety & Energy Conserva, Beijing 100192, Peoples R China
7.Wuhan Univ, Sch Elect Engn & Automat, Wuhan 430072, Peoples R China
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Li, Xiaoshuang,Wang, Xiao,Zheng, Xinhu,et al. SADRL: Merging human experience with machine intelligence via supervised assisted deep reinforcement learning[J]. NEUROCOMPUTING,2022,467:300-309.
APA Li, Xiaoshuang.,Wang, Xiao.,Zheng, Xinhu.,Jin, Junchen.,Huang, Yanhao.,...&Wang, Fei-Yue.(2022).SADRL: Merging human experience with machine intelligence via supervised assisted deep reinforcement learning.NEUROCOMPUTING,467,300-309.
MLA Li, Xiaoshuang,et al."SADRL: Merging human experience with machine intelligence via supervised assisted deep reinforcement learning".NEUROCOMPUTING 467(2022):300-309.
Files in This Item: Download All
File Name/Size DocType Version Access License
Li et al_2022_SADRL.(1244KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Li, Xiaoshuang]'s Articles
[Wang, Xiao]'s Articles
[Zheng, Xinhu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Li, Xiaoshuang]'s Articles
[Wang, Xiao]'s Articles
[Zheng, Xinhu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Li, Xiaoshuang]'s Articles
[Wang, Xiao]'s Articles
[Zheng, Xinhu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: Li et al_2022_SADRL.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.