Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces
Wang JP(王军平); You Kang Shi; Wen Sheng Zhang; Ian Thomas; Shi Hui Duan
发表期刊IEEE Transactions on Industrial Informatics
2019
卷号15期号:4页码:2395-2404
摘要
The sequential decision-making problem with large-scale state spaces is an important and challenging topic for multitask reinforcement learning (MTRL). Training near-optimality policies across tasks suffers from prior knowledge deficiency in discrete-time nonlinear environment, especially for continuous task variations, requiring scalability approaches to transfer prior knowledge among new tasks when considering large number of tasks. This paper proposes a multitask policy adversarial learning (MTPAL) method for learning a nonlinear feedback policy that generalizes across multiple tasks, making cognizance ability of robot much closer to human-level decision making. The key idea is to construct a parametrized policy model directly from large high-dimensional observations by deep function approximators, and then train optimal of sequential decision policy for each new task by an adversarial process, in which simultaneously two models are trained: a multitask policy generator transforms samples drawn from a prior distribution into samples from a complex data distribution with higher dimensionality, and a multitask policy discriminator decides whether the given sample is prior distribution from human-level empirically derived or from the generator. All the related human-level empirically derived are integrated into the sequential decision policy, transferring human-level policy at every layer in a deep policy network. Extensive experimental testing result of four different WeiChai Power manufacturing data sets shows that our approach can surpass human performance simultaneously from cart-pole to production assembly control.
收录类别SCI
是否为代表性论文
七大方向——子方向分类机器学习
国重实验室规划方向分类认知机理与类脑学习
是否有论文关联数据集需要存交
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/51634
专题多模态人工智能系统全国重点实验室_人工智能与机器学习(杨雪冰)-技术团队
通讯作者Wang JP(王军平)
作者单位Institute of Automation, Chinese Academy of Science
推荐引用方式
GB/T 7714
Wang JP,You Kang Shi,Wen Sheng Zhang,et al. Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces[J]. IEEE Transactions on Industrial Informatics,2019,15(4):2395-2404.
APA Wang JP,You Kang Shi,Wen Sheng Zhang,Ian Thomas,&Shi Hui Duan.(2019).Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces.IEEE Transactions on Industrial Informatics,15(4),2395-2404.
MLA Wang JP,et al."Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces".IEEE Transactions on Industrial Informatics 15.4(2019):2395-2404.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
Multitask_Policy_Adv(2547KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wang JP(王军平)]的文章
[You Kang Shi]的文章
[Wen Sheng Zhang]的文章
百度学术
百度学术中相似的文章
[Wang JP(王军平)]的文章
[You Kang Shi]的文章
[Wen Sheng Zhang]的文章
必应学术
必应学术中相似的文章
[Wang JP(王军平)]的文章
[You Kang Shi]的文章
[Wen Sheng Zhang]的文章
相关权益政策
暂无数据
收藏/分享
文件名: Multitask_Policy_Adversarial_Learning_for_Human-Level_Control_With_Large_State_Spaces(1).pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。