Knowledge Commons of Institute of Automation,CAS
Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces | |
Wang JP(王军平)![]() | |
发表期刊 | IEEE Transactions on Industrial Informatics
![]() |
2019 | |
卷号 | 15期号:4页码:2395-2404 |
摘要 | The sequential decision-making problem with large-scale state spaces is an important and challenging topic for multitask reinforcement learning (MTRL). Training near-optimality policies across tasks suffers from prior knowledge deficiency in discrete-time nonlinear environment, especially for continuous task variations, requiring scalability approaches to transfer prior knowledge among new tasks when considering large number of tasks. This paper proposes a multitask policy adversarial learning (MTPAL) method for learning a nonlinear feedback policy that generalizes across multiple tasks, making cognizance ability of robot much closer to human-level decision making. The key idea is to construct a parametrized policy model directly from large high-dimensional observations by deep function approximators, and then train optimal of sequential decision policy for each new task by an adversarial process, in which simultaneously two models are trained: a multitask policy generator transforms samples drawn from a prior distribution into samples from a complex data distribution with higher dimensionality, and a multitask policy discriminator decides whether the given sample is prior distribution from human-level empirically derived or from the generator. All the related human-level empirically derived are integrated into the sequential decision policy, transferring human-level policy at every layer in a deep policy network. Extensive experimental testing result of four different WeiChai Power manufacturing data sets shows that our approach can surpass human performance simultaneously from cart-pole to production assembly control. |
收录类别 | SCI |
是否为代表性论文 | 是 |
七大方向——子方向分类 | 机器学习 |
国重实验室规划方向分类 | 认知机理与类脑学习 |
是否有论文关联数据集需要存交 | 否 |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/51634 |
专题 | 多模态人工智能系统全国重点实验室_人工智能与机器学习(杨雪冰)-技术团队 |
通讯作者 | Wang JP(王军平) |
作者单位 | Institute of Automation, Chinese Academy of Science |
推荐引用方式 GB/T 7714 | Wang JP,You Kang Shi,Wen Sheng Zhang,et al. Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces[J]. IEEE Transactions on Industrial Informatics,2019,15(4):2395-2404. |
APA | Wang JP,You Kang Shi,Wen Sheng Zhang,Ian Thomas,&Shi Hui Duan.(2019).Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces.IEEE Transactions on Industrial Informatics,15(4),2395-2404. |
MLA | Wang JP,et al."Multitask Policy Adversarial Learning for Human-Level Control With Large State Spaces".IEEE Transactions on Industrial Informatics 15.4(2019):2395-2404. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
Multitask_Policy_Adv(2547KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论