CASIA OpenIR  > 毕业生  > 硕士学位论文
基于模仿的机器人操作技能学习技术研究
李博遥
2020-08-14
页数106
学位类型硕士
中文摘要

近年来,人工智能技术的兴起推动了机器人行业的快速发展,各类机器人已广泛应用于工业生产、医疗救助、家庭生活等众多领域。随着应用场景的复杂化,机器人操作技能的智能化需求也在不断提升。受限于环境感知与控制策略学习等技术,机器人操作技能学习技术的发展依然面临着巨大的挑战。本文从模仿学习的角度研究机器人操作技能学习的相关技术,并在学习策略的探索及长时序、多步骤任务的分层模仿方面开展探讨。论文的主要工作如下:

(1) 针对深度强化学习方法探索效率较低及在稀疏奖励场景下操作技能学习困难的问题,本文提出了一种增强型好奇心驱动经验回放机制。该方法将好奇心内在驱动机制及事后经验回放机制引入到策略探索中,通过定义任务相关因子,鼓励机器人探索与任务相关的未知新颖状态,从而提高控制策略的搜索能力,实现策略的快速收敛。实验结果表明,该方法在策略收敛速度及学习性能方面均展现出了较好的效果。

(2) 针对长时序、多步骤任务的操作技能学习问题,本文提出了一种基于分层机制的机器人操作技能模仿学习方法。该方法通过引入层级思想构建以操作物体为中心的任务分解机制,并结合专家示教数据和增强型好奇心策略搜索方法对各子任务技能进行模仿学习。同时提出基于并行训练方式的不同层级策略同步更新方法,在加快网络训练过程的基础上,提升了策略模型的整体性能。实验结果表明,该方法能够实现对复杂多步骤操作任务的有效学习。

(3) 设计了一种机器人操作技能模仿学习系统。该系统可通过编程或遥操作方式完成教师机器人在真实场景的技能演示,并通过实-虚环境数据转换在仿真环境中进行基于示教数据的操作技能策略训练,最终将策略模型部署到学生机器人上实现技能再现。该系统实现了“技能示教-虚拟训练-技能再现”的一体化模仿学习功能设计,为机器人技能传授、技能共享学习提供了验证测试平台。

英文摘要

In recent years, the rise of artificial intelligence technology has promoted the rapid development of robot industry. Various kinds of robots have been widely used in industrial production, medical assistance, family life and many other fields. As the application scenarios become more complicated, the demand for intelligence of robot manipulation skills is also increasing. Limited by the technologies of environmental perception and policy learning, the development of robot manipulation skills learning still faces great challenges. This thesis studies the related technologies of robot manipulation skills learning from the perspective of imitation learning, and discusses from the exploration of learning policy and the hierarchical imitation of long-horizon and multi-step tasks. The main contents of this thesis are as follows:

(1) An augmented curiosity-driven experience replay mechanism is proposed to deal with the problems of low exploration efficiency of deep reinforcement learning methods and the difficulty in learning manipulation skills in sparse reward environment. This method introduces the intrinsic curiosity-driven mechanism and the hindsight experience replay into policy's exploration. By defining the task-relevant factor, the robot is encouraged to explore unknown and novel states relevant to tasks, so as to improve the exploration competence and achieve rapid convergence of the policy. The experimental results show that this method has better effect on convergence speed and learning performance of the policy model.

(2) A hierarchy-based robot manipulation skills imitation learning method is proposed to deal with the problem of learning policy for long-horizon and multi-step tasks. This method introduces the hierarchical idea to construct an object-centric task decomposition mechanism, and incorporate expert demonstration data and augmented curiosity-driven exploration method to learn skills for each subtask. The parallel training method is used to update policies from different levels simultaneously, so as to accelerate the network training process and improve the overall performance of the policy model. The experimental results show that this method can achieve good performance for complex multi-step manipulation tasks learning.

(3) A robotic manipulation skills imitation learning system is designed. This system first uses the teacher robot to complete skills demonstration in the real world by programming or teleoperation, and then trains the policy based on the demonstration data transferred from the real-world in the simulation environment, and finally deploys the trained policy model on the student robot to reproduce the skills. This system realizes the integrated imitation learning design of "Skill Demonstration-Simulation Training-Skill Reproduction", which provides a verification and test platform for skills teaching and skills shared learning of robots.

关键词机器人学习 模仿学习 深度强化学习 好奇心驱动探索 分层机制
语种中文
七大方向——子方向分类智能机器人
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/40231
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
李博遥. 基于模仿的机器人操作技能学习技术研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
李博遥_基于模仿的机器人操作技能学习技术(6247KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李博遥]的文章
百度学术
百度学术中相似的文章
[李博遥]的文章
必应学术
必应学术中相似的文章
[李博遥]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。