基于演示示教的机器人技能模仿学习

CASIA OpenIR > 精密感知与控制研究中心

	基于演示示教的机器人技能模仿学习
	夏鹏程
	2021-05-27
页数	75
学位类型	硕士
中文摘要	具有自主识别和决策能力的微装配机器人可以显著减少对人工编程开发与调试的依赖，有效提升机器人的自主性和智能性。本文将域自适应、动态运动基元和深度强化学习引入微装配机器人研究领域，重点研究微零件识别定位、趋近轨迹模仿、力/位协调装配控制关键技术，论文主要内容如下：一、针对人工标注零件特征成本较高且模型适应性较差的问题，本文提出基于域自适应的零件特征提取方法，实现不同风格和纹理背景下微零件的特征提取。本文首先创建微零件合成语义分割数据集，自动生成微零件图像标注；然后采用风格迁移的方法生成不同纹理背景的微零件图像，提升微零件训练图像数据的多样性；最后采用域自适应语义分割和逐像素对抗训练相结合的方法实现微零件特征的自动提取。二、针对常规微零件示教趋近轨迹的起点和终点位置固定的问题，本文提出基于动态运动基元的微零件趋近轨迹模仿学习方法，可实现微零件起始点变化时的趋近轨迹的自主生成。本文首先采用虚拟现实设备采集微零件趋近示教数据，并通过 DTW 算法进行数据对齐；然后采用 GMM 和 GMR 算法学习并生成最优示教轨迹；最后通过 DMP 模型学习并泛化微零件趋近轨迹，实现微零件起始位置变化时趋近轨迹的自主生成。三、针对过渡配合的微零件装配时无法兼顾装配力和位置的问题，本文首先建立基于深度强化学习的机器人装配控制模型，并通过演示示教数据缩短模型训练时间；然后采用连续性控制策略 TD3 算法确定当前装配状态下的最优装配动作，实现兼顾微零件装配力和位置的应用目标。
英文摘要	The micro-assembly robots with automatic identification and decision-making capabilities can significantly reduce the dependence on manual programming and tuning, and effectively improve the assembly effciency of micro-parts. This thesis introduces domain adaptation, dynamic primitives and deep reinforcement learning into the research of micro-assembly robots in order to achieve the precision assembly skill learning, focusing on the key technologies of micro-part recognition and localization, trajectory imitation, and force/position assembly control. The main contents of the thesis are as follows: 1.Considering the high cost of manual part annotation and poor model adaptability, this thesis proposes an automatic identification method based on domain adaptation to localize micro-parts under different textures. First, a micro-part synthetic semantic segmentation dataset is proposed, which automatically generate micro-part image annotations with the help of computer graphics. Then the style transfer method is applied to generate micro-part images with different texture backgrounds to improve the diversity of micro-part training data. Finally, semantic segmentation is adopted to extract the feature of micro-parts through pixel-wise domain adaptation. 2.In this paper, the micro-part approaching trajectory model is proposed based on dynamic motion primitives, which can realize automatic approaching of microparts under the condition of changing starting points. First, virtual reality equipment is used to collect the approaching teaching data of micro-parts. DTW algorithm is applied for time alignment. Then GMM and GMR algorithms are utilized to learn and generate the optimal teaching trajectory. Finally, the approaching trajectory of micro-parts is generated to achieve the goal of the micro-parts, which can automatically approach the assembly station with the starting position changing. 3.Since the assembly force and position cannot be considered in the assembly of micro-parts, this thesis first establishes robot assembly control model based on deep reinforcement learning, and shortens the training time of the control model by finetuning with the teaching data. Then, the continuity control strategy TD3 method is used to determine the optimal assembly action in the current assembly state, taking both the assembly force and position of the micro-parts into consideration.
关键词	演示示教域自适应轨迹模仿深度强化学习
语种	中文
七大方向——子方向分类	智能机器人
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44822
专题	精密感知与控制研究中心中国科学院自动化研究所
推荐引用方式 GB/T 7714	夏鹏程. 基于演示示教的机器人技能模仿学习[D]. 智能化大厦. 中科院自动化所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
夏鹏程-学位论文.pdf（16930KB）	学位论文		开放获取	CC BY-NC-SA