基于自适应深度卷积神经网络的抓取检测算法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 机器人理论与应用

	基于自适应深度卷积神经网络的抓取检测算法研究
	顾启鹏
	2021-06
页数	96
学位类型	硕士
中文摘要	抓取是机器人的一项重要技能，稳定可靠的抓取物体是机器人完成装配、搬运和分拣任务的基本要求。传统的机器人抓取方法需要根据物体几何属性，建立机器人手爪和物体之间的接触模型。在一些复杂场景中，由于很难获得物体的几何属性，因而需要机器人能够根据对场景的理解和推理，以及物体在场景中的功能性属性，自主建立抓取模式。近年来，基于深度学习的抓取检测方法成为机器人抓取领域的研究热点之一。大部分的自主抓取方法是基于大规模的数据学习，样本采集和标注的工作量大，降低了机器人自主学习的效率。迁移学习由于具备所需样本数据少以及学习能力快的优点，已成机器人自主抓取的重要研究方向。本文基于工业机器人应用场景，通过深度学习来研究机器人抓取检测的相关技术。论文的主要工作如下： 1)提出了一种基于全卷积神经网络的抓取检测模型针对机器人抓取检测实时性与准确性的要求，提出了一种基于像素的、轻权重网络模型。由于全卷积网络具有结构简单、参数少的优点，采用了全卷积网络作为基础架构，满足了抓取检测的实时性要求。通过引入注意力机制，确保显著特征不丢失，从而提高模型预测的准确性。实验结果表明，该方法在模型收敛速度及学习性能方面均展现出了较好的效果。 2)提出了一种引入注意力机制的物体功能性检测模型针对复杂或动态的环境，机器人抓取检测需要理解物体所具有的功能性，这也为机器人的多任务学习奠定了基础。本文提出了一种基于编码-解码架构的物体功能性检测模型，编码网络采用扩张残差网络提取特征，能保留更多的空间信息。并采用有效的注意力机制对长范围、多级别的依赖关系进行建模，从而提高特征表示能力。解码网络采用上采样层，将低分辨率的特征映射到高分辨率像素级的功能性输出。实验结果表明，该方法能够对输入图片从全局角度预测功能性标签并取得良好性能。 3)提出了一种基于无监督领域自适应的抓取检测模型针对抓取场景变化以及实际采集和标注数据集需要消耗大量人力物力资源，本文提出了一种基于无监督领域自适应的抓取检测方法。将带有标签的公开数据集作为源域数据集，采集的少量无标签的数据集作为目标域数据集。从熵的角度出发，通过极小化目标域的熵来弥补源域与目标域的差距。同时，引入特征对齐模块，来增强跨域之间的一致性。实验结果表明，所提出的模型在目标域上预测性能良好。
英文摘要	Grasping is an important skill of robots. Grasping objects stably and reliably is the basic requirement for robots to complete assembly, handling, and sorting tasks. The traditional robot grasping method needs to establish the contact model between the robot hand and the object according to the geometric properties of the object. In some complex scenes, it is difficult to obtain the geometric attributes of the object, so the robot needs to be able to independently establish the grasping mode according to the understanding and reasoning of the scene and the functional attributes of the object in the scene. In recent years, grasping detection method based on deep learning has become one of the research hotspots in the field of robot grasping. Most of the autonomous grasping methods are based on large-scale data learning, and the workload of sample collection and annotation is large, which reduces the efficiency of robot autonomous learning. Due to the advantages of fewer sample data and fast learning ability, transfer learning has become an important research direction of robot autonomous grasping. This thesis uses deep learning to study the related technology of robot grasp detection based on the industrial robot application scenarios. The main work of this paper is as follows: 1) Robotic grasp detection based on fully convolution neural network The thesis proposes an attention-based grasping network model, which is a pixel-based light-weight network structure, which meets the real-time and accuracy of robotic grasp detection. The attention mechanism is introduced to ensure that the salient features are not lost, thereby improving model prediction accuracy. Experimental results show that this method has shown good results in terms of model convergence speed and learning performance. 2) Object affordance detection using an attention mechanism The thesis proposes an object functional detection model based on an encoder-decoder architecture for robot scene understanding and reasoning.The encoder network uses the dilated residual network to extract features, which can retain more spatial information. In order to improve the ability of feature representation, and effective attention mechanism is used to model long-range and multi-level dependency relations. The decoder network uses an up-sampling layer to map low-resolution features to a high-resolution pixel-wise affordance map. Experimental results show that the proposed method can predict the affordance labels from a global perspective and achieve good performance. 3)An unsupervised domain adaptation method for robotic grasp detection In view of the fact that grasping scene changes and actual data collection and annotation need to consume a lot of human and material resources, The paper proposes an unsupervised domain adaptation method for robotic grasp detection. The public datasets with labels are regarded as source domain datasets, and a small number of unlabeled datasets are collected as target domain datasets. From the perspective of entropy, the gap between the source domain and the target domain is made up by minimizing the entropy of the target domain. At the same time, a feature alignment module is introduced to enhance consistency across domains. The experimental results show that the proposed model has good prediction performance in the target domain.
关键词	抓取检测深度学习功能性检测领域自适应
语种	中文
七大方向——子方向分类	机器学习
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44948
专题	多模态人工智能系统全国重点实验室_机器人理论与应用
推荐引用方式 GB/T 7714	顾启鹏. 基于自适应深度卷积神经网络的抓取检测算法研究[D]. 北京. 中国科学院自动化研究所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于自适应深度卷积神经网络的抓取检测算法（9182KB）	学位论文		开放获取	CC BY-NC-SA