服务机器人导航与抓取检测研究

	服务机器人导航与抓取检测研究
	于莹莹
	2020-08
页数	122
学位类型	博士
中文摘要	服务机器人良好的自主导航能力是任务高质量完成的重要前提，而抓取检测可以更好的实现机器人与环境的交互，具有重要的理论研究意义和广泛应用前景。本文针对服务机器人导航与抓取检测开展研究，论文的主要内容如下：首先，介绍了服务机器人导航与抓取检测的研究背景和研究意义，从机器人导航、物体检测与分割、机器人抓取检测三个方面进行了现状综述，并对论文内容和结构做了介绍。其次，基于路径定义和路径感知量，提出了基于路径态势感知的机器人导航方法。将整个环境用路径进行描述并构建以路径为边的拓扑图，同时利用态势值来隐式地描述机器人在路径中的位置，进而设计态势感知拟合网络以实现场景感知信息到态势值的映射，其网络结构利用路径感知量加以确定。在此基础上，结合基于拓扑图的全局路径规划结果以及机器人当前的运动态势，通过基于激光的局部避碰算法实现机器人运动决策。所提导航方法基于态势值进行运动决策，摆脱了对笛卡尔坐标表示的全局位置的依赖，其有效性通过实验进行了验证。第三，针对现有同时检测与分割方法难以同时满足实时性以及较高的检测与分割准确性，提出了基于改进BlitzNet的同时检测与分割方法。在原始BlitzNet网络结构中添加基于通道的注意机制以调整主干网络各输出特征图的通道权重，并将编码器中不同尺度的特征图融入分割分支，同时，在训练过程中，通过多任务损失参数的自学习实现了边界框分类、边界框回归和图像分割任务损失函数权重的优化设定，并设计基于背景抑制的图像分割损失函数以解决训练集图像的背景像素与物体像素比例不平衡的问题。进一步，为解决检测过程中目标物体被严重遮挡的问题，在计算机视觉领域图像修复研究的启发下，将图像修复引入机器人抓取任务。提出了基于图像修复与识别网络IRNet的被遮挡目标物体识别方法，其中IRNet网络由包含粗修、中间修复和精修的三阶段图像修复网络以及基于修复图像的识别网络构成，其中以中间修复网络的输出作为修复结果，在提供较为细致纹理信息的同时，有效提高了抗周围图像信息干扰的能力，最终输出以TOP3可能所属类别表示的识别结果。上述方法在保证实时性的基础上提升了检测与分割的准确性，并为抓取过程中目标物体的遮挡问题提供了解决方案。数据集和实际场景下的实验表明了所提方法的有效性。第四，以目标物体对应的深度图和灰度图为输入，设计了像素级的双通路抓取检测卷积神经网络TsGNet用于输出目标物体的最佳抓取矩形。该网络的编码器采用深度可分离卷积，解码器采用基于非对称反卷积的GDN模块。在此基础上，结合相机内外参矩阵，将TsGNet输出的最佳抓取矩形转换为机械臂末端夹持器期望的抓取位姿，进而控制机械臂执行抓取操作。该方法能够在保证较少网络参数量和快速处理的同时，提升抓取检测的精度，在Cornell 抓取数据集和实际场景下进行了实验验证。第五，设计了服务机器人导航与抓取软件架构，包括态势感知层、导航规划层、目标检测层、抓取检测层和抓取控制层，在ROS框架下实现了所提导航方法、基于同时检测分割和遮挡修复的目标物体检测、抓取检测方法的集成。室内办公环境下的导航抓取实验验证了其有效性。最后，对本文工作进行了总结，并指出了需要进一步开展的研究工作。
英文摘要	Autonomous navigation ability is an important prerequisite for service robots to complete tasks with high quality, and grasp detection provides an important way to realize better human-environment interaction. It is significant in both research and applications. This thesis concerns the research on navigation and grasp detection for service robots. The main contents are as follows: Firstly, the research background and its significance of this thesis is given. The research development of robot navigation, object detection and segmentation and grasp detection is reviewed. The contents and structure of this thesis are also introduced. Secondly, based on the definitions of path and the amount of perception, a navigation method based on the situational awareness of path is proposed. The environment is described by paths and then the topological map is constructed by abstracting the path as topological edge. Also, the situational awareness value is used to describe the robot’s position in the path implicitly. Then the situational awareness fitting network is designed to realize the mapping of scene perception information to the situational awareness value, and the network structure is determined by the amount of perception. On this basis, with the combination of the result of global path planning based on topological map as well as the current motion situation of the robot, the motion decision is made by the laser-based local collision avoidance algorithm. Regardless of the global position represented by Cartesian coordinates, the robot makes decisions based on situational awareness value, and the effectiveness of the proposed navigation method is verified by experiments. Thirdly, aiming at the problem that the existing simultaneous detection and segmentation methods are difficult to satisfy the real-time requirement with high accuracy, an improved simultaneous detection and segmentation network BlitzNet is proposed. The channel-based attention mechanism is added to the original BlitzNet network to adjust the channel weight of each output feature map in the trunk network, and the feature maps with multiple scales in the encoder are merged into the segmentation branch. Moreover, the weights of loss functions corresponding to boundary box classification, bounding box regression and image segmentation are optimized by self-learning of multi-task loss weightings in the training process. Also, the image segmentation loss function based on background suppression is designed to solve the problem of unbalanced ratio between background pixels and object pixels. Furthermore, inspired by the inpainting work of image in the field of computer vision, the image inpainting is introduced into the grasping task to solve the problem where the target object is severely occluded in the detection process. A recognition method of occluded target object based on image inpainting and recognition network IRNet is proposed. IRNet consists of a three-stage inpainting network with a coarse, an intermediate and a reﬁnement inpainting stages as well as an inpainting-based recognition network. It takes the output of the intermediate stage as the inpainting result, which provides more detailed texture with better anti-disturbance ability to the surrounding region, and the TOP3 recognition results are outputted. This method improves the accuracy of detection and segmentation with real-time performance, and provides a solution to the occlusion problem of the target object in the process of grasping, which is verified by the experiments on the dataset and actual scenes. Fourthly, a pixel-level two-stream grasping convolutional neural network TsGNet is designed to determine the best grasp of the target object, where the depth map and grayscale map corresponding to the target object are regarded as the network input. The encoder of TsGNet adopts the depthwise separable convolution and its decoder uses the global deconvolution module GDN, which improves the accuracy of grasp detection with fewer network parameters and fast processing speed. On this basis, combined with the intrinsic and extrinsic matrixes of the camera, the best grasp outputted by TsGNet is converted to the desired pose of the manipulator, which is used to drive the manipulator to execute the grasping operation. The proposed method improves the accuracy of the grasp detection with a small amount of network parameters and faster speed, and the effectiveness is verified through the experiments on the Cornell grasping dataset and actual scenes. Fifthly, the navigation and grasp software architecture of service robots is designed, which includes the situational awareness layer, navigation planning layer, target detection layer, grasp detection layer and grasp control layer. Under the ROS framework, the above-mentioned navigation, target detection and grasp detection methods are integrated, and the effectiveness is verified by the navigation grasping experiments in the official environment. Finally, the conclusions are given and future work is addressed.
关键词	服务机器人路径态势感知导航同时检测分割遮挡修复抓取检测
语种	中文
七大方向——子方向分类	智能机器人
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/40398
专题	复杂系统认知与决策实验室_先进机器人
推荐引用方式 GB/T 7714	于莹莹. 服务机器人导航与抓取检测研究[D]. 中国科学院自动化研究所. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
博士学位论文-于莹莹0827最终版.pd（22709KB）	学位论文		开放获取	CC BY-NC-SA