机器人抓取目标的表征学习与位姿估计

	机器人抓取目标的表征学习与位姿估计
	李晓灿
	2020-08-14
页数	104
学位类型	硕士
中文摘要	物体位姿估计是智能机器人领域一个十分重要的研究课题。物体位姿估计问题可描述为，如何从相机所采集到的数据中获得物体相对于相机坐标系的三维平移分量以及三维旋转分量。物体相对于相机的位姿对于机器人抓取领域起着至关重要的作用。目前物体位姿估计仍然存在着许多理论以及技术难题，如物体位姿估计准确性往往受到光照变化、噪声干扰、杂乱背景以及物体遮挡等因素的影响。此外，物体位姿数据集获取工作量大，需对所有待检测物体进行繁重的位姿标注工作，未考虑同类物体之间的同质特征。基于无监督学习的物体旋转表征学习方法，尚未考虑旋转表征在表征空间上的约束。因此，本文以机器人抓取目标的位姿估计方法研究为重点，为提高机器人自主抓取能力提供有效支撑。本文针对以上问题，进行了方法与应用研究，本文完成的主要工作如下： 1. 针对位姿估计算法依赖位姿标注的问题，给出了一种基于降噪自编码器的无监督旋转表征学习方法，并在随机二维图形图像数据集上进行仿真实验。实验结果显示，无需位姿标注，降噪自编码器可学习到椭圆、心形、正方形、长方形的旋转表征，并在长方形图形上具有类级别旋转表征提取的能力。 2. 针对实例级别方法未考虑同类物体同质特征的问题，给出利用降噪自编码器在三维场景物体实例级别以及类级别上进行旋转表征学习的方法。该方法仅需单个物体三维模型进行类级别旋转表征学习。实验结果表明利用该类级别方法可有效学习同类物体不同实例的旋转表征，获得了较好的位姿估计结果。此外，真实环境中的机器人抓取实验验证了类级别位姿估计方法的有效性。 3. 针对降噪自编码器未考虑不同旋转表征之间位姿约束关系的问题，给出了将度量学习约束应用于降噪自编码器瓶颈层旋转表征的方法。实验结果表明，通过对降噪自编码器瓶颈层旋转表征结合欧式距离损失函数的度量学习约束，可提升物体位姿估计准确率。
英文摘要	Object pose estimation is an important research field in the field of intelligent robots. Object pose estimation is defined as, how to obtain an object's 3D translations and 3D rotations under the camera coordinate system, from color or depth images. Object pose estimation is crucial on robot grasping. However, there are theoretical and practical issues remain unsolved such as the influence of illumination variance, noise from sensors, cluttered background and occlusion. These issues will cut down the accuracy of pose estimation. Besides, the dataset of object pose estimation needs excruciating efforts to acquire, since each object needs pose labelling. Furthermore, the homogeneous features, such as shape, are not considered within the same category of objects. The constraints on embedding space are not considered for the orientation representation learning method based on unsupervised learning. Therefore, this thesis focuses on the research of pose estimation of objects to grasp, and provides technical solutions for improving the robot’s grasping skills. For the problems mentioned above, the researches and experiments are conducted. Main work is summarized below: 1. For the problem of training dependence on pose-annotated datasets, unsupervised orientation representation learning based on denoising autoencoder is proposed and experimented on the 2D shapes with random scales and in-plane translations. With our elaborated training pipeline, the autoencoder has learnt the orientations of various shapes without pose annotations. Thus, the proposed method has the ability of categorylevel orientation representation learning. 2. For the problem without considering the homogeneous feature of the same category of 3D objects, denoising autoencoder is used to learn orientation representation. Only one object 3D model is used as category representative to learn category-level orientation representation. The results show that denoising autoencoder is effective for orientation representation learning both at instance and category level. Our robot grasping experiments demonstrated the effectiveness of the category-level pose estimation method. 3. For the problem without considering the relations between training samples, deep metric learning method is introduced. Metric learning constraints are applied to the bottleneck layer of denoising autoencoder. The results show that applying Euclidean metric loss to the bottleneck layer of denoising autoencoder could improve the accuracy of pose estimation.
关键词	位姿估计自编码器表示学习度量学习机器人抓取
语种	中文
资助项目	National Natural Science Foundation of China[61773378] ; National Natural Science Foundation of China[61773378]
七大方向——子方向分类	智能机器人
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/40230
专题	复杂系统认知与决策实验室_先进机器人
推荐引用方式 GB/T 7714	李晓灿. 机器人抓取目标的表征学习与位姿估计[D]. 中国科学院自动化研究所. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
李晓灿硕士学位论文.pdf（39060KB）	学位论文		开放获取	CC BY-NC-SA