杂乱场景下机器人仿人灵巧抓取

CASIA OpenIR > 毕业生 > 硕士学位论文

	杂乱场景下机器人仿人灵巧抓取
	李一鸣
	2022-05-18
页数	70
学位类型	硕士
中文摘要	机器人灵巧抓取是机器人操作领域最基础、最具有代表性的任务之一，在工业、服务业、国防和航天领域有着巨大的应用潜力。面对这些非结构化的场景，机器人如何从中准确、高效地抓取物体是一个非常有挑战性的问题，对机器人的视觉感知、行为决策和底层控制三个方面提出了很高的要求，非常值得深入研究。杂乱场景下机器人灵巧抓取所面临的主要难点有（1）物体的多样性，（2）场景物体的堆叠和遮挡，（3）高维的机器人抓取规划空间等，给机器人的高效、准确抓取带来了巨大的挑战。受人类抓取行为的启发，本文重点围绕杂乱场景下的多任务抓取学习以及仿人五指手灵巧抓取学习两个方面开展研究。为了提高机器人在杂乱场景下的抓取能力，本文基于多任务学习框架对物体实例、抓取配置以及碰撞进行联合优化，以实现单视角观测下针对特定物体的无碰撞抓取。围绕仿人五指手灵巧抓取，本文将五指手抓取方式划分为多种模态，训练深度网络进行杂乱场景下的精细化抓取配置预测，同时进行物体的功能性抓取区域的语义分割学习。论文的主要工作及创新点归纳如下： 1. 杂乱场景下二指平行爪抓取多任务学习。针对机器人抓取过程中缺乏对场景语义理解的问题，本文提出一种联合实例分割和碰撞检测的二指平行爪抓取多任务学习框架，对物体语义、机器人抓取位姿以及夹爪与物体可能存在的碰撞进行联合优化，以获取物体级、无碰撞的二指平行爪抓取配置。实验表明该方法能够准确、高效地获取大量可执行的机器人抓取配置，并实现针对特定物体的抓取，在真实场景下的机器人抓取实验中取得了超过76% 的抓取成功率。 2. 杂乱场景下仿人五指手多模式灵巧抓取学习。二指平行爪结构简单、自由度低，虽然具有一定的抓取物体能力，但在灵巧性方面和人手相距甚远。本文基于仿人五指手开展机器人的灵巧抓取研究，针对杂乱场景下五指手抓取研究缺乏相关基准的问题，构建了杂乱场景下仿人五指手抓取合成数据集，并提出一种基于抓取模态的仿人五指手抓取配置学习方法，能够直接从单视角点云中准确、高效地生成高质量的灵巧抓取配置，在真实场景下的机器人抓取实验中针对未知物体取得了超过78%的清扫率。 3. 杂乱场景下仿人五指手精细化功能抓取学习。为了进一步发挥五指手的仿人灵巧抓取的能力，本文开展了杂乱场景下仿人五指手精细化功能抓取的研究。首先，本文通过构建功能性抓取区域语义分割网络识别物体的功能性抓取区域，然后基于物体和五指手之间的接触距离对手腕抓取姿态和手指关节角度进行优化，实现仿人五指手的精细化抓取。实验表明，该方法能够生成更加精细化且具有一定功能性的仿人五指手抓取配置，在真实场景下的机器人抓取实验中取得了超过70% 的抓取成功率和80% 的清扫率。
英文摘要	As one of the most fundamental and representative tasks in the field of robot manipulation, robotic dexterous grasping has shown great potential for applications in industry, service, defense and aerospace. It remains a very challenging problem for robots to grasp objects effectively and efficiently in unstructured clutters, which requires high demands on the aspects of visual perception, decision making and underlying control . Therefore, robotic dexterous grasping in clutters is a very important topic for in-depth research. The main challenges of robot dexterous grasping in cluttered scenes are: (1) a wide variety of objects, (2) the stacking and occlusion of objects in the scene, (3) the highdimensional planning space of robot grasping, which brings a great challenge for robots to grasp objects steadily and effectively. Inspired by human grasping behavior, this research focuses on multi-task robot grasping learning in cluttered scenes and anthropomorphic hand dexterous grasping learning. Specifically, to improve the grasping ability in cluttered scenes, we jointly optimize object instances, grasp poses and collisions based on a multi-task learning framework to achieve target-driven and collision-free grasping from single-view observation. As for the dexterous grasping of the anthropomorphic hand, we present a grasp type based deep neural network to precise hand grasp configurations in clutter to achieve dexterous grasping. In addition, we propose to jointly optimize functional grasping points through a semantic segmentation network. The main work and contributions of this paper are summarized as follows: 1. Multi-task parallel-jaw gripper grasp learning in cluttered scenes. To tackle the lack of semantic understanding of the scene during robot grasping, this section proposes a simultaneous semantic and collision learning framework for robotic grasping to jointly optimize object instances, 6 degrees-of-freedom grasping poses, and potential collisions between gripper and objects, to obtain object-level, and collision-free grasps. Experiments show that the proposed method is able to generate a large number of executable robot grasping configurations effectively and efficiently, which also enables target-driven grasp tasks, and achieves over 76% success rate in real-world robot grasping experiments. 2.Taxonomy based Five-finger anthropomorphic hand dexterous grasp learning based on grasp taxonomy in cluttered scenes. Although parallel-jaw grippers is widely used in robot grasping tasks, it has a simple structure and low degrees of freedom, which is far from a human hand in terms of dexterity. In this section, we conduct research that focuses on anthropomorphic hand grasping in clutter scenes. To address the lack of relevant benchmarks, we build a large-scale synthetic anthropomorphic hand grasping dataset and propose a single-shot network to learn grasp configurations based on predefined grasp types, which predicts high-quality dexterous hand grasps effectively and efficiently and achieves over 78% completion rate in real-world robot grasping experiments. 3. Precise and functional hand grasp learning in cluttered scenes. To further exploit the human-like dexterous grasping capability of the anthropomorphic hand, we conduct research on functional and precise grasp prediction in cluttered scenes. We first utilize a functional grasp points segmentation network to directly predict functional grasping areas in cluttered scenes, and propose to optimize the 6 degrees-of-freedom wrist poses as well as finger joints by measuring contacts between the anthropomorphic hand and scene objects. Experiments show that the proposed method can generate functional and more precise hand grasp configurations and achieve over 70% success rate and 80% completion rate in real-world robot grasping experiments.
关键词	机器人学习，灵巧抓取，仿人五指手，场景理解，多任务学习
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/48707
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	李一鸣. 杂乱场景下机器人仿人灵巧抓取[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
杂乱场景下机器人仿人灵巧抓取_签名.pd（11381KB）	学位论文		限制开放	CC BY-NC-SA