基于增量学习的机器人开放式目标分拣关键技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于增量学习的机器人开放式目标分拣关键技术研究
	邓杰仁
	2024-05-16
页数	136
学位类型	博士
中文摘要	随着我国2035创新型战略的深入推进，智能机器人成为未来发展的方向。其中，多种类目标物体的分拣是智能机器人研究的重要内容之一。在无人超市，无人仓库以及无人物流等涉及机器人分拣应用的场景中，分拣的目标往往是开放式的，即分拣的目标的物体种类会不断地变化与更新。现有的大部分机器人分拣系统构建在传统模式识别或深度学习的基础上，遵守封闭世界假设，缺乏应对不断更新的物体的能力。本研究针对开放式目标分拣这个问题，基于增量学习并围绕增量式目标分类、增量式目标检测、增量式具身自举以及机器人增量式目标分拣展开研究，旨在为机器人开放式目标分拣提供理论指导和技术支撑。本文的主要工作如下： 1、针对增量式目标分类中出现的灾难性遗忘以及语义混淆问题，提出了一种基于联合加参微调以及双重记忆回放的增量式目标分类方法。增量式目标分类的目的是学会区分不断到来的新物体或类别，而不忘记已学物体或类别。基于预训练基础模型，联合加参微调方法结合提示词微调与适应器微调，以实现模型更好的稳定性与可塑性平衡，缓解灾难性问题。在联合加参微调的模型结构基础上，分别在高层语义和底层语义引入内存高效的范例原型回放以及组合式记忆回放，提供内存高效的跨阶段监督，以进一步缓解灾难性遗忘以及语义混淆。实验结果表明，所提出的方法能够仅用少量的记忆内存有效地改善灾难性遗忘以及语义混淆带来的负面影响，提高增量式目标分类的准确性。 2、针对增量式目标检测中定位与分类任务耦合放大遗忘的问题，提出了一种基于多模态联合加参微调以及零干扰重参化适应的增量式目标检测方法。增量式目标检测需要在学会区分不断到来的新物体或类别的基础上，对目标物体进行定位。视觉语言预训练的基础目标检测模型具有类别无关的特征。多模态联合加参微调在视觉语言预训练目标检测模型的基础上，将类别相关的知识与类别无关的知识进行解耦，以防止类别与定位互相耦合放大目标检测受到的遗忘。在此基础上，提出零干扰重参化适应以约束多模态联合加参微调方法，以进一步防止遗忘并保留模型的零样本泛化能力。实验结果表明，所提出的方法能够有效的防止目标检测模型在下游任务以及预训练任务的遗忘，保证已学类别目标检测的精度以及对未学类别的泛化性。 3、针对现有增量式具身自举方法对错误标注鲁棒性不强的问题，提出了一种基于自适应错误标注抑制以及自监督冗余标注增强的增量式具身自举方法。增量式具身自举的目的是让机器人在作业环境中主动收集样本、自动标注并自主学习，从而减少对于人工标注的依赖。为了增强增量式具身自举对于错误标注的鲁棒性，提出了自适应错误标注抑制。自适应错误标注抑制包括分支输出抑制以及范例原型约束，分别利用了显性正确原型记忆以及隐性正确参数记忆对新的样本进行相合性约束，以抑制错误标注对于模型学习的负面影响。自监督冗余标注增强在自适应错误标注抑制提供的鲁棒性基础上，通过具身化的多视角点云匹配、多视角分类投票以及跨视角冗余标注，引入更多的正确样本，以提升增量式具身自举的所能带来的增益。 4、为了更好地实现开放式目标分拣，提出了一套新颖的识别与抓取解耦的增量式分拣算法框架，基于已提出方法研发了机器人增量式智能分拣系统，并基于该系统对四个主要工作的方法进行验证。为了避免物体识别与物体抓取耦合放大可能出现的遗忘，采用了识别与抓取解耦的增量式目标分拣框架，将增量式目标分拣分为了增量式目标检测与类别无关抓取位姿估计。目标物体增量式识别的过程在增量式目标检测中完成，而类别无关抓取位姿估计负责实现通用的抓取位姿预测。在此框架基础上，提出了中心筛选的抓取位姿估计方法以增强物体抓取的稳定性，同时提出了抓取与检测的多视角融合以实现完整分拣。结合提出的增量式目标检测方法、增量式具身自举方法以及增量式分拣框架，实现了一套可自举的机器人增量式分拣系统。基于实际的机器人系统的实验结果表明，本文所提出的方法能够更精准地帮助机器人完成分拣任务，且能够在环境中进行自举而无需过多人工标注，从而验证了所提出的增量式目标检测方法、增量式具身自举方法以及增量式目标分拣框架对于开放式目标分拣场景的有效性。
英文摘要	With the deepening advancement of China's 2035 innovative strategy, intelligent robotics has emerged as a direction for future development. Among these advancements, research into intelligent robots capable of pick-and-place tasks for various types of objects stands out as a crucial area. In scenarios such as unmanned supermarkets, warehouses, and logistics operations involving robot pick-and-place applications, the objectives are often open, indicating that the types of objects to be picked and placed will continue to change and update. However, most existing robot pick-and-place systems are built on the assumption of a closed world, relying on traditional pattern recognition or deep learning, which lacks the ability to adapt to continuously updated objects. This study addresses the issue of open object pick-and-place, based on incremental learning and focusing on incremental object classification, incremental object detection, incremental self-bootstrapping, and incremental robot pick-and-place, aiming to provide theoretical guidance and technical support for robot open object pick-and-place. The main contributions of this paper are as follows: 1. For the issues of catastrophic forgetting and semantic confusion in incremental object classification, a method based on Unified Parameter-Additional Tuning and dual-memory replaying is proposed. The goal of incremental object classification is to learn to differentiate continuously incoming new objects or categories without forgetting previously learned ones. Leveraging a pre-trained foundational model, the unified parameter-additional tuning method integrates prompt tuning and adapter tuning to achieve a better balance between model stability and adaptability, alleviating catastrophic issues. Building upon the structure of unified parameter-additional tuning, efficient memory replaying mechanisms, including prototype replaying at high-level semantics and composite memory replaying at low-level semantics, are introduced to provide memory-efficient cross-stage supervision, further mitigating catastrophic forgetting and semantic confusion. Experimental results demonstrate that the proposed approach can effectively alleviate the negative impact of catastrophic forgetting and semantic confusion with only a small amount of memory, thereby improving the accuracy of incremental object classification. 2. To address the issue of exacerbated forgetting due to the coupling of localization and classification tasks in incremental object detection, a method called Incremental Object Detection with Multi-Modal Unified Parameter-Additional Tuning and Zero-Interference Reparameterization Adaptation is proposed. Incremental object detection involves not only learning to differentiate continuously incoming new objects or categories but also locating the target objects. The foundational object detection model from visual-language pre-training possesses category-agnostic features. Multi-Modal Unified Parameter-Additional Tuning, built upon the foundational object detection model from visual-language pre-training, decouples category-specific knowledge from category-agnostic knowledge, thereby preventing the coupling of category and localization from amplifying the forgetting issue in object detection. Additionally, Zero-Interference Reparameterization Adaptation is introduced on top of Multi-Modal Unified Parameter-Additional Tuning to further prevent forgetting and preserve the model's zero-shot generalization capability. Experimental results demonstrate the effectiveness of the proposed method in preventing forgetting in downstream tasks and pre-training tasks, ensuring the accuracy of learned category object detection and preserving the generalization capability for unseen category object detection. 3. To address the issue of weak robustness to mislabeling in existing incremental self-bootstrapping methods, a incremental self-bootstrapping method based on Adaptive Mislabel Suppression and Self-Supervised Redundant Label Enhancement is proposed. The goal of incremental self-bootstrapping is to enable robots to actively collect samples and learn autonomously in operational environments, reducing reliance on manual labeling. To enhance the robustness of incremental self-bootstrapping to mislabeling, Adaptive Mislabel Suppression is introduced. This method includes branch output suppression and exemplar prototype constraints, utilizing explicit correct prototype memory and implicit correct parameter memory to impose consistency constraints on new samples, suppressing the negative impact of mislabeling on model learning. Self-Supervised Redundant Label Enhancement, built upon the robustness provided by Adaptive Mislabel Suppression, introduces more correct samples through embodied multi-viewpoint point cloud matching, multi-viewpoint classification voting, and cross-viewpoint redundant labeling, thereby enhancing the benefits of incremental self-bootstrapping. 4. A novel incremental pick-and-place algorithm framework, which decouples recognition and grasping, is proposed. Based on all the proposed methods in this study, a robot incremental intelligent pick-and-place system is developed and validated against four main tasks using this system. To avoid the possibility of amplified forgetting due to the coupling of object recognition and grasping, an incremental object pick-and-place framework with decoupled recognition and grasping is adopted, dividing the incremental pick-and-place into incremental object detection and category-agnostic grasping pose estimation. The process of incremental object recognition is completed within the incremental object detection, while the category-agnostic grasping pose estimation is responsible for implementing a universal grasping pose prediction neural network. On top of this framework, a center selection-based grasping pose estimation method is proposed to enhance the stability of object grasping, and a multi-view fusion of grasping and detection is proposed to achieve complete pick-and-place. Combining the proposed incremental object detection method, incremental self-bootstrapping method, and incremental pick-and-place framework, a self-bootstrapping robot incremental pick-and-place system is implemented. Experimental results based on the robot system demonstrate that the proposed methods can more accurately assist robots in completing pick-and-place tasks and can self-bootstrap in the environment without excessive manual labeling, thus validating the effectiveness of the proposed incremental object detection method, incremental self-bootstrapping method, and incremental pick-and-place framework for open object pick-and-place scenarios.
关键词	机器人分拣增量式目标分类增量式目标检测增量式具身自举增量式目标分拣
收录类别	其他
语种	中文
是否为代表性论文	是
七大方向——子方向分类	智能机器人
国重实验室规划方向分类	实体人工智能系统感认知
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/57374
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	邓杰仁. 基于增量学习的机器人开放式目标分拣关键技术研究[D],2024.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（33172KB）	学位论文		限制开放	CC BY-NC-SA