融合层级目标关系图的开放环境语义目标导航研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	融合层级目标关系图的开放环境语义目标导航研究
	许涛
	2023-05
页数	78
学位类型	硕士
中文摘要	自主导航是具身智能领域的一个研究重点，它在无人系统自主作业场景中有着重要的应用价值。随着人工智能技术的不断进步，无人系统的环境感知能力正在从智能识别逐步过渡到目标语义理解。目前，基于视觉的语义目标导航技术已经能够使用单目相机作为唯一传感器输入，通过深度学习模型完成端到端的导航动作预测，从而大大降低了无人系统的部署成本。然而，该技术仅使用局部视觉输入作为传感器输入，缺乏全局信息，如地图和GPS等，因此智能体在未知环境中很难快速且精准地完成导航任务。因此，基于视觉的语义目标导航仍然是当前具身智能领域研究的重点和难点。本文围绕语义目标导航技术中全局信息的缺失问题展开讨论，希望通过图神经网络嵌入与导航相关的先验信息，以弥补全局信息的缺失，从而促进语义目标导航问题的研究，提高语义目标导航任务成功率的同时降低导航系统的部署成本以及实现复杂度。本论文的主要工作和创新点归纳如下： 1. 提出基于类平衡经验重放的开放环境感知算法基于视觉的语义目标导航任务中，环境感知算法多为利用深度学习模型对第一人称视角的视觉输入进行处理，进而提取丰富的视觉特征表示，以便于后端导航决策模型学习导航策略。针对开放环境下的环境感知算法的灾难性遗忘问题，本文提出了一种基于类平衡经验重放的类增量持续学习算法。该算法将持续学习过程分为两部分：增量学习部分和学习能力提升部分。首先增量学习部分利用K-Means算法选择旧类别数据进行保留，采用类平衡重放策略将新旧类别数据进行混合训练。其次学习能力提升部分引入数据增广、批归一化以及标签平滑等技术对主干网络进行改进，从而提高模型的鲁棒性和泛化性能，减少过拟合。该算法在图像分类任务中得到了充分的实验验证，并推广到目标检测任务中实现。 2. 提出融合层级目标关系图的导航决策算法面向基于视觉的语义目标导航任务，本文提出了一个端到端导航模型框架，利用AI2-iTHOR具身智能环境进行强化学习训练，完成了语义目标导航任务的端到端实现。在此基础上，本文针对基于单层目标关系图的导航模型推理能力较弱的问题，搭建了一个分层图卷积神经网络。通过图卷积块和图池化块的堆叠，该网络可以用于生成“区域-目标”层级目标关系图，从而使得智能体在语义目标导航任务中推理更为高效，在提高智能体导航成功率的同时减少导航步数，使得智能体能更快更准地到达目标。针对语义目标导航任务中的稀疏奖励函数问题，本文提出一种新的稠密化的奖励函数，通过稠密化奖励信号的同时鼓励智能体在未知环境中探索，指导智能体在室内环境中进行强化学习训练，提升深度强化学习的学习速度和策略水平。针对现有端到端语义目标导航模型中轮次判定不足的问题，本文提出了一种基于拒识机制的结束判定算法，不仅能够提高模型的鲁棒性和准确性，还可以避免无效的导航操作，从而提高导航模型的训练效率。
英文摘要	Autonomous navigation is a research focus in embodied intelligence, and it has significant practical value in the autonomous operation of unmanned systems. With the continuous advancement of artificial intelligence technology, the environmental perception capability of unmanned systems is gradually transitioning from intelligent recognition to semantic understanding of target objectives. Currently, object navigation can use a single camera as the only sensor input and complete end-to-end navigation action prediction through deep learning models, thus significantly reducing the deployment cost of unmanned systems. However, this technology only uses local visual inputs as sensor inputs and lacks global information, such as maps and GPS, making it difficult for the agent to quickly and accurately complete navigation tasks in unknown environments. Therefore, object navigation remains a vital and challenging area of research in the field of embodied intelligence. This paper discusses the issue of missing global information in object navigation technology, aiming to address this problem by embedding prior information related to navigation into graph neural networks, promoting research in object navigation, and improving the success rate of navigation tasks while reducing the deployment cost and complexity of navigation systems. The main contributions and innovations of this paper are summarized as follows: 1. An open environment perception algorithm based on class-balanced experience replay. In the object navigation task, the environment perception algorithm mostly uses the deep learning model to process the visual input from the first-person perspective, and then extracts rich visual feature representation to facilitate the learning of navigation strategies by the navigation decision model. To address the catastrophic forgetting problem of environmental perception in open environments, this paper proposes a class-incremental continual learning algorithm based on class-balanced experience replay. The learning process is divided into incremental learning and learning capability improvement. Firstly, the incremental learning part uses the K-Means algorithm to select old class data for retention and mixes new and old class data using a class-balanced replay strategy for training. Secondly, the learning capability improvement part introduces techniques such as data augmentation, batch normalization, and label smoothing to improve the backbone, thereby enhancing the robustness and generalization performance of the model and reducing overfitting. The algorithm has been fully verified by experiments in image classification task and extended to object detection task. 2. An object navigation algorithm that integrates a hierarchical object relation graph. This paper proposes an end-to-end navigation framework for object navigation tasks, trained through reinforcement learning in the AI2-iTHOR embodied environment. First, a hierarchical graph convolutional neural network is constructed to address the problem of weak reasoning ability in navigation models based on single-layer object relationship graphs. By stacking graph convolutional blocks and graph pooling blocks, this network can generate "region-object" hierarchical object relationship graphs, allowing agents to reason more efficiently and reducing navigation steps, thus enabling faster and more accurate reaching of the target. A new dense reward function is proposed to address the problem of sparse reward functions in object navigation tasks, which encourages agents to explore unknown environments while learning through reinforcement learning in indoor environments. Finally, a rejection-based termination algorithm is proposed to address the insufficiency of termination criteria in existing end-to-end object navigation models. This algorithm not only improves the robustness and accuracy of the model but also avoids invalid navigation operations, thus improving the training efficiency of the navigation model.
关键词	语义目标导航层级目标关系图具身智能经验重放
语种	中文
七大方向——子方向分类	无人系统
国重实验室规划方向分类	虚实融合与迁移学习
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52111
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	许涛. 融合层级目标关系图的开放环境语义目标导航研究[D],2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
毕业论文最终版_许涛.pdf（8787KB）	学位论文		限制开放	CC BY-NC-SA