面向水下开放环境的视觉目标检测方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	面向水下开放环境的视觉目标检测方法研究
	武志亨
	2023-05-23
页数	94
学位类型	硕士
中文摘要	随着全球对海洋发展的重视日益增强，各国政府制定了一系列政策以平衡环境保护与经济发展。水下机器人作为一种具有广泛应用前景的高科技装备，在海洋科学研究、资源开发、环境监测等领域发挥着重要的战略作用。大多数水下作业需要水下机器人具备感知周围环境的能力。近年来，深度学习方法在计算机视觉领域取得了显著成果，为水下视觉感知的发展提供了契机。但由于水下图像数据获取困难，导致依赖大量训练数据的基于深度学习的视觉感知技术在水下的应用面临着巨大的挑战。因此，基于有限数据的水下视觉感知方法研究，对于实现机器人自主水下勘探具有重要意义。本文聚焦于水下图像稀缺情况下的视觉感知问题，旨在提高目标检测算法在水下开放环境的感知性能。目标检测是一种依赖大量训练数据的计算机视觉技术。然而，在水下环境中，获取大量高质量的训练数据十分困难，使得目标检测算法在水下视觉任务中的表现受到很大限制，而水下数据的稀缺导致可用于诱导检测器识别的水下类别较少，进一步影响了水下目标检测识别的性能。针对上述问题，本文围绕水下数据扩充、水下类别扩充和改善目标检测性能展开研究，主要的研究成果如下：（1）针对水下数据集缺乏的问题，提出了一种基于像素级自监督学习的水下图像合成方法SUG。首先，构建了水下图像合成模型，将水下视觉形成过程中的衰减、散射和相机模型等物理原理作为知识驱动，通过模拟光信号的变化将陆地图像转化为水下风格。其次，设计了一种像素级自监督训练策略，通过像素级损失函数监督图像各位置光信号的变化，实现了高质量水下图像的合成。该策略仅需水下图像即可完成训练，无需额外输入，从而有利于训练数据的收集。最后，在不同风格水下数据集上的实验表明，SUG方法能够合成适用于不同水下环境和光照条件的水下图像。此外，将SUG方法应用于陆地图像测试，也取得了令人满意的效果。（2）针对水下数据集稀缺导致的检测类别有限的问题，本文提出了未知类可分的开放世界目标检测问题UC-OWOD，旨在检测未知物体并将其分类为不同的未知类别。为了解决UC-OWOD问题，提出了一种两阶段目标检测器UC-Det。首先，设计未知标签感知候选框ULP及未知判别分类头UCH模块来检测已知与未知物体。其次，构建基于相似度的未知分类SUC与未知聚类细化UCR模块，以区分多个未知类别。此外，设计了两种新的评估指标UC-mAP和UC-Recall，以评估未知物体检测性能。最后，大量陆地公开数据集实验证明了所提方法的有效性。此外，UC-Det仅用陆地数据集训练便能成功检测出水下自建数据集上的物体，并识别为不同未知物体。（3）针对大多数视觉感知算法在水下视觉任务中因缺乏水下数据集而受限的问题，提出了一种基于水下合成数据增强的水下域预训练方法UDP。首先，从迁移学习角度对问题形式化定义，通过分析开放世界目标检测任务，得出将源域数据转换为目标域分布能提高开放任务性能的结论。其次，基于上述分析，提出了一种水下域预训练方案，利用SUG得到转换后的水下合成图像预训练模型，通过少量真实水下数据微调模型获得检测水下物体类别的能力。最后，在多种水下风格、多种水下视觉感知任务上对所提方法进行了验证，结果表明，UDP在水下UC-OWOD任务中对已知物体和未知物体的检测性能分别提高了21.26\%~mAP和4.97\%~UC-Recall；增量学习之后的灾难性遗忘也被大大缓解，旧类别的检测性能提升了11.86\%~mAP。（4）为了验证所提理论方法的实际环境下的应用效果，本文设计并验证了面向水下开放环境的视觉感知系统。首先，融合SUG、UC-Det和UDP，构建了水下视觉感知系统。其次，以搭载视觉感知系统的微型计算机和水下航行器作为验证平台，在真实水域环境中展开了目标检测的实验。最后，实验结果显示，所提水下视觉感知系统能在水下机器人探索过程中定位并区分真实水下开放环境中的未知物体，并具备增量学习这些未知类别的能力。
英文摘要	With the growing focus on ocean development, governments worldwide have implemented a range of policies to strike a balance between environmental protection and economic growth. Underwater robots, as advanced technological equipment with vast application potential, hold a crucial strategic position in the domains of marine scientific research, resource exploitation, and environmental monitoring. The majority of underwater operations necessitate the capacity for underwater robots to perceive their surrounding environment. In recent years, deep learning techniques have demonstrated remarkable achievements in the field of computer vision, paving the way for advancements in underwater visual perception. However, due to the challenges in acquiring underwater image data, deep learning-based visual perception technologies, which rely on an extensive amount of training data, encounter significant obstacles in underwater applications. Consequently, research into underwater visual perception based on limited data is of paramount importance for the autonomous underwater exploration capabilities of robots. This thesis addresses the challenge of visual perception in the context of limited underwater imagery, to enhance the performance of object detection algorithms in open underwater environments. Object detection, a computer vision technique, relies heavily on substantial training data. However, acquiring a large volume of high-quality training data in underwater settings is challenging, which consequently hampers the performance of object detection algorithms in aquatic vision tasks. The dearth of underwater image data results in a reduced number of object categories for training detectors, further impacting the efficacy of underwater object detection and recognition. To overcome these issues, this thesis emphasizes underwater data augmentation, category expansion, and performance improvement of object detection. The primary contributions are as follows. (1) To address the scarcity of underwater datasets, this thesis presents a pixel-level self-supervised learning-based underwater image synthesis method, SUG. First, an underwater image synthesis model is constructed, incorporating the physical principles of attenuation, scattering, and camera models involved in underwater vision formation as knowledge-driven factors. This model simulates the transformation of terrestrial images into underwater ones by emulating light signal changes during the imaging process. Second, a pixel-level self-supervised training strategy is designed, monitoring the light signal changes at each image location through pixel-level loss functions to achieve high-quality underwater image synthesis. This strategy requires only underwater images for training, eliminating the need for additional input, and thus facilitating data collection. Finally, experiments on various underwater dataset styles demonstrate that the SUG method can synthesize underwater images suitable for different underwater environments and lighting conditions. Moreover, the SUG method also obtains satisfactory results in land image testing. (2) To tackle the issue of limited detection categories resulting from the scarcity of underwater datasets, this thesis introduces the Unknown-Classifiable Open World Object Detection (UC-OWOD) problem, which aims to detect unknown objects and classify them into distinct unknown categories. Thereafter, a two-stage object detector, UC-Det, is proposed to deal with the UC-OWOD problem. First, the unknown label-aware proposal (ULP) and unknown-discriminative classification head (UCH) modules are designed to detect known and unknown objects. Second, similarity-based unknown classification (SUC) and unknown clustering refinement (UCR) modules are constructed to differentiate among multiple unknown categories. Additionally, two novel evaluation metrics, i.e., UC-mAP and UC-Recall, are devised to assess the performance of unknown object detection. Finally, extensive experiments on various on-land public datasets demonstrate the effectiveness of the proposed method. Moreover, UC-Det can successfully detect objects in the self-built underwater dataset and identify distinct unknown objects solely through training with land datasets. (3) Acknowledging the limited performance of many visual perception algorithms in underwater vision tasks due to the scarcity of underwater datasets, an underwater domain pre-training method (UDP) based on underwater synthetic data enhancement is proposed. First, by analyzing the open world object detection task from a transfer learning perspective, it is concluded that transforming source domain data to match the target domain distribution can improve performance in open tasks. Second, building on this analysis, an underwater domain pre-training strategy is proposed, employing the SUG method to generate synthetic underwater images for pre-training the model. The model is then fine-tuned using a small quantity of real underwater data to acquire underwater object categories. Finally, the proposed method is verified on various underwater scenes and various underwater visual perception tasks. The results indicate that UDP enhances detection performance for known and unknown objects by 21.26\% mAP and 4.97\% UC-Recall in underwater UC-OWOD. Additionally, the catastrophic forgetting experienced after incremental learning is significantly mitigated, improving detection performance for pre-existing categories by 11.86\% mAP. (4) To verify the practical applicability of the proposed theoretical methods in field environments, a visual perception system for underwater open environments is designed and tested. First, the underwater visual perception system is constructed by integrating SUG, UC-Det, and UDP. Second, employing a microcomputer equipped with the visual perception system and an underwater vehicle as the verification platform, object detection experiments are conducted in a field environment. Lastly, the experimental results demonstrate that the underwater visual perception system can locate and distinguish unknown objects in real underwater open environments during underwater robot exploration, while exhibiting the capacity for incremental learning of these unknown categories.
关键词	水下视觉感知水下图像合成开放世界目标检测迁移学习
语种	中文
七大方向——子方向分类	目标检测、跟踪与识别
国重实验室规划方向分类	水下仿生机器人
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52175
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	武志亨. 面向水下开放环境的视觉目标检测方法研究[D],2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
2020E8014682033武志亨.p（35766KB）	学位论文		限制开放	CC BY-NC-SA