单阶段目标检测中的关键问题研究

	单阶段目标检测中的关键问题研究
	谌强
	2021-05-28
页数	140
学位类型	博士
中文摘要	目标检测可获取图像中物体的位置信息，对图像内容分析和理解至关重要，被广泛应用于自动驾驶、智慧交通、无人零售、人机交互等领域。近年来，深度学习已成为目标检测的主流方法，按照检测流程可分为两阶段法和单阶段法两类。单阶段法由于流程简单，模型推理速度快，容易保持精度和速度平衡等优点，在实际部署应用中更受青睐。因此，本文聚焦在单阶段目标检测方法的研究。然而，由于应用场景各不相同，计算资源、样本数据、模型任务等条件千差万别，对目标检测模型的计算效率、检测精度、整体功能等方面提出了更高要求，需要更进一步的研究。本文从目标检测模型结构设计、模型压缩、模型任务扩展等方面入手，深入分析和研究更加高效与通用的单阶段目标检测方法。本文的主要研究成果与贡献归纳如下： 1. 基于单层级特征的单阶段目标检测方法。针对现有单阶段目标检测算法依赖多层级特征，导致模型结构变得复杂、计算量增加、推理速度变慢的问题，本文提出一种基于单层级特征的单阶段目标检测方法。首先，在单阶段目标检测中使用单层级特征代替多层级特征，消除了多尺度特征带来的消极影响。其次，通过引入空洞编码器和均衡匹配方法，提升了基于单层级特征的单阶段目标检测的精度。该方案简化了单阶段目标检测模型的结构、减少了模型的计算量、提升了模型的运行速度。在公开数据集的实验结果表明，相比同期其他方法，本方法达到了更好地速度与精度的平衡。 2. 单阶段目标检测的训练后量化与二值化。针对模型在训练后量化中精度损失过大的问题，本文首先提出了一种基于比特分割与缝合的训练后量化方法。该方法采用贪心优化的方式，使得量化模型在低比特量化的设定下，仍然可以保持原始模型的精度。其次，为了实现对单阶段目标检测模型的进一步压缩，本文提出了一个基于二值网络的单阶段目标检测方法。该方法引入了一系列训练技巧，有效地提升了基于二值网络的单阶段目标检测方法的精度。在多个公开数据集上的实验结果表明，该方法可以有效降低量化误差，相较于基线方法取得了显著的性能提升。 3. 基于强分类器的鲁棒单阶段目标检测方法。针对单阶段目标检测算法中分类器较弱，导致分类效果较差以及对背景变化不鲁棒的问题，本文提出一种基于强分类器的鲁棒单阶段目标检测方法。该方法的实现基于本文提出的位置感知的多支路空洞卷积模块。首先，该模块针对单阶段法平行子网络设计的不足，在分类器中引入物体位置信息，提升了分类器对预测框扰动的鲁棒性。其次，该方法通过增大分类器感受野的方式引入更多背景信息，提升了分类器在不同背景下的鲁棒性。在公开数据集的实验结果表明，该方法有效地提升了检测精度，获得了同期最好的结果。 4. 基于单阶段目标检测与空间信息流的全景分割。针对现有单阶段目标检测算法功能单一的问题，本文提出一种基于单阶段目标检测的全景分割方法。该方法通过引入物体空间位置信息流，使得单阶段目标检测模型可以兼顾目标检测任务和更复杂的环境理解任务。首先，该方法将全景分割任务分解为四个并行子任务，并设计了对应的并行子网络。其次，在子网络中利用图像物体位置信息，连接全景分割中所有子任务，使得它们具有物体位置感知能力。实验结果表明，该方法对图片场景的理解相比同期其他方法要更为准确，在多个场景数据集上都取得了同期最好的结果。
英文摘要	Object detection is essential to image content analysis and scene understanding, as it can locate objects in the images. Detection methods are widely applied in autonomous driving, intelligent transportation, intelligent retail, human-computer interaction. In recent years, deep learning-based methods dominate the object detection area. These methods can be roughly divided into two categories: two-stage methods and one-stage methods. Compared with two-stage detectors, one-stage detectors adopt a simpler pipeline, run faster, and achieve better speed and accuracy balance. Based on the above merits, one-stage detectors win the industrial community’s praise. Thus, this dissertation is focused on one-stage detectors. As there are considerable diﬀerences in application scenarios regarding computing resources, dataset sizes, and tasks, the detectors’ computing eﬃciency, detection performance, and overall functions are challenged. This dissertation considers the above key problems and aims to design eﬃcient and versatile one-stage detectors. Speciﬁcally, this dissertation explores the following critical aspects: network structure design, model quantization, and model task generalization. The main ﬁndings and contributions are as follows. 1. A simple one-stage detector based on single-level features is designed. Currently, most successful one-stage detectors rely on multiple-level features, making the network complex, bringing memory burdens, and slowing down the detector. This paper proposes a novel method based on the single-level feature setting to avoid the side eﬀects brought by multi-level features. By proposing the dilated encoder and the uniform matching, the detection performance of the novel one-stage detector is largely improved. It simpliﬁes the network structure, reduces the model’s computational cost, and speeds up the model’s inference. The experiment results show that the proposed detector achieves accuracy and speed balance. 2. A post-training quantization method and a binary one-stage detector are presented. This paper proposes a novel quantization method named bit-split and stitching to address the excessive accuracy loss in post-training quantization. The method adopts a greedy optimization pipeline and can maintain the full-precision model’s accuracy with low-bit quantization. Besides, this paper explores binary networks for detection and proposes a binary one-stage detector, which achieves further compression of the model. By proposing several training methods, the performance of the binary one-stage detector is improved signiﬁcantly. Experiment results on various datasets show that the proposed methods ensure a slight loss of model accuracy when quantizing models. 3. A robust one-stage detector based on a strong classiﬁer is proposed. This paper ﬁnds that the weak classiﬁer of one-stage detector limits the model classiﬁcation performance and robustness under various backgrounds by analyzing their results. This paper proposes a location-aware multi-dilation module, considering the eﬀectiveness of background information and the parallel head design in one-stage detectors. The module eﬀectively improves the robustness of the classiﬁer to the perturbation of predict boxes and various backgrounds, by increasing the receptive ﬁeld of the classiﬁer and adding object location information to the classiﬁer. Experimental results on various datasets show that the proposed method outperforms other one-stage detectors. 4. A new panoptic segmentation method based on a one-stage detector is introduced. The proposed method enables the object detection model to perform both object detection tasks and more complex vision tasks. In the model, spatial information ﬂows are introduced to deliver the object location to all sub-tasks in panoptic segmentation, allowing eﬀective information exchange between the sub-tasks. All sub-tasks are bridged by spatial information ﬂows in the proposed panoptic segmentation method. Experimental results on various datasets show that the proposed method can achieve better scene understanding performance than other panoptic segmentation counterparts.
关键词	目标检测网络结构设计模型量化鲁棒性全景分割
语种	中文
七大方向——子方向分类	图像视频处理与分析
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44905
专题	紫东太初大模型研究中心_图像与视频分析
通讯作者	谌强
推荐引用方式 GB/T 7714	谌强. 单阶段目标检测中的关键问题研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
毕业论文-谌强.pdf（17553KB）	学位论文		开放获取	CC BY-NC-SA