单阶段目标检测中的关键问题研究
谌强
2021-05-28
页数140
学位类型博士
中文摘要

目标检测可获取图像中物体的位置信息,对图像内容分析和理解至关重要, 被广泛应用于自动驾驶、智慧交通、无人零售、人机交互等领域。近年来,深度 学习已成为目标检测的主流方法,按照检测流程可分为两阶段法和单阶段法两 类。单阶段法由于流程简单,模型推理速度快,容易保持精度和速度平衡等优点, 在实际部署应用中更受青睐。因此,本文聚焦在单阶段目标检测方法的研究。

然而,由于应用场景各不相同,计算资源、样本数据、模型任务等条件千差 万别,对目标检测模型的计算效率、检测精度、整体功能等方面提出了更高要 求,需要更进一步的研究。本文从目标检测模型结构设计、模型压缩、模型任务 扩展等方面入手,深入分析和研究更加高效与通用的单阶段目标检测方法。

本文的主要研究成果与贡献归纳如下:

1. 基于单层级特征的单阶段目标检测方法。针对现有单阶段目标检测算法 依赖多层级特征,导致模型结构变得复杂、计算量增加、推理速度变慢的问题, 本文提出一种基于单层级特征的单阶段目标检测方法。首先,在单阶段目标检测 中使用单层级特征代替多层级特征,消除了多尺度特征带来的消极影响。其次, 通过引入空洞编码器和均衡匹配方法,提升了基于单层级特征的单阶段目标检 测的精度。该方案简化了单阶段目标检测模型的结构、减少了模型的计算量、提 升了模型的运行速度。在公开数据集的实验结果表明,相比同期其他方法,本方 法达到了更好地速度与精度的平衡。

2. 单阶段目标检测的训练后量化与二值化。针对模型在训练后量化中精度 损失过大的问题,本文首先提出了一种基于比特分割与缝合的训练后量化方法。 该方法采用贪心优化的方式,使得量化模型在低比特量化的设定下,仍然可以保 持原始模型的精度。其次,为了实现对单阶段目标检测模型的进一步压缩,本文 提出了一个基于二值网络的单阶段目标检测方法。该方法引入了一系列训练技 巧,有效地提升了基于二值网络的单阶段目标检测方法的精度。在多个公开数据 集上的实验结果表明,该方法可以有效降低量化误差,相较于基线方法取得了显 著的性能提升。

3. 基于强分类器的鲁棒单阶段目标检测方法。针对单阶段目标检测算法中分类器较弱,导致分类效果较差以及对背景变化不鲁棒的问题,本文提出一种基 于强分类器的鲁棒单阶段目标检测方法。该方法的实现基于本文提出的位置感 知的多支路空洞卷积模块。首先,该模块针对单阶段法平行子网络设计的不足, 在分类器中引入物体位置信息,提升了分类器对预测框扰动的鲁棒性。其次,该 方法通过增大分类器感受野的方式引入更多背景信息,提升了分类器在不同背 景下的鲁棒性。在公开数据集的实验结果表明,该方法有效地提升了检测精度, 获得了同期最好的结果。 4. 基于单阶段目标检测与空间信息流的全景分割。针对现有单阶段目标检 测算法功能单一的问题,本文提出一种基于单阶段目标检测的全景分割方法。该 方法通过引入物体空间位置信息流,使得单阶段目标检测模型可以兼顾目标检 测任务和更复杂的环境理解任务。首先,该方法将全景分割任务分解为四个并 行子任务,并设计了对应的并行子网络。其次,在子网络中利用图像物体位置信 息,连接全景分割中所有子任务,使得它们具有物体位置感知能力。实验结果表 明,该方法对图片场景的理解相比同期其他方法要更为准确,在多个场景数据集 上都取得了同期最好的结果。

英文摘要

Object detection is essential to image content analysis and scene understanding, as it can locate objects in the images. Detection methods are widely applied in autonomous driving, intelligent transportation, intelligent retail, human-computer interaction. In recent years, deep learning-based methods dominate the object detection area. These methods can be roughly divided into two categories: two-stage methods and one-stage methods. Compared with two-stage detectors, one-stage detectors adopt a simpler pipeline, run faster, and achieve better speed and accuracy balance. Based on the above merits, one-stage detectors win the industrial community’s praise. Thus, this dissertation is focused on one-stage detectors.

As there are considerable differences in application scenarios regarding computing resources, dataset sizes, and tasks, the detectors’ computing efficiency, detection performance, and overall functions are challenged. This dissertation considers the above key problems and aims to design efficient and versatile one-stage detectors. Specifically, this dissertation explores the following critical aspects: network structure design, model quantization, and model task generalization.

The main findings and contributions are as follows.

1. A simple one-stage detector based on single-level features is designed. Currently, most successful one-stage detectors rely on multiple-level features, making the network complex, bringing memory burdens, and slowing down the detector. This paper proposes a novel method based on the single-level feature setting to avoid the side effects brought by multi-level features. By proposing the dilated encoder and the uniform matching, the detection performance of the novel one-stage detector is largely improved. It simplifies the network structure, reduces the model’s computational cost, and speeds up the model’s inference. The experiment results show that the proposed detector achieves accuracy and speed balance.

2. A post-training quantization method and a binary one-stage detector are presented. This paper proposes a novel quantization method named bit-split and stitching to address the excessive accuracy loss in post-training quantization. The method adopts a greedy optimization pipeline and can maintain the full-precision model’s accuracy with low-bit quantization. Besides, this paper explores binary networks for detection and proposes a binary one-stage detector, which achieves further compression of the model. By proposing several training methods, the performance of the binary one-stage detector is improved significantly. Experiment results on various datasets show that the proposed methods ensure a slight loss of model accuracy when quantizing models.

3. A robust one-stage detector based on a strong classifier is proposed. This paper finds that the weak classifier of one-stage detector limits the model classification performance and robustness under various backgrounds by analyzing their results. This paper proposes a location-aware multi-dilation module, considering the effectiveness of background information and the parallel head design in one-stage detectors. The module effectively improves the robustness of the classifier to the perturbation of predict boxes and various backgrounds, by increasing the receptive field of the classifier and adding object location information to the classifier. Experimental results on various datasets show that the proposed method outperforms other one-stage detectors.

4. A new panoptic segmentation method based on a one-stage detector is introduced. The proposed method enables the object detection model to perform both object detection tasks and more complex vision tasks. In the model, spatial information flows are introduced to deliver the object location to all sub-tasks in panoptic segmentation, allowing effective information exchange between the sub-tasks. All sub-tasks are bridged by spatial information flows in the proposed panoptic segmentation method. Experimental results on various datasets show that the proposed method can achieve better scene understanding performance than other panoptic segmentation counterparts.

关键词目标检测 网络结构设计 模型量化 鲁棒性 全景分割
语种中文
七大方向——子方向分类图像视频处理与分析
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/44905
专题紫东太初大模型研究中心_图像与视频分析
通讯作者谌强
推荐引用方式
GB/T 7714
谌强. 单阶段目标检测中的关键问题研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
毕业论文-谌强.pdf(17553KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[谌强]的文章
百度学术
百度学术中相似的文章
[谌强]的文章
必应学术
必应学术中相似的文章
[谌强]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。