基于深度学习的遥感图像目标检测技术研究
黄河
2019-05-21
页数96
学位类型硕士
中文摘要

目标检测是遥感图像处理与分析领域的一个重要研究方向,其广泛用于资源探测、港口监测、交通疏导等军用和民用领域。随着对地观测技术的进步,获取高分辨率遥感图像的途径越来越多,人们对于从海量遥感图像中提取有价值的信息愈加迫切,这对遥感图像目标检测提出了巨大挑战。传统的基于手工特征的目标检测方法由于鲁棒性差、检测精度较低,已经无法适应当前复杂的图像处理需求。近年来,以深度神经网络为基础的深度学习,通过堆叠多层神经网络逐层提取抽象特征,具有强大的特征表达能力。在多项自然图像识别任务上,基于深度学习的方法都展现出了强大的泛化能力,性能远超基于传统特征的方法。但是,由于遥感图像的特殊性及目标的复杂性(如目标尺度差异巨大、旋转多变性等),基于深度学习的遥感图像目标检测还有很多难点没有解决,限制了实际应用。在这种背景下,本文旨在结合遥感图像的特点,立足目标检测实际应用需求,围绕目标的尺度多变性、旋转多变性这两个瓶颈难点对基于深度学习的遥感图像目标检测技术进行深入研究。主要研究内容与贡献如下:


1. 提出了实例尺度归一化来解决遥感图像中目标尺度变化大的问题。该方法将所有目标都归一化到预设的、较小的尺度范围内训练与测试,可减小尺度变化对检测精度的影响。为了保留图像中目标的特征多样性并加速大尺寸遥感图像的训练,本文结合图像金字塔与贪婪图像块生成法,实现了灵活的实例尺度归一化。该方法的有效性在遥感图像目标检测任务上得以验证,其泛化能力在多个自然图像实例相关的任务上也得到验证。本文基于实例尺度归一化与特征金字塔网络实现了多尺度的遥感图像目标检测,在公开数据集上取得领先精度。
2. 针对遥感图像中目标旋转多变的问题,提出了两种方法。基于可变形卷积的旋转不变目标检测方法,使用可变形卷积取代普通卷积,以实现更好的水平框目标检测。可变形卷积可根据图像内容自适应调整卷积采样位置,提取旋转不变性特征。在训练过程中,可以将可变形卷积的采样偏移初始化为0,以减轻网络训练难度。通过实验发现,在网络的高层使用可变形卷积对于遥感图像目标检测更加有效。基于最适匹配的旋转框目标检测方法(FRCNN-OBB)直接进行旋转框目标检测。旋转框相比水平框能更精准地定位目标位置与方向。在FRCNN-OBB的第一阶段,回归目标是真实目标的最小外接水平矩形框,候选框提取网络在此过程生成候选框;第二阶段回归时,针对候选框与真实框匹配后的回归路径太长的问题,提出最适匹配的方法来缩短回归路径,以减轻回归器训练难度,实验验证了其有效性。
 

英文摘要

Object detection is an important research direction in the field of remote sensing image processing and analysis. It is widely used in resources exploration, port monitoring, traffic dredging and other military and civilian fields. With the progress of earth observation technology, there are more and more ways to obtain high-resolution remote sensing images. At the same time, it is more and more urgent to extract valuable information from massive remote sensing images, which poses a great challenge to remote sensing object detection. The traditional object detection methods based on manually designed features have been unable to meet the current complex image processing requirements due to its poor robustness and low detection accuracy. In recent years, deep learning based on deep neural network, through stacking multi-layer neural network and extracting abstract features layer by layer, has strong feature representation ability. In many natural image recognition tasks, the method based on deep learning has shown strong generalization ability, and its performance is much better than that based on traditional features. However, due to the complexities of remote sensing images, such as significant differences with respect to object sizes and rotation variations, traditional deep learning-based object detection approaches are inadequate for remote sensing images, and practical applications are thus being hindered. In this context, this dissertation aims to study object detection based on deep learning technology by focusing on the characteristics of remote sensing images. The main contents and contributions of this dissertation are as follows:

1. An instance scale normalization method is proposed to solve the problem of large scale variation of objects. By normalizing all objects into a predefined smaller scale range for training and testing, this method can eliminate the impacts of large scale variations on detection performance. In order to preserve the diversity of other features and accelerate the training of large-size remote sensing images, this dissertation proposes a method combining image pyramid and greedy patch generation to achieve flexible instance scale normalization. The effectiveness of the proposed method is verified in remote sensing image object detection task. The generalization is also verified on several instance related recognition tasks in natural image. A multi-scale remote sensing image object detection algorithm is proposed based on instance scale normalization and feature pyramid network and achieves the state-of-the-art accuracy on public dataset.

2. To address the problem of multi-rotation of target in remote sensing image, two approaches are proposed in different perspectives. One is that, the deformable convolution based rotation-invariant object detection method, where the ordinary convolution is replaced by deformable convolution for advanced horizontal bounding box object detection in remote sensing images. Deformable convolution can adaptively adjust the sampling position of convolution according to the image content and extract rotation invariant features. In the training process, the sampling offset of deformable convolution is initialized to be 0 to reduce the difficulty of network training. Experiments show that the use of deformable convolution at high level of ResNet is more useful for remote sensing object detection. Secondly, to obtain more accurate location and direction of the object, the FRCNN-OBB algorithm is proposed for rotating bounding box object detection. In the first stage, the region proposal network only regresses to the minimum external horizontal rectangular box of the object, and the candidate boxes are generated. In the second stage of regression, considering the problem that the regression path is too long after the candidate box is matched with the
real object, the optimal matching technique is proposed to shorten the regression path and reduce the training difficulty of the regression. The experiments verify the effectiveness of the proposed method.

关键词目标检测 光学遥感图像 深度学习 尺度不变性 旋转不变性
语种中文
七大方向——子方向分类目标检测、跟踪与识别
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/23935
专题多模态人工智能系统全国重点实验室_先进时空数据分析与学习
推荐引用方式
GB/T 7714
黄河. 基于深度学习的遥感图像目标检测技术研究[D]. 中科院自动化所. 中科院自动化所,2019.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
黄河-毕业论文.pdf(33964KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[黄河]的文章
百度学术
百度学术中相似的文章
[黄河]的文章
必应学术
必应学术中相似的文章
[黄河]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。