高分航空遥感图像复杂城区车辆检测关键技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	高分航空遥感图像复杂城区车辆检测关键技术研究
	李非墨
	2017-05-23
学位类型	工学博士
中文摘要	对于高分航空遥感图像的交通信息提取而言，对车辆目标的位置、朝向角和类别三要素的估计是开展后续一系列智能信息提取任务的基础和前提。对于现阶段常见的高分航空遥感图片而言，图像的尺度普遍在3K~4K 之间，急剧增加的图像尺度一方面有助于提供更多有效的纹理细节，另外一方面也加剧了来自其他地物目标的干扰。此外，相比于整幅遥感图像而言，图像中实际车辆面积的占有比依然很小。所以，受到相对尺度小、朝向角不确定、干扰性地物众多等多方面因素的影响，上述车辆三要素的估计难度依然很高。本文作为一项城区警用无人机系统车辆检测功能模块的预研，关注点集中在两方面：一是计算量和计算复杂度问题，即如何在合理的计算量和计算时间范围内完成对高达4K的遥感图像的搜索；二则是改良现有的“先定位再分类”估计模式，减少该过程中位置和角度、位置和类别之间的相互干扰。由于各自算法设计要求和所使用实验数据性质上的差异，很多现有的面向通用目标或是遥感专有目标的检测方法并不能很好地解决以上两点问题，还需要进一步优化和改进才能符合本文的应用需求。对此，本研究以车辆显著性区域提取、车辆的联合位置角度估计、车辆的分类这三个主题作为切入点，结合车辆显著性特征设计、浅层和深度检测模型等相关理论方法，展开了以下方面的深入研究：（1）针对高分辨率航空遥感图像的画幅尺寸过大、处理时间过长和计算开销过高的问题，提出了一种借助同区域遥感参考图像来对当前航拍图像进行显著性区域检测的方法。由于高分航拍图像内容复杂、干扰性地物繁多，现有的车辆显著性区域提取方法往往依赖手工的范围限定，或是借助获取难度高的外部参考信息。本文利用同区域参考图像特征空间上的相似性，用参考图训练获得的显著性模型来处理当前航拍图，从而实现了无范围限制、相对准确的车辆存在区域估计。其中，为了获得更加稳定有效的显著性特征表示，本文使用了一种基于图像分割的内外环描述子提取图像对象属性，并配合以层级聚类为基本手段的非显著性特征编码方法，来检测并排除图像中高纯度的非车辆区域，实现了最优的显著性区域压缩效率。（2）为弥补单角度组合式分类器在位置、角度估计性能上的不足，提高各组合分类器角度类别之间的关联关系，提出了使用多角度检测器替代单角度检测器分组的设计方案。在该设计模式下，原本相对孤立的各角度关联关系得到了显著增强，而原单角度感知检测器中严重的正负样本不均衡情况也得以有效地缓解。在分组数量固定的前提下，为了能够最大限度地提升多类分类器对所分配角度组的区分性，本文以局部特征位置和朝向角之间的关联关系为出发点，为每个多类分类器分配与特定局部特征组相关性最强的角度子集，从而将易混淆角度有效地绑定在了一起，实现了组合式检测器总体定位和角度估计性能的显著提升。（3）针对车辆分类问题中的类别不均衡现象，提出一种具有优良代价收益性质的二分式网络扩展方法，并明显提升了原有网络在弱势类别上的分类性能。对于高分航空遥感中的车辆检测问题而言，首先其中过小的车辆目标尺度限制了可用于精细车辆分类的图像细节信息量，其次本身多样而互有交叉的车辆类型导致了异常的类间和类内差异、不均衡的类别样本数量分配，形成典型的类别不均衡现象，并影响了分类精确度。使用卷积神经网络作为车辆分类器，能够利用其鲁棒的表象模型极大地弥补特征信息量上的缺失，但对类别非均衡问题依然略显无力。基于卷积核的语义性纹理关联性，本文提出了一种基于网络扩展的类别非均衡问题应对方案。该方案利用一种新的类别显著性指标来筛选较低卷积层中的卷积核，并以一个单层的全连接层来扩充到原网络中，修正了原有的类别估计似然度，并极大地缩减了扩展中的卷积开销和连接开销。在最终的实验对比中，按照此方法扩展所得网络结构与具有相似规模的网络进行了对比，证明其能够以更小的扩展代价换取同水平甚至更好的分类提升效果，因而具有更高的代价收益比。通过上面一系列关键技术的研究，本文为基于高分航空遥感图像的交通信息提取系统提供了一系列可行解决方案的预研，具体领域涉及：大画幅处理效率问题，高效的位置和角度协同估计问题，密集排列下位置和类别准确配对问题，以及车辆分类中的非均衡问题。以公开的航空遥感数据集作为实验样本，本文所提出的相关的方案在贴近现实的测试环境中进行了有效性验证，为后续进一步向实用性算法系统集成提供了相应的研究基础。
英文摘要	To extract traffic information for an Intelligent Traffic Information Collection System based on high resolution airborne images, the estimations on vehicle location, orientation and type are the theoretical and practical basis of subsequent related applications. For most of the high resolution airborne images used nowadays, their scales common range between 3K and 4K pixels. This helps with fine-grained vehicle detection and categorization, but introduces more disturbances from other ground objects. Moreover, despite of the enlarged image scale, the target vehicles are still too small for efficient recognition. As part of the pre-research of an urban police-use unmanned aerial vehicle (UAV) system, this article mainly focuses on two issues: the first one, is the the problem of computational time and cost for 4K aerial image based vehicle detection; the second one, is the problem with the separated recognition scheme, where orientations and types are estimated after vehicle locations are gotten. This scheme will introduce appearance diversities by different vehicle orientations and types, and hazard the localization accuracy. Because of the variances in detailed experimental settings, the direct appliance of many existing object detection algorithms will introduce many new challenges, and modifications are needed to make them fit for our requirement. In this article, the vehicle salient region extraction, joint vehicle localization and orientation estimate as well as the vehicle categorization are selected as the central problems. To address these issues, further studies are made by utilizing theories from fields including saliency, shallow hand-crafted feature design and deep feature learning. The content of this research are arranged as follows: (1) In order to make the computational costs of large scaled airborne image based vehicle detection more reasonable, a novel vehicle salient region detection algorithm is proposed, which utilizes historical remote sensing images in the same region as a reference to generate prediction model. Because of the high texture complexity and heavy disturbances in high resolution airborne images, many existing vehicle salient region detection methods are built on manual assistance or expensive external information. To cut down the reference cost, in this article, the feature space similarities between remote sensing images in the same region is used as the assistance information for the estimation of salient region. Because of such feature space similarity, the estimation model trained from the reference image turns to be accurate and robust. Implementation details for the prediction model would include the usage of a center-surround image object sampling descriptor, and a hierarchical clustering based vehicle insignificance codebook learning (VICL) algorithm. The combinatorial usage of these two techniques helps to eliminate image regions without vehicles, and a high region compression ratio has been achieved on averaged re-call rate deficiency. (2) To complement the deficiency on localization and orientation estimation, and to improve the orientation prediction correlations between the compositional detectors, a novel multi-class detector based detector composition is proposed to the binary detector based one. With such design, correlations between the isolated angles are greatly enhanced, and the class imbalance issue in the binary detector has been alleviated. In order to maximize the orientation discrimination power of each compositional detector, the orientation correlated local features are utilized to optimally distribute the predicting angles, so that the overall localization and orientation prediction performance can be maximized. (3) To address the class-imbalance issue in vehicle categorization, a bi-partite network extension scheme is proposed to improve the prediction accuracies on minority classes. Vehicles in high resolution aerial images are considerably small, with limited amount of imagery information for fine-grained classification. Furthermore, with the existence of crossover vehicle types, the abnormal categorical distribution of vehicles further skewed, making the classifier more likely to bias towards the majority ones. Based on the semantic texture modeling pattern of convolutional kernels, a highly cost-effective network extension scheme is proposed in this chapter. In the proposed method, feature maps from lower convolutional layers are filtered and reused by a single convolutional layer, and further optimized by a class-imbalance sensitive main-side loss function. With such arrangement, the convolutional and connection costs in extension have been largely reduced. The resulting network is capable of achieving comparative or better performance on minority classes comparing to its similarly structured counterparts. With these pre-mentioned studies, the content of this article can be taken as the pre-research for a near real-life Intelligent Traffic Information Collection System. These proposed methods address problems include: efficient location and orientation estimation, accurate vehicle type and location bounding, as well as class-imbalance alleviation. Based on public aerial image datasets, the effectiveness of these methods has been proved experimentally, which makes them as the theoretical and practical basis for real-life application in the future.
关键词	高分航空遥感复杂城区车辆检测朝向估计车辆分类
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14750
专题	毕业生_博士学位论文
作者单位	中国科学院大学
推荐引用方式 GB/T 7714	李非墨. 高分航空遥感图像复杂城区车辆检测关键技术研究[D]. 北京. 中国科学院研究生院,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
毕业论文最终版_20170612.pdf（16120KB）	学位论文		限制开放	CC BY-NC-SA