图像目标检测与识别技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	图像目标检测与识别技术研究
其他题名	Object Detection and Recognition in Images
	夏晓珍
	2010-12-04
学位类型	工学博士
中文摘要	随着计算机、多媒体以及网络技术的飞速发展，以及各种压缩技术和大容量存储技术的不断出现，以图像、声音和视频为主要内容的多媒体信息迅速成为信息交流与服务的主流。如何有效的管理和利用如此庞大的多媒体数据资源，以及从这些海量的多媒体数据中快速找到用户需要的数据和资源，是当前基于内容的图像检索系统中需要重点解决的问题。目标检测与识别是基于内容的图像检索领域重要研究内容之一，其研究成果对智能视频监控、图像视频检索以及信息安全领域具有重要的应用价值。本文对图像中目标检测与识别技术进行了较深入的研究和探讨。论文的工作主要体现在以下几个方面： 1）针对目标匹配过程中存在的曲面镜形变、非刚性形变、仿射形变以及类内形变等问题，本文提出了一种基于局部表观上下文特征的目标匹配算法。该特征分为局部表观特征和局部上下文特征两部分，前者描述了目标局部区域的表观属性，后者不仅描述了局部区域内其它非中心块相对于中心块的空间位置关系，而且描述了非中心块与中心块的表观相似程度。目标匹配过程中不需要大规模训练样本和先验知识的学习。实验过程中我们将此特征与其它特征相比较，结果表明该特征在目标匹配方面具有更好的性能。 2）考虑到目前大部分目标检测方法采用整体训练方式，忽略了目标部件信息以及部件之间的空间关系，本文提出了一种将部件结构模型与Boost级联分类器相结合的目标检测与定位识别算法，目标的各个部件位置由手工标注产生，对各个部件建立级联分类器，部件之间的空间关系采用星型结构。与以往算法相比本文提出的算法进一步提高了目标检测与定位识别的精度。此外，针对训练样本的各个部件位置由手工标注产生，增加了人工干扰因素，本文还提出了一种在半监督模式下将部件结构模型与Boost级联分类器相结合的目标检测与定位识别算法。即给定目标整体的位置，在训练过程中自动定位目标各个部件的位置。实验结果表明我们的算法与其它算法在性能上具有可比性。 3）考虑到上下文信息对目标识别性能的影响，本文提出了一种基于多信息融合策略的目标类识别算法。该算法将语义上下文信息融合到目标分类识别算法中。语义上下文信息具体包括场景与目标之间的语义上下文关系、目标与目标之间的语义上下文关系。实验结果表明融合语义上下文信息的目标识别算法在目标类的识别性能上有了进一步的提高，能够更有效地帮助传统算法在解决目标存在形变问题上所遇到的困难。 4）设计了网络广告内容监控管理系统，并将目标匹配技术成功应用于该系统中，实现了网络广告分类和违规内容监控。具体为采用本文提出的目标匹配算法实现了网络广告商标检测，对网络广告库的检测结果表明本文提出的目标匹配算法与其它算法相比能够有效地提高商标检测的正确率。
英文摘要	With the rapid development of computer, multimedia, and network technology, various types of multimedia resources such as images, flashes, and videos have become the mainstream of information exchange. How to effectively manage and use the large number of multimedia resources, and how to quickly find useful data for users from them are the key problems in content-based image retrieval system. Object detection and recognition is one of the most important research works in the field of content-based image retrieval, which has significant application value for smart video surveillance, image and video retrieval, information security and so on. In this thesis, we make an intensive study of object detection and recognition in images. The main contributions of this thesis are listed as follows: Firstly, we present a novel approach to measuring similarity between objects based on matching local “appearance contextual descriptor”, which is robus across a substantial range of lens deformation, non-rigid deformation, local affine deformation, intra-class deformation, etc. The descriptor has two components: histogram of oriented gradient feature representing local patch appearance and the contextual descriptor capturing not only the spatial distribution of the non-reference patches relative to the reference patch but also the appearance similarities between the reference patch and the non-reference patches in the region. We treat recognition in a nearest-neighbor classification framework and match object in regions with no prior learning. We compare our method to commonly used methods and demonstrate its applicability to object matching. Secondly, we propose a new method for object detection that integrates part-based model with cascades of boosted classifiers. The parts are labeled in a supervised manner. For each part, we construct a boosted cascade by selecting the most discriminative features from a large set and combining more complex classifiers. Then we learn a model of the spatial relations between those parts. The experimental results demonstrate that training a cascade of boosted classifiers for each part and adding spatial constraints among parts improve performance of detection and localization. In addition, in order to avoid noise that hand-labeling the training images may add, we learn the part models in a weakly supervised manner, where object labels are provided but part labels are produced by training. The experimental results ...
关键词	目标检测与识别分类识别图像检索局部特征目标匹配部件结构级联分类器多信息融合语义上下文广告分类 Object Detection And Recognition Category Recognition Image Retrieval Local Feature Object Matching Part-based Structure Cascade Classifier Multi-information Fusion Semantic Context Ads Classification
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6314
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	夏晓珍. 图像目标检测与识别技术研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2010.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20071801462806（2846KB）			暂不开放	CC BY-NC-SA