基于视觉码本模型的物体图像识别与近相似图像检索

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于视觉码本模型的物体图像识别与近相似图像检索
其他题名	object recognition and near-duplicated image retrieval based on bag of visual words model
	刘廷麟
	2010-05-23
学位类型	工学硕士
中文摘要	物体图像识别和近相似图像检索一直以来都是计算机视觉领域内的经典问题，同时也是人机交互、智能机器人、数字媒体检索等科技发展中的关键技术，因此，对这些问题的研究具有非常重要的意义。近些年来，由于视觉码本模型诸多突出的性能，例如计算简便、对外界干扰鲁棒等等，目前在计算机视觉领域内已经取得了非常广泛的应用，并成为新的研究热点。本文的工作围绕对视觉码本模型的研究而展开，主要总结分析了其在物体图像识别和近相似图像检索两个领域内的相关研究工作和具体应用。深入剖析了将视觉码本模型应用在这两个领域中可能存在的问题，并就这些问题提出了一些改进算法。本文的主要工作和创新如下： 1. 对图像的视觉码本表示方法进行了全面的综述，对各步骤中出现的不同方法进行了优缺点分析和评价。重点对将其分别应用在物体图像识别和近相似图像检索中的具体流程进行了深入研究。 2. 针对在物体图像识别任务中，图像的视觉码本表示由于忽略了局部特征点的图像空间位置信息，而导致判别能力受到影响的问题，提出了一种结合查询扩展的视觉码本图像表示方式，能够方便的挖掘视觉码字的两两共生关系，并借此对图像的视觉码本表示进行类似文本检索中查询扩展的修正操作，从而提高最终图像表示方式的鲁棒性能和判别性能。 3. 揭露了传统的视觉码本模型应用在近相似图像检索中时，存在匹配特征点映射不一致的问题。针对此，提出了一种基于特征点匹配信息的视觉码本生成方式，能够有效的提高近相似图像间视觉码字的匹配性能，从而改善图像检索的识别能力。 4. 研究并实现了一个基于层次视觉码本模型的实时光碟检索系统，全部工作都是在Windows-PC平台上用 Visual C++ 2008 开发工具实现完成。在包含7万张光碟图像的数据库中，能够近乎实时的（检索时间小于1秒）、而且非常准确的查询出用户通过摄像头输入的光碟信息。总的说来，本文对基于视觉码本模型的物体图像识别和近相似图像检索作出了有益的探索。
英文摘要	Object recognition and near-duplicated image retrieval have been the classical and challenging problems in computer vision. The extensive applications in human-computer interface, intelligent robot, digital media retrieval, etc. make the relevant researches very meaningful and helpful. In recent years, the bag of visual words model has attracted much attentions due to its simplicity and robustness to the environmental noise. This thesis mainly emphasizes on the discussion about the bag of visual words model, and focuses on its applications and relevant researches upon object recognition and near-duplicated image retrieval. Besides, we will analysis some existing problems in these two fields under the bag of visual words model, and accordingly provide our novel solutions. The main contents and contributions of this thesis include: 1. A comprehensive review for the image representation methods under the bag of visual model is presented, as well as the discussion about the respective advantages and disadvantages of each proposed methods. And the emphases are also placed upon each specific operation flow in object recognition and near-duplicated image retrieval under the bag of visual words model. 2. According to the problem in object recognition that the image representation will suffer from the ignorance of the spatial relationship between the local key-points, we propose the expanded bag of visual words presentation for object recognition. In this method, the classical bag of visual words representation is updated based on the Query Expansion algorithm using the explored mutual co-occurrence relationship between the visual words, and we demonstrate the significant improvement in the robustness and discriminativeness of the novel representation. 3. The problem in near-duplicated image retrieval based on the bag of visual words is uncovered that the matching key-points pairs could not always successfully be assigned to the same visual word. Accordingly, another novel algorithm is proposed in the thesis to improve the matching probability of the relevant images, and thus refine the final retrieval results. 4. We implement a real-time CD retrieval system based on the hierarchical bag of visual words model, the system is completed upon the flatform of Windows-PC and developed by Visual C++ 2008. The real-time(less than 1 sec) and accurate retrieval results can illustrate the success of the system. In a word, in this thesis, we have made a lot of fruitful...
关键词	视觉码本模型物体图像识别近相似图像检索 Bag Of Visual Words Object Recognition Near-duplicated Image Retrieval
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7508
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	刘廷麟. 基于视觉码本模型的物体图像识别与近相似图像检索[D]. 中国科学院自动化研究所. 中国科学院研究生院,2010.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20072801462803（7444KB）			限制开放	CC BY-NC-SA