CASIA OpenIR  > 毕业生  > 博士学位论文
通用视觉对象分类方法与系统
Alternative TitleGeneric Visual Object Categorization Method and System
程刚
Subtype工学博士
Thesis Advisor王春恒
2010-05-29
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline控制理论与控制工程
Keyword视觉对象分类 多特征融合 二值化特征变换 场景图像分类 判别性词典学习 Visual Object Categorization Multi-feature Fusion Adaptive Binarized Data Transformation Scene Image Categorization Discriminative Dictionary Learning
Abstract随着数码相机、摄像头、超高速扫描仪等各种图像获取设备的广泛应用以及互联网的迅猛发展,数码图像的数量呈指数级增长。如何让计算机自动理解图像的内容并使其具备人类的视觉分类能力,是计算机视觉领域研究的热点。本文针对通用视觉对象分类这一问题,从分类系统的设计,特征提取和变换,分类器的融合以及词典的训练等方面展开研究,主要内容包括: 第一,针对通用视觉对象分类问题,提出并设计实现了一种基于多特征融合的视觉对象分类系统。该系统基于Bag-of-Features模型,采用两种检测子和五种描述子组合构成多种局部特征,并采用空间金字塔划分加入结构信息以得到多通道的Bag-of-Features直方图特征,然后再将此特征进行自适应的二值化特征变换,最终将变换后的多通道特征通过核函数加以融合,提高视觉对象分类的准确率。本文将此系统应用于国际视觉对象分类竞赛The PASCAL Visual Object Classes Challenge 2009 (VOC2009),取得了令人满意的结果。 第二,视觉对象分类的一个重要应用就是根据图像的语义对图像进行分类和管理,其中场景图像的分类是最常见的问题之一,本文针对场景图像的分类问题提出了一种融合结构特征和纹理特征的场景图像分类方法。采用两级分类器对场景图像进行分类,第一级分类器利用全局结构信息得到候选类别,并通过分类结果判定相似类别对,第二级分类器则利用局部纹理信息区分相似类别,采用分类器的级联综合利用场景图像的整体结构信息和局部纹理信息,实验表明该方法能够做到不同场景类别鲁棒分类,相似场景类别有效区分,在15类场景图像的分类中达到了目前已知的最好分类准确率。 第三,在Bag-of-Features模型中,词典的学习尤为重要,传统的方法一般都是基于重构误差最小的准则去训练词典,为了提高Bag-of-Features特征的判别性,本文提出一种有判别性的稀疏表示词典学习方法。与其它的加入判别信息的学习方法不同,针对Bag-of-Features的特点,本文不仅利用图像块的类别信息而且通过加入整体图像的Fisher判别信息提高词典的判别性。实验表明,该方法比传统的词典学习方法具有更好的分类性能。
Other AbstractWith the development of image capturing devices such as digital cameras, video recorders and scanners, the amount of digital images increases explosively. A person can recognize thousands of categories of objects, however, it is difficult for a computer to achieve this level of performance. More and more researchers are involved in this direction. The following research has been conducted on system design, feature extraction and transformation, classifier fusion and dictionary learning: First of all, a muti-feature fusion system for visual object categorization has been proposed. The system is based on Bag-of-Features model. Two detectors and five descriptors are combined to capture the local features. In order to incorporate structure information, spatial pyramid matching is used. Then each channel of feature is transformed by adaptive binarized data transformation. Finally, all the channels are fused by extended Gaussian kernel. The performance of the system is evaluated by The PASCAL Visual Object Classes Challenge 2009 (VOC2009). In addition, scene classification is one of the most important applications of visual object categorization. A scene image categorization method is proposed based on structure and texture fusion. Structure information is used as the input of the first classifier and a few pairs of the similar categories are computed based on the results of first classification. The second classifier makes use of texture information to distinguish the similar categories. The experiment demonstrates that the proposed method has achieved the state-of-the-art results. Finally, dictionary learning is important to the performance of the Bag-of-Features model. Most of the existed methods are constructed based on the reconstruction error. For the sake of discriminant capacity of local features, a discriminative dictionary learning approach is introduced. A discrimination measure inspired by linear discriminant analysis is incorporated into the traditional dictionary learning and experiments have proved the validation of the method.
shelfnumXWLW1486
Other Identifier200718014628031
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/6259
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
程刚. 通用视觉对象分类方法与系统[D]. 中国科学院自动化研究所. 中国科学院研究生院,2010.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20071801462803(5550KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[程刚]'s Articles
Baidu academic
Similar articles in Baidu academic
[程刚]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[程刚]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.