CASIA OpenIR  > 毕业生  > 博士学位论文
结构稀疏学习及其在图像检索中的应用研究
Alternative TitleStructured Sparse Representation And Its Application To Image Retrieval
康翠翠
Subtype工学博士
Thesis Advisor向世明
2015-05-21
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword稀疏学习 Lasso 核方法 度量学习 人脸识别 跨模态交叉匹配 “文本–图像”检索 Sparse Representation Lasso Kernels Metric Learning Face Recognition Cross-modal Matching Image Retrieval
Abstract在数字信号处理和机器学习领域中,稀疏表示是一个基础且重要的研究问题,并在大量的实际应用中均表现出优异的性能。稀疏学习是一类基于稀疏表示的机器学习方法。经典的稀疏学习模型是基于零范数约束的线性回归模型,是一个NP-Hard 问题。LASSO 问题的提出使得稀疏学习问题从凹问题变成了非平滑的凸问题,从而奠定了稀疏学习成为一个热点研究方向的基础。尽管存在多种改进的LASSO 模型,它们仍不能很好地处理呈非线性分布的数据,也不能充分地挖掘和利用数据的内部结构。为此,论文将在核(非线性)稀疏学习模型构建和结构稀疏学习模型构建两方面开展理论和算法研究。另外,由于智能电子终端设备的迅猛发展使得基于数字媒体数据的智能应用需求不断扩大,智能图像检索成为了现代互联网应用中的一个核心研究内容。一方面,人脸识别作为图像检索问题的特例,在遮挡等扰动情况下的鲁棒学习算法仍需进一步研究。另一方面,``不受限于数据模态'' 是构建新一代图像检索系统的基本要求,构建基于跨模态学习的检索技术是实现这一目标的重要途径。跨模态学习的目的是实现不同模态数据的直接匹配。然而,模态差异性的存在使得这种直接匹配难以实现。为此,在应用研究方面,论文将开展基于结构稀疏学习和跨模态学习的智能图像检索方法研究。论文的贡献主要包含以下几个方面: 1. 提出了基于核坐标下降的核稀疏学习方法和基于核同伦的核稀疏学习方法。其中,核坐标下降算法的核心思想是在固定其他坐标分量的前提下,对每一个当前坐标分量分别进行更新。而核同伦算法则在整个优化过程中维护一个支撑集合,并不断地对该支撑集合添加激活原子和剔除非激活原子。此外,在这两个核稀疏学习优化方法的基础上,针对人脸识别问题,论文提出了一种海明核构建方法,其核心思想是通过利用局部图像特征来构建基于非欧距离度量的核学习模型。对比研究表明,论文所提方法能更好地解决困难人脸识别问题中的小样本训练、随机噪声、局部遮挡以及剧烈光照变化等问题,具有更好的鲁棒性。 2. 提出了基于特征结构学习的判别子空间学习算法,其目的是使所学目标子空间特征对局部遮挡等局部扰动情况具有更好的鲁棒性。该方法利用特征的局部结构对目标子空间进行稀疏约束,从而使目标子空间建立在具有局部部件的基向量基础之上。因此,样本数据的目标特征表达由具有不同局部部件的基向量的相关系数组成,进而可以更好地处理局部遮挡等图像的局部扰动问题。基于所构建的数学模型,本文还提出了一种学习多个并列判别子空间且进行子空间特征融合的方法,从而可以学习更为准确的局部部件。在两个国际著名人脸数据库上的实验对比表明,本文算法具有更好的鲁棒性。尤其是在具有局部遮挡和局部光照的人脸图像子集上,本文算法的识别性能提升十分显著。 3. 提出了一种跨模态协同线性回归方法。该算法通过在回归目标空间学习一个能够联系不同模态的信息关联矩阵,实现了跨模态数据的交流和互补,使目标特征具有更好的鲁棒性。因此,在学习模型构建方面,该方法与传统的采用隐子空间来实现模态间信息关联的方法显著不同。此外,该...
Other AbstractSparse representation is an important fundamental research topic in the Signal Processing and Machine Learning fields. It has shown impressive performance in many practical applications. Sparse Learning is a kind of machine learning methods that are based on sparse representation. Typically, the original sparse learning is a linear regression problem regularized by \ell_0 norm, resulting in an NP-Hard problem. Then, the LASSO model was proposed with the \ell_1 norm instead of the \ell_0 norm, which made the sparse learning a convex optimization problem from the concave problem. This has built up a foundation so that the sparse learning became popular afterwards. Though many extensions of LASSO are proposed, sparse representation still can not deal with the nonlinearly distributed data very well and take advantages of data structures. Considering this, this dissertation will focus on kernel (nonlinear) sparse learning and structured sparse learning algorithms. Beyond sparse learning, image retrieval becomes an important research with the speedily development of the electronic devices and the increasing multimedia data in modalities and amount. On one hand, as a special case in image retrieval field, face recognition needs more research on robust algorithms against local distortions such as occlusions. On the other hand, ``transparent to modalities'' is a new requirement to the image retrieval systems now. The cross-modal matching is a critical way to realize it. The target of the cross-modal matching is to compute the distances between modalities directly. However, the divergence between modalities makes it a difficult problem. Thus, for applied research, the paper aims at the image retrieval algorithms based on the structured sparse representation. Specifically, the contributions of the paper are listed as follows: 1. The Kernel Coordinate Descent (KCD) and Kernel Homotopy (KHomotopy) algorithms are proposed to the find the optimal solution to the kernel LASSO problem. KCD algorithm updates each coordinate according to the other coordinates iteratively. As for the KHomotopy algorithm, it maintains a supporting set by adding active elements to the support set or removing non-active elements from the set in each iteration. Based on the two algorithms, we further proposed a Hamming kernel to take better advantages of the local images features for face recognition. Extensive experiments show that the proposed algorithm is more robust to the small-sample-s...
Other Identifier201318014628001
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/6673
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
康翠翠. 结构稀疏学习及其在图像检索中的应用研究[D]. 中国科学院自动化研究所. 中国科学院大学,2015.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20131801462800(6814KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[康翠翠]'s Articles
Baidu academic
Similar articles in Baidu academic
[康翠翠]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[康翠翠]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.