CASIA OpenIR  > 毕业生  > 硕士学位论文
鲁棒子空间聚类方法
Alternative TitleRobust Subspace Clustering
张迎亚
Subtype工程硕士
Thesis Advisor孙哲南
2014-05-24
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Discipline计算机技术
Keyword子空间聚类 相关熵 半二次优化 稀疏子空间聚类模型 低秩表示模型 Subspace Clustering Correntropy Half-quadratic Minimization Sparse Subspace Clustering Low-rank Representation
Abstract高维数据普遍存在于诸多领域如计算机视觉,机器学习,模式识别,生物信息学,信号和图像处理中。 这些高维数据不但会增加算法的处理时间和内存的需求,而且由于噪声的存在和样本数目的不足(相对于维度空间),反倒会降低算法的性能。幸运的是,这些高维数据通常存在一个低维的内部结构。 恢复数据的低维结构不仅可以减少计算时间和空间的复杂度,而且可以降低高维噪声的影响,从而提高学习和识别算法的性能。 子空间聚类作为近些年流行起来的一种数据分析方法,因其在机器学习和计算机视觉领域广泛的应用,正受到越来越多的关注。子空间聚类问题是指根据数据所在的子空间对数据进行划分,它可以找出原始高维数据的低维表示,并将这些数据投影到对应的低维子空间中。 然而现实中数据会受到各种噪声的影响,如何在数据中含有大量噪声的情况下正确的对数据进行划分,正是本文研究的重点。本文旨在提高原始子空间聚类算法的鲁棒性,特别是对非高斯噪声的鲁棒性,主要工作包括: 1)阐述了子空间聚类算法的基本概念,并根据算法的原理将现有的子空间聚类算法进行了划分,针对每类算法都进行了简要介绍,总结归纳每种算法的优缺点,指出现有算法存在的问题。 2)提出了一种基于半二次优化的子空间聚类算法。稀疏子空间聚类模型(SSC)通过寻找数据的稀疏表示,并使用稀疏表示矩阵进行谱聚类,从而得到数据的划分结果。 然而原始的SSC容易受到噪声的影响,为了提高SSC对噪声的鲁棒性,我们将相关熵引入稀疏表示模型,并提出一种基于半二次优化的迭代优化算法来优化模型。 在运动分割和人脸聚类两个应用中的实验结果证实了我们方法的有效性。 3)提出了一种基于相关熵的低秩表示算法。 低秩表示模型(LRR)通过寻找原始数据的最低秩表示来捕捉数据的整体结构信息。为了提高LRR的鲁棒性,我们将原始LRR中$l_{2,1}$范数替换为相关熵损失函数,提出了基于相关熵的低秩表示模型CLRR。 为了处理数据存在的结构噪声,我们又提出一种基于列相关熵的低秩表示模型c-CLRR。 接着我们又提出一种基于交替优化方向法的迭代算法来求解我们提出的模型。 最后在Hopkins 155数据库和Extended Yale B数据库上的结果表明我们的方法有效地提高了原始LRR的鲁棒性。 总的说来,本文通过将信息论学习中的相关熵概念引入子空间聚类,并结合半二次优化方法,分别对稀疏子空间聚类模型(SSC)和低秩表示模型(LRR)进行了改进,大大提高了原始模型对噪声的鲁棒性。
Other AbstractHuge volumes of high-dimensional data have been created with the development of information and network technology. So much more computational and storage cost is needed for processing and archiving of high-dimensional data. And increasing dimensionality of data greatly degrades performance of information processing and analysis. Fortunately, the huge volumes of high-dimensional data in our society usually have low-dimensional inner structures. Recovery of those low-dimensional structures of the high-dimensional data can reduce computational complexity and storage requirement of information processing algorithms and improve the performance of machine learning and pattern recognition tasks. Subspace clustering aims to divide high-dimensional data points into multiple subspaces and find a low-dimensional subspace into which each group of data points can fit simultaneously. Subspace clustering as a new and powerful data analysis tool has attracted a great attention due to its promising applications in computer vision and machine learning. However, it is a challenging task to learn low-dimensional subspace structures due to the possible errors (e.g., noise and corruptions) existing in the high-dimensional data. This thesis aims to improve the robustness of subspace clustering algorithms in the presence of large corruptions and outliers. The main work and contributions are as follows: 1) This thesis firstly presents a survey of subspace clustering. The existing subspace clustering methods are classified into four categories. Both strength and weakness of each method are summarized. 2)A novel optimization model based on half-quadratic minimization is introduced to robust subspace clustering. The sparse subspace clustering (SSC) aims to find the sparse representation of the data and apply the spectral clustering to the representation matrix so that the ultimate segmentation results can be obtained. However the original SSC is prone to the presence of noise. Therefore a correntropy loss function is integrated into the sparse representation model to improve the robustness of SSC. Then half-quadratic minimization is provided as an efficient solution of the proposed formulation. Experimental results on two real-world applications, i.e. face clustering and motion segmentation, demonstrate that our method outperforms state-of-the-art subspace clustering methods. 3)A novel subspace clustering method based on correntropy is proposed in the framework of ...
shelfnumXWLW2077
Other Identifier2011e8014661100
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/7706
Collection毕业生_硕士学位论文
Recommended Citation
GB/T 7714
张迎亚. 鲁棒子空间聚类方法[D]. 中国科学院自动化研究所. 中国科学院大学,2014.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_2011e801466110(3852KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[张迎亚]'s Articles
Baidu academic
Similar articles in Baidu academic
[张迎亚]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[张迎亚]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.