CASIA OpenIR  > 智能感知与计算研究中心
Approximate pairwise clustering for large data sets via sampling plus extension
Wang, Liang1; Leckie, Christopher2; Kotagiri, Ramamohanarao2; Bezdek, James2
Source PublicationPATTERN RECOGNITION
2011-02-01
Volume44Issue:2Pages:222-235
SubtypeArticle
AbstractPairwise clustering methods have shown great promise for many real-world applications. However, the computational demands of these methods make them impractical for use with large data sets. The contribution of this paper is a simple but efficient method, called eSPEC, that makes clustering feasible for problems involving large data sets. Our solution adopts a "sampling, clustering plus extension" strategy. The methodology starts by selecting a small number of representative samples from the relational pairwise data using a selective sampling scheme; then the chosen samples are grouped using a pairwise clustering algorithm combined with local scaling; and finally, the label assignments of the remaining instances in the data are extended as a classification problem in a low-dimensional space, which is explicitly learned from the labeled samples using a cluster-preserving graph embedding technique. Extensive experimental results on several synthetic and real-world data sets demonstrate both the feasibility of approximately clustering large data sets and acceleration of clustering in loadable data sets of our method. (C) 2010 Elsevier Ltd. All rights reserved.
KeywordPairwise Data Selective Sampling Spectral Clustering Graph Embedding Out-of-sample Extension
WOS HeadingsScience & Technology ; Technology
WOS KeywordALGORITHMS
Indexed BySCI
Language英语
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Artificial Intelligence ; Engineering, Electrical & Electronic
WOS IDWOS:000284446200005
Citation statistics
Cited Times:31[WOS]   [WOS Record]     [Related Records in WOS]
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/9735
Collection智能感知与计算研究中心
Affiliation1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
2.Univ Melbourne, Dept Comp Sci & Software Engn, Parkville, Vic 3010, Australia
First Author AffilicationChinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
Recommended Citation
GB/T 7714
Wang, Liang,Leckie, Christopher,Kotagiri, Ramamohanarao,et al. Approximate pairwise clustering for large data sets via sampling plus extension[J]. PATTERN RECOGNITION,2011,44(2):222-235.
APA Wang, Liang,Leckie, Christopher,Kotagiri, Ramamohanarao,&Bezdek, James.(2011).Approximate pairwise clustering for large data sets via sampling plus extension.PATTERN RECOGNITION,44(2),222-235.
MLA Wang, Liang,et al."Approximate pairwise clustering for large data sets via sampling plus extension".PATTERN RECOGNITION 44.2(2011):222-235.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang, Liang]'s Articles
[Leckie, Christopher]'s Articles
[Kotagiri, Ramamohanarao]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, Liang]'s Articles
[Leckie, Christopher]'s Articles
[Kotagiri, Ramamohanarao]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Liang]'s Articles
[Leckie, Christopher]'s Articles
[Kotagiri, Ramamohanarao]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.