CASIA OpenIR  > 毕业生  > 硕士学位论文
社会标签系统和社会网络中的数据挖掘
其他题名Knowledge Discovery from Social Collaborative Tagging Systems and Social Networks
李慧倩
学位类型工学硕士
导师王飞跃 ; 曾大军
2008-01-14
学位授予单位中国科学院研究生院
学位授予地点中国科学院自动化研究所
学位专业控制理论与控制工程
关键词标签 社会网络 复杂网络 推荐 贝叶斯 渗透论 Tag Social Networks Complex Networks Recommendation Bayes Percolation
摘要标签系统以及社会网络作为Web2.0技术的典型应用,已经受到了来自产业界和研究界的广泛关注。人们普遍认为,标签作为一种用户自发的、随机的对网上内容进行分类和标注的信息,能够极大的促进信息检索的效率,并吸引更多的用户更深入的使用互联网资源。然而这种自组织的数据为研究者带来的极大的挑战。研究者在从数据的分析,到标签语义结构的挖掘,到最终利用标签信息辅助推荐和检索,进行了一轮初步的研究后,发现对标签系统的中数据的了解仍不完整,因而对标签的使用也限于简单的方法。而在社会网络方面,虽然已有的社会学和物理学领域的基础研究已经相当丰富,但如何从虚拟社会网络所提供的人的活动信息提炼出人与人关联的动机和机制,以及社会网络如何帮助信息的传播,这些问题仍没有建立起与方法的直接对应。 本文针对以上问题,在已有研究的基础上,进行了三个方面的探索: 1.用复杂网络的统计分析方法研究了标签数据和社会网络数据。对复杂网络的统计方法,在标签数据特有的三项图中进行改进。根据统计分析结果,深入地阐述了标签数据所表现出的潜在规律性和应用价值。 2.从作为特征和作为用户浏览的中间量两个角度,完整的评估了标签在推荐系统中的作用。提出了基于三项图的用户浏览模型,利用基于随机游走的相似度计算,将用户利用标签来寻找潜在资源的不同行为模式融入推荐中,并取得了良好的效果。 3.用渗透模型验证了社会网络中存在的标签传播,及其传播的模型。定义了基于信息传导的重要性参数,并通过该参数寻找社会网络中的关键用户。首次将社会网络通过图模型引入推荐中。
其他摘要Collaborative tagging systems and social networks, which are all representative Web2.0 applications, have been calling attentions from both the industry and the academy. Both the fields consider tag as a user contributed classifying method and believe that tags could be efficiently used to improve information retrieval from Web and attract more people to join in the information sharing. Nevertheless, the complex nature of tagging data brings the research area with great challenges. After the first round of preliminary research work, ranging from data analysis, semantic structure mining, to the application of tags with Web page recommendation and ranking, the researchers found that more work is necessary for a deep understanding of user tagging activity. More over, the application of tagging data remains at a preliminary level due to the lake of insightful understanding of data. For the social network area, the research work from social science and physics have set up solid research background and frameworks. But, their methodologies can not directly answer the questions that how to define the mechanism of social network construction from users' open online activities and how to make the user connections improve knowledge diffusion. Basing on the existing works in the area of collaborative tagging and social network, we try to answer these questions with three aspects: 1. We analyze the data of social collaborative tagging and social network, particularly new measure is proposed on the tri-partite graph of tagging data. The statistical result reveals the potential underlying principles of user activities and its usefulness for further application. 2. We adopt tagging information to Web page recommendation methods from two aspects: tags as features and as user navigation terminals. A tri-partite graph based user navigation model is proposed, which works as a framework for random-walk based similarity calculation to involve tagging activity into recommendation. This framework is flexible in that it can shift in its structure to model different user navigation types. 3. We evaluate the diffusion of tags in the social network using the model from percolation and complex network theory. According to the diffusion ability, we define importance measures in social network to identify influential users. Further more, we initially introduce social network into graph model based Web page recommendation to evaluate the efficiency of user networks to improve information discovery.
馆藏号XWLW1340
其他标识符200528014628011
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/7469
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
李慧倩. 社会标签系统和社会网络中的数据挖掘[D]. 中国科学院自动化研究所. 中国科学院研究生院,2008.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
CASIA_20052801462801(4757KB) 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李慧倩]的文章
百度学术
百度学术中相似的文章
[李慧倩]的文章
必应学术
必应学术中相似的文章
[李慧倩]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。