CASIA OpenIR  > 毕业生  > 博士学位论文
面向用户交互网站的信息检索与浏览关键技术研究
Alternative TitleResearch On Information Retrieval and Browsing in User interactive Website
路冬媛
Subtype工学博士
Thesis Advisor戴汝为
2012-05-29
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword人机交互 影响力预测 层级结构挖掘 情感分析 个性化检索 Human Computer Interaction Hierarchical Structure Construction Mood Analysis Personalized Search Prestige Prediction
Abstract随着互联网技术的飞速发展,以用户交互为核心的网站逐渐成为主流的信息交流平台。用户交互行为包括与网络实体信息的交互,例如分享信息、收藏信息、添加标签和心情投票等,以及与用户间的交互,例如建立好友关系等。这些交互行为极大地丰富了互联网内容,但也使得用户面临信息过载的困扰。因此在用户交互网站中(如允许用户交互的新闻网站、Flickr等),充分挖掘用户交互行为提供的新元素,研究基于新元素特征的关键技术,提供有效的检索和浏览策略,帮助用户快速准确地获取所需的实体信息(如群组、新闻、图片等),是信息检索领域面临的新挑战。 用户为群组中的图片添加标签的同时丰富了群组的语义信息,挖掘语义信息中的主题结构,将群组按主题逐层细化的层级结构进行组织,有助于用户通过层级浏览的方式,逐步明确查询目的,从而快速准确地定位所需群组;用户为新闻进行心情投票进一步丰富了新闻的情感信息,该信息反映了新闻内容对读者心情的影响,挖掘新闻内容的读者心情特征,研究综合考虑心情、语义等多方面因素的新闻检索方法,有助于满足用户多元化的信息检索需求;用户与好友间的交互信息反映了其相似兴趣,挖掘用户与好友间关联关系能够根据好友的喜好预测用户的喜好,从而满足用户日益个性化的检索需求;综合利用用户的多种交互行为,挖掘和分析用户交互行为所反映出的用户行为特征,特别是发现和预测有影响力的用户,能够帮助用户通过有影响力用户,有选择的浏览高质量信息,从而丰富和提升用户的浏览体验。目前,对网络内容检索和浏览的研究多利用其自身信息或用户的搜索日志,较少从利用用户交互信息、综合多种元素的角度,提供检索和浏览策略。本文以允许用户交互的新闻网站和Flickr为研究背景,以网络数据挖掘技术为手段,针对用户交互的上述特点展开研究,主要研究内容包括: 1.提出了一个面向Flickr群组的层级语义结构挖掘、构建方法,用于将群组按主题逐层细化的层级结构进行组织,从而便于用户通过层级浏览的方式逐层明确感兴趣的主题,快速定位所需群组。该方法基于层级主题模型,通过抽取群组集合中具有层级关联关系的潜在主题结构,并将群组映射到已构建的层级主题结构上,形成群组的层级组织结构。在数据集上的实验结果表明,该方法能够有效地组织群组满足用户的浏览需求。 2.提出了一种融合读者情感要素的新闻检索方法,用于满足用户多元化的检索需求。该方法重点研究了依据读者心情的新闻排序算法,并考虑新闻内容与查询词的语义相关性,以及新闻的重要性随时间变化的特性,实现了一种从多角度满足用户需求的新闻检索方法。基于所提方法,我们设计了一个新闻检索系统,验证了该方法的有效性和实用性。 3.提出了基于Flickr用户兴趣挖掘的个性化检索模型,通过挖掘用户与好友间的兴趣关联关系,利用好友喜好预测用户喜好,从而满足用户个性化的检索需求。该模型基于图分割方法将用户兴趣用统一的潜在特征空间表示,并利用判别式模型进行特征选择,实现基于好友喜好预测当前检索用户喜好。在数据集上的实验分析表明,该方法能够提高用户对检索结果的满意度。 ...
Other AbstractWith the rapid development of Web2.0 techniques, the websites with the core of user interactive actions have recently become a novel platform for sharing information. Users perform interactive actions with the web objects (e.g., sharing objects, collecting objects as favorites, tagging, and casting mood votes, etc.), and other users (e.g., contacting others as friends). These actions enrich the web content, but simultaneously make the users face the problem of information overload. Therefore, in these user interactive websites, such as Sina News and Flickr, fully employing the new characters brought by the user interactive actions and investigating the key methods based on these characters will provide users with novel information retrieval and browsing mechanisms. With these novel alternative mechanisms, users could get access to the desired information (e.g., news, groups and photos, etc.) more quickly and efficiently. The action of tagging photos in groups enriches the semantic information of groups. Mining the hidden hierarchical topic structure underlying the semantic information, and organizing the groups into a hierarchy will guide the users browsing groups from broader topics to more specific topics, and finally achieve the desired group. The action of casting mood votes further enriches the sentiment information of news. This sentiment information reflects the readers' mood influenced by the news content. Incorporating the mood, content and time factors into a unified news retrieval framework would satisfy users' diverse retrieval requirement. The interactive actions among users reflect their similar interest. Mining the interest relationship between user and his/her friends would assist predicting the user's preference from his/her friends' preference, which would satisfy users' increasing personalized need. Comprehensive utilization of multiple interactive actions to detect and predict prestigious users would assist people choosing high quality information, which could improve users' browsing experience. Current studies on information retrieval and browsing mainly utilize the content of objects or users' click-through data. Utilizing the data from user interactive actions and incorporating multiple factors have received less attention. Therefore, based on web data mining techniques, this thesis uses the Sina News and Flickr as the research background, and main research focuses are summarized as follows. 1. We propose a novel approach on hie...
shelfnumXWLW1738
Other Identifier200918014628037
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/6442
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
路冬媛. 面向用户交互网站的信息检索与浏览关键技术研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2012.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20091801462803(3951KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[路冬媛]'s Articles
Baidu academic
Similar articles in Baidu academic
[路冬媛]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[路冬媛]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.