CASIA OpenIR  > 毕业生  > 博士学位论文
协作式问答系统关键技术研究
其他题名Research on Collaborative Question Answering System
刘明荣
学位类型工学博士
导师杨青
2010-05-30
学位授予单位中国科学院研究生院
学位授予地点中国科学院自动化研究所
学位专业模式识别与智能系统
关键词协作式问答 相似问句检索 答案排序 答案质量判别 用户建模 用户搜索 问句分类 原型系统 Collaborative Question Answering Question Retrieval Answer Ranking Answer Quality User Modeling User Search Question Categorization Prototype System
摘要协作式问答(CQA)系统是满足用户信息交流和知识共享需求的网络问答系统。近年来,CQA相关的工作受到了越来越多的关注,但是还存在很多关键技术问题需要解决。本文基于大规模真实问答数据对CQA系统进行了较深入的研究,主要的工作有: (1) 相似问句检索 基于大规模真实问题提出一种相关词模型,并在相关词模型基础上,对查询问句和候选问句同时作相关词扩展,提出一种计算问句之间语义相似度的方法,应用于相似问句检索。 (2) 答案排序 采用统计翻译模型衡量答案与问句的相关性,同时提出了不同候选答案之间的相关性假设;改进了流形排序方法,将答案与问句以及不同候选答案之间的关系融入流行排序框架,对问答页面内的候选答案排序。 (3) 答案质量判别 论文考虑了答案的浅层文本特征、答案与问题之间的关系特征、答案提供者的特征以及答案上下文特征,然后将四种类型特征融入线性回归模型判别答案是否为问句的高质量答案。 (4) 问题最佳回答者搜索 论文综合了语言模型和LDA主题模型对用户兴趣建模,并分析了用户权威度和用户活跃度等用户先验信息,然后将它们融入统一的概率框架,搜索CQA系统中新问题的最佳回答者。 (5) 协作式问答原型系统 论文设计了一种新的CQA原型系统;对问句分类方法进行了研究,并利用大规模问答数据实现了原型系统的问答搜索和用户搜索两个模块。 关键词:协作式问答,相似问句检索,答案排序,答案质量判别,用户建模,用户搜索,问句分类,原型系统
其他摘要Collaborative question answering (CQA) services such as Yahoo! Answers and Sina Iask have become more and more popular during recent years in providing platforms for people to share knowledge and search information online. However, there are relatively fewer work done on CQA system compared to other information retrieval system. In this thesis, we investigate several key problems in CQA system. The main contributions include following issues: (1) similar question retrieval A word relevance model is trained based on the whole question archive which is made up of millions of natural language questions proposed by users on the web; then a novel method to calculate similarities between questions is proposed with the help of word relevance model by question expansion. (2) answer ranking within a question-answering thread Relations between a question and its candidate answers are built based on the statistical translation model. Besides, inter-answer similarities are calculated. The manifold ranking is taken to propagate ranks among the question and answers. After ranking propagation, each answer gets its ranking score, and candidates answers are sorted by their ranking scores. (3) quality determination of user-generated answers Four types of features are extracted to describe answers, including surface linguistic patterns, question-answer relationships, answer provider's features and structural context features. These types of features are incorporated into the linear regression model to determine the quality the answers. (4) user searching for new questions Interests of the answerers are modeled by tracking users’ answering history. Relationship between the answerer and a new question is measured by language model and the LDA topic model. User authority and user activity are also taken into consideration. A probabilistic framework is utilized to combine all information about users to predict best answerers for new questions. (5) a new CQA prototype system A CQA prototype system is designed, and we have implemented several key modules of the system, including q&a retrieval and user searching for new questions. Meanwhile, we also study the problem of question categorization by comparing two classification models. Key Words: collaborative question answering, question retrieval, answer ranking, answer quality, user modeling, user search, question categorization, prototype system
馆藏号XWLW1498
其他标识符200718014628051
语种中文
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/6263
专题毕业生_博士学位论文
推荐引用方式
GB/T 7714
刘明荣. 协作式问答系统关键技术研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2010.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
CASIA_20071801462805(2399KB) 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[刘明荣]的文章
百度学术
百度学术中相似的文章
[刘明荣]的文章
必应学术
必应学术中相似的文章
[刘明荣]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。