基于自适应知识选择的机器阅读理解方法研究
李泽政
2021-05-27
页数76
学位类型硕士
中文摘要

机器阅读理解作为自然语言处理领域的一项重要任务,近些年受到研究者们的极大关注。目前,研究者们已经将其看作是衡量机器理解人类自然语言的重要手段之一,而且将其广泛引用于下一代搜索引擎、智能客服等互联网产品中。因此对其研究具有重要的学术意义和应用价值。

近些年,随着深度学习和预训练语言模型技术的快速发展,机器阅读理解技术也取得了长足的进步。尽管机器阅读理解任务在一些简单模式匹配的任务中取得了较好的性能,但是这些模型依然基于数据驱动的学习范式,并不能像人类一样具备推理所需要的常识。因此面对一些复杂推理型问题时,问答性能大打折扣。如何利用外部知识增强的机器阅读理解的语义理解和推理能力就成为这一任务的难点问题之一,也引起了许多研究者们的关注。目前,研究者们已经开发了多种基于知识增强的机器阅读理解新模型。然而,这些模型并没有讨论外部知识的不同类型对机器阅读理解模型的影响,以及如何对多种来源知识进行自适应选择。本文针对上述问题提出了两种解决方案,主要贡献点如下:

1.针对忽略外部知识来源特点的问题,本文提出了一种自适应区分外部知识来源的机器阅读理解方法。首先,该方法从不同的外部知识库分别检索出显式知识,然后基于注意力机制对不同外部知识库的知识进行打分;其次,我们分别采取硬性和软性的方法把外部知识编码融入机器阅读理解模型,进而推理出答案。本文在ROCStories数据集上通过与基线模型的比较证明了对外部知识来源区分的重要性和有效性。

2.针对不同外部知识来源的知识存在交叠现象的问题,本文提出了一种自适应筛选外部知识的机器阅读理解方法。首先,该方法从各个外部知识库分别检索出显式知识,然后基于知识图谱的图嵌入模型对这些知识进行打分;其次我们把这些知识分到一个相交的集合和一个不相交的集合里,再分别把交集和非交集的知识融入机器阅读理解模型,进而推理出答案。本文在ROCStories数据集上通过与基线模型的比较证明了对外部知识进行筛选的必要性和有效性。

英文摘要

As an important task in the field of natural language processing, machine reading comprehension has received great attention from researchers in recent years. At present, researchers have set its goal to become one of the important means for machines to understand human natural language, and it is widely applid in the next generation of search engines, intelligent customer service and other Internet products. Therefore, the research has important academic significance and application value.

In recent years, through the rapid development of deep learning and pre-training language model technology, machine reading comprehension technology has also made considerable progress. These models are still based on the data-driven learning paradigm and cannot lead to the common sense required for reasoning like humans do. Therefore, when faced with some complex reasoning problems, the Q&A performance is greatly reduced. How to use the semantic understanding and reasoning of machine reading comprehension enhanced by external knowledge, researchers have developed a variety of new models of machine reading comprehension based on knowledge enhancement. However, these models did not discuss the influence of different types of external knowledge on the machine reading comprehension model, and how to make adaptive selection of knowledge from multiple sources. Two solutions are proposed here for the above problems. The main contribution points are as follows:

1. Aiming at the neglected problem of the source of external knowledge, this article proposes a machine reading comprehension method of adaptively distinguishing the source of external knowledge. First, the method retrieves the explicit knowledge from different external knowledge bases, and then uses a mechanism to score the knowledge of different external knowledge bases; second, we use hard and soft methods to incorporate external knowledge coding into the machine reading comprehension model, and make reasoning out the answer.This paper proves the importance and effectiveness of distinguishing external knowledge sources by comparing with the baseline model on the ROCStories dataset.

2. Aiming at the problem of overlapping of knowledge from different sources of external knowledge bases, this paper proposes a machine reading comprehension method of adaptively selecting external knowledge. First, the method retrieves the explicit knowledge from each external knowledge bases, and then scores the knowledge based on the graph embedding model of the knowledge graph; Second, we divide the knowledge into an intersecting set and a disjoint set, and then the knowledge of intersection and non-intersection is separately incorporated into the machine reading comprehension model, and the answer is derived by reasoning.This paper proves the importance and effectiveness of selecting external knowledge sources by comparing with the baseline model on the ROCStories dataset. 

关键词机器阅读理解 知识增强 自适应选择
语种中文
七大方向——子方向分类自然语言处理
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/44813
专题多模态人工智能系统全国重点实验室_自然语言处理
推荐引用方式
GB/T 7714
李泽政. 基于自适应知识选择的机器阅读理解方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
李泽政硕士大论文.pdf(4562KB)学位论文 开放获取CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[李泽政]的文章
百度学术
百度学术中相似的文章
[李泽政]的文章
必应学术
必应学术中相似的文章
[李泽政]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。