Institutional Repository of Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
|Place of Conferral||中国科学院自动化研究所|
|Keyword||机器阅读理解 篇章建模 句间关系 篇章场景 篇章主题|
语言理解是认知智能的一个重要体现，同时也是自然语言处理领域一个长期的富有挑战性的目标。为了更加灵活且全面地评价一个系统的语言理解能力，研究者提出了机器阅读理解(Machine Reading Comprehension, MRC)任务。该任务在形式上表现为基于文本的问答，即给定一篇文档，要求机器回答与文档相关的问题。
Language understanding is an important manifestation of cognitive intelligence, and it is also a long-term challenging goal in the field of natural language processing. To evaluate the language comprehension ability of a system more flexibly and comprehensively, the researchers proposed the task of Machine Reading Comprehension (MRC). Formally, the task is defined as a text-based Question Answering, that is, given a document, the machine is required to answer the questions related to the document.
Thanks to the increase in deep learning technology and the scale of annotated data, MRC task has developed rapidly over the past few years. At the same time, the model has initially learned some basic text comprehension skills, especially in terms of word understanding and sentence-level matching. However, as only focusing on the modeling of words or sentences, the existing methods still have great limitations in the scenarios where it is required to model the overall information of the given document. The experimental manifestation is that in certain test
1. A sentence relation modeling based MRC method
In the existing MRC methods, the sentence relations, one of the document-level information, are often ignored during modeling. This results in insufficient modeling of the overall semantics of the document, which in turn affects the reasoning of the answer. To this end, this paper proposes a graph based method which models sentence relations from multiple perspectives. This method takes the sentences in the document as nodes and utilizes graph structure to describe the relationship between sentences. On the one hand, the method builds the relation graphs in a static way, from the perspectives of topic relevance, semantic similarity, and distance within document. On the other hand, in order to capture the relationship between sentences that cannot be covered by the above pre-designed perspective, a dynamic building method is also introduced in the method. This paper conduct experiments on the answer sentence selection task, a subtask of MRC， meanwhile based on the module of sentence relation modeling, two types of answer sentence selection model are built by employing weak and strong underlying representation, respectively. The results demonstrate the effectiveness and versatility of the proposed method that models the relations between sentences. Moreover, the adversarial test shows the robustness of the method.
2. A document scene modeling based MRC method
Narrative is one of the common text genres, and its comprehension should not be ignored in the task of Machine Reading Comprehension. Narrative document is composed of a series of interrelated events, which is its important feature distinguishing from other forms of text. Therefore, it is necessary to model narrative from the perspective of events at document level, but the existing methods generally do not pay attention to this point. In response to this problem, this paper proposes a document-level event scene modeling method for narrative document. The method is inspired by human reading behavior. By introducing
Social media text is an important text form in nowadays Internet era. The author usually posts a message on the assumption that the readers have specific background knowledge, thus those messages are generally short. This leads to the weak self-containment ability of the document. Moreover, an MRC model suffers from understanding the topics described in the document, so it will be even more difficult to answer questions based on the document. Therefore, in such scenarios, the first problem that the model needs to solve is topic modeling, but the existing methods have seldom paid attention to this problem. To this end, this paper proposes an MRC method that introduces external knowledge to model document topic. The method starts with the characteristics of “topic information clustering” of social media, treats other relevant texts in social media platforms as knowledge sources to acquire and refine topic knowledge, and finally incorporates the topic knowledge of document. Experimental results on relevant public dataset show that this method can improve the ability of document understanding and answer reasoning through effective document topic modeling.
|田志兴. 基于篇章建模的机器阅读理解技术研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2021.|
|Files in This Item:|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|Similar articles in Baidu academic|
|Similar articles in Bing Scholar|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.