融合多层次语言信息的文本蕴涵识别方法研究

CASIA OpenIR > 多模态人工智能系统全国重点实验室 > 自然语言处理

	融合多层次语言信息的文本蕴涵识别方法研究
	杜倩龙
	2020-11-25
页数	128
学位类型	博士
中文摘要	文本蕴涵识别旨在利用各种分析手段有效理解文本的语义，并推理出文本之间的语义蕴涵关系。该任务有效解决了语言表达的多样性和歧义性问题，广泛应用于自动问答、信息抽取、自动摘要和机器翻译等众多自然语言处理任务中。因此，开展文本蕴涵识别的相关研究具有重要的理论意义和研究价值。早期的文本蕴涵识别方法主要采用文本转换、逻辑推理和传统统计学等方法对文本间的语义关系进行推理判定。然而上述方法需要构建大量的蕴涵规则和特征，这极大地限制了文本蕴涵识别研究的发展。近年来，随着深度学习研究的不断深入，有学者提出了基于深度神经网络的文本蕴涵识别方法。由于该类方法不需要人工构建大量的蕴涵规则和特征，大大简化了文本蕴涵识别系统的复杂度，且取得了很好的效果，因此受到学者越来越多的关注。现有神经网络文本蕴涵识别方法大多关注于如何利用更大规模的数据对模型和词向量进行预训练，或者通过更加复杂的网络结构对文本的语义进行建模。然而，它们忽视了多层次语言信息对文本蕴涵识别的帮助。这些信息包括词义信息、依存句法结构信息和句间语义相似度信息等。上述不同层次的语言信息可以有效改善文本的语义表示和文本间语义关系的推理，进而提升文本蕴涵的识别效果。基于此，本文的研究工作围绕如何利用不同层次的语言信息改善文本蕴涵识别系统的推理性能展开，重点关注于如何在文本蕴涵识别中融合词汇层次的词义信息、词汇间的依存结构信息以及句间的语义相似度信息。本文的主要贡献和创新归纳如下： 1. 提出了一种融合词义信息的文本蕴涵识别方法文本蕴涵识别是对两个文本之间语义关系的有向推理，而词汇的词义对理解文本的语义以及推理文本之间的语义蕴涵关系有着重要作用。因此，本文首先分析了不同类型信息对词义的影响，并提出了一种融合句法结构信息和上下文信息的词义消歧模型，从而更好地抽取词汇的词义。在该模型的基础上，本文提出了一种融合词义信息的文本蕴涵识别方法。该方法利用词义消歧模型对输入文本中词汇的词义进行判定，从而将原始的词汇序列转化为对应的词义序列。然后利用词汇的词义信息和文本间词义的关联信息改善文本的语义表示和文本间语义关系的推理。实验表明，该方法可以有效改善文本蕴涵识别系统的推理性能，从而提升了文本蕴涵识别的准确率。 2. 提出了一种融合依存句法结构信息的文本蕴涵识别方法现有神经网络文本蕴涵识别方法往往会忽略词汇间的结构依存关系，或者由于采用树形网络结构导致模型引入大量的无关信息。为了避免上述问题，本文提出了一种融合依存句法结构信息的文本蕴涵识别方法。该方法从词汇的依存句法结构信息中抽取其头节点词汇和对应的依存关系构成词汇依存三元组，从而将输入文本表示为词汇依存三元组的集合。然后，利用集合中的词汇依存三元组对文本间的语义信息进行对齐和比较。此外，不同于现有模型将文本中的每个词汇与另一个文本的整体表示进行比较的处理方法，本文提出了直接对两个文本的词汇依存三元组进行独立比较。在此基础上，本文进一步提出了利用局部上下文信息改善三元组中词汇节点的语义表示，从而进一步提升了文本蕴涵识别系统的推理性能。实验表明，本文提出的方法可以有效利用词汇的依存句法结构信息改善文本蕴涵识别系统的性能，且具有很好的可解释性。 3. 提出了一种融合句间语义相似度的文本蕴涵识别方法现有文本蕴涵识别方法在进行词汇对齐时往往只关注于如何改善词汇的向量表示，而忽略了文本间上下文语义相似度对词汇对齐的影响。针对该问题，本文提出了一种融合句间语义相似度的文本蕴涵识别方法。该方法在进行词汇对齐时同时考虑了词义相似度和文本间上下文语义的相似度，从而改善了文本之间语义信息的对齐效果。此外，对于不同词汇对应的局部蕴涵信息，现有的文本蕴涵识别方法采用相同的权重对其进行融合。实际上，不同词汇对应的局部蕴涵信息对最终结果具有不同的影响。为了弱化噪声词汇对应的局部蕴涵信息对最终预测结果的影响，同时增强关键词汇的局部蕴涵信息对最终预测结果的影响，本文进一步提出了一种选择门机制对所有词汇对应的局部蕴涵信息进行过滤。实验表明，本文提出的文本蕴涵识别方法具有很强的鲁棒性，且具有很好的可解释性。综上所述，本文针对如何利用不同层次语言信息改善文本蕴涵识别性能进行了深入研究，分别研究了词义信息、词汇间依存句法结构信息和句间语义相似度信息对语义表示和文本间语义关系推理的作用，并提出了一系列的相关模型，从而有效提升了文本蕴涵识别的准确率。
英文摘要	Textual entailment recognition is the task of detecting whether a given text passage can be inferred from another text passage. As a pivotal and fundamental task for natural language processing, it has applications as varied as question answering, information extraction, automatic text summarization, machine translation and so on. Therefore studying textual entailment recognition has important theoretical significance and application value. Most earlier approaches mainly adopt traditional rule-based methods, or statistical frameworks for this task. As those methods need a lot of entailment rules and features, it greatly limits the advance of textual entailment recognition. In recent years, with the advance of deep learning, many neural network based approaches have been proposed for textual entailment recognition. These neural network based approaches do not require constructing entailment rules and features by human, which greatly reduces the difficulty of building textual entailment recognition systems. And they have reached state-of-the-art performances on textual entailment recognition. As a result, approaches based on the neural network quickly become the hottest research area in textual entailment recognition. Most current neural network based approaches either construct complicated architectures for representing the given passage-pairs, or pre-train models and word embeddings with larger datasets. However, they ignore the importances of word sense information, syntactic structural information and inter-sentence semantic similarity. The linguistic information above is very important in understanding the passages, and predicting the relationships of passage-pairs for textual entailment recognition. Therefore, this dissertation focuses on how to use different linguistic information to improve the performance of textual entailment recognition. The main contributions of this thesis are summarized as follows: 1. Incorporating word sense information into textual entailment recognition Textual entailment recognition is to determine if the meaning of one passage can be inferred from the meaning of another passage. During inference, the senses of words play an important role in understanding the meaning of the passages and predicting the relationship of the passage-pair. Consequently, this dissertation first analyzes the effects of different linguistic information on inferring word sense, and then proposes a word sense disambiguation approach to better extract the word sense information for textual entailment recognition. Based on the above work, we propose a textual entailment recognition approach which incorporates word sense information. In this approach, we first use a word sense disambiguation system to generate the sense of each word, and then use word sense information to improve the representations of the passages and the inferences of the passage-pairs. Experimental results show that our approach can improve the performance effectively. 2. Incorporating syntactic structural information into textual entailment recognition Most previous neural network based approaches either ignore the syntactic dependency among words, or directly use tree-structured neural network to generate the passage representations with some irrelevant information. To overcome the problems mentioned above, we propose a new approach which incorporates syntactic structural information into textual entailment recognition. In this approach, we first build a word-dependency-triplet for each word in the passages, which is generated by extracting its head-word and the corresponding dependency relation from the syntactic structural information. In this way, we can convert the input passage-pair into two sets of word-dependency-triplet, and then directly compare these two triplet sets. To be specific, instead of comparing each triplet from one passage with the merged information of another passage, we propose to perform comparison directly among the triplets of the given passage-pair, which makes the judgment more interpretable. Furthermore, we propose to enhance the pre-trained word-embeddings in the triplet with their associated local contexts. Experimental results show that the performance of our approach is better than most of the approaches that use tree structures, and is more interpretable. 3. Incorporating inter-sentence semantic similarity into textual entailment recognition In order to align the tokens of the passage-pairs, existing textual entailment recognition approaches usually focus on improving the word representations, but ignore the importance of inter-sentence semantic similarity of contexts. Furthermore, most of them uniformly weight various local decisions during aggregation for the global judgment. However, local decisions related to various tokens can influence the final decision differently. In order to handle these problems, an enhanced alignment mechanism is proposed, which jointly considers both token content similarity and inter-sentence semantic similarity of the contexts. Besides, a selection gate mechanism for weighting local decisions differently is also proposed. Experimental results show that our performance is comparable to state-of-the-art approaches but better mimics human behavior, making it more interpretable. In summary, this thesis focuses on how to make use of different linguistic information to improve the performance of textual entailment recognition. We separately adopt word sense information, syntactic structural information and inter-sentence semantic similarity to improve the representations of the passages and the comparisons among passages. The obtained results show that our approaches significantly improve the performance of textual entailment recognition.
关键词	文本蕴涵识别词义消歧依存句法结构信息自然语言推理
学科领域	计算机科学技术 ; 人工智能 ; 自然语言处理
语种	中文
七大方向——子方向分类	自然语言处理
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/42482
专题	多模态人工智能系统全国重点实验室_自然语言处理
通讯作者	杜倩龙
推荐引用方式 GB/T 7714	杜倩龙. 融合多层次语言信息的文本蕴涵识别方法研究[D]. 中国科学院自动化研究所. 中国科学院自动化研究所,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
博士学位论文，杜倩龙，2020-12-7（7524KB）	学位论文		开放获取	CC BY-NC-SA