面向非结构化文本的事件识别关键技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	面向非结构化文本的事件识别关键技术研究
	刘树林
	2017-05-24
学位类型	工学博士
中文摘要	在当代社会，互联网已经成为大部分人日常生活中必不可少的一部分，为人们的工作、学习和生活带来了极大的便利。互联网上存在大量的非结构化电子文本，如新闻、博客、电子邮件、聊天记录等。互联网技术的飞速发展，使得网络上的信息总量呈爆炸式增长。在这一背景下，如何帮助人们理解这些数据，快速地从海量文本中发现目标知识，以减轻人们的学习成本成为亟待解决的问题。信息抽取的提出正是为了解决这一问题，抽取的内容包括实体、关系和事件。其中，事件是本文的研究目标。事件识别是事件抽取的重要环节，识别的效果直接影响事件抽取的总体表现。该任务的目标是从非结构化的文本中识别出其描述的事件类型，并标注最能表示目标事件发生的词或词组（这类词或词组被称作事件的触发词）。事件识别对于帮助人们管理和使用海量的互联网信息具有重要作用，在信息检索、事件追踪、舆情监控、自动问答、文本摘要等领域具有潜在的应用前景。因此，事件识别在学术界和工业界得到了广泛的关注。由于自然语言表达的多样性与灵活性，事件识别一直是一个富有挑战性的问题：（1）事件触发词的歧义性是制约事件识别效果的重要因素。例如“离开”既可以触发“移动（Movement）”事件，也可以触发“离职（End-Position）”事件；（2）由于事件结构的复杂性，对文本中的事件信息进行人工标注成本高昂。因此，目前公开的事件识别数据集规模普遍较小，数据稀疏问题是事件识别面临的另一个重要挑战；（3）事件具有复杂的结构，如何有效地对事件的结构信息进行建模是该任务面临的重要挑战。针对上述问题展开研究，本文取得的主要研究成果包括： 1、提出了一种基于概率软逻辑模型融合深层局部和全局信息的事件识别方法，有效缓解了事件触发词的歧义性问题。歧义性是自然语言处理任务普遍面临的挑战，事件识别也不例外。针对这一问题，本文提出了一种基于概率软逻辑模型的事件识别方法，融合多层次特征对候选触发词进行消歧。该方法由局部模块和全局模块组成，分别用于对不同类型的特征进行建模：局部模块利用经典的分类模型（例如最大熵模型）对高维的局部特征进行建模，包括词法、句法及语义特征；全局模块则通过概率软逻辑模型以一阶逻辑谓词公式的形式对结构化的全局特征进行建模，包括事件~-~事件、主题~-~事件的相关关系。首先，局部模块通过分类模型对目标文本进行事件的预识别；然后，全局模块在训练语料中统计事件之间以及主题和事件之间的相关性；最后，模型融合上述不同模块的信息进行全局推理，获得最终的识别结果。公开数据集上的实验结果验证了本方法的有效性。 2、提出了一种利用FrameNet提高事件识别效果的方法，有效缓解了事件识别面临的数据稀疏问题。本文提出了一种利用外部资源FrameNet提高事件识别的方法，有效缓解了该任务面临的数据稀疏问题。框架和事件在结构上的相似性启发我们探索它们之间潜在的映射关系。在本工作中，我们利用框架下所有标注例句所表达的事件信息评估给定的框架能否映射到某一事件类型，因此如何识别FrameNet中的事件成为本文研究的关键问题。为了解决该问题，本文提出三条表示框架和事件相关关系的假设并基于这些假设提出了一种能够在FrameNet语料中有效识别事件的方法。最后本文将从FrameNet识别的事件样本添加到事件识别的原始训练语料，实验结果表明额外的事件样本显著地缓解了数据稀疏问题，大幅提高了模型的召回率。另外，基于上述结果本文还详细分析了具体框架和事件之间的映射关系。 3、提出了一种基于有监督关注机制融合事件角色信息进行事件识别的方法，显著提高了事件识别的效果。事件是由触发词和事件角色组成的复杂结构。事件的角色信息能够为事件识别提供重要的线索，然而现有工作没能有效地对该信息进行建模。针对该问题，本文提出了一种直接应用角色信息进行事件识别的方法，其基本思想是在事件识别的过程中重点关注角色信息。为了实现该目标，本文提出了一种利用有监督关注机制融合角色信息进行事件识别的方法，并探索了不同关注策略对模型的影响。训练阶段，首先利用训练集中标注的事件角色信息构建标准关注向量，然后将其作为监督信息训练模型的关注机制。因此，事件识别模型和关注机制都在有监督的条件下进行训练。测试阶段，利用学习得到的关注机制和事件识别模型在测试文本中进行事件识别。公开数据集上的一系列实验的结果验证了本方法的有效性。
英文摘要	Today, the Internet has already become an essential part of people's daily life, which provides great convenience to our work, study and life. There are large numbers of electronic texts (news, blogs, emails, etc.) on the Internet, and the total amount is still in explosive growth. Under this background, there is an urgent need for technologies to facilitate the acquisition of target knowledge from large-scale unstructured texts on the Internet. Information extraction aims to solve this problem, which is proposed to extract entities, relations and events from texts. And in this dissertation, we only focus on the event. Event detection is a crucial part of event extraction, which strongly influences the overall performances of event extraction. The goal of event detection is to detect events of certain specified types and identify the word or phrase which most clearly expresses the target event occurrence (called event trigger). Event detection helps to manage and exploit the large-scale information on the Internet. It is potentially useful in information retrieval, event tracking, question answering, text summarization, and so on. Thus, event detection has been received widespread attentions in both academia and industry, and is becoming an increasingly hot research topic. Event detection is extremely challenging because of the diversity and flexibility of natural languages: (1) The ambiguity of event triggers is one of the most important challenges of this task. For example, the word ``fire" could trigger both ``Attack" event and ``End-Position" event in different cases; (2) The complexity of events causes it very expensive to manually label event instances from texts, thus all existing data sets are in small scale. Data-sparseness problem is another challenge of event detection; (3) Event is a complicated structure, which makes it difficult to effectively model the structured information of events. This dissertation focuses on the abovementioned challenges, and the main achievements are as follows: %In this dissertation, we focus on detecting events and their corresponding triggers from unstructured texts. The main achievements are as follows: 1) The ambiguity of event triggers is one of the most important challenges of event detection. To tackle with this problem, we propose a Probabilistic Soft Logic based approach to exploiting latent and global information in event detection. Global information such as event-event association, and latent local information such as fine-grained entity types, are crucial to event detection. However, existing methods typically focus on sophisticated local features such as part-of-speech tags, either fully or partially ignoring the aforementioned information. By contrast, we focus on fully employing them for event detection. We notice that it is difficult to encode some global information such as event-event association for previous methods. To resolve this problem, we propose a feasible approach which encodes global information in the form of logic using Probabilistic Soft Logic model. Experimental results show that, our proposed approach advances state-of-the-art methods. 2) Data-sparseness problem is another challenge of event detection. To alleviate this problem, we propose to leverage FrameNet (FN) to improve event detection. Frames defined in FN share highly similar structures with events in ACE event extraction program. An event in ACE is composed of an event trigger and a set of arguments. Analogously, a frame in FN is composed of a lexical unit and a set of frame elements, which play similar roles as triggers and arguments of ACE events, respectively. Besides having similar structures, many frames in FN actually express certain types of events. The above observations motivate us to explore whether there exists a good mapping from frames to event types and if it is possible to improve the performance of event detection by using FN. To achieve this goal, we propose a global inference based approach to detect events in FN. Further, based on the detected results, we analyze possible mappings from frames to event types. Finally, we improve the performance of event detection and achieve a new state-of-the-art result by using the events automatically detected from FN. 3) Event arguments are capable of providing significant clues for event detection, however existing approaches proposed for this task failed to utilize this information effectively. To make use of the annotated argument information of the training corpus, we propose a neural networks based approach with supervised attention mechanisms for event detection. The basic idea of this approach is that argument words should obtain more attentions than common words when detecting events in texts. To achieve this goal, we introduce supervised attention mechanisms in our detection model. Moreover, we systematically investigate the proposed model under the supervision of different attention strategies. Specifically, in training procedure, we first construct gold attentions for each trigger candidate based on annotated arguments. Then, treating gold attentions as the supervision to train the attention mechanism, we learn attention and event detector jointly both in supervised manner. In testing procedure, we use the learned detector and attention mechanism to detect events. Experimental results show that our approach advances state-of-the-arts, which demonstrates the effectiveness of the proposed method.
关键词	自然语言处理信息抽取事件识别事件抽取神经网络
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14640
专题	毕业生_博士学位论文
作者单位	中国科学院自动化研究所
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	刘树林. 面向非结构化文本的事件识别关键技术研究[D]. 北京. 中国科学院大学,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
forrest_final.pdf（5102KB）	学位论文		限制开放	CC BY-NC-SA