CASIA OpenIR  > 毕业生  > 硕士学位论文
面向事件传播的隐式语义表达分析方法研究
苑敏洁
2024-05-18
Pages94
Subtype硕士
Abstract

社交媒体平台的高速发展为用户提供了更加便捷、开放的事件信息传播渠道。事件传播趋向于表达方式多元化、反馈形式复杂化,进而衍生出讽刺、隐喻等隐式语义表达。在事件传播分析过程中,深入挖掘和洞悉隐式语义表达及其相关的细粒度关联要素,有助于更精准地预测事件影响力,为挖掘用户群体的潜在立场和深层诉求提供了准确高效的决策支持。本文旨在借鉴知识增强语义表征、异构信息网络建模和多任务学习等领域的最新研究成果,从隐式表达主题与线索发现、主题-对象关联对识别和多维度关联要素挖掘等三个方面开展面向事件传播的隐式语义表达分析方法研究,主要工作及创新点总结如下:

1. 知识增强提示的隐式表达主题与线索发现方法。隐式表达中蕴含了与表层语义不一致的观点,需要融合领域主题、上下文线索理解并解释其深层次动因。针对如何有效地建模外在知识,在无监督场景下发现主题与线索的挑战性问题,本文提出了知识增强的提示学习方法。该方法首先基于句法知识和情感词典对隐式语义句和线索句掩码;进而在主题提示引导下捕捉线索句与隐式表达句的语义关联,采用预训练模型生成掩码词语;最后设计交互相似度匹配机制建模原始文本与生成文本间的差异,生成主题-线索互影响矩阵,无监督地预测隐式表达的主题和线索。实验表明,融合外在知识和主题信息,利用预训练模型的知识增强建模,能够提高模型对隐式语义主题和解释性线索的理解能力。

2. 语境感知网络驱动的主题-对象对识别方法。不同事件背景和语境场景下,隐式表达的领域词汇、写作风格、句法符号等特征具有多样性和交叠性,暗指的对象和方面存在区别与关联。针对如何建模多视角特征的关联性与差异性的挑战性问题,本文提出语境感知网络驱动的主题-对象对识别方法。该方法首先构建融合知识、语义的多视角异构网络;进而采用预训练融合的网络表示学习方法刻画各特征的全局关联,设计注意力融合的聚合机制捕获特征间交互;最终采用语境感知的负样本采样机制学习不同主题-对象对的差异化表征,联合优化对比损失与交叉熵损失实现主题-对象对预测。实验表明,语境感知网络能够捕获细粒度特征的交互与差异,准确挖掘隐式表达的主题和对象关联对。

3. 基于偏差修正多轮问答的多维度关联要素挖掘方法。为深入挖掘隐式表达中组织、主题等细粒度要素间的级联关系并减少误差累积,本文提出基于偏差修正多轮问答的多维度关联要素挖掘方法。该方法将事件要素间的逻辑关系转化为多轮查询和回答过程,并引入查询修正机制以缓解多轮问答中的错误累积;设计任务感知的BERT编码器,实现多任务联合学习;最终采用交叉熵损失协同优化,引导模型选择事件主题,抽取主题内容要素和关键机构要素,构成多维度关联要素元组。实验表明,深入刻画主题与组织要素间的级联关系并修正偏差扰动,有助于挖掘细粒度关联要素。

Other Abstract

The rapid development of social media platforms has provided users with more convenient and open channels for event information dissemination. Event propagation tends to be diversified in expression and complex in feedback forms, leading to the emergence of implicit semantic expressions, such as sarcasm and metaphor. In the process of event dissemination analysis, in-depth mining and understanding implicit semantic expression and its fine-grained related elements helps predict the impact of events more accurately, providing precise and efficient decision support for identifying potential stances and underlying demands of users. By learning from the achievements of knowledge-enhanced semantic representation, heterogeneous information network modeling, and multi-task learning, this thesis aims to research the analyzing methods of implicit semantic expression based on event dissemination from three perspectives: implicit topic and clue discovery, topic-object pair identification, and multi-dimensional associated elements mining. The major work and contributions of this thesis are summarized as follows:

1. Knowledge-enhanced prompt method for implicit expression topic and clue discovery. Implicit expressions contain opinions which are inconsistent with surface-level semantics, necessitating the integration of domain topics and contextual clues to understand and explain their underlying motivations. To tackle the challenge of effectively modeling the external knowledge and discovering topics and clues in an unsupervised manner, this thesis proposes a knowledge-enhanced prompt learning method. This method first masks implicit semantic sentences and clue sentences based on syntactic knowledge and sentiment lexicons; then, based on the guidance of topic prompts, it captures the semantic correlation between clues and implicit expression sentences, generating masked words using a pre-trained model. Finally, an interactive similarity matching mechanism is designed to model the differences between original and generated sentences, creating a topic-clue co-influence matrix, and the topics and clues of implicit expression are predicted unsupervised. Experiments demonstrate that integrating external knowledge and topic information into knowledge-enhanced pre-trained models improves the model's ability to comprehend implicit semantic topics and interpretative clues.

2. Context-aware network-driven method for topic-object pair identification. Under different events and contextual scenarios, implicit expressions exhibit diversity and overlap in domain vocabulary, writing style, and syntactic features, indicating differences and associations among implied targets. To tackle the challenge of how to model the associations and differences among multi-perspective features, this thesis proposes a context-aware network-driven topic-target pair identification method. This method first constructs a multi-perspective heterogeneous network integrating knowledge and semantics; then a pre-trained integrated network representation learning approach is employed to mine the global association of each feature, and an attention-fused aggregation mechanism is designed to capture the interactions among features. Finally, a context-aware negative sampling mechanism is adopted to learn the differentiated representations of various topic-target pairs, which help optimize contrastive loss and cross-entropy loss jointly to predict topic-target pair. Experimental results demonstrate that the context-aware network can capture the interactions and differences of fine-grained features, accurately mining the associations between the implicit expression's topics and targets.

3. Multi-dimensional associated elements mining method based on bias-corrected multi-turn question answering. To deeply mine the cascading relationships among fine-grained elements, such as organizations and topics, in implicit expressions and to reduce error accumulation, this thesis proposes a bias-corrected multi-turn question answering method. This method transforms the logical relationships among event elements into a multi-turn query and answer process, incorporating a query correction mechanism to alleviate error accumulation in multi-turn answering. A task-aware BERT encoder is designed to learn multi-task jointly. Finally, cross-entropy loss is employed for collaborative optimization, which guides the model to select event topics, extract topic content elements and key organizational elements, forming multi-dimensional associated element tuples. Experimental results show that capturing the cascading relationships between topics and organizational elements in depth and correcting bias helps mine fine-grained associated elements.

Keyword事件传播 预训练文本表征与生成 隐式表达线索 主题对象关联对 多维度要素
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/56504
Collection毕业生_硕士学位论文
Recommended Citation
GB/T 7714
苑敏洁. 面向事件传播的隐式语义表达分析方法研究[D],2024.
Files in This Item:
File Name/Size DocType Version Access License
苑敏洁-面向事件传播的隐式语义表达分析方(2023KB)学位论文 限制开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[苑敏洁]'s Articles
Baidu academic
Similar articles in Baidu academic
[苑敏洁]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[苑敏洁]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.