CASIA OpenIR  > 毕业生  > 博士学位论文
Thesis Advisor毛文吉
Degree Grantor中国科学院大学
Place of Conferral北京
Keyword社交媒体分析 情感分析 立场挖掘 个性化信息 基于话题的建模与求精
1. 探究性格因素在社交媒体情感分析中的作用,并首次提出一种基于Big Five性格模型的情感极性分类方法。该方法根据用户的不同性格维度对社交媒体文本进行分组,从而挖掘不同性格维度对应的个性化情感特征,并通过集成学习融合个性化与通用情感分类结果,以提升现有情感分类方法的效果。最后,采用实验方法验证了所提出的个性化社交媒体情感分析方法的有效性。
2. 在立场挖掘领域,首次开展面向多方实体的立场挖掘研究,并提出一种融合双层话题信息的多方立场挖掘方法。该方法利用社交媒体文本中与特定立场相关的话题信息细颗粒度地刻画不同立场的词汇特征,并挖掘与立场无关的通用话题进一步提升立场分类效果。最后,采用实验方法验证了所提出的多方立场挖掘方法的有效性。
3. 为减少多方立场挖掘所需的人工标注数据,同时保证分类性能,提出一种基于用户立场一致性与话题信息的半监督多方立场挖掘方法。该方法采用自训练方式、利用少量已标注文本和大量未标注文本迭代训练立场分类模型,并根据用户立场一致性与话题信息选择高置信度分类样本用于扩充训练文本集合。最后,采用实验方法验证了所提出的半监督多方立场挖掘方法的有效性。
4. 在所提出的半监督多方立场挖掘方法基础上,进一步提出一种基于话题建模的弱监督多方立场挖掘方法。该方法首先运用情感分析自动标注少量文本的立场,再利用大规模文本之间的内在语义关联提升对噪声标签的鲁棒性。该方法通过扩展话题模型得到具有立场区分性的话题,并基于话题相似度确定文本立场。最后,采用实验方法验证了所提出的弱监督方法在多方立场挖掘中的有效性。
Other AbstractWith the in-depth development and popularization of the Internet, social media have infiltrated into all aspects of social life, and become one of the major channels for people to disseminate information, share emotions and express desires. Internet users express their sentiments and standpoints towards certain entities, events or topics by publishing contents and comments on social media platforms. Sentiment analysis and standpoint mining for social media texts can help people explore public opinions, and understand and grasp the dynamic of popular feelings in time. Thus they are of great research and application value in many areas, such as business and security. In this thesis, we focus on sentiment analysis and standpoint mining of social media texts. Specifically, we investigate the application of users’ personalized information for sentiment analysis, and explore the effect of topic information in multiple standpoint mining. We carry out experimental studies to evaluate the effectiveness of the proposed sentiment analysis and standpoint mining methods on social media datasets, including Sina Weibo dataset and Twitter dataset, etc.
The major works and contribuions of this thesis are summarized as follows:
1) We investigate the role of users’ personality in sentiment analysis for social media texts, and first propose a sentiment polarity classification method based on Big Five personality model. This method groups social media texts according to users’ different personality dimensions, and mines the corresponding personalized sentiment features. In addition, this method employs ensemble learning to merge the results of personalized and general sentiment classification to improve the performances of current sentiment classification methods. We finally conduct experimental studies to verify the effectiveness of the proposed personalized sentiment classification method for social media texts.
2) In the area of standpoint mining, we are among the first to carry out the research on standpoint mining concerning multiple entities, and propose a method for multiple standpoint mining which incorporates double layers of topic information. This method leverages standpoint-related topic information in social media texts to capture the lexical features of different standpoints in a fine-grained way, and mines standpoint-independent general topics to further improve standpoint classification performance. We finally conduct experimental studies to verify the effectiveness of the proposed method for multiple standpoint mining.
3) To reduce the demand of manually annotated data in multiple standpoint mining, and meanwhile guarantee the classification performance, we propose a semi-supervised method for multiple standpoint mining based on user-level standpoint consistency and topic information. This method leverages a small number of labeled texts and large numbers of unlabeled texts to train standpoint classification model iteratively in a self-training way. To expand the set of training texts, this method selects the classification samples with high confidence according to user-level standpoint consistency and topic information. We finally conduct experimental studies to verify the effectiveness of the proposed semi-supervised method for multiple standpoint mining.
4) On the basis of the proposed semi-supervised method for multiple standpoint mining, we further propose a weakly-supervised multiple standpoint mining method based on topic modeling. This method first employs sentiment analysis to annotate the standpoints of a small number of texts automatically, and then leverages the intrinsic semantic relevence of massive texts to improve its robustness to noisy labels. This method extends topic model to acquire the topics which are distinguishable between different standpoints, and determines the standpoints of texts based on topic similarity. We finally conduct experimental studies to verify the effectiveness of the proposed weakly-supervised method for multiple standpoint mining.
Document Type学位论文
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
林俊杰. 面向社交媒体的个性化情感分析与立场挖掘方法研究[D]. 北京. 中国科学院大学,2018.
Files in This Item:
File Name/Size DocType Version Access License
面向社交媒体的个性化情感分析与立场挖掘方(6281KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[林俊杰]'s Articles
Baidu academic
Similar articles in Baidu academic
[林俊杰]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[林俊杰]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.