CASIA OpenIR  > 毕业生  > 硕士学位论文
Thesis Advisor郝红卫
Degree Grantor中国科学院大学
Place of Conferral北京
Keyword情感分类 Lstm 自编码 注意力机制
为解决以上问题,本文在传统循环神经网络 (Recurrent Neural Network, RNN)网络模型的基础上,引入长短期记忆单元(Long Short Term Memory, LSTM),从增加序列未来信息、增强语义学习和添加注意力机制等三个方面对其进行改进,提出了双向多层LSTM网络分类模型、LSTM自编码网络分类模型和带注意力机制的LSTM自编码联合学习模型。主要研究内容包括:
1)提出了基于词向量的多层双向LSTM网络框架。传统基于词袋的支持向量机(Support Vector Machine, SVM)方法和卷积神经网络 (Convolutional Neural Network, CNN)分类方法没有考虑时间序列、历史信息。LSTM-RNN将短文本以词为单位转化为词向量输入网络,学习短文本的深层语义表示,再将其进行分类。双向LSTM在一定程度上解决了时间序列首尾权重不平衡的问题,同时融合了历史和未来信息,使得生成的句子表示包含语义更为完全,但仍然无法完整包含整句的语义和标签信息。实验结果表明双向LSTM网络能够获取更多的信息,生成的语义表示具有更好的辨识性,分类效果较SVM,CNN以及RNN有较大的提升。
2)为了加强分类模型中语义的学习过程,得到句子的更为完整的深层语义表示。本文提出了LSTM和自编码(Auto Encoder)的联合学习模型,有监督训练的同时,加入自编码网络(Auto Encoder)对句子进行编解码,生成包含语义和标签信息的句子表示。实验结果表明,自编码网络的加入使得分类效果在中英文数据集上均有所提高,联合训练效果也较离线训练要好。
3)人脑在接受文本或者图像信息的时候,在了解整体信息之后会将注意力集中在某些词和图像的片段上进行加深印象,情感分类也是如此,在掌握整句语义和标签信息后,需要对某些关键词,例如转折词、情感词等,进行聚焦来把握整句的情感倾向。因此本文在联合学习模型的基础上提出了三种模式的注意力机制(Attention Mechanism),旨在能够将网络自动聚焦在某些关键的词或者字句的语义表示上面,生成信息更加丰富并且关键词突出的深度语义特征表示,实验中三种注意力机制的分类效果都较LSTM +AutoEncoder联合学习要好,其中基于隐层输出的注意力机制效果最好。
Other AbstractThe coming of information era, updating of information technology, popularization of mobile Internet, help social networks and e-commerce boom. Internet has been the most important way for people to obtain, exchange information and communicate. Information with personal sentiments in social networks such as BBS, Microblog, and Twitter, expand rapidly. Due to their limited length, concise expression and Emotional concentration, they are called Short Texts. On account of short texts’ conciseness and irregularities, methods that rely too much on handcrafted features and pre-processed tools, have been restricted. And traditional methods based on handcrafted features, have not taken words’ order into consideration, which is bad for extracting texts’ sentiment information.
Aiming at these drawbacks, this thesis introduced Long Short Term Memory (LSTM) cell into traditional RNN network, improved the model in three aspects: adding sequence information, enhancing semantic learning and attention mechanism, and proposed Bidirectional Multi-layer LSTM Network, LSTM AutoEncoder Network, and Attention-based LSTM AutoEncoder joint learning model. Main research contents include:
  1. Proposed a multi-level bidirectional LSTM network. Traditional Support Vector Machine (SVM) method based on Bag of Words and Convolutional Neural Network (CNN) have not taken temporal order and history information into account. The proposed model in this thesis combines words’ vector and gets sentences’ deep semantic representations. Bidirectional LSTM Neural Network has alleviated the problem of imbalanced weights for the beginning and ending words in a sentence, and has made the semantic representation of the sentence more complete by merging the history and future information. Experiments show that bidirectional LSTM network outperforms SVM, CNN and LSTM a lot in fine-grained classification.
  2. To enhance the learning of semantic information and get sentence’s integrated deep semantic representation. This thesis proposed a joint learning model of LSTM and Auto Encoder. In the process of supervised learning, the AutoEncoder could encoder the sentence without supervision and get the more comprehensive representations which include both the semantic and label information. We got a better performance compared to the bidirectional LSTM network on both the English and Chinese datasets. Results show that joint learning is also better than off-line in my experiments.
  3. As far as we know, when our human brain receives the information from texts and images, we would focus on some specific words or parts of the images after getting the overall features, it is called “Attention Mechanism”. Similarly, sentiment analysis also needs to focus on the key words such as emotional words and adversatives after getting the information of the entire sentence. So in this thesis, I proposed 3 kinds of attention mechanisms to focus on key words and sub-sentence on the foundation of the joint learning model of LSTM and Auto Encoder to get more abundant and highlighted semantic representation. In the experiments, all the 3 kinds of Attention model outperform the LSTM-AutoEncoder networks, and the one based on hidden units’ outputs gets the best result in the fine-grained classification.
Document Type学位论文
Recommended Citation
GB/T 7714
施伟. 基于深度语义特征表示的短文本情感分析研究[D]. 北京. 中国科学院大学,2016.
Files in This Item:
File Name/Size DocType Version Access License
基于深度语义特征表示的短文本情感分析研究(2958KB)学位论文 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[施伟]'s Articles
Baidu academic
Similar articles in Baidu academic
[施伟]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[施伟]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.