CASIA OpenIR  > 毕业生  > 硕士学位论文
社会媒体中的用户行为理解及分析方法
王灿1,2
学位类型工学硕士
导师李秋丹
2018-05-27
学位授予单位中国科学院研究生院
学位授予地点北京
关键词社会媒体 用户行为 转发预测 转载预测 评分预测
摘要近年来,社会媒体成为人们分享意见、观点及传播信息的重要平台。用户可以在平台上自由转发信息,并对产品发表评论与评分。深入分析社会媒体中信息传播与评分机制,有助于管理部门理解用户兴趣和辅助决策。因此,如何建模用户的转发、转载和评分行为,预测用户兴趣,成为当前重要和具有挑战性的研究课题。
本文借鉴深度学习、数据挖掘领域的研究成果,融合文本内容、用户和社会影响力的深度表示方法挖掘用户行为模式、预测用户兴趣。主要工作包括如下三个方面:
1)针对帖子内容和行为数据稀疏性问题,提出一种集成用户-内容交互信息、用户信息和社交影响力信息的用户转发行为预测模型。该模型首先通过融合内容共现信息、用户信息和基于词向量的内容语义表示信息计算内容间关联相似度;接着,基于协同分解模型,共同分解用户-内容交互矩阵和内容相似度矩阵,进而预测用户的转发兴趣。在真实微博数据集的实验结果表明,通过对用户与帖子信息进行有效表示,该模型能准确分析用户转发行为模式。所挖掘的行为模式可以为消费者及监管部门提供信息反馈。
2)针对传统传播分析方法依赖于显式网络结构的问题,提出一种基于热扩散理论的融合内容类别特征的新闻传播模型。该模型首先利用热扩散过程建模新闻传播过程,将新闻传播网络中的新闻站点映射到连续隐含空间中;然后,在模型中集成新闻类别语义特征,并将该语义特征作为新闻发布源站点在隐含空间中的偏置向量;最后,通过各个新闻站点与新闻发布源站点在隐含空间的距离获取新闻的转载传播序列。实验结果表明,在连续空间中建模信息扩散过程,能有效捕获训练数据中信息传播节点间隐式关联关系。集成新闻类别信息有助于深入理解与分析信息传播模式。
3)为有效集成内容及分项评分信息,提出一种基于注意力机制的评分预测方法。该方法首先通过双向长短记忆网络学习内容的深度语义表示;然后在注意力机制中集成用户分项评分信息,捕获内容中与用户分项评分有关的重要特征。在真实评论数据集中的实验结果表明,捕获评论内容与分项评分之间的深度语义关联可提升模型的预测效果。
其他摘要Recently, social media has become a prevalent information sharing and spreading platform, where users can retweet messages, provide reviews and ratings for products. Better understanding the diffusion and rating mechanism will enable organizations to estimate why and how a user will be interested in a message and product, thus bring great opportunities for them to make better decisions. Therefore, how to model the user retweeting, quoting and rating behaviors and predict future interest is a very important and challenging research topic.
Based on the latest techniques of deep learning and data mining, this paper aims to mine users’ behaviors and predict users’ interests by jointly performing deep representation of content, user and social influence information. The primary contents of this paper include the following aspects
1) It proposes a hybrid model for learning users’ retweeting behavior, which solves the sparsity problem by unifying user-content interaction information, user information and social influence information. The model first computes content similarity by considering the content co-occurrence, the user information and word2vec based low-dimensional representation of content, then, jointly decomposes the user-content matrix and content similarity matrix based on a co-factorization model. We empirically evaluate the performance of the proposed model on real world weibo datasets. Experimental results show that taking the dense representation of user and content information into consideration could allow us make more accurate analysis of users’ retweeting patterns. The mined patterns could serve as a feedback channel for both consumers and management departments.
2) It puts forward a news diffusion model based on heat diffusion theory, which takes semantic category into account and doesn’t need strong assumptions about explicit diffusion structure. The model firstly maps the observed news sites diffusion process into a heat diffusion process in a continuous space, then, the semantic category information of news is integrated as the offset of source site’s location in the latent space. Finally, the news quoting sequence is obtained based on the distances to the source site of news in the latent space. Experimental results show that diffusion model based on continuous space can capture the implicit relationships which are unobserved in the training dataset, and news’s category information can help learn more detailed propagating patterns.
3) It studies an attention-based rating prediction method, which effectively encodes contents and aspect ratings into an integrated format. The method firstly constructs a bidirectional long short-term memory network (BiLSTM) to obtain the deep semantic representation of the content, then the attention mechanism is proposed to mine the important features of aspect ratings in the review content. Experimental results on real-world review sites show that the performance advantage of the proposed approach mainly comes from the identified deep semantic associations among review content and aspect ratings.
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/21072
专题毕业生_硕士学位论文
作者单位1.中国科学院自动化研究所
2.中国科学院大学
推荐引用方式
GB/T 7714
王灿. 社会媒体中的用户行为理解及分析方法[D]. 北京. 中国科学院研究生院,2018.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
社会媒体中的用户行为理解及分析方法.pd(4633KB)学位论文 暂不开放CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[王灿]的文章
百度学术
百度学术中相似的文章
[王灿]的文章
必应学术
必应学术中相似的文章
[王灿]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。