面向用户行为序列的深度上下文建模

CASIA OpenIR > 模式识别实验室

	面向用户行为序列的深度上下文建模
	崔强
	2019-05-30
页数	108
学位类型	博士
中文摘要	科技改变生活，随着科技尤其是互联网等技术的深入发展及广泛运用，多彩缤纷的线上应用与人们的生活结合得越来越紧密。这些应用累积了大量用户对项目的行为，比如购买和点击，按时间顺序可获得相应的用户行为序列。在产生用户行为的过程中，往往伴随着丰富的上下文信息，例如时间、空间、图像、文本描述等。这些信息能从多角度去辅助描述用户、项目和行为，不仅能丰富特征表达取得更好的模型性能，还有助于缓解冷启动、数据稀疏等问题。当前，深度学习在计算机视觉、自然语言处理等领域取得了巨大的突破，将其应用于用户行为序列建模，前景广阔。本文面向用户行为序列，采用基于循环神经网络（Recurrent Neural Network, RNN）等的深度学习方法，研究多种形式的上下文建模策略，提出相应的创新方案，具体研究内容如下：（1）在建模用户的当前时刻行为时，可将序列中其它时刻的行为视为上下文，本文称之为时序上下文。为了使用与当前时刻相临近的用户行为，本文提出了层次化上下文注意力网络，将上下文注意力建模分别应用在RNN的输入层、隐含层。RNN模型在每时刻只处理一个输入、一个隐藏状态，模型将临近的多个输入、多个隐藏状态当做上下文。在输入层，将当前时刻之前的多个输入收集起来，组成上下文矩阵，并采用注意力机制抽取重要特征，形成上下文输入向量，再与该时刻原本的输入共同送入RNN。在隐含层，同样收集当前时刻之前的多个隐藏状态，组成上下文矩阵，并采用注意力机制形成上下文隐藏向量，再与该时刻原本的隐藏状态做非线性激活，生成最终的当前时刻隐藏状态。该方法在用户购买、点击等场景中取得了优异的性能，能很好地把握用户短期兴趣的建模。（2）使用行为本身做上下文虽然效果显著，但建模有局限，例如无法处理冷启动问题，使用当前上下文可从多种视角去丰富建模。当前上下文，是行为序列之外且伴随当前时刻行为的信息，例如天气等。为了探索多样当前上下文的建模方式，我们提出了多视角循环神经网络模型。本文以用户网络购物为例，引入项目的图像信息、文本描述，预测下一次购买。具体的建模过程分为两步，包含多特征建模和用户兴趣建模。在特征层面，基于原有用户序列学习项目编号的隐含特征，引入图文后得到图像特征、文本特征。设计三种方式去组合多特征：拼接、图文融合、图文融合后再重构。由于组合后的特征，本质上仍要通过拼接做为模型的输入，因此再设计两种方式处理输入以得到用户兴趣。多特征直接输入一条RNN得到综合的用户兴趣，多特征分别输入多个RNN得到各自用户特征后再组合。该方法在两个用户购买数据集上都取得了最优的效果，并且项目的图文特征利于缓解冷启动问题。（3）引入上下文后值得做深入探索，多信息融合能进一步挖掘信息间的影响与联系。为了探索不同的上下文融合方式，本文提出了个性化的空间偏好模型。选择推荐系统中地点预测这一经典问题，建模用户曾到访过的地点，并引入地点两两间的距离间隔信息，以便更准确预测用户下次的地点。在建模过程中，一方面采用RNN建模基础的用户地点序列，得到用户的序列偏好。另一方面，通过建模用户每次的距离间隔，得到用户对下次距离的空间偏好。而后，设计线性、非线性两种方式去组合用户在每个时刻的序列偏好、空间偏好。该方法极大提升了预测准确度，线性组合可直观看到不同信息的作用，而非线性的融合方式能更好地适应数据多样性、把握数据中的复杂关联关系。
英文摘要	Science and technology change lives. With the in-depth development and extensive use of technologies, especially the Internet, multiple online applications are more and more closely integrated with people's lives. These applications accumulate a large number of user behaviors for the item, such as purchases and clicks, and the corresponding user sequences are obtained in chronological order. In the process of generating user behavior, often accompanied by rich context information, such as time, images, text descriptions and so on. At present, deep learning has made great breakthroughs in the fields of computer vision and natural language processing, and it is applied to user behavior sequence modeling. This paper is oriented to the user behavior sequence, using the deep learning method based on Recurrent Neural Network (RNN) to study various forms of context modeling strategies, and propose corresponding innovative solutions. The specific research contents are as follows: (1) When modeling the user's current behavior, the behavior of other moments in the sequence can be regarded as the context, which is called the timing context. In order to use the user behavior close to the current moment, this dissertation proposes a hierarchical contextual attention-based network, and applies contextual attention modeling to the input layer and hidden layer of the RNN. The model treats multiple inputs and multiple hidden states as contexts. At the input layer, multiple inputs before the current time are collected to form a context matrix, and the attention mechanism is used to form a context input vector, which is then sent to the RNN together with the original input. In the hidden layer, the multiple hidden states before the current moment are also collected to form the context hidden vector, and then the final hidden state is obtained by nonlinear activation. The method achieves excellent performance in the scenes of user purchase, click, etc., and can well grasp the modeling of the user's short-term interest. (2) Using the behavior itself as a context is effective, but modeling is limited, and current context can be used to enrich modeling from multiple perspectives. The current context is information outside the behavior sequences and accompanying the behavior of the current moment, such as weather. In order to explore the modeling methods of diverse current contexts, we propose a multi-view recurrent neural network. This dissertation takes the user's online shopping as an example, introduces the image information and text description of the item, and predicts the next purchase. The modeling process is divided into two steps, including multi-feature modeling and user interest modeling. There are three kinds of features including latent feature, visual feature and textual feature. Three ways are designed to combine multiple features: concatenation, fusion, and fusion with reconstruction. Due to the combined features, the input is still a concatenated feature. Therefore, the two methods are designed to process the input to obtain user interest. Multiple features directly input an RNN, and multiple features are input into multiple RNNs and then combined. The method achieves the best results on both user purchase data sets, and the visual and textual features of the item are beneficial to alleviate the cold start problem. (3) It is worthwhile to explore in depth after introducing the context, and fusion can further explore the influence and connection between information. In order to explore different contextual fusion methods, this dissertation proposes a personalized spatial preference model. Select the classic problem of location prediction, model the location where the user has visited, and introduce the distance interval information between the two locations to more accurately predict the user's next location. In the modeling process, on the one hand, the sequence of user locations based on RNN modeling is used to obtain the user's sequential preference. On the other hand, by modeling the distance interval each time, the user's spatial preference for the next distance is obtained. Then, the linear and nonlinear methods are designed to make combinations at each moment. This method greatly improves the prediction accuracy. The linear combination can visually see the role of different information, and the nonlinear fusion method can better adapt to the data diversity and grasp the complex relationship in the data.
关键词	上下文信息深度学习用户行为序列循环神经网络注意力机制
语种	中文
七大方向——子方向分类	数据挖掘
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/23886
专题	模式识别实验室
通讯作者	崔强
推荐引用方式 GB/T 7714	崔强. 面向用户行为序列的深度上下文建模[D]. 中国科学院自动化研究所. 中国科学院大学,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
博士学位论文崔强，4本，谭老师组.pd（12117KB）	学位论文		开放获取	CC BY-NC-SA