基于异质信息的推荐系统研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于异质信息的推荐系统研究
其他题名	Recommender Systems with Heterogenous Information
	袁婷
	2015-05-27
学位类型	工学博士
中文摘要	随着互联网技术的发展，信息正呈现爆炸式增长。为了解决信息过载，推荐系统应运而生。推荐系统是通过分析用户的兴趣特点和历史行为，主动地向用户推荐他可能感兴趣的信息。决定推荐系统效果的关键是如何正确建模用户的兴趣偏好。目前最为广泛使用的方法是基于协同过滤的推荐技术，它基于“相似的用户对相似的对象有相似的表现”这一假设，通过挖掘用户和对象之间的历史行为信息建模出用户的兴趣偏好，从而实现推荐。然而，传统的方法面临着数据稀疏的问题，即当用户和对象之间的历史交互信息非常稀疏时，传统的协同过滤技术很难从中有效地挖掘出用户的兴趣偏好。另一方面，随着各种互联网应用的流行，我们能处理的数据也越来越多样化。例如，用户和商品的内容信息，用户之间好友关系，用户在网络中的多种行为，如购买行为、阅读行为、加入群组行为等。这些信息在形式上是多样的，属性上是异质的。如果能针对特定的推荐场景，高效并充分的挖掘和利用这些异质信息，将有效解决传统方法中数据稀疏的问题，从而提高推荐效果。因此，本文的工作主要围绕基于异质信息的个性化推荐系统展开，针对不同的推荐场景，提出相应的有效融合异质信息的解决方案。本文研究的主要内容和贡献如下： 1.本文针对推荐问题中常见的隐式反馈数据（点击数据、购买数据、收藏数据等），提出了一种基于内容主题特征的加权单类协同过滤算法。该方法通过融合网络中丰富的内容信息来解决隐式反馈推荐时存在的单类问题。具体而言，该方法为每个用户和对象提取出内容主题特征，用以帮助从缺失数据中区分出潜在的负样本。并且在传统矩阵分解模型的基础上融入了基于内容相似性的加权机制，通过内容信息提供的先验来辅助挖掘用户的隐式反馈数据。真实数据集上的实验表明，我们的方法能将网络中丰富的内容信息融入到隐式反馈推荐场景中，帮助解决该场景下负样本缺失带来的困难。 2.本文针对融入社交关系的推荐场景，提出了一种基于社交影响分析的推荐方法，它通过挖掘用户的亲密好友（对用户行为有强影响力的好友）和易感性强度（用户接受好友影响的意愿程度），将网络中用户之间的社交关系有效地融入到推荐当中。为了挖掘每个用户的亲密好友和易感性强度，该方法构造了一个统一的因子图模型（factor graph model）来捕获影响社交关系分析的多个要素，同时提出了社交影响力传播（Social Influence Propagation，SIP）算法：通过在社交关系网络中传递两种跟影响力强度相关的信息来学习该模型。最后，在亲密好友和易感性强度的指导下，该方法同时考虑了长期和短期的社交关系影响来提高推荐的准确性。实验表明，我们的方法能将用户之间的社交关系更有效地融入到推荐中，帮助解决数据稀疏的问题。 3.本文针对融入用户多种行为的推荐场景，提出两种基于用户多种行为分析的推荐方法，通过同时建模用户在不同行为之间的相关性和异质性，实现行为之间信息的有效迁移，从而有效融合用户的多种行为信息进行推荐。它们是基于类别分组的潜在因子模型和基于组稀疏的矩阵分解模型。前者将用户各种行为下的评分矩阵分解到共享和独立的两...
英文摘要	The development of information technology makes massive data generating at an unprecedented rate, which brings the information overload problem. To address this problem, the recommender system emerges.Recommender systems try to suggest users the potential enjoyed information by analyzing users' characteristic and their historical behaviors. How to model the users' individual preference properly is crucial for them. The most widely used methods are Collaborative Filtering (CF) approaches. Based on the assumption that similar users have similar behaviors on similar items, CF methods aim at predicting users' interests by mining their behavior history. However, they have the data sparsity problem, that is, the behavioral data is typically very sparse and it is indeed hard for traditional CF methods to make accurate recommendation with such insufficient data. With the prevalence of massive web applications, we can deal with diverse kinds of data.For example, the massive content information for users and items, the social relationship between users and user's multiple types of behaviors on the internet (such as the shopping history, reading history and rating history). Their forms are diverse and their attributes are heterogenous. If we can mine these heterogenous information effectively for different recommendation cases, the data sparsity problem will be relieved to improve the recommendation quality. In this dissertation, we will take the research on recommender systems with heterogenous information. For different recommendation problems, we have presented different solutions which emerge different heterogenous information effectively. The main contributions are summarized as follows. 1. We propose a novel method, named Content Topic Feature weighted One-Class Collaborative Filtering, to deal with the implicit feedback data in recommender systems. It attempts to solve the one-class problems of implicit feedback by exploiting the rich content information. Specifically, we get a content topic feature for each user and item to assist distinguishing the potential negative examples from missing data, and extend the Matrix Factorization model by incorporating the content-similarity based weighting scheme. Experiments on real-world data show that the proposed method outperforms state-of-the-art algorithms, which suggests that our method can incorporate the content information into implicit feedback effectively and assist to overcome the one-class problem in thi...
关键词	推荐系统异质信息矩阵分解协同过滤 Recommender Systems Heterogenous Information Matrix Factorization Collaborative Filtering
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6704
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	袁婷. 基于异质信息的推荐系统研究[D]. 中国科学院自动化研究所. 中国科学院大学,2015.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20121801462911（2357KB）			暂不开放	CC BY-NC-SA