Social Media, as a new online media, has a great impact on people's daily life,so the study about social media is a hot subject of current research. Recently, as a natural structure in social media, the community has aroused many researchers' interests. In this paper, we focus on the community-based data mining and application in social media. This paper investigates the community-based data mining in social media based on large-scale real social media data. Following are the main contributions of our work: First, Community detection based on users interests and social topological network. Given that communities in social media are defined by social interactions and common interests among users, we discover the communities with user interests and social connections in consideration. First, we compute interest similarity between users leveraging kinds of textual and social features; Second, random walk is conducted on the interest-weighted social network to get the distance between users; Third, communities can be derived from clustering. Experimental results show that the performance gets better with user interests in consideration. Second, Popularity prediction in community and content recommendation. Popularity in community will get retweeted or shared by the major of members in the community, and feature-weighted model is proposed to predict popularity in communities. We present a set of features and measure their importance using information gain, then propose the feature-weighted mechanism hoping that important features can have a greater impact on classification. Experimental results show that the feature-weighted model has the best performance in popularity prediction. Third, Influencers identification in community and friend recommendation. First, in order to learn influence probability between two directly connected users, we propose the Read-Retweet Model, in which a set of underlying factors are investigated to characterize a user's reading and retweeting behavior. Second, we present Multi-Path Non-Linear Threshold Model to simulate the information propagation, with any possible diffusion paths in consideration. Experimental results show that our method has a best performance in the task of identifying influential users in community. Fourth, Prototype of community-based recommendation system in social media. We design a prototype of community-based recommendation system in social media and complete the community-based recommendation en...
修改评论