社会媒体下地理数据的处理与应用

CASIA OpenIR > 毕业生 > 博士学位论文

	社会媒体下地理数据的处理与应用
其他题名	Geo-data Processing and Application for Social Media
	闵巍庆
	2015-05-31
学位类型	工学博士
中文摘要	随着Web2.0技术的迅速发展和GPS手持移动智能设备的广泛应用，人们在使用各种社会媒体的同时以各种各样的形式分享自己的地理位置，产生了大量地理数据。例如用户在Flickr上分享某一地点比较有名的地标图片，在Foursquare上分享他们当前所在的商户位置。这类地理数据除了包含地理位置信息外，还通常关联其他各种上下文信息，比如时间和文本标签等。这些内容丰富、形式多样、数量巨大的地理数据能够服务于基于地理位置的各种应用，比如社会媒体的组织和检索、旅游推荐和媒体可视化等，因此如何对来自社会媒体的地理数据进行有效处理就成为开展基于地理位置的各种应用的关键。来自社会媒体的地理数据和传统的网络多媒体相比有自己更突出的特点，比如这些地理数据通常都包含位置信息，数量巨大和数据异质等。尽管研究人员已经做了大量的工作，但是来自社会媒体的地理数据处理技术仍然存在着一些关键问题需要解决，比如异质多模态信息的有效融合和统一建模，跨平台的信息关联等。针对以上问题，我们从社会媒体下地理数据的处理和应用两个方面进行了研究和探讨，主要研究内容和贡献如下： (1) 基于场景和角度的地标总结。考虑到地标场景和拍摄角度的多样性，我们提出了一个场景角度主题模型对地标进行总结。在角度聚类集合的基础上，该模型能够学习来自不同角度聚类共享的场景主题子空间以及同一场景下不同角度聚类所特有的场景角度主题子空间。我们利用学习的两类主题子空间得到每个地标对应的不同场景和角度的代表性图片。 (2) 基于时空主题的地标分析。来自社会媒体的地标图片除了地标的视觉信息外，通常还包括和它关联的其他信息，比如文本和时间信息等。本工作充分考虑各种异质信息提出了一种时空主题模型学习地标的三类主题子空间，即所有地标都享有的全局主题子空间，只和某个地标相关的位置主题子空间以及在某个地标的某些时刻对应的时间主题子空间。此外，我们考虑地标和位置主题的关联以及地标-时间和时间主题的关联引入了基于互信息的正则化的优化目标函数。最后我们利用贝叶斯定理从地标的时间和空间两个方面对发现的主题进行了分析。 (3) 基于异质元数据的社会事件检测。社会媒体下的地理数据和关联的其他各种信息（比如时间信息）的重要应用之一是检测这类媒体数据中的社会事件实现基于社会事件的媒体组织和搜索。本工作充分考虑社会媒体的各种异质信息包括位置信息，时间信息、文本和视觉信息进行社会事件检测。为此我们提出了一种鲁棒的高阶联合聚类方法。一方面构建星结构的K分图建模社会媒体本身和各种类型的信息之间的依附关系，实现这些异质信息的有效融合；另一方面考虑时间空间内时间之间的关系，并将该关系作为全局正则化项引入到总的目标函数中，进一步提高了社会事件检测的精度。 (4) 基于位置情景的跨平台个性化推荐。在地理数据处理和分析的基础上，本工作设计了一个基于位置情景的跨平台协同应用：在给定某一位置情景和两个不同的社会媒体平台Flickr和Foursquare的条件下，让Flickr用户能够享受来自Foursquare的当前位置的商...
英文摘要	The fast development ofWeb2.0 and the widespread use of GPS-equipped mobile smart devices empower people to the use location data from different social networks in various ways, which fosters the emergence of geo-data. For example, people can upload landmark images to Flickr and share their present venue information in Foursquare. Besides the location information, these geo-data are associated with other context information, e.g., the time-stamps and textual meta-data. The large-scale geo-data presents rich content with different modalities and thus serves as a handy resource for various location based applications(e.g. social media organization and recommendation,travel recommendation and media visualization). Therefore, how to effectively process this kind of the geo-data becomes the key problem of location based application. Compared with the traditional multimedia, the geo-data has distinctive characteristics: they are generally associated with the geo-location and presents content with the heterogeneous metadata. Although researchers have done a lot of work in recent years, there are still several key technical issues, such as the fusion of the heterogeneous metadata and the information correlation across different platforms. To cope with the above mentioned issues, we conduct the research on the geo-data processing and applications for social media. The main contributions of this dissertation can be summarized as follows: (1) Scene and viewpoint based summarization for landmarks. Considering the diversity in both scenes and viewpoints, in order to better visually summarize landmarks, we propose a scene-viewpoint based theme model for modeling both scenes and viewpoints. This model is capable of learning the subspace of both the shared scene themes and viewpoint-specific scene-viewpoint themes. We obtain representative images with different scenes and viewpoint via the two kinds of learned subspace. (2) Spatio-temporal theme based landmark analysis. The landmark images from so cial networks are generally associated with other information, such as time and text information. We propose a probabilistic topic model to utilize different multimodal information to learn three kinds of theme subspace, i.e., global themes shared by many landmarks,local themes characterizing local characteristics of one landmark and temporal themes happened at a specific moment for one landmark. In addition, we consider the correlation between the local theme and landmarks, ...
关键词	社会媒体地理数据联合聚类主题模型跨平台总结推荐 Social Media Geo-data Co-clustering Topic Model Cross-platform Summarization Recommendation
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6739
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	闵巍庆. 社会媒体下地理数据的处理与应用[D]. 中国科学院自动化研究所. 中国科学院大学,2015.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20101801462804（10593KB）			暂不开放	CC BY-NC-SA