With the rapid development of computer and internet technology, multimedia information, such as image and video, is playing an important and indispensible role in people's everyday lives. The huge amount of multimedia data made it very hard for users to efficiently access what they need. An effective way for multimedia content analysis and retrieval is the key to solve the information overload problem. In this dissertation, we study on the content analysis and personalized retrieval of image and blog video. Based on the intrinsic interrelation between multimedia data, we further incorporate the useful knowledge embedded in large-scale web resources to bridge the gap between low-level features and high-level semantic concepts. The goal is to better understand the content of multimedia data and more precisely capture the preference of users. The main contributions of this dissertation are as follows: (1) We indicate that active learning is of great help for relevance feedback in content-based image retrieval. Inspired by some related work, we propose a novel dynamic batch mode for selective sampling in active learning. Through one-by-one labeling and batch training, the selection of unlabeled examples is no longer dominated by the existing classification boundary, but also dependent on the previously labeled examples. Based on dynamic batch mode, we further present three strategies for sample selection, which are boundary moving strategy, certainty propagation strategy and dynamic version space reduction strategy. These strategies can effectively guide the selection of informative samples. (2) Multi-label image classification is a very challenging task with respect to the large demand for human annotation of multi-label samples. We propose a multi-view two dimensional active learning algorithm, which integrates the mechanism of active learning and multi-view learning. On one hand we explore the sample and label uncertainties within each view; on the other hand we capture the uncertainty over different views based on multi-view fusion. The overall uncertainty along the sample, label and view dimensions are obtained to detect the most informative sample-label pairs. The combination of multi-view learning and active learning proves effective for redundancy reduction. (3) We propose an effective way for video blog (vlog) content analysis, including semantic annotation and sentiment analysis. In order to acquire high-quality annotation for a vlog, we first e...
修改评论