Learning motion patterns from video data is a key technique for video con- tent understanding. It can be used in visual surveillance to help people discover typical activities and detect anomalies in the scenes. It can also be used to help people summarize, annotate and retrieve video content. Because video data is usually rather big and hard to annotate, there is an urge to develop unsupervised or weakly supervised algorithms to cluster data for motion pattern learning. However, for traditional clustering algorithms, it is hard to automatically determine the reasonable number of clusters to avoid under?tting and over?tting. Fortunately, the Bayesian nonparametric models provide us a good solution for this problem, because their complexity grows with data. In this thesis we study algorithms for learning motion patterns from video data using Bayesian nonparametric models. The main contributions of this thesis include: 1. We propose dual hierarchical Dirichlet process hidden Markov model (Dual- HDP-HMM) and apply it for trajectory modeling and retrieval. Dual-HDP- HMM is a new topic model for document analysis, it jointly performs triple tasks of topic discovery, document clustering and topic transition rules learning, with the numbers of topics and document clusters automatically determined. We represent trajectory features as visual words, and trajectories as visual documents, and use Dual-HDP-HMM to analyze them for trajectory clustering and retrieval. Because our algorithm exploits both temporal and spatial structure of trajectories, it obtains higher clustering and retrieval accuracies than traditional algorithms on two trajectory datasets. 2. We propose sticky multimodal hierarchical Dirichlet process hidden Markov model (SMD-HDP-HMM) and use it to learn motion patterns from complicated time series. This model is developed by introducing a sticky prior and multimodal emission distributions into Dual-HDP-HMM. It keeps all the merits of Dual-HDP-HMM but has broader application range and better robustness than Dual-HDP-HMM. It can be used to model complicated time series which has large intra-cluster variation and multiple observations per time step. Our experiments on ASL trajectory dataset and KTH human action video dataset con?rm its advantages. 3. We propose distance dependent Chinese restaurant franchise (ddCRF) mixture model for learning motion patterns in crowded scenes. The ddCRF mixture model can be used to extract topics from documents...
修改评论