基于贝叶斯非参数模型的运动模式学习

CASIA OpenIR > 毕业生 > 博士学位论文

	基于贝叶斯非参数模型的运动模式学习
其他题名	Learning Motion Patterns Using Bayesian Nonparametric Models
	田国栋
	2014-11-28
学位类型	工学博士
中文摘要	从视频数据中学习目标物的运动模式是理解视频内容的关键技术，具有广泛的应用价值。它可以应用于视觉监控中帮助人们发现场景中的典型行为模式并检测异常行为。它也可以用来帮助人们归纳、标注、和检索视频内容。由于视频数据量大且不易标注，因此人们急需发展无监督或弱监督的运动模式学习算法对数据进行聚类，但传统的聚类算法很难自动确定合理的团簇数以避免欠拟合或过拟合的问题。幸运的是，贝叶斯非参数模型为我们提供了一个很好的解决方案，因为它的模型复杂度可随数据生长。本文研究用贝叶斯非参数模型从视频数据中学习运动模式的算法，主要贡献包括： 1. 提出了对偶分层狄利克雷过程隐马尔可夫模型（Dual-HDP-HMM）并把它应用于轨迹建模与检索。Dual-HDP-HMM 是一种新的文档主题模型。它可以联合完成主题发现、文档聚类和主题转移关系学习这三重任务，并自动确定主题和文档类的数目。我们首先把轨迹特征表示为视觉单词，把轨迹表示为视觉文档，然后用 Dual-HDP-HMM 对其进行分析并实现轨迹聚类和检索。由于在轨迹建模中同时利用了轨迹的时间和空间结构，该算法在两个轨迹库上取得了比传统的轨迹聚类和检索算法更高的准确度。 2. 提出了粘性多模态分层狄利克雷过程隐马尔可夫模型（SMD-HDP-HMM）并把它应用于复杂时间序列中的运动模式学习。该模型是在 Dual-HDP-HMM 的基础上引入粘性先验和多模态观测分布而形成的。它保持了Dual-HDP-HMM 的所有优点，但比 Dual-HDP-HMM 具有更宽的适用范围和更强的鲁棒性，可以对类内差异大、单时刻有多个观测的复杂时间序列建模。我们在 ASL 手语轨迹库和 KTH 人体运动视频库上的实验证明了算法的有效性。 3. 提出了距离依赖的中餐馆连锁店过程（ddCRF）混合模型并把它用于行人密集的监控场景下的运动模式学习。该模型可以从相互间存在依赖关系的文档中提取出共享的主题并自动确定主题数目。从行人密集场景监控视频中只能提取到不完整的轨迹（轨迹片段），我们为其建立了一个包含轨迹片段之间的时空依赖关系的 ddCRF 混合模型。我们把轨迹片段表示成视觉文档，用 ddCRF 混合模型从中学习出表示运动模式的视觉主题。实验证明，我们的算法可以从轨迹片段中较准确地发现典型的运动模式。 4. 提出了一个基于轨迹分层聚类的行为模式学习和语义描述算法。我们把一种行为模式定义为具有相似出发地、运动过程和目的地的一类轨迹。该算法可以较高的准度发现场景中存在的典型行为模式，并生成近似自然语言的语义描述。我们在交通轨迹数据集上的实验验证了算法的优点。
英文摘要	Learning motion patterns from video data is a key technique for video con- tent understanding. It can be used in visual surveillance to help people discover typical activities and detect anomalies in the scenes. It can also be used to help people summarize, annotate and retrieve video content. Because video data is usually rather big and hard to annotate, there is an urge to develop unsupervised or weakly supervised algorithms to cluster data for motion pattern learning. However, for traditional clustering algorithms, it is hard to automatically determine the reasonable number of clusters to avoid under?tting and over?tting. Fortunately, the Bayesian nonparametric models provide us a good solution for this problem, because their complexity grows with data. In this thesis we study algorithms for learning motion patterns from video data using Bayesian nonparametric models. The main contributions of this thesis include: 1. We propose dual hierarchical Dirichlet process hidden Markov model (Dual- HDP-HMM) and apply it for trajectory modeling and retrieval. Dual-HDP- HMM is a new topic model for document analysis, it jointly performs triple tasks of topic discovery, document clustering and topic transition rules learning, with the numbers of topics and document clusters automatically determined. We represent trajectory features as visual words, and trajectories as visual documents, and use Dual-HDP-HMM to analyze them for trajectory clustering and retrieval. Because our algorithm exploits both temporal and spatial structure of trajectories, it obtains higher clustering and retrieval accuracies than traditional algorithms on two trajectory datasets. 2. We propose sticky multimodal hierarchical Dirichlet process hidden Markov model (SMD-HDP-HMM) and use it to learn motion patterns from complicated time series. This model is developed by introducing a sticky prior and multimodal emission distributions into Dual-HDP-HMM. It keeps all the merits of Dual-HDP-HMM but has broader application range and better robustness than Dual-HDP-HMM. It can be used to model complicated time series which has large intra-cluster variation and multiple observations per time step. Our experiments on ASL trajectory dataset and KTH human action video dataset con?rm its advantages. 3. We propose distance dependent Chinese restaurant franchise (ddCRF) mixture model for learning motion patterns in crowded scenes. The ddCRF mixture model can be used to extract topics from documents...
关键词	贝叶斯非参数模型主题模型狄利克雷过程中餐馆过程运动模式学习 Bayesian Nonparametric Models Topic Models Dirichlet Process Chinese Restaurant Process Motion Pattern Learning
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6658
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	田国栋. 基于贝叶斯非参数模型的运动模式学习[D]. 中国科学院自动化研究所. 中国科学院大学,2014.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20101801462805（5661KB）			暂不开放	CC BY-NC-SA