基于特征间关系的行为分析

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于特征间关系的行为分析
其他题名	The Inter-Feature Relation based Human
	常广鸣
	2011-06-01
学位类型	工学硕士
中文摘要	随着多媒体技术和互联网技术的飞速发展，网络上以及生活中的视频文件正在以惊人的速度增长，以人的行为为主要内容的视频是其中很大的一个组成部分。如何使人们对视频中包含的人的行为进行快速的识别和定位就成了一个很重要的问题。人体行为识别是计算机视觉领域中备受关注的前沿方向和最为活跃的研究主题之一，它是指利用计算机视觉技术从图像或视频序列中识别和理解人的个体行为、人与人之间以及人与外界环境之间的交互行为。我们要解决的就是如何用计算机技术对对人体行为进行有效的表达和分析，找到底层特征和特定语义间的关系。人体行为识别除了有很重要的研究价值外，还在智能监控、运动分析以及跟踪等方面有巨大的应用价值。近年来，随着行为识别技术的飞速发展，多种局部特征随着词包模型(BOVW)等的广泛应用起到了越来越重要的作用。在大多数文献中，研究者所应用的信息只是局部特征本身，或者加上一些如特征时空坐标等的简单几何信息，而特征之间的关系则很少在行为识别文献中被提到。本文集中在局部特征间的相互关系上，并提出一些策略对特征间的关系做描述并将其与局部特征向量进行融合来对局部信息进行更全面的挖掘。论文的主要工作和贡献如下，它们都是在视觉词包模型（BOVW）的框架下完成的： °1 提出了一种局部特征间关系的度量方式,进而提出了不同行为间的视觉相似性。在每一局部特征上加上一个标签代表其所属视频的行为类别，借此在每一个局部特征处提取类间的视觉相似性并对其进行加权。通过统计所有局部特征的相似性并综合局部特征在特征空间上的分布情况获得不同行为间的视觉相似性。受到度量学习方法的启发，我们将行为间的视觉相似性融入到欧式距离中使得来自相似性较大的两个类的两个特征间的距离增大，以此来达到获得更有判别力的视觉词汇表的目的。通过在KTH和Weizmann两个库中的实验，融入视觉相似性的词汇表要比一般的词汇表获得5%左右的分类准确率提升。 °2 提出了一种新的局部特征邻域的表示方式。在每个局部特征附近找到最近邻的几个线性无关的特征，分别用它们的时间和空间分布信息来构建两个邻域特征并且将这两种特征进行融合。这样联合起来的时空邻域特征具有尺度和旋转不变性。我们将这一特征与协方差和SIFT两种具有不变性的特征进行联合，使得局部特征能表达特征的时空信息，以此来弥补词包模型（BOVW）忽略特征时空分布的缺陷。
英文摘要	With the rapid growth of the multimedia and network technology, the amount of videos on the net and in our daily lives are increasing greatly, and the videos about human actions make up a large percentage. So it is urgent to recognize and locate human actions efficiently in these videos. Human action recognition(HAR) is an active topic in computer vision. HAR is to recognize and analyze human action, interactive action, and group action from images or videos by computer vision technology. The essence of this problem is how to efficiently analysis and represent human actions, to construct relations between low-level features and semantics. Besides the great value in research, HAR also has many potential applications in smart surveillance, automatic analysis of sport events and tracking. Recently, as the development in HAR and the widely use of bag of visual words(BOVW) etc., the vital role of local features is obvious. In most papers, the authors usually just use the local features themselves or some additional simple geometry information such as their coordinates. Little attention has been paid to the relations among the features. In this thesis we focus on the relations among the local features and propose some strategies to describe their relations and combine them with local feature vectors to mine local information more comprehensively. The main contributions of our work are summarized as follows, which are both based on the framework of BOVW: °1 We propose a new measurement to evaluate the relations among local features and then compute the visual similarity between each pair of actions. In our framework, each local feature is indexed by the action label of its home video, we extract visual similarity of each feature and give it a weight. Combining all the similarities and the global feature distribution of each action, we computer the visual similarities between each pair of actions. Inspired by Metric Learning, the similarity is embedded into the Euclidean space so as to enlarge the distance between two features if they come from different but similar actions. Thus we obtain a more discriminative visual vocabulary. The experiments in Weizmann and KTH datasets show that our approach outperforms the traditional vocabulary based approach by about 5%. °2 We propose a novel representation of local feature neighborhood. We find several linearly independent nearest features about each local feature, utilize their spatial or temporal information to co...
关键词	特征间关系视觉相似性局部特征邻域视觉词包模型行为识别 Interclass Relation Visual Similarity Local Feature Neighborhood Bag Of Visual Words Action Recognition
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/7575
专题	毕业生_硕士学位论文
推荐引用方式 GB/T 7714	常广鸣. 基于特征间关系的行为分析[D]. 中国科学院自动化研究所. 中国科学院研究生院,2011.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20072801462800（4267KB）			暂不开放	CC BY-NC-SA