CASIA OpenIR  > 毕业生  > 博士学位论文
基于中层特征和时空上下文的行为识别研究
Alternative TitleMid-level Features and Spatial-Temporal Context Based Activity Recognition
袁飞
Subtype工学博士
Thesis Advisor马颂德
2012-12-10
Degree Grantor中国科学院大学
Place of Conferral中国科学院自动化研究所
Degree Discipline控制理论与控制工程
Keyword中层行为部件特征 时空特征流 时空关系 时空上下文内核 Mid-level Activity Components Spatio-temporal String Features Spatio-temporal Relationships Spatio-temporal Context Kernel
Abstract视频序列中的行为分析与识别是模式识别和计算机视觉领域中一个重要的前沿研究方向。这方面的研究和进步有助于构建一个智能化的 系统和网络,例如智能机器人、智能视频监控系统、海量视觉数据的物联网网络等。 行为识别是指让计算机从摄像机记录的视频数据中自动识别出人们感兴趣的行为事件。它涉及模式识别和计算机视觉领域中两个根本性的 问题:(\romannumeral1). 行为数据的视觉描述,以及 (\romannumeral2). 行为模式的时空建模与学习。 前者是模式识别领域中的本质性的问题:即行为的模式究竟是什么?以及如何从视频数据中提取出有效的行为模式?后者与行为数据 的结构属性和动态属性相关,它要解决的关键问题是如何从复杂的行为数据中学习出判别性的行为类模型。 近年来,许多研究人员在行为分析与识别方面做了大量的工作。代表性的工作为局部时空兴趣点特征(e.g. STIPs,Cuboids~特征) 以及基于词袋模型(Bag-of-Features)的行为描述。局部时空特征能够在特征提取阶段避免一些预处理操作,如背景提取,身体建模 以及运动估计等,并且对摄像机运动和光照变化具有一定的鲁棒性。它们还可以构成行为的稀疏描述(如利用词袋模型),有效地嵌入到 高级的机器学习框架中,如~SVM。因此,被广泛地应用于行为识别中,并在一些人工和真实场景取得了较好的识别结果。 但是,上述方法也存在两个严重的问题: (\romannumeral1). 局部时空特征仅仅描述有限区域的局部信息,与包含不同语义层次的复杂行为类别之间存在较大的语义鸿沟; (\romannumeral2). 基于局部时空特征的描述,如词袋模型,通常丢弃了特征之间空间上、时间上的依存关系。而这种时空上下文的 依存关系为行为识别提供了非常重要的线索,是不容忽视的。 本文针对上述问题,进行了深入的研究和探索,做了以下几个研究工作。首先,在行为数据的视觉描述方面,本文提出了~2~种中层 时空特征: \begin{itemize} \item[1.] 提出了一种基于中层行为部件的行为特征。行为部件特征是一种中层的特征,其设计目的在于克服局部时空特征描述能力不足的问题。 本文将行为部件特征定义为空域上具有外观一致性、时域上具有运动一致性的时空部件,它能够描述具有一定语义属性的中层子行为 事件,诸如“踢腿”、“挥手”等。我们采用自下而上的策略,从底层特征开始逐层聚类、提取出更高层次的特征:首先从每帧视频 图像中提取关键点特征;然后通过跟踪相邻帧之间的关键点特征以得到一系列运动轨迹特征;最后根据运动轨迹的在表观和运动上的 相似性,将这些运动轨迹特征聚到不同的聚类中心。我们将每个运动轨迹聚类作为一个中层的行为部件特征,用来描述具有结构 一致性和时间一致性的时空部件。此外,我们分别提出了一个表观描述子、一个形状描述子和一个运动描述子,以描述行为部件特征 在表观、形状、运动方面的信息。与其他方法相比,行为部件特征具有如下不同和优势: (\romannumeral1). 与局部时空特征(e.g. STIPs,Cuboids~特征)相比,行为部件特征具有更强的判别力,它不仅能够 描述身体部件的...
Other AbstractActivity analysis and recognition is an important area of active Pattern Recognition and Computer Vision research. Advances in this field of research contribute to the elaboration of intelligent systems and networks such as, but not limited to, autonomous robots, intelligent video surveillance system, the internet of things with massive visual data. The goal of activity recognition is to automatically analyze and recognize ongoing interested activities from an unknown video. It is involved into two fundamental issues in Pattern Recognition and Computer Vision research: (\romannumeral1). the visual representation of activity data, and (\romannumeral2). the spatio-temporal modeling and learning of activity patterns. The former is one of essential questions in Pattern Recognition area, that is, what is the pattern of activity? and how to extract effective activity pattern from a video? The latter is related to the structural property and dynamic property of activity data, and it is targeted to solve the key problem of learning discriminative activity models from complicated activity data. Over the last decade, a large panoply of work are dedicated to activity analysis and recognition. The representative work is local spatio-temporal interest points~(e.g. STIPs, Cuboids features) and Bag-of-Features based activity representation. They form sparse and effective action representations usually coupled with machine learning techniques, such as SVM. Their success is also due to their avoidance of pre-processing (such as background subtraction, body modeling and motion estimation) and their robustness to camera motion and illumination changes. Impressive results have indeed been reported in both ems: (\romannumeral1). local spatio-temporal features describe only the local information in a spatio-temporal volume. There is a big semantic gap between these local features and complicated activity class with different levels of semantics; (\romannumeral2). local spatio-temporal features based representation and their variants (e.g. synthetic and realistic scenarios. However, the limitation of them lies in: there are two serious probl, bag-of-features) usually discards the geometric and the temporal relationships. This context relationships affords an important cue for activity recognition, and can not be ignored. In this thesis, we deals with the above issues with the following works and contributions. First, for the visual representation of activity data, we prop...
shelfnumXWLW1833
Other Identifier200818014628071
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/6496
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
袁飞. 基于中层特征和时空上下文的行为识别研究[D]. 中国科学院自动化研究所. 中国科学院大学,2012.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20081801462807(3212KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[袁飞]'s Articles
Baidu academic
Similar articles in Baidu academic
[袁飞]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[袁飞]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.