Activity analysis and recognition is an important area of active Pattern Recognition and Computer Vision research. Advances in this field of research contribute to the elaboration of intelligent systems and networks such as, but not limited to, autonomous robots, intelligent video surveillance system, the internet of things with massive visual data. The goal of activity recognition is to automatically analyze and recognize ongoing interested activities from an unknown video. It is involved into two fundamental issues in Pattern Recognition and Computer Vision research: (\romannumeral1). the visual representation of activity data, and (\romannumeral2). the spatio-temporal modeling and learning of activity patterns. The former is one of essential questions in Pattern Recognition area, that is, what is the pattern of activity? and how to extract effective activity pattern from a video? The latter is related to the structural property and dynamic property of activity data, and it is targeted to solve the key problem of learning discriminative activity models from complicated activity data. Over the last decade, a large panoply of work are dedicated to activity analysis and recognition. The representative work is local spatio-temporal interest points~(e.g. STIPs, Cuboids features) and Bag-of-Features based activity representation. They form sparse and effective action representations usually coupled with machine learning techniques, such as SVM. Their success is also due to their avoidance of pre-processing (such as background subtraction, body modeling and motion estimation) and their robustness to camera motion and illumination changes. Impressive results have indeed been reported in both ems: (\romannumeral1). local spatio-temporal features describe only the local information in a spatio-temporal volume. There is a big semantic gap between these local features and complicated activity class with different levels of semantics; (\romannumeral2). local spatio-temporal features based representation and their variants (e.g. synthetic and realistic scenarios. However, the limitation of them lies in: there are two serious probl, bag-of-features) usually discards the geometric and the temporal relationships. This context relationships affords an important cue for activity recognition, and can not be ignored. In this thesis, we deals with the above issues with the following works and contributions. First, for the visual representation of activity data, we prop...
修改评论