Human action recognition is very important in the field of Computer Vision and Pattern Recognition. Many applications, especially intelligent video surveillance, need human action recognition. Human action recognition is usually addressed by extracting features, which capture spatial and temporal information, from video sequences. This thesis analyzes how human recognizes and learns actions visually to explore information needed to express actions visually. Directional statistics is adopted to work with gradients and optical flow for action representation and recognition. The contributions of this thesis include: 1. We propose three kinds of information to represent actions visually: state, the process of state transition and the sequence of state transition. Related action recognition work is organized based on it to put our work in context. 2. We propose a new descriptor for state through directional statistics and histogram of oriented gradients. A corresponding similarity measurement is also proposed. 3. We develop a compact descriptor for the process of state transition through directional statistics and optical flow. The descriptor is robust to irregular activities and view changes. It is efficient enough for real-time applications. 4. We create an action recognition dataset. It is large, real scene, multi-view. We propose a method to organize multiple action recognition datasets.
修改评论