Learning Relevance Restricted Boltzmann Machine for Unstructured Group Activity and Event Understanding | |
Zhao, Fang1; Huang, Yongzhen1,3![]() ![]() ![]() | |
Source Publication | INTERNATIONAL JOURNAL OF COMPUTER VISION
![]() |
2016-09-01 | |
Volume | 119Issue:3Pages:329-345 |
Subtype | Article |
Abstract | Analyzing unstructured group activities and events in uncontrolled web videos is a challenging task due to (1) the semantic gap between class labels and low-level visual features, (2) the demanding computational cost given high-dimensional low-level feature vectors and (3) the lack of labeled training data. These difficulties can be overcome by learning a meaningful and compact mid-level video representation. To this end, in this paper a novel supervised probabilistic graphical model termed Relevance Restricted Boltzmann Machine (ReRBM) is developed to learn a low-dimensional latent semantic representation for complex activities and events. Our model is a variant of the Restricted Boltzmann Machine (RBM) with a number of critical extensions: (1) sparse Bayesian learning is incorporated into the RBM to learn features which are relevant to video classes, i.e., discriminative; (2) binary stochastic hidden units in the RBM are replaced by rectified linear units in order to better explain complex video contents and make variational inference tractable for the proposed model; and (3) an efficient variational EM algorithm is formulated for model parameter estimation and inference. We conduct extensive experiments on two recent challenging benchmarks: the Unstructured Social Activity Attribute dataset and the Event Video dataset. The experimental results demonstrate that the relevant features learned by our model provide better semantic and discriminative description for videos than a number of alternative supervised latent variable models, and achieves state of the art performance in terms of classification accuracy and retrieval precision, particularly when only a few labeled training samples are available. |
Keyword | Representation Learning Video Analysis Restricted Boltzmann Machine Sparse Bayesian Learning |
WOS Headings | Science & Technology ; Technology |
DOI | 10.1007/s11263-016-0896-3 |
WOS Keyword | LATENT DIRICHLET ALLOCATION ; CLASSIFICATION ; ALGORITHM ; MODELS |
Indexed By | SCI |
Language | 英语 |
Funding Organization | National Basic Research Program of China(2012CB316300) ; National Natural Science Foundation of China(61525306 ; Strategic Priority Research Program of the CAS(XDB02070100) ; 61573354 ; 61135002 ; 61420106015) |
WOS Research Area | Computer Science |
WOS Subject | Computer Science, Artificial Intelligence |
WOS ID | WOS:000380270000008 |
Citation statistics | |
Document Type | 期刊论文 |
Identifier | http://ir.ia.ac.cn/handle/173211/12170 |
Collection | 智能感知与计算 |
Affiliation | 1.Chinese Acad Sci, Inst Automat, Ctr Res Intelligent Percept & Comp, Beijing, Peoples R China 2.Queen Mary Univ London, Sch Elect Engn & Comp Sci, London, England 3.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing, Peoples R China |
First Author Affilication | Institute of Automation, Chinese Academy of Sciences |
Recommended Citation GB/T 7714 | Zhao, Fang,Huang, Yongzhen,Wang, Liang,et al. Learning Relevance Restricted Boltzmann Machine for Unstructured Group Activity and Event Understanding[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION,2016,119(3):329-345. |
APA | Zhao, Fang,Huang, Yongzhen,Wang, Liang,Xiang, Tao,&Tan, Tieniu.(2016).Learning Relevance Restricted Boltzmann Machine for Unstructured Group Activity and Event Understanding.INTERNATIONAL JOURNAL OF COMPUTER VISION,119(3),329-345. |
MLA | Zhao, Fang,et al."Learning Relevance Restricted Boltzmann Machine for Unstructured Group Activity and Event Understanding".INTERNATIONAL JOURNAL OF COMPUTER VISION 119.3(2016):329-345. |
Files in This Item: | Download All | |||||
File Name/Size | DocType | Version | Access | License | ||
Learning Relevance R(6286KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | View Download |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment