Knowledge Commons of Institute of Automation,CAS
Find Who to Look at: Turning From Action to Saliency | |
Xu, Mai1; Liu, Yufan1,2; Hu, Roland3; He, Feng1 | |
发表期刊 | IEEE TRANSACTIONS ON IMAGE PROCESSING |
ISSN | 1057-7149 |
2018-09-01 | |
卷号 | 27期号:9页码:4529-4544 |
通讯作者 | He, Feng(robinleo@buaa.edu.cn) |
摘要 | The past decade has witnessed the use of high-level features in saliency prediction for both videos and images. Unfortunately, the existing saliency prediction methods only handle high-level static features, such as face. In fact, high-level dynamic features (also called actions), such as speaking or head turning, are also extremely attractive to visual attention in videos. Thus, in this paper, we propose a data-driven method for learning to predict the saliency of multiple-face videos, by leveraging both static and dynamic features at high-level. Specifically, we introduce an eye-tracking database, collecting the fixations of 39 subjects viewing 65 multiple-face videos. Through analysis on our database, we find a set of high-level features that cause a face to receive extensive visual attention. These high-level features include the static features of face size, center-bias and head pose, as well as the dynamic features of speaking and head turning. Then, we present the techniques for extracting these high-level features. Afterwards, a novel model, namely multiple hidden Markov model (M-HMM), is developed in our method to enable the transition of saliency among faces. In our M-HMM, the saliency transition takes into account both the state of saliency at previous frames and the observed high-level features at the current frame. The experimental results show that the proposed method is superior to other state-of-the-art methods in predicting visual attention on multiple-face videos. Finally, we shed light on a promising implementation of our saliency prediction method in locating the region-of-interest, for video conference compression with high efficiency video coding. |
关键词 | Video analysis saliency prediction face |
DOI | 10.1109/TIP.2018.2837106 |
关键词[WOS] | VIDEO CODING HEVC ; VISUAL-ATTENTION ; SPATIOTEMPORAL SALIENCY ; MODEL ; FACE ; EFFICIENCY ; IMAGE ; SCENE ; GAZE ; COMPRESSION |
收录类别 | SCI |
语种 | 英语 |
资助项目 | Natural Key R&D Program of China[2017YFB1002400] ; NSFC projects[61573037] ; Fok Ying-Tong Education Foundation[151061] ; Zhejiang Public Welfare Research Program[2016C31062] ; Natural Science Foundation of Zhejiang Province[LY16F010004] ; Natural Key R&D Program of China[2017YFB1002400] ; NSFC projects[61573037] ; Fok Ying-Tong Education Foundation[151061] ; Zhejiang Public Welfare Research Program[2016C31062] ; Natural Science Foundation of Zhejiang Province[LY16F010004] |
项目资助者 | Natural Key R&D Program of China ; NSFC projects ; Fok Ying-Tong Education Foundation ; Zhejiang Public Welfare Research Program ; Natural Science Foundation of Zhejiang Province |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Artificial Intelligence ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000435518500008 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/27996 |
专题 | 多模态人工智能系统全国重点实验室_视频内容安全 |
通讯作者 | He, Feng |
作者单位 | 1.Beihang Univ, Sch Elect & Informat Engn, Beijing 100191, Peoples R China 2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 3.Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310027, Zhejiang, Peoples R China |
推荐引用方式 GB/T 7714 | Xu, Mai,Liu, Yufan,Hu, Roland,et al. Find Who to Look at: Turning From Action to Saliency[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING,2018,27(9):4529-4544. |
APA | Xu, Mai,Liu, Yufan,Hu, Roland,&He, Feng.(2018).Find Who to Look at: Turning From Action to Saliency.IEEE TRANSACTIONS ON IMAGE PROCESSING,27(9),4529-4544. |
MLA | Xu, Mai,et al."Find Who to Look at: Turning From Action to Saliency".IEEE TRANSACTIONS ON IMAGE PROCESSING 27.9(2018):4529-4544. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论