Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation

doi:10.1109/TNNLS.2021.3054769

CASIA OpenIR > 中科院工业视觉智能装备工程实验室 > 精密感知与控制

	Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation
	Yin, Yingjie1,2,3 ; Xu, De1,3 ; Wang, Xingang1,3 ; Zhang, Lei 2
发表期刊	IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
ISSN	2162-237X
	2021-02-12
页码	11
通讯作者	Yin, Yingjie(haidaying.jie@163.com)
摘要	Most recent semisupervised video object segmentation (VOS) methods rely on fine-tuning deep convolutional neural networks online using the given mask of the first frame or predicted masks of subsequent frames. However, the online fine-tuning process is usually time-consuming, limiting the practical use of such methods. We propose a directional deep embedding and appearance learning (DDEAL) method, which is free of the online fine-tuning process, for fast VOS. First, a global directional matching module (GDMM), which can be efficiently implemented by parallel convolutional operations, is proposed to learn a semantic pixel-wise embedding as an internal guidance. Second, an effective directional appearance model-based statistics is proposed to represent the target and background on a spherical embedding space for VOS. Equipped with the GDMM and the directional appearance model learning module, DDEAL learns static cues from the labeled first frame and dynamically updates cues of the subsequent frames for object segmentation. Our method exhibits the state-of-the-art VOS performance without using online fine-tuning. Specifically, it achieves a J & F mean score of 74.8% on DAVIS 2017 data set and an overall score G of 71.3% on the large-scale YouTube-VOS data set, while retaining a speed of 25 fps with a single NVIDIA TITAN Xp GPU. Furthermore, our faster version runs 31 fps with only a little accuracy loss.
关键词	Feature extraction Kernel Object segmentation Faces Probabilistic logic Learning systems Image segmentation Deep appearance learning directional deep embedding learning directional statistics-based learning video object segmentation (VOS)
DOI	10.1109/TNNLS.2021.3054769
收录类别	SCI
语种	英语
资助项目	National Natural Science Foundation of China[61703398] ; Science and Technology Program of Beijing Municipal Science and Technology Commission[Z191100008019004] ; Hong Kong Research Grants Council (RGC) General Research Fund (GRF)[PolyU 152135/16E] ; Hong Kong Scholars Program[XJ2017031]
项目资助者	National Natural Science Foundation of China ; Science and Technology Program of Beijing Municipal Science and Technology Commission ; Hong Kong Research Grants Council (RGC) General Research Fund (GRF) ; Hong Kong Scholars Program
WOS研究方向	Computer Science ; Engineering
WOS类目	Computer Science, Artificial Intelligence ; Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS记录号	WOS:000733511700001
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
七大方向——子方向分类	图像视频处理与分析
引用统计	被引频次：9[WOS] [WOS记录] [WOS相关记录]
文献类型	期刊论文
条目标识符	http://ir.ia.ac.cn/handle/173211/47110
专题	中科院工业视觉智能装备工程实验室_精密感知与控制
通讯作者	Yin, Yingjie
作者单位	1.Chinese Acad Sci, Res Ctr Precis Sensing & Control, Inst Automat, Beijing 100190, Peoples R China 2.Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China 3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
第一作者单位	精密感知与控制研究中心
通讯作者单位	精密感知与控制研究中心
推荐引用方式 GB/T 7714	Yin, Yingjie,Xu, De,Wang, Xingang,et al. Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2021:11.
APA	Yin, Yingjie,Xu, De,Wang, Xingang,&Zhang, Lei.(2021).Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,11.
MLA	Yin, Yingjie,et al."Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2021):11.