Knowledge Commons of Institute of Automation,CAS
Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation | |
Yin, Yingjie1,2,3![]() ![]() ![]() | |
发表期刊 | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
![]() |
ISSN | 2162-237X |
2021-02-12 | |
页码 | 11 |
通讯作者 | Yin, Yingjie(haidaying.jie@163.com) |
摘要 | Most recent semisupervised video object segmentation (VOS) methods rely on fine-tuning deep convolutional neural networks online using the given mask of the first frame or predicted masks of subsequent frames. However, the online fine-tuning process is usually time-consuming, limiting the practical use of such methods. We propose a directional deep embedding and appearance learning (DDEAL) method, which is free of the online fine-tuning process, for fast VOS. First, a global directional matching module (GDMM), which can be efficiently implemented by parallel convolutional operations, is proposed to learn a semantic pixel-wise embedding as an internal guidance. Second, an effective directional appearance model-based statistics is proposed to represent the target and background on a spherical embedding space for VOS. Equipped with the GDMM and the directional appearance model learning module, DDEAL learns static cues from the labeled first frame and dynamically updates cues of the subsequent frames for object segmentation. Our method exhibits the state-of-the-art VOS performance without using online fine-tuning. Specifically, it achieves a J & F mean score of 74.8% on DAVIS 2017 data set and an overall score G of 71.3% on the large-scale YouTube-VOS data set, while retaining a speed of 25 fps with a single NVIDIA TITAN Xp GPU. Furthermore, our faster version runs 31 fps with only a little accuracy loss. |
关键词 | Feature extraction Kernel Object segmentation Faces Probabilistic logic Learning systems Image segmentation Deep appearance learning directional deep embedding learning directional statistics-based learning video object segmentation (VOS) |
DOI | 10.1109/TNNLS.2021.3054769 |
收录类别 | SCI |
语种 | 英语 |
资助项目 | National Natural Science Foundation of China[61703398] ; Science and Technology Program of Beijing Municipal Science and Technology Commission[Z191100008019004] ; Hong Kong Research Grants Council (RGC) General Research Fund (GRF)[PolyU 152135/16E] ; Hong Kong Scholars Program[XJ2017031] |
项目资助者 | National Natural Science Foundation of China ; Science and Technology Program of Beijing Municipal Science and Technology Commission ; Hong Kong Research Grants Council (RGC) General Research Fund (GRF) ; Hong Kong Scholars Program |
WOS研究方向 | Computer Science ; Engineering |
WOS类目 | Computer Science, Artificial Intelligence ; Computer Science, Hardware & Architecture ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic |
WOS记录号 | WOS:000733511700001 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
七大方向——子方向分类 | 图像视频处理与分析 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/47110 |
专题 | 中科院工业视觉智能装备工程实验室_精密感知与控制 |
通讯作者 | Yin, Yingjie |
作者单位 | 1.Chinese Acad Sci, Res Ctr Precis Sensing & Control, Inst Automat, Beijing 100190, Peoples R China 2.Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China 3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China |
第一作者单位 | 精密感知与控制研究中心 |
通讯作者单位 | 精密感知与控制研究中心 |
推荐引用方式 GB/T 7714 | Yin, Yingjie,Xu, De,Wang, Xingang,et al. Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2021:11. |
APA | Yin, Yingjie,Xu, De,Wang, Xingang,&Zhang, Lei.(2021).Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,11. |
MLA | Yin, Yingjie,et al."Directional Deep Embedding and Appearance Learning for Fast Video Object Segmentation".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2021):11. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论