Knowledge Commons of Institute of Automation,CAS
Temporal Context Enhanced Feature Aggregation for Video Object Detection | |
He, Fei1,2; Gao, Naiyu1,2; Li, Qiaozhe1,2; Du, Senyao3; Zhao, Xin1,2; Huang, Kaiqi1,2,4 | |
2020-02 | |
会议名称 | AAAI Conference on Artificial Intelligence |
会议录名称 | The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20) |
会议日期 | 2020-02 |
会议地点 | New York |
会议举办国 | US |
摘要 | Video object detection is a challenging task because of the presence of appearance deterioration in certain video frames. One typical solution is to aggregate neighboring features to enhance per-frame appearance features. However, such a method ignores the temporal relations between the aggregated frames, which is critical for improving video recognition accuracy. To handle the appearance deterioration problem, this paper proposes a temporal context enhanced network (TCENet) to exploit temporal context information by temporal aggregation for video object detection. To handle the displacement of the objects in videos, a novel DeformAlign module is proposed to align the spatial features from frame to frame. Instead of adopting a fixed-length window fusion strategy, a temporal stride predictor is proposed to adaptively select video frames for aggregation, which facilitates exploiting variable temporal information and requiring fewer video frames for aggregation to achieve better results. Our TCENet achieves state-of-the-art performance on the ImageNet VID dataset and has a faster runtime. Without bells-and-whistles, our TCENet achieves 80.3% mAP by only aggregating 3 frames. |
收录类别 | EI |
语种 | 英语 |
文献类型 | 会议论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/48736 |
专题 | 复杂系统认知与决策实验室_智能系统与工程 |
作者单位 | 1.CRISE, Institute of Automation, Chinese Academy of Sciences 2.University of Chinese Academy of Sciences 3.Horizon Robotics, Inc. 4.CAS Center for Excellence in Brain Science and Intelligence Technology |
第一作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | He, Fei,Gao, Naiyu,Li, Qiaozhe,et al. Temporal Context Enhanced Feature Aggregation for Video Object Detection[C],2020. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
TCENet-AAAI2020.pdf(952KB) | 会议论文 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论