CASIA OpenIR  > 智能感知与计算研究中心
An End-to-end Video Text Detector with Online Tracking
Yu HY(俞宏远)1,2; Zhang, Chengquan3; Li, Xuan3; Han, Junyu3; Ding, Errui3; Wang L(王亮)1,2,4
2019
会议名称International Conference on Document Analysis and Recognition
会议日期2019年
会议地点悉尼
出版地IEEE
出版者IEEE
摘要

Video text detection is considered as one of the most difficult tasks in document analysis due to the following two challenges: 1) the difficulties caused by video scenes, i.e., motion blur, illumination changes, and occlusion; 2) the properties of text including variants of fonts, languages, orientations, and shapes. Most existing methods attempt to enhance the
performance of video text detection by cooperating with video text tracking, but treat these two tasks separately. In this work, we propose an end-to-end video text detection model with online tracking to address these two challenges. Specifically, in the detection branch, we adopt ConvLSTM to capture spatial structure information and motion memory. In the tracking branch, we convert the tracking problem to text instance association, and an appearance-geometry descriptor with memory mechanism is proposed to generate robust representation of text instances. By integrating these two branches into one trainable framework, they can promote each other and the computational cost is significantly reduced. Experiments on existing video text benchmarks including ICDAR2013 Video, Minetto and YVT demonstrate that the proposed method significantly outperforms state-of-the-art methods. Our method improves F-score by about 2% on all datasets and it can run
realtime with 24.36 fps on TITAN Xp.
 

文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/48517
专题智能感知与计算研究中心
中国科学院自动化研究所
通讯作者Wang L(王亮)
作者单位1.中国科学院大学
2.中国科学院自动化研究所,NLPR,CASIA
3.Department of Computer Vision Technology(VIS), Baidu Inc.
4.Chinese Academy of Sciences Artificial Intelligence Research
第一作者单位模式识别国家重点实验室
通讯作者单位模式识别国家重点实验室
推荐引用方式
GB/T 7714
Yu HY,Zhang, Chengquan,Li, Xuan,et al. An End-to-end Video Text Detector with Online Tracking[C]. IEEE:IEEE,2019.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
An_End-to-End_Video_(4286KB)会议论文 开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Yu HY(俞宏远)]的文章
[Zhang, Chengquan]的文章
[Li, Xuan]的文章
百度学术
百度学术中相似的文章
[Yu HY(俞宏远)]的文章
[Zhang, Chengquan]的文章
[Li, Xuan]的文章
必应学术
必应学术中相似的文章
[Yu HY(俞宏远)]的文章
[Zhang, Chengquan]的文章
[Li, Xuan]的文章
相关权益政策
暂无数据
收藏/分享
文件名: An_End-to-End_Video_Text_Detector_with_Online_Tracking.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。