CASIA OpenIR  > 09年以前成果
A multimodal scheme for program segmentation and representation in broadcast video streams
Wang, Jinqiao1; Duan, Lingyu2; Liu, Qingshan1; Lu, Hanqing1; Jin, Jesse S.3
发表期刊IEEE TRANSACTIONS ON MULTIMEDIA
2008-04-01
卷号10期号:3页码:393-408
文章类型Article
摘要With the advance of digital video recording and playback systems, the request for efficiently managing recorded TV video programs is evident so that users can readily locate and browse their favorite programs. In this paper, we propose a multimodal scheme to segment and represent TV video streams. The scheme aims to recover the temporal and structural characteristics of TV programs with visual, auditory, and textual information. In terms of visual cues, we develop a novel concept named program-oriented informative images (POIM) to identify the candidate points correlated with the boundaries of individual programs. For audio cues, a multiscale Kullback-Leibler (K-L) distance is proposed to locate audio scene changes (ASC), and accordingly ASC is aligned with video scene changes to represent candidate boundaries of programs. In addition, latent semantic analysis (LSA) is adopted to calculate the textual content similarity (TCS) between shots to model the inter-program similarity and intra-program dissimilarity in terms of speech content. Finally, we fuse the multimodal features of POIM, ASC, and TCS to detect the boundaries of programs including individual commercials (spots). Towards effective program guide and attracting content browsing, we propose a multimodal representation of individual programs by using POIM images, key frames, and textual keywords in a summarization manner. Extensive experiments are carried out over an open benchmarking dataset TRECVID 2005 corpus and promising results have been achieved. Compared with the electronic program guide (EPG), our solution provides a more generic approach to determine the exact boundaries of diverse TV programs even including dramatic spots.
关键词Broadcast Video Latent Semantic Analysis Multimodal Fusion Tv Program Segmentation
WOS标题词Science & Technology ; Technology
关键词[WOS]RETRIEVAL
收录类别SCI
语种英语
WOS研究方向Computer Science ; Telecommunications
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Telecommunications
WOS记录号WOS:000258767100009
引用统计
被引频次:17[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/9571
专题09年以前成果
通讯作者Wang, Jinqiao
作者单位1.Chinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, Beijing 100080, Peoples R China
2.Inst Infocomm Res, Singapore 119613, Singapore
3.Univ Newcastle, Sch Design Commun & Informat Technol, Callaghan, NSW 2308, Australia
第一作者单位模式识别国家重点实验室
通讯作者单位模式识别国家重点实验室
推荐引用方式
GB/T 7714
Wang, Jinqiao,Duan, Lingyu,Liu, Qingshan,et al. A multimodal scheme for program segmentation and representation in broadcast video streams[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2008,10(3):393-408.
APA Wang, Jinqiao,Duan, Lingyu,Liu, Qingshan,Lu, Hanqing,&Jin, Jesse S..(2008).A multimodal scheme for program segmentation and representation in broadcast video streams.IEEE TRANSACTIONS ON MULTIMEDIA,10(3),393-408.
MLA Wang, Jinqiao,et al."A multimodal scheme for program segmentation and representation in broadcast video streams".IEEE TRANSACTIONS ON MULTIMEDIA 10.3(2008):393-408.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
A Multimodal Scheme (2319KB)期刊论文作者接受稿开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Wang, Jinqiao]的文章
[Duan, Lingyu]的文章
[Liu, Qingshan]的文章
百度学术
百度学术中相似的文章
[Wang, Jinqiao]的文章
[Duan, Lingyu]的文章
[Liu, Qingshan]的文章
必应学术
必应学术中相似的文章
[Wang, Jinqiao]的文章
[Duan, Lingyu]的文章
[Liu, Qingshan]的文章
相关权益政策
暂无数据
收藏/分享
文件名: A Multimodal Scheme for Program Segmentation and Representation in Broadcast Video Streams.pdf
格式: Adobe PDF
此文件暂不支持浏览
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。