CASIA OpenIR  > 毕业生  > 博士学位论文
广播视频的结构分析和语义检索
Alternative TitleStructure Analysis and Semantic Retrieval for Broadcast Videos
王金桥
Subtype工学博士
Thesis Advisor卢汉清
2008-06-04
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline模式识别与智能系统
Keyword视频检索 结构分析 视频分割 多模态分析 场景检测 广告分类 Video Retrieval Structure Analysis Video Segmentation Multimodal Fusion Scene Detection Ads Classification
Abstract本文针对广播视频的结构和语义理解进行了深入的研究,涉及到了许多视频处理和内容检索的基本问题,其中包括广播视频的镜头边界检测、节目分割、节目摘要、节目分类和节目检索等。主要的工作和贡献有: (1)从广播视频本身的结构和制作特点出发,分析了当前视频分割、分类和检索中存在的问题,提出了一种基于多模态融合的节目分割和表达框架,并提出了三种中层的特征来连接低层特征和高层的语义之间的“鸿沟”。并用视觉和文本特征对节目进行多模态的表达,从而使用户更方便的对视频节目进行浏览和搜索。 (2)深入研究了广播视频中logo分布与节目边界之间的关系,将多值图像的梯度运算扩展到视频处理中,提出了一种基于广义梯度的视频中的logo处理算法框架,能够对静态、动态、和半透明的logo进行检测、跟踪和去除。 (3)利用POIM图像检索和视频关键帧序列匹配相互结合的方式,提出了一种由粗到精的快速视频节目检索算法。与传统的视频检索算法相比,能够克服颜色扭曲、码流变化和分辨率变化等造成的影响,从而增加了视频节目检索的鲁棒性。 (4)针对广播视频中的广告视频进行了分析,实现了包括广告的分割、分类和检索的广告视频摘要系统。提出了一种基于FMPI图像同时结合视觉场景的变化和音频场景的变化,以及一些广告领域的黑帧、静音等特征来检测广告的边界。潜在语义分析用来自动挖掘与产品和服务有关的视觉和文本概念,对视频广告按产品和服务进行分类。基于FMPI图像和关键帧序列匹配的广告检索方式,满足广告的监控以及搜索的需要。
Other AbstractThis dissertation study the structure and semantic understanding of broadcast video, which involves a lot of basic issues in video processing and content retrieval, including shot boundary detection, program segmentation, program summary, program classification and programs indexing. The main work and contributions of this thesis include following issues: (1) From the structure and production characteristics of broadcast video, we analyze the existing problems of video segmentation, classification and retrieval, and proposed a multimodal fusion framework of broadcast video analysis. Three middle-level features are proposed to bridge the semantic gap between the low-level features and high-level semantics. We further propose a program segmentation and expression framework based on visual and textual features, which makes it easy for users to easily browsing and indexing of video program. (2) Through the in-depth study of relationship between program boundary and logo existence of video programs, we extend the gradient of multi-value image to video processing, and propose a logo processing algorithm framework based on generalized gradient, which can deal with static, animated and semi-transparent logos including detection, tracking and removal. (3) We propose a coarse to fine rapid video program retrieval algorithm, based on POIM image and key frame sequence matching. Compared with key frame sequence matching and clip based retrieval approaches, our approach overcomes the influence by the distortion of color, Encoding change and resolution change, and increases the robustness of the program retrieval. (4) We develop a video ads digesting system including ads segmentation, categorization and recognition. We study the characteristic of the boundaries between individual ads, and propose to use FMPI image, video scene change, audio scene change, as well as black frame and silence features in ads domain to detect the boundaries of ads. To classifying video ads by product and service, we employ latent semantic analysis to mine the latent visual and textual concept related with product and service. Video ads retrieval with FMPI image and key frame sequence matching can meet the requirement of ads monitor and recognition.
shelfnumXWLW1276
Other Identifier200418014628032
Language中文
Document Type学位论文
Identifierhttp://ir.ia.ac.cn/handle/173211/6119
Collection毕业生_博士学位论文
Recommended Citation
GB/T 7714
王金桥. 广播视频的结构分析和语义检索[D]. 中国科学院自动化研究所. 中国科学院研究生院,2008.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20041801462803(3635KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[王金桥]'s Articles
Baidu academic
Similar articles in Baidu academic
[王金桥]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[王金桥]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.