电视节目结构化分析与摘要技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	电视节目结构化分析与摘要技术研究
其他题名	Research on Television Programmes Structured Analysis and Abstract
	徐夙
	2013-05-30
学位类型	工学博士
中文摘要	电视节目结构化与摘要技术是多媒体内容分析领域研究的主要问题，在视频数据的浏览和检索领域有广泛的应用价值和商业价值。虽然经过多年的研究，电视节目结构化与摘要技术已有了长足的发展，但是要实现一个通用的电视节目结构化与摘要系统还有许多问题有待解决。本文以电视节目结构化与摘要技术作为研究对象，针对不同类型的电视节目设计了一套通用结构化框架进行逻辑单元分割，并在逻辑单元的基础上针对不同类型节目设计了图片摘要及可视化方法，论文的主要工作和贡献如下： 1.本文提出的镜头检测算法引入了UniformLBP特征作为图像描述基本特征，该特征对于镜头间的渐变比其它特征表现出更好的敏感性，而对于镜头内移动变化则表现出同其它特征类似的稳定性。在差异度构造上本文采用了图模型，这样可以在突出不同图像间差异的同时减小异常扰动。最后本文采用了SVM分类器，对镜头进行分类。 2.根据逻辑单元的语义结构分析，本文提出了通用的逻辑单元分割框架。通过定义四种镜头类型，将逻辑单元分割问题转化为标签识别问题，这一转化将逻辑单元分割中的两类分割问题纳入到同一的框架下，这样的设计使算法在不同类型的节目上有更广泛的拓展性。针对连续镜头标签的识别，本文引入了条件随机场技术，同时选择了镜头差异信号特征、场景转换图特征、主题镜头特征和音频类型特征四种语义特征。由于条件随机场在标签估计的过程中考虑到不同标签中的状态转移概率和训练数据的统计结果等上下文信息，因此可以有效的提高标签估计的准确率。 3.本文在逻辑单元分割的片段基础上，利用镜头聚类、主题镜头和摄影机运动方向等语义信息针对影视剧、新闻节目和纪录片三类电视节目设计图片摘要算法。基于逻辑单元层图片摘要比基于镜头层的图片摘要有更好的简洁性，比基于视频段的图片摘要有更好的概括性，适于视频内容的预览。在图片摘要的基础上，根据影视剧、新闻节目、纪录片三类电视节目各自特点设计了漫画式故事板方法来展示图片摘要，可以提高浏览的趣味性。
英文摘要	Television programs structured and abstract are two fundamental problems in multimedia content analysis. They are of great value to broad applications, including content-based retrieval. non-linear browsing, etc. Although with the efforts on the research for many years, great progress has been made on television programs structured and abstract, there are still many open problems to resolve to general method of them. In this thesis, we study method of the television programs structured and abstract. In comparison with previous approaches that handle scenes and topic units separately, the proposed method deals with them in a general framework. Based on the structured logical units logical unit, new approaches of still-image abstract and visualization against different genres of TV-program are presented. The main contributions of this thesis include the following: 1.In this paper UniformLBP feature is employed to describe the basic information of images. This feature is more sensitive in gradual transition case than other features, but it is as robustness as other feature in the same shot. Graph-model is used by calculating the difference of image sequences, which can emphasizes difference signal and suppress influence of the local disturbance. Finally, the shot boundary is identified by the SVM classifier. 2.According to the structure characteristics of logical unit, a general framework for logical unit segmentation is presented in this paper. By defining four types of shots, the problem of logical unit segmentation is formulated as a problem of identifying the type of shot. Such a framework is easily applicable to different genres of TV-program. To identify the labels, conditional random field is employed and each shot is represented by four semantic features: shot difference, scene transition, shot theme and audio type. Because conditional random field technique adequately utilizes the context information of neighboring shots, it delivers significantly more accurate results than previous methods. 3.This paper present a still-image abstract method based on the result of logical unit segmentation. Using the information of shot clustering, shot theme and camera motion, the still-image abstract method is designed against movie and TV-series, news and documentary. These abstracts are appropriate for non-linear browsing. Moreover, a comic-like layout method is employed to show the still-image abstracts according the production rule of movie and TV-serie...
关键词	逻辑单元分割条件随机场模型图片摘要图片摘要可视化 Logical Unit Segmentation Conditional Random Field Still-image Abstract Still-image Visualization
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/6541
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	徐夙. 电视节目结构化分析与摘要技术研究[D]. 中国科学院自动化研究所. 中国科学院大学,2013.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20091801462806（13971KB）			限制开放	CC BY-NC-SA