面向视频编码的运动分析

CASIA OpenIR > 毕业生 > 博士学位论文

	面向视频编码的运动分析
其他题名	video analysis for video compression
	王海峰
	2007-05-09
学位类型	工学博士
中文摘要	可视信息是人类获取信息的主要来源。视频编码在视频信息的存储与传输中起到至关重要的作用。它的主要作用是去除图像序列中的信息冗余。而信息的冗余主要表现在前后帧之间的相关性。因此，深入挖掘前后帧之间的相关性对于提高压缩比有着重要的意义。这主要是通过运动估计完成的。H.264标准中使用基于变尺寸块匹配的运动估计算法将图像分成不同尺寸的块，为每个块求取一个运动矢量。MPEG-4的基于全局运动补偿的编码方式将场景分为背景和对象。为整个背景区域使用一个运动模型进行描述。本文主要致力于帧与帧之间运动估计的研究，主要包括H.264的运动估计算法和MPEG-4的全局运动估计算法。本文的主要工作和贡献有： (1) 针对H.264中的变尺寸块匹配（Variable-Size Block Matching）算法，我们提出了一种改进的自下而上（bottom-up）的高精度变尺寸块匹配算法。算法优点表现在：一、无需计算阈值。我们的做法是为每个小块保留所有最小匹配误差对应的运动矢量，并将这些矢量作为每个块的候选矢量。二、提出了一种宏模式预测（Macro-Mode Prediction）的方法。三、针对基于候选矢量决定后续合并过程的方法容易受到光照变化的影响，提出了一种去除光照变化（Illumination Change）影响的后处理方法。 (2) 针对MPEG-4中的基于全局运动补偿的编码方式，我们提出了一种基于模型渐进细化（Progressive Model Refinement，PMR）的全局运动估计算法。算法首先利用一种静态背景检测（Static Background Detection）的方法检测出背景静止的情况。基于PMR的全局运动估计算法的主要特点是可以根据摄像机运动的复杂程度自适应选择运动模型进行描述。算法建立在三层金字塔结构上，模型的自适应选择发生在层与层之间。得到的模型在底层得到校正。为了提高算法速度，我们引入了两种机制：一、基于外块预测的稀疏特征点选取方法。二、使用了中间层模型预测的方法提高模型选择和计算速度。 (3) 单模型的全局运动估计方式只适用于场景深度信息相差不大的情况。针对深度相差比较大的场景，我们提出了一种分层的全局运动估计（Layered Global Motion Estimation）算法。相同景深场景的全局运动相同。该算法为不同深度的场景使用不同的全局运动模型进行描述。全局运动补偿后，不能为任何一层描述的区域为对象区，其中包括对象和突出（Occlusion）区域。
英文摘要	Most of the information we obtained is visual information. Video compression plays an important role in the storage and transmission of video information. The major task of video compression is to remove the information redundancy, which is represented in the correlation between neighbor frames. So the research on the correlation between frames means a lot to improve the compression ratio. This task is mainly performed by motion estimation. In H.264, Variable-Size Block Matching(VSBM) based motion estimation algorithm represents the image with different sizes of blocks and calculates a motion vector for each block. The coding method based on global motion compensation in MPEG-4 segments the scene into background and objects and uses one motion model to describe the whole background region. In this thesis, the emphasis is put on the research of the motion information between frames. The motion estimation algorithm in H.264 and the global motion estimation algorithm in MPEG-4 are discussed with detail. The main contributions of this thesis include: (1) As for the VSBM algorithm in H.264, an improved "bottom-up" high precision VSBM algorithm is proposed. There are three major improvements. Firstly, no threshold is needed beforehand. For each small block, we keep all the motion vectors that result in the minimum residual and store them in its candidate set. Secondly, a Macro-Mode Prediction (MMP) method is proposed. Thirdly, because the method of merging according to candidate motion vectors suffers a lot from the change of illumination, a Illumination Change Removal (ICR) method is proposed. (2) As for the global motion compensation based coding method in MPEG-4, a Progressive Model Refinement based global motion estimation (PMRGME) algorithm is proposed. A Static Background Detection (SBD) method is used to detect the stationary background. The major advantage of PMRGME is that it can adaptively select appropriate motion model according the complexity of the camera motion. PMRGME is built on a three-layer pyramid structure. The adaptive selections of models happen between layers. A model rectification model at the bottom layer is used to check the effectiveness of the final model. Two schemes are used to accelerate the algorithm. Firstly, an outlier block prediction based sparse feature point selection method is proposed. Secondly, an intermediate level model prediction method is proposed to improve the speed of model selection and calculation. (3) GME algorithms based on one motion model are only qualified to scenes without depth change. A Layered Global Motion Estimation (LGME) algorithm is proposed to deal with scenes with depth change. The global motion of scenes sharing the same depth is the same. The proposed LGME algorithm is capable of using different global models to describe scenes with different depths. After global motion compensation, the un-support regions are the object regions, including objects and occlusion regions.
关键词	运动估计变尺寸块匹配算法全局运动估计全局运动分层对象提取 Motion Estimation Variable-size Block Matching Algorithm Global Motion Estimation Layered Global Motion Estimation Object Segmentation
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/5963
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	王海峰. 面向视频编码的运动分析[D]. 中国科学院自动化研究所. 中国科学院研究生院,2007.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20031801460302（8183KB）			暂不开放	CC BY-NC-SA