图像与视频拼接算法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	图像与视频拼接算法研究
其他题名	Research on Image and Video Mosaicing Algorithm
	苗立刚
	2007-06-05
学位类型	工学博士
中文摘要	利用图像拼接技术自动创建全景图像是摄影测量、计算机视觉、图像处理、以及计算机图形学等领域中一个比较活跃的研究方向。图像拼接把多幅部分重叠的图像合成为一幅大幅面的高分辨率图像，它能够克服普通成像设备的视场限制，并获得比单幅图像更高的分辨率和更大的视场范围。由于图像拼接可以方便地创建几乎任意分辨率的全景图像，目前在航空摄影、虚拟现实、视频压缩、视频检索、以及视频监控等领域取得了广泛的应用。图像拼接算法在不同的应用领域存在明显的差异。本文主要研究三种不同场景的图像拼接问题，它包括多层显微镜图像的拼接算法、基于手持相机的文档图像拼接算法、以及视频监控中的图像拼接算法。首先，针对集成电路反向分析中采集的大规模显微镜图像，提出了一种多层显微镜图像的自动拼接算法。在图像采集阶段，提出了一种基于线性预测的显微镜自动对焦算法。为了消除图像拼接中的误差积累，本文用3D格状图表示多层显微镜图像在空间上的重叠关系，并建立图像拼接的全局对准模型。此外，本文还研究了图像误匹配对图像拼接的影响，并提出了最小回路一致性方法以消除误匹配产生的对准误差。其次，提出了一种基于手持相机的文档图像拼接算法，它不需要对相机的姿态和运动作太多的限制，也不需要事先标定相机的内外参数。本文利用两幅文档图像的特征点对校正镜头的径向失真；然后用数学形态学方法估计消隐点坐标，并提出了一种透视失真的分层校正方法；最后根据所有重叠图像的特征点对建立文档图像拼接的全局对准模型。利用非线性优化算法同时求解所有图像的全局对准参数，它可以有效地消除误差积累。最后，视频监控领域常用多个静止的摄像机或云台摄像机监控视场范围比较大的区域。对于多个摄像机的情况，本文采用高斯混合模型对各个视频序列进行背景建模，并利用背景图像计算各个摄像机的几何关系，它可以消除多模式背景和运动目标的影响。对于云台摄像机跟随拍摄的情况，提出了一种基于关键帧的视频拼接方法。本文根据重叠区域的大小和纹理信息的丰富程度选取关键帧，并把各帧图像都对准到其前面最近的关键帧。由于关键帧之间一般具有比较高的对准精度，因而它能够精确地重建运动目标所经过场景的背景模型。
英文摘要	The construction of high-quality panoramas by image mosaicing is an active area of research in the fields of photogrammetry, computer vision, image processing and computer graphics. To obtain a broader view of a scene than is available with a single view, it needs to scan multiple overlapping images and registers all these images into a large, high-resolution panoramic mosaic. Image mosaicing can create panoramas with almost arbitrarily resolution. The image mosaicing methods have distinct characteristics for diffirent fields. This dissertation investigates mosaicing algorithms for three kinds of applications, and it mainly includes multilayer microscopic image mosaicing method, document image mosaicing method for hand-held camera, and video mosaicing methods for surveillance. Firstly, it needs to scan large-scale microscopic images for the multilayer interconnection structure in the IC reverse analysis engineering. A fast autofocusing algorithm based on prediction is proposed for the microscopy images acquisition. It uses 3D grid graph to represent neighborhood relations of overlapping images, and constructs a global image alignment model for the multilayer image mosaics, thus, to get an accurate 3D representation of the multilayer structure. Based on the global model, it proposes a minimum cycle method to eliminate large alignment error caused by image mismatch. Secondly, it proposes an image mosaicing method for hand-held camera-captured document images. It does not need to restrict the camera position and calibrate the intrinsic/extrinsic camera parameters in advance, and allows greater flexibility than approaches using scanner or fixed-cameras. This paper corrects the lens distortion using two-view point correspondences. Then, it estimates the vanishing points and shear angle by mathematic morphology, and propose a hierarchical approach for the perspective rectification. it uses features correspondeces of all the overlapping image pairs to construct global alignment model. It obtains global consistent alignment parameters of all images using nonlinear optimization method, and eliminates error accumulation effectively. Finally, multiple static cameras or PTZ cameras are often used to monitor activity over a wide area in video surveillance system. This paper uses mixtures of Gaussian to model the background of each video, and then computes the homography with most proble background images. This approach can avoid alignment error caused by moving object and multi-model background. Moreover, this paper also proposes a key frame based video mosaicing approach for PTZ cameras. Key frames are selected based on amount of overlap and abundance of texture information, and all frames are matched to their latest neighbor key frames. Key frames often have very high alignment accuracy, and it can create accurate background model of scenes that moving object have transversed.
关键词	图像拼接视频拼接显微镜图像误差分析文档图像镜头失真透视校正视频监控 Image Mosaicing Video Mosaicing Microscopic Image Error Analysis Document Image Lens Distortion Perspective Rectification Video Surveillance
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/5998
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	苗立刚. 图像与视频拼接算法研究[D]. 中国科学院自动化研究所. 中国科学院研究生院,2007.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
CASIA_20041801462803（7557KB）			暂不开放	CC BY-NC-SA