计算光场的采集与显示

CASIA OpenIR > 毕业生 > 博士学位论文

	计算光场的采集与显示
	曹煊
	2017-05-24
学位类型	工学博士
英文摘要	近年来，随着3D电影和虚拟现实的发展，光场技术引起了相关研究人员的极大关注。光场是空间中光线的集合，采集并显示光场就能从视觉上重现三维世界。然而光场是高维度数据，其采集和显示都比传统的2D图像具有更大的挑战性。如何低成本高质量的实现光场的采集和显示是本论文的主要研究内容。我们采用光场4D模型建立了统一的理论框架，并在此基础上深入研究了光场采集和显示的系统原理，提出了创新的算法和系统，并搭建了实物样机进行验证。在光场的采集方面，为了采集高质量的光场同时又降低硬件成本，我们提出了稀疏相机阵列结构，具有明显的优点：不需要对相机做任何光学改造，信噪比更高，所需相机数量大大减少，硬件成本更低。该方法首先利用稀疏编码学习一个过完备的光场字典，并将原始光场在字典空间映射为一个稀疏的表示系数。由于光场表示系数的稀疏性，我们利用压缩感知成功重建出光场的所有视点图像。在光场重建过程中，部分光场像素块的重建质量较差，我们称之为光场重建的灾难区。我们发现重建采样值的PSNR和表示系数的稀释度可以提前判断重建光场的PSNR，从而可以预先准确定位到重建光场的灾难区。然后有针对性的处理灾难区，既改善了光场重建质量又减少了光场重建时间。我们分别用5个相机和9个相机构建了两种稀疏相机阵列结构，在多个光场数据中证明了算法和系统的可行性。在光场的显示方面，我们首先搭建了基于多层液晶的光场显示样机，为后续的算法验证提供了物理实验平台。将光场建模为张量，通过非负张量分解可以求解多层液晶上所有像素的值。我们设计了一种迭代更新式的光场分解算法，占用内存更低，并行化程度更高。考虑到多层液晶本质上是像素复用，一个像素难以兼顾多条光线，这会造成光场显示质量的降低。我们定义了负载系数的概念，通过建立超定方程组来直观分析多层液晶的负载程度，并提出了“多区-多层联合优化”的策略来分解光场，降低了负载系数，在不增加液晶层数的情况下改善了光场显示质量，获得了更高的光场显示亮度。除了重点研究如何提高光场的显示质量，另一方面我们还重点研究如何提升光场分解的速度。我们利用光场视频的帧内冗余性（空间相似性）构建光场分辨率金字塔模型，为单帧光场分解提供优化的初始值，同时我们还利用帧间冗余性（时间相似性）为光场视频的分解提供优化的初始值。该方法减少了迭代运算次数，加快了运算收敛过程，大大提高了光场分解的运算效率。在光场视频分解实验中，我们的光场分解算法比未优化的算法速度提高了5.9倍。最后，我们总结了全文工作并展望了光场技术的未来发展趋势。在现有研究工作的基础上，我们从光场采集与深度学习的结合、消除多层液晶的摩尔条纹以及多层液晶的标定等多个方面规划了未来的研究工作。 ; In recent years, with the development of 3D movie and virtual reality, light field technology caused the attention of relevant researchers. Light field is the collection of light rays in space. Three-dimensional world can be visually reproduced if light field was acquired and displayed. Unfortunately, light field is high dimensional data. Its acquisition and display is more challenging than traditional 2D images. How to implement the acquisition and display of light field with high quality at low cost is the main research content in this thesis. We established a unified theoretical framework by utilizing the light field 4D model. In this theoretical framework, we deeply studied the principles of the light field acquisition and display. Then we proposed novel systems and algorithms and built a prototype for verification. In the acquisition of light field, in order to capture light field with high quality at a lower hardware cost, we proposed a sparse camera array which has several advantages including no optical modification, higher signal-to-noise ratio, less number of cameras and much lower hardware cost. This method firstly leverages sparse coding to learn a light field dictionary then map the light field as sparse coefficient in the dictionary space. Due to the sparsity of coefficient, we successfully reconstruct all views of light field by utilizing compressing sensing. In the process of light field reconstruction, partial light field pixel blocks are reconstructed with poor quality. We call them “disaster area”. We found that the PSNR of sampling and the sparsity level of coefficient can predict the PSNR of reconstructed light field, which can help us accurately locate to the disaster areas in advance. By targeted processing the disaster area, we improve the quality and reduce the computation time. We separately construct sparse camera array with five and nine cameras and both prove the feasibility of our algorithm and system in several data sets. In the display of light field, we firstly built a light field display prototype based on multi-layer LCD which provides an experimental platform for testing our algorithms. By modeling light field as a tensor, we can solve the values of all pixels in the multiple LCD panels by non-negative tensor factorization. We design an iterative algorithm for light field decomposition, which takes less RAM and can be easily implemented in parallel. Multi-layer LCD is essentially a pixel-multiplexing strategy. That means a pixel is responsible for modulation of multiple light rays. However, a pixel fails to give full consideration to multiple light rays, which degrades light field display quality. We define a concept of “load factor”. By establishing an over-determined equation set, we can analyze the load factor of multi-layer LCD intuitively and proposed a “multi-zone - multi-layer joint optimization” strategy to reduce the load factor, so as to improve the quality of light field display without increasing number of LCD layers and achieve higher display brightness. We not only focus on how to improve light field display quality but also how to accelerate the light field decomposition. We leverage the internal redundancy (spatial similarity) to build a light field resolution pyramid model which provides optimized initial values for single frame light field decomposition. In addition, we utilize external redundancy (temporal similarity) to provide optimized initial values for light field video. This method reduces the amount of iteration and accelerates the convergence which improves the computational efficiency. In experiments of decomposing light field video, our algorithm achieves speed up 5.9 times than non-optimized algorithm. Finally, we summarize our works and prospect the future development of light field technology. On the basis of our finished research, we plan the future research work including combining light field acquisition with deep learning, eliminating morie fringe of multi-layer LCD, calibrating of multi-layer LCD and so on.
关键词	光场裸眼3d显示计算摄像多层液晶压缩感知
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14675
专题	毕业生_博士学位论文
作者单位	中国科学院大学
推荐引用方式 GB/T 7714	曹煊. 计算光场的采集与显示[D]. 北京. 中国科学院研究生院,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
计算光场的采集与显示.pdf（9716KB）	学位论文		限制开放	CC BY-NC-SA