可视数据的保边平滑与总结

CASIA OpenIR > 毕业生 > 博士学位论文

	可视数据的保边平滑与总结
	袁梦轲
	2019-05
页数	146
学位类型	博士
中文摘要	可视数据是承载视觉信息的数据类型，包括图像、视频、三维模型等。随着可视数据采集设备如相机、三维扫描仪的不断发展与普及，可视数据的获取变得更加容易。另一方面，互联网技术的进步也使可视数据变得触手可及。生动直观的可视数据为用户提供了充足的信息来源、丰富了用户的生活、提高了用户的工作效率。与此同时，有效获取、高效计算、内容分析成为可视数据计算领域的关键问题。首先，可视数据的采集与传输往往会引入噪声，降低了数据质量；其次，图像、视频、网格等可视数据的分辨率在不断提高，但用户通常希望实时或交互地对可视数据进行高效处理；最后，从可视数据集的整体角度来看，数据元素缺乏组织、信息冗余，也为用户把握、理解可视数据集的整体内容，检索可用信息带来了困难。为了更快、更好的从数据量大、非结构性的可视数据中获取目标信息，用户对高效的可视数据处理与分析技术的需求也变得更为迫切。对于单独的可视数据元素，保边平滑处理能够将可视数据中的边缘与尖锐特征同噪声与细节分离，设计具有优秀保边能力的快速保边滤波器能够帮助实现高效、高质量的可视数据处理。对于整体的可视数据集，对数据集所包含的元素进行有效组织，根据需求展示集合中的具有多样性的代表性元素作为集合总结，将会方便用户获取目标信息，加深对可视数据集合的理解。本文围绕图像、三维网格的保边滤波器加速、保边滤波器设计与可视数据集合总结三个问题进行了研究，主要的贡献如下： (1)提出了一种基于加权变量投影方法的具有线性复杂度的高精度双边滤波器加速方法。现有的图像双边滤波器加速方法对颜色核逼近的精度较差。本文将颜色核逼近问题归约为关于基函数和其对应系数的非线性联合优化问题，同时在优化目标中考虑了图像的颜色分布，使用加权变量投影方法得到了高精度的颜色核逼近结果，得到了具有线性复杂度的加速算法。本文通过实验验证了利用加权变量投影方法能够得到更好的颜色核逼近结果，从而能够在较短的时间内得到更为精确的双边滤波结果。 (2)提出了一种保边能力较好的空变双边滤波器及其误差受控的具有线性复杂度的加速方法。双边滤波器采用了固定的加权平均窗口，导致了保边效果不理想，而且在图像增强中会导致“梯度翻转”缺陷。本文提出了一种新的空变双边滤波器，能够在图像平滑中更好的保持不同尺度、不同方向的边缘结构。同时利用两种误差受控的逼近方法得到了准确的具有线性复杂度的加速方法。最后通过数值实验和三个应用：图像去噪、图像增强、图像焦点编辑证明了空变双边滤波器的优秀保边性能以及所提加速方法在滤波精度与运算速度方面的优越性。 (3)提出了一种基于最小生成树和多输出线性岭回归的快速引导三维网格滤波方法。已有的两阶段网格保边滤波方法在法向滤波时使用了较小的固定局部邻域，导致保边平滑的结果较差。本文提出了一种快速的引导三维网格滤波算法，基于面心和法向信息构造了最小生成树，基于树相似性权值得到了能够感知全局结构信息的隐式邻域，然后利用多输出的线性岭回归模型对法向进行准确估计，借助最小生成树的快速聚合算法实现了线性时间复杂度的引导法向滤波。实验结果表明本文的方法在去噪效果和运行时间优于目前的最好结果。 (4)提出了一种用户可定义的可视数据集合总结方法。在图形领域中，可视数据总结是算法结果可视化的常用工具，但鲜有全面和针对性的讨论。本文将可视数据集合总结问题归约为带约束的整数规划问题，在选取代表性元素时兼顾元素多样性、用户偏好和元素的属性信息，提出了基于$\ell_2-$box ADMM的求解方法，可以快速的得到高质量的可视数据集合总结。本文还设计了相应的交互界面帮助得到满足用户约束的集合总结。通过与现有相关方法的比较，验证了所提可视数据集合总结方法在速度和质量方面的优势。
英文摘要	Visual data is the type of data that carries visual information, including images, videos, and geometric models. With the development and popularization of visual data acquisition equipments, such as digital cameras and 3D scanners, users can obtain images, videos, 3D models and other visual data more easily. Advances in Internet technology have also made more visual data accessible. The vivid and intuitive visual data provides sufficient information for users, which will enrich their lives and improve the work efficiency. At the same time, effective acquisition, efficient computing and content analysis have become the key issues in visual data computing. First, the acquisition and transmission of visual data often introduce noises, which degrade the data quality. Second, the resolution of visual data, such as images, videos and 3D meshes, is constantly improved, which prevent users from processing visual data in real time. Finally, from the overall perspective of the visual data collection, the lack of organization of data elements and information redundancy bring difficulties for users to understand the content of the visual data collection and to retrieve valuable information. To infer the most desired information from various visual data, it is highly-demanded for effective and efficient visual data processing and analysis technologies. For individual visual data elements, edge-preserving smoothing can separate edges and sharp structures in images and meshes from noises and details. Designing fast edge-preserving filter with excellent edge-preserving ability can help realizing high-quality and efficient visual data processing. For the whole visual data collection, it will be convenient for users to obtain desired information and understand the visual data collection by well organizing the elements and displaying the diverse, preferred, and representative elements as the summarization of the collection. To process and analyze visual data effectively and efficiently, three problems are studied in this thesis: accelerating and designing edge-preserving filters for images and meshes, and summarizing the visual data collection according to user preference. The main contributions are as follows: (1)An accurate bilateral filter acceleration method using weighted variable projection to achieve linear time complexity is proposed. The previous bilateral acceleration methods had poor accuracy in bilateral filtering results approximation. In the thesis, the range kernel approximation problem is formulated into nonlinear joint optimization about the basis functions and the corresponding coefficients. Meanwhile, the information of image color distribution is also taken into account in the optimization objective. The weighted variable projection technique is utilized to solve this problem to obtain high precision approximation results and linear time acceleration algorithm. Experiments demonstrate that the proposed weighted variable projection method can obtain accurate range approximation results, and can gain more accurate filtering results efficiently. (2) A space-variant bilateral filter with better edge-preserving ability and its error-bounded linear time acceleration method is proposed. The traditional space-invariant isotropic kernel utilized by a bilateral filter frequently leads to blurry edges and "gradient reversal" artifacts in image enhancement. The thesis presents a space-variant bilateral filter, which can preserve the edge structures with different scales and directions in image smoothing. Two error-bounded approximation methods are also used to obtain an accurate linear time acceleration method of space-variant bilateral filter. The advantages of the proposed filter is validated in applications including: image denoising, image enhancement, and image focus editing. Experimental results demonstrate that our fast and error-bounded space-variant bilateral filter is superior to state-of-the-art methods. (3) A fast guided mesh filtering method based on minimum spanning tree and multiple output linear ridge regression is proposed. To reduce the computational cost, the existing two-stage mesh filtering methods employ a small and fixed local neighborhood in the normals filtering, and blur the edges or shape features in mesh smoothing. A fast guided mesh filter is proposed, which constructs minimum spanning tree based on face centroids and normals to obtain feature-aware implicit neighborhood defined by tree similarity weight, and utilizes multiple output linear regression to accurately estimate the uncontaminated normals. With the help of the fast minimum spanning tree aggregation method, linear time guided normal filtering is achieved. Experimental results show that the proposed method is superior to state-of-the-art methods in mesh denoising. (4) A customized summarization method for visual data collection is proposed. In the field of computer graphics, visual data collection summarization is a commonly used tool for visualization of algorithm output. While selecting samples in visual explorations is used as a component of many existing shape-space exploration systems, it has not been systematically explored. Customized summarization of visual data collection is formulated into an integer programming problem with user constraints, and take into account elements diversity, user preference and element attribute information. A solver based on $\ell_2-$box ADMM method is proposed to efficiently obtain high-quality visual data collection summarization. An user interface is also designed to facilitate the generation of desired collection summarization. Experiments verify the superiority of the proposed visual data set summarization method in terms of speed and quality in comparing with state-of-the-art summarization methods.
关键词	保边滤波器加速保边滤波器设计图像去噪网格去噪可视数据集合总结
语种	中文
七大方向——子方向分类	计算机图形学与虚拟现实
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/23941
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	袁梦轲. 可视数据的保边平滑与总结[D]. 智能化大厦3层. 中国科学院自动化研究所,2019.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
Thesis.pdf（197791KB）	学位论文		限制开放	CC BY-NC-SA