图像显著物体检测算法与应用

CASIA OpenIR > 毕业生 > 博士学位论文

	图像显著物体检测算法与应用
	彭厚文
	2016-05
学位类型	工学博士
英文摘要	随着数码相机与智能手机的普及，以及社交网络的快速发展，图像与人们生活的联系越来越紧密。图像的快速传播与普及在给人们生活带来便利的同时，也给图像的处理、分析与理解带了巨大挑战。近些年来，显著物体检测作为图像处理的底层研究课题之一吸引了众多研究者的关注。显著物体检测的目标是从图像中检测并提取出能够吸引人类视觉注意的物体，它的研究可以为图像压缩、检索、编辑以及识别等诸多问题提供有效的预处理手段，同时它在人机交互、视觉导航、自主监控、视频分析、增强现实等领域也有广阔的应用前景。本文围绕图像显著物体检测算法及其应用，针对普通RGB图像与带深度信息的RGBD图像，提出了多种有效的检测算法，同时将显著性检测的思想应用于图像记忆性预测中，拓宽了显著物体检测的应用范围。本文的主要工作和贡献概括如下：针对普通RGB图像，提出了两种基于矩阵分解的显著物体检测算法。第一种是基于低秩与结构化稀疏矩阵分解的显著物体检测算法，该方法将图像的特征矩阵分解为低秩矩阵和结构化稀疏矩阵两个部分，其中低秩矩阵表达图像冗余的背景信息，结构化稀疏矩阵表达图像的前景信息；通过该矩阵分解方法可以实现图像显著前景和不显著背景的分离，并将显著物体从图像中检测出来。与目前已有的基于低秩矩阵分解的算法相比，该方法考虑了图像上下文之间关系，即图像区域的空间邻接关系以及特征模式的相似性和差异性，实现了图像显著物体的鲁棒检测。另一种是基于结构化矩阵分解的显著物体检测算法，该方法是对前一种方法的拓展，不仅保留了前一种算法的优点，还克服了其存在的缺点。结构化矩阵分解算法考虑了同质区域的相似性传播以及不同区域间的差异性增强，增大了显著物体和背景区域在特征空间中的投影距离，使两者更易分离。该算法初步解决了当显著物体表观与背景较为相似时，以及图像背景较为复杂时，已有算法难以处理的问题。此外，该方法通过参数因子将高层感知先验集成到矩阵分解算法中，指导分解过程，从而实现了更加鲁棒的检测。我们在多个公开数据集上对上述两种算法进行了评测，实验结果表明这两种算法相对之前低秩矩阵分解算法在性能上更加优越。针对带有深度信息的RGBD图像，提出了两种基于融合的显著性检测算法。第一种算法是将图像的颜色信息（RGB）和深度信息（Depth）分开处理，利用已有的针对RGB图像的显著物体检测算法对RGB图像进行显著性检测，然后利用本文提出的多上下文对比度方法对深度信息进行显著性检测，最后将这两种显著性进行融合，产生最终的结果。这种方法的好处在于它直接利用了已有的RGB图像检测算法，使得前人设计的算法可以不经过修改，直接应用于RGBD图像进行显著物体检测。然而，由于该方法将颜色和深度信息分开处理，忽略了两者的相关性和互补性，因此我们设计了第二种检测算法。第二种算法是一种多阶段的融合检测算法，它综合利用了图像底层特征的区域对比度，中层区域相似性聚合以及高层先验知识增强。在底层区域对比度中，该方法利用颜色、位置、深度等信息计算区域间的差异性，得到初步的区域显著性估计结果；然后，利用初步结果产生显著性种子区域，通过区域聚合的方式得到中层检测结果；最后将上述两层的结果相融合，并进一步集成高层先验知识得到最终的检测结果。我们建立了一个较大规模的RGBD图像显著物体检测数据集，并在该数据集上评测了上述两种算法，验证了（1）基于深度信息的多上下文对比度方法可以有效的提高已有RGB显著物体检测算法的准确性；（2）多阶段检测算法可以有效地从RGBD图像中检测出显著物体，较目前已有的方法更加鲁棒。将显著物体检测的思想应用到图像记忆性预测中，拓宽了显著物体检测的应用范围。图像记忆性作为图像的一种内在属性，与图像显著性有一定的联系，我们可以认为图像记忆性是图像在图像序列上的一种显著性。我们提出了一种多视角数据自适应回归模型，并将其应用于图像记忆性预测。在公开的数据集上对这种融合了显著性检测思想的预测算法进行了评测，并与相关方法进行了对比和分析，验证了该算法的有效性。此外，我们还初步总结概括了近年来图像显著性检测的有关应用。公开了一个较大规模的RGBD图像显著物体检测数据集，同时对目前显著物体检测算法进行了综合评测。作为本文的研究工作和贡献，我们公开了一个RGBD显著物体检测数据集，用于促进对带深度信息图像的显著物体检测的研究，该数据集包含1000张RGBD图像，以及对应的人工标注。该数据集可以从我们建立的项目网站进行下载使用：https://sites.google.com/site/rgbdsaliency/。此外，我们对目前已有的显著物体检测算法进行了综合评测。我们利用了多种评价标准，在5个目前广泛使用的数据集上对24种算法进行了综合评测，并将评测结果、评测方法以及算法产生的结果公开在我们的项目网站上，网站地址为：http://www.dabi.temple.edu/~hbling/SMD/SMDSaliency.html。该评测是对目前显著物体检测算法的综合分析，有利于加深对该方向研究现状的了解。 ; Ubiquity of digital cameras and smart-phones, especially the rapid development of social network, has resulted in a closer relation between image and our lives. Although the rapid spread of images has brought a lot of convenience to people's lives, it also poses challenging research questions. Recent years, as a fundamental research problem in low-level image processing, salient object detection has attracted lots of researchers' attention. Salient object detection is the task of localizing and segmenting the most conspicuous foreground objects from a scene. It has a wide range of applications in computer vision, such as object detection and recognition, content-based image retrieval and context-aware image resizing, and also applications in industrial systems, such as human-computer interaction, visual navigation, automated surveillance, video analysis, augmented reality, to name a few. In this thesis, we propose multiple salient object detection algorithms for RGB images and RGBD images respectively, and also apply saliency detection algorithms and ideas to image memorability prediction, which widens the application range of salient object detection. The main work and contributions of this thesis are summarized as follows. (1) We propose two kinds of matrix decomposition-based salient object detection algorithms for RGB images. The first is a low-rank and structured-sparse matrix decomposition based salient object detection algorithm. This algorithm models the salient foreground and non-salient background separation as a problem of low-rank and structured-sparse matrix decomposition. It decomposes the feature matrix of an image into a low-rank matrix and a structured-sparse matrix. The low-rank matrix represents the redundant background information, while the structured-sparse matrix identifies the salient objects in the image. Compared with previous salient object detection algorithms based on low-rank matrix recovery, our method takes account of the spatial contiguity and pattern consistency of image regions, resulting in robust detection of salient objects. The second algorithm is an extension of the above one, named structured matrix decomposition. It not only preserves the merits of previous one, but also remedies its weakness. The structured matrix decomposition algorithm also aims to decompose the image feature matrix into a low-rank part and a structured-sparse part. The algorithm encourages patches within the same semantic region to share similar or identical representation, and patches from heterogeneous regions to have different representation. Thus, it can detect salient objects in jumbled scenes, even when the salient objects have a similar appearance to the background. Moreover, the structured matrix decomposition algorithm seamlessly incorporates low-level visual features and high-level human perception priors, thus achieves robust detection. We evaluate our model for salient object detection on multiple challenging datasets including single object, multiple objects and complex scene images, and show competitive results as compared with state-of-the-art methods. (2) We propose two kinds of fusion-based salient object detection algorithms for RGBD images. The first one combines existing RGB-produced saliency with new depth-induced saliency to identify salient objects from RGBD images. RGB-produced saliency is estimated by existing saliency models designed for RGB images, while the depth-induced saliency is computed by the proposed multi-contextual contrast model. The merits of this algorithm is that it can hold existing RGB-based saliency models still adequate in RGBD scenarios. However, it treats the appearance and depth correspondence cues in an independent manner, ignoring the strong complementarities between them. Therefore, we propose the second salient object detection algorithm for RGBD images, namely, a novel multi-stage RGBD saliency estimation algorithm which takes account of low-level feature contrast, mid-level region grouping and high-level object-ware priors. It combines depth information and appearance cues in a coupled manner. In low-level feature contrast, we extend the multi-contextual contrast method proposed previously to RGBD cases and produce an initial saliency map. In mid-level region grouping, we exploit thresholding on the initial saliency map to yield saliency seeds which are diverse regions with high saliency values. Starting with any one of saliency seeds, region grouping is performed on a weighted graph by using Prim's algorithm to select candidate regions which have high probabilities belonging to the foreground object. This procedure is repeated until all the seeds are traversed. A visual consistent saliency map is generated at the end of this stage. Finally, saliency maps generated by previous two stages are combined through a Bayesian fusion strategy. Besides, a high-level object-aware prior is also integrated to boost the performance. We build up a large scale RGBD image dataset containing 1,000 images, and use it to test the proposed two fusion-based saliency detection algorithms. Experimental results verify that (1) the multi-contextual contrast method performed on depth images is effective, and can improve the precision of RGB saliency models; (2) The multi-stage saliency fusion method is robust and can accurately identify the salient objects from RGBD images. (3) We apply saliency detection algorithms and ideas to image memorability prediction, which widens the application range of saliency detection. Image memorability is an inherent property of individual images, which is also related to image saliency. We can characterize image memorability as a kind of saliency on image collection. We propose a multi-view regression model for image memorability prediction. Experimental results on the public benchmark show the superiority of the proposed model compared with other existing image memorability prediction methods. Besides, we also summarize the related applications of saliency detection. (4) We provide a comprehensive evaluation of existing salient object detection methods, and also release a large-scale RGBD salient object detection dataset. As one of the contributions of this thesis, we release a large-scale RGBD salient object detection benchmark including 1,000 images, this dataset is helpful to stimulate further research in the area. The dataset can download from http://sites.google.com/site/rgbdsaliency. Moreover, we provide a comprehensive evaluation of existing salient object detection methods on five challenging datasets including single object, multiple objects and complex scene images, and the results of comparing 24 state-of-the-art methods in terms of multiple performance metrics. The evaluation and results can download from http://www.dabi.temple.edu/~hbling/SMD/SMDResult.html.
关键词	显著物体检测矩阵分解结构化稀疏 Rgbd图像图像记忆性预测
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/11775
专题	毕业生_博士学位论文
作者单位	中国科学院自动化研究所
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	彭厚文. 图像显著物体检测算法与应用[D]. 北京. 中国科学院大学,2016.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
图像显著物体检测算法与应用.pdf（59649KB）	学位论文		限制开放	CC BY-NC-SA