基于层次化语义信息的视觉美感质量评估

CASIA OpenIR > 毕业生 > 博士学位论文

	基于层次化语义信息的视觉美感质量评估
	考月英
	2017-05-25
学位类型	工学博士
中文摘要	视觉美感质量评估是计算机视觉领域中非常具有挑战性的问题之一。视觉美感质量评估研究是一项高层语义理解任务，涉及到多个学科的交叉，具有重要的理论价值。视觉美感质量评估的最终目标是希望计算机能够像人类一样对图像的美感质量进行感知、分析和决策，其在图像检索、图像增强、机器情感等很多领域也有着广泛的应用前景。在过去的十几年里，美感质量评估研究得到了快速地发展，但是由于图像的美感质量是一个非常主观的视觉属性，至今仍然是一个具有极大挑战性的研究热点。美感质量评估的过程中通常伴随着图像中层次化语义信息的理解。在本文中，我们利用层次化语义信息，沿着从无到有，从粗到细的研究思路，对美感质量评估中的特征表达和建模等难点问题进行了深入研究，提出了有效的美感质量评估方法，并进一步将其应用于图像自动裁剪问题。本文围绕视觉美感质量评估问题开展了以下工作： 1. 提出了一种基于深度回归模型的视觉美感质量评估方法。由于美感质量是图像的一个主观性很强的属性，所以现有手工设计的美感特征会存在考虑不全面及很难量化等问题。因此，我们提出了利用深度卷积神经网络来自动学习图像的美感特征。此外，之前大部分方法通常将图像美感质量评估问题建模为一个简化的二分类问题。为了模拟人类视觉系统对图像美感质量的打分过程，我们将此建模为一个回归问题，来预测连续的美感分数。该方法能够弥补传统特征表达能力不足和分类模型在提供美感质量信息过于简单的缺陷。 2. 提出了一种基于层次化的视觉美感质量评估方法。大多数现有评估方法同等地对待处理所有图像，而没有考虑到图像内容、类型或者空间布局的多样性等问题。考虑到不同空间布局的图像有不同的美感评价标准，我们在设计美感质量评估方法时，首先在空间布局层次上将所有图像分成三种不同的类型，即“场景”类别、“物体”类别和“纹理”类别，然后在模型层次上，对不同类型的图片使用不同的深度卷积神经网络自动学习各自的美感特征，同时进行质量评估模型的训练。该方法除了取得较好的结果，而且由于模型充分利用了图像到在空间布局上的层次化信息，可以有效地减弱图像本身的多样性所带来的影响。 3. 提出了一种基于语义信息的视觉美感质量评估方法。现有研究发现人类在评估图像的美感质量时，往往会同时理解到该幅图像所包含的语义内容信息。我们提出利用语义识别任务来联合学习美感特征表达，即构建一个多任务的深度卷积神经网络同时学习美感评估和语义识别的任务。与上一方法相比，该方法考虑到更细化的层次化图像语义信息，此外，为了探究美感评估任务和语义识别任务的关系，我们在该网络中加入了美感任务与语义识别任务之间的相关性约束，同时还提出了一种任务间平衡的策略进行优化求解。实验表明语义任务的辅助很大程度上增强了美感特征的表达能力，取得了很好的结果，同时也探究了美感任务和不同语义任务之间的差异，增加了美感分析的可解释性。 4. 在前面三种图像美感质量评估方法的基础上，我们尝试将美感质量评估应用于图像自动裁剪研究。图像自动裁剪的目的是去掉不想要的区域，保留高美感质量区域，从而增强图像的构图和美感质量。其本质也是一种美感质量评估，即评估同一幅图像中不同区域的美感质量。在前面研究的基础上，我们提出了一种基于美感响应图的图像自动裁剪方法。在方法中，我们提出了一种美感响应图，其可以显示有区分力的影响美感质量区域。基于美感响应图和梯度能量图，我们建立了构图模型学习图像的构图规则，还提出了美感保留模型以最大程度的保留图像的美感质量。实验表明了所提出方法的有效性。此外，鉴于图像自动裁剪研究领域只有较少的公开数据库，而且还存在着数据量较少等问题。数据库的缺乏严重影响了问题的发现和有效方法的提出。因此，我们构建了一个较大规模数据集，来改善图像自动裁剪领域缺少大规模数据集的问题。同时，我们还对数据库进行了的详细分析，并给出相应的实验基准。
英文摘要	Visual aesthetic quality assessment is a challenging problem in computer vision. As a high-level perception vision task, aesthetic quality assessment involves multiple disciplines and is of important significance for the development of aesthetics in theory. Its goal is to make computers have an ability to percept, analyze and assess the aesthetic quality of images like human being. Aesthetic quality assessment has shown to be very useful in many applications, e.g., image retrieval, image enhancement and robots’ emotion. In the last decades, many data-driven approaches have been proposed to address this issue. However, since aesthetics is a very subjective attribute of images, aesthetic quality assessment is still very challenging. Its challenges lie in the design of aesthetic features and modeling the problem. Aesthetic quality assessment is often accompanied by the understanding of hierarchical semantic information of images. In this thesis, we attempt to address this issue based on hierarchical semantic information with the following contributions: 1. We investigate the visual aesthetic quality assessment with a regression model. Many handcrafted features in early works are proposed based on common intuition about how people perceive the aesthetic quality of images. Since aesthetics is very subjective, it is difficult to design all of features which are exhaustive and computational. Thus we utilize the convolutional network to learn aesthetic features. Moreover, we interpret aesthetic quality assessment as a regression problem. Different from classification models which can only predict aesthetic class (high or low) in most existing works, the regression model can predict continuous aesthetic score. To some extent, the proposed method can overcome the defects of representation of traditional features and classification models. 2. We investigate the visual aesthetic quality assessment with hierarchical content information. Inspired by the different ways in which humans make aesthetic judgements and by the adoption of particular photographic techniques depending on the nature of the images, we propose a novel framework for visual aesthetic quality assessment with hierarchical content information by dividing images into three categories: “scene”, “object” and “texture”. Three specific networks are constructed to learn aesthetic features automatically. Experiments show that our method effectively reduces the effects of the diversity of images. 3. We investigate the visual aesthetic quality assessment with semantic information. For human beings, aesthetic quality assessment is always coupled with the identification of semantic content of images. We propose to exploit the semantic recognition, which is more hierarchical information, to jointly assess the aesthetic quality with a single multi-task convolutional neural network. Furthermore, an effective strategy of keeping a balanced effect between the two tasks is developed to optimize the parameters of our framework. We also propose to automatically learn the correlations between the aesthetic and semantic tasks by incorporating the inter-task relationship learning in our multi-task framework. Extensive experiments validate the importance of the semantic recognition in aesthetic quality assessment and verify the effectiveness of the proposed method. 4. We investigate the visual aesthetic quality assessment for automatic image cropping. Based on the above three assessment methods, we attempt to propose an automatic image cropping method with the aesthetic map. The aesthetic maps can highlight the discriminative image regions for a given aesthetic quality category. Then a composition model is learned with the pyramid features of aesthetic map and gradient energy map. An aesthetic preservation model is also presented to preserve the aesthetic regions when cropping an image. Experiments show that the effectiveness of our image cropping approach. In addition, we create a large-scale dataset for automatic image cropping, since there are few datasets in this field. Dataset plays a big role in the development of automatic image cropping methods, especially for the deep learning problems. For our dataset, some analysis and baseline experiments are given.
关键词	视觉美感质量评估图像自动裁剪层次化语义信息多任务学习深度卷积神经网络
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/14661
专题	毕业生_博士学位论文
作者单位	中国科学院自动化研究所
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	考月英. 基于层次化语义信息的视觉美感质量评估[D]. 北京. 中国科学院大学,2017.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
kao_最终版.pdf（13624KB）	学位论文		限制开放	CC BY-NC-SA