|jia geng yun|
Image cropping is a basic image editing method with many functions. On the one hand, cropping can improve the image composition aesthetic quality, making it widely applied in many areas. The research on image cropping for aesthetic augmentation and related problems can increase the application breadth and depth, helping machines better understand human perceptions. Therefore, there are important applicational and theoretical values. On the other hand, cropping can modify the contents of images. There is an abuse risk of this function, requiring research on automatic image cropping detection. Although related works have developed rapidly in the past ten years, there are still many deficiencies and problems. For the aesthetic problems, the key challenge is that aesthetic is both objective and subjective, and couples with real scenarios. As a result, different problems rise in different aspects. First, Most methods ignore some scene context factors, resulting in information inconsistency between models and humans. Second, most image cropping methods are based on some ideal-condition assumptions of cropping numbers and locations, making them inapplicable in some scenarios. Third, less attention is paid to the objective and subjective uncertainty in human aesthetic perception process. As for the detection problem, although there are many traces left by image cropping, current works cannot sufficiently analyze the differences and relations between them. Thus, the utilization of the diverse traces is not effective nor efficient. In response to the above problems, this thesis carried out the following research works:
1. A theme-aware aesthetic quality assessment model with full-resolution photos is proposed. To deal with the inconsistency between the model-received and human-observed information, this thesis proposes a method that combines image padding and RoM (region of image) pooling. Shape information is also introduced as a complement. For the aesthetic criteria inconsistency, the influence of theme information is analyzed in detail. This thesis proposes a theme-aware aesthetic evaluation method, which effectively integrates the theme information with image features through an attention module. Thus, the theme criterion bias is introduced. Experimental results show that the proposed model achieves outstanding results in three tasks: aesthetic distribution learning, aesthetic score regression, and aesthetic classification.
2. An image cropping model towards both globality and diversity is proposed. Aiming to deal with the problem that existing methods set the crop quantity prior or position prior under ideal conditions, this thesis proposes a model that achieves both globality and diversity for the first time. A set of crops regressed from multiple learnable anchors is matched with the ground-truth crops, and a classifier is trained using the matching results to select a valid subset from all the predictions. Thus, any number of crops can be regressed. Furthermore, two label smoothing strategies are introduced to deal with the inconsistency between validity probability and crop quality, including quality guidance and self-distillation. Experimental analysis shows that the model can produce multiple cropping results with high aesthetic quality from the entire coordinate space and achieves the best level on multiple metrics.
3. An image cropping model based on uncertainty is proposed to address the uncertainty problem in human aesthetic perception. This thesis analyzes the uncertainty in image cropping from two different aspects and models them respectively. For the uncertainty in coordinate space, the model regards the candidate crop coordinates as samples from a triangular distribution whose expectations are the given coordinates in datasets. For the uncertainty in pixel space, it is proposed to use multi-dimensional Gaussian distribution in the embedding space of a deep neural network to integrate various uncertainty factors in pixel space. Besides, this thesis introduces an ordinal constraint on the feature distributions. This constraint effectively promotes ordinal consistency between crop quality scores and features by combining image-guided feature normalization. Experimental results show that this model effectively improves the performance of crop quality assessment.
4. An image cropping detection model based on multi-scale features is proposed. Aiming at the problem of diverse and multi-scale cropping traces, this thesis proposes a multi-scale feature hybrid Transformer image cropping detection model. The model extracts the optical traces left by the cropping behavior in pixel details through a convolution module and extracts the large-scale photographic composition traces through a Transformer network. An auxiliary patch location classification task is introduced to avoid the loss of pixel details. Experiments show that the model can effectively detect cropped images.
|Keyword||图像裁切 美感评估 美感增强 图像裁切检测|
|jia geng yun. 图像裁切中的美学与检测问题研究[D]. 中国科学院大学. 中国科学院大学,2022.|
|Files in This Item:|
|Recommend this item|
|Export to Endnote|
|Similar articles in Google Scholar|
|[jia geng yun]'s Articles|
|Similar articles in Baidu academic|
|[jia geng yun]'s Articles|
|Similar articles in Bing Scholar|
|[jia geng yun]'s Articles|
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.