基于扩散模型的生成图像质量改善方法研究

CASIA OpenIR > 毕业生 > 硕士学位论文

	基于扩散模型的生成图像质量改善方法研究
	殷月琴
	2023-05-17
页数	82
学位类型	硕士
中文摘要	随着深度学习技术的发展，图像生成逐渐成为计算机视觉领域的热点研究方向，并得到了广泛的应用，例如图像增强、图像翻译、图像超分辨率、图像修复等。在以人脸生成、虚拟现实、计算机游戏、医疗影像分析和无人驾驶为代表的诸多场景中也发挥了重要作用。然而，生成图像的质量和真实度的高低会直接影响图像生成任务的实际应用效果。传统的图像生成方法主要基于生成对抗网络，但由于其对抗训练的方式，导致生成的图像缺乏多样性。与传统方法相比，基于扩散模型的图像生成方法将数据分布的学习过程建模为一条马尔可夫链，并通过逐步增加和减少噪声来生成图像，已经成为目前最为流行的生成模型之一。虽然扩散模型在图像生成领域已经展示出良好的应用潜力，但当面对真实场景的各种挑战因素时，该技术仍然存在生成的图像质量不高、图像真实度不够、图像失真等问题。因此，如何提高扩散模型的生成能力，使其在各种应用场景中生成高质量、真实的图像，一直是学术界和工业界的研究重点。综上，针对当前图像生成任务存在的问题，本文基于扩散模型展开研究，旨在从后处理过程和生成过程两个方面探索提高生成图像质量的方法。具体而言，本文主要创新和贡献总结如下： 1.基于条件扩散模型的生成模型伪影修复方法。本文旨在研究由生成模型采样出来的伪影图像问题，并提出一种后处理方法进行图像修复，以提高图像质量。伪影图像指在生成过程中出现的不真实的瑕疵或失真，影响了图像的真实性和质量。通过对这些伪影图像进行后处理修复，可以提高图像的质量，使其更加真实和自然。具体而言，本文针对不同类型生成模型产生的伪影图像提出了统一的修复方法。针对三类主流的生成模型（生成对抗网络、自回归模型和扩散模型），本文模拟了不同的机制来生成图像-伪影数据对，以用于图像修复模型的训练。在图像修复模型的设计方面，本文基于连续扩散模型，利用其对数据分布的强大拟合能力来设计图像修复模型。通过对合成伪影图像和真实伪影图像进行的修复实验表明，本文提出的图像修复模型在这两类伪影图像上均取得了良好的修复效果。 2.基于分层离散扩散模型的图像生成算法。离散扩散模型是扩散模型下除了连续扩散模型之外的另一个分支。离散扩散模型使用向量量化的方式将图像数据建模为离散令牌序列，这种方法可以很好地扩展到文本-图像生成任务中。然而，向量量化模型在第一阶段图像压缩的过程中通常会丢失过多的信息，导致生成的图像质量不佳。为了更好地拟合图像数据的先验分布（学习高维数据的联合分布），本文从损失函数的优化设计角度出发，将单层马尔可夫链的离散扩散模型扩展为基于双层马尔可夫链的分层离散扩散模型。具体而言，本文额外引入了一个轻量化的映射模型，用于学习低分辨率令牌到高分辨率令牌的映射关系。通过为单层扩散模型引入额外一层的监督损失，使其能够学习到更优的图像先验，从而提升了生成图像的质量。
英文摘要	Based on the development of deep learning technology, image generation has gradually become a hot research topic in the field of computer vision and has found extensive applications, such as image enhancement, image translation, image super-resolution, and image restoration. It has also played an important role in various scenarios, including face generation, virtual reality, computer games, medical image analysis, and autonomous driving. However, the quality and realism of generated images directly affect the practical application effectiveness of image generation tasks. Traditional image generation methods primarily rely on generative adversarial networks (GANs), but due to their adversarial training approach, the generated images lack diversity. In comparison to traditional methods, diffusion models, which model the learning process of data distribution as a Markov chain and generate images by progressively adding and reducing noise, have become one of the most popular generative models. Although diffusion models have demonstrated promising potential in the field of image generation, they still face challenges in generating high-quality, realistic images with sufficient fidelity when confronted with various factors in real-world scenarios. Therefore, improving the generation capability of diffusion models to generate high-quality and realistic images in various application scenarios remains a focal point of research in both academia and industry. In summary, addressing the existing issues in current image generation tasks, this thesis conducts research based on diffusion models, aiming to explore methods for improving the quality of generated images from both the post-processing and generation processes. Specifically, this paper presents the following innovations and contributions: 1. Generative artifact restoration based on conditional diffusion model. This work focuses on studying the problem of artifacts in images sampled by generative models and proposes a post-processing method for image restoration to enhance image quality. Artifacts refer to non-realistic defects or distortions that occur during the generation process, affecting the realism and quality of the images. By applying post-processing techniques to restore these artifacts, the image quality can be improved, making the images more realistic and natural. This work presents a unified restoration method for artifacts generated by different types of generative models. In this study, three mainstream generative models (GANs, autoregressive models, and diffusion models) are considered, and different mechanisms are simulated to generate image-artifact pairs for training the image restoration model. In terms of the design of the image restoration model, this work utilizes the powerful fitting capability of continuous diffusion models to design the image restoration model. Experimental results on synthesized artifact images and real artifact images demonstrate that the proposed image restoration model achieves satisfactory restoration performance on both types of artifact images. 2. Image generation algorithm based on hierarchical discrete diffusion model. Discrete diffusion models represent another branch of diffusion models alongside continuous diffusion models. Discrete diffusion models model image data as a sequence of discrete tokens using vector quantization, which can be effectively extended to text-to-image generation tasks. However, vector quantization models often suffer from significant information loss during the first stage of image compression, resulting in poor image quality. To better capture the prior distribution of image data, this work proposes an extension of the single-layer Markov chain-based discrete diffusion model to a hierarchical discrete diffusion model based on a double-layer Markov chain. From the perspective of optimizing the loss function, this study introduces an additional lightweight mapping model to learn the mapping relationship between low-resolution tokens and high-resolution tokens. By introducing an additional supervised loss to the single-layer diffusion model, it can learn a more optimal image prior, thereby enhancing the quality of generated images.
关键词	生成模型图像生成扩散模型
语种	中文
七大方向——子方向分类	图像视频处理与分析
国重实验室规划方向分类	视觉信息处理
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52129
专题	毕业生_硕士学位论文复杂系统认知与决策实验室_智能系统与工程
推荐引用方式 GB/T 7714	殷月琴. 基于扩散模型的生成图像质量改善方法研究[D],2023.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
202028014628058 殷月琴.（28050KB）	学位论文		限制开放	CC BY-NC-SA