CASIA OpenIR  > 毕业生  > 硕士学位论文
面向图像分类的领域泛化方法研究
林建鑫
2023-05-21
页数100
学位类型硕士
中文摘要

随着智能技术的发展,以机器学习为核心的智能系统已在各行各业中广泛应用,成为了社会生产生活中的有力工具。然而数据驱动的机器学习模型与算法在实际应用中常遭遇分布偏移问题,即训练数据分布与测试数据分布存在差异。以独立同分布为基本假设的经验风险最小化方法所学得的模型常在分布偏移条件下产生性能退化,降低了系统的鲁棒性和可靠性。

近年来,领域泛化引起了许多学者的关注。领域泛化旨在利用多个分布不同的源域数据集学习一个对分布偏移鲁棒的模型,使其在未见的测试场景中保持其良好性能而不退化。领域泛化方法是对经典经验风险最小化方法的重要补充,具有显著的理论价值和应用价值。当前的领域泛化研究已取得了大量进展,所提出的方法已使得模型的分布外泛化能力大幅提高,但仍存在一些研究难点和重点问题亟需突破,分别是不合理假设、源域过拟合、先验知识和分布残差。本文针对这些难点、重点问题,从通用领域泛化方法和面向实际场景的领域泛化技术两个角度展开探索。主要研究内容和创新成果可总结如下:

1. 无先验的通用域不变分类器学习:针对现有工作中的不合理假设和源域过拟合问题,本文提出了一种基于带约束的最大跨域似然优化问题的域不变分类器学习方法,在不涉及不合理假设的基础上,仅引入一个边缘分布约束条件,避免过度正则化带来的源域过拟合问题。具体而言,首先以最小化域间 KL 散度的优化目标对域不变分类器学习问题进行形式化;其次,针对最小化 KL 散度产生的条件分布熵增问题,设计了一项最大域内似然以提高特征空间判别性;随后,引入源域边缘分布对齐约束,以对齐后的源域边缘分布近似真实世界边缘分布,并在此分布上最小化 KL 散度期望,提高域不变分类器的域外泛化性;最终获得一个带约束的最大跨域似然优化问题,能够在学习域不变分类器的同时,实现特征空间中的联合分布对齐;此外,设计了一个有效的交替优化策略来求解该约束优化问题。本方法并不依赖对数据的先验知识,因此是一种通用的领域泛化方法。在四个公开数据集上充分验证了所提方法的有效性。

2. 融合视觉先验与不确定度量化的领域泛化框架:针对如何利用视觉先验知识的问题,鉴于图像数据中大部分域间协变量偏移表现为风格差异,本文设计了一种即插即用的风格分布归一化模块,在视觉先验知识的辅助下,高效地缓解了图像数据中协变量偏移问题。具体而言,该模块以特征统计量表征图像风格信息,将多域图像特征统计量分布归一化到同一个高斯分布中,隐式地实现了风格分布的归一化。针对模型所拟合分布与测试分布的分布残差问题,本文设计了一个基于不确定度量化的多域决策融合机制。首先基于主观逻辑和证据理论对多个域特定分类器的预测不确定度进行量化,其次基于 Dempster-Shafer 证据理论对多个域特定分类器的预测进行基于不确定度的动态融合,得到具备更低不确定度的预测分布,即以多域预测条件分布的动态组合近似给定样本的真实条件分布,有效缓解了二者的分布残差问题。风格分布归一化模块可视为一种基于先验知识的边缘分布对齐方案,而多域决策融合则可视为以动态组合的方式对齐条件分布,二者可组合成为一个领域泛化框架,实现联合分布对齐。在四个公开数据集上验证了所提方法的优秀性质。

英文摘要

With the development of intelligent technology, intelligent systems based on machine learning have been widely used in various industries and have become powerful tools in social production and life. However, data-driven machine learning models and algorithms often encounter the problem of distribution shift in practical applications, that is, the distribution of training data is different from that of test data. The performance of models, which are trained via empirical risk minimization method based on the assumption of independent and identical distribution, often degrade due to the distribution shift, and the robustness and reliability of systems are broken.

In recent years, domain generalization has attracted the attention of many scholars. Domain generalization aims to utilize multiple source domain datasets with different distributions to learn a model which is robust to distribution shifts so that it maintains its good performance in unseen test scenarios without performance degradation. Domain generalization is an important supplement to the classical empirical risk minimization method, and has significant theoretical and practical implications. The current domain generalization research has made a lot of progress, and the proposed methods have greatly improved the out-of-distribution generalization ability of models. But there are still some research difficulties and key issues to be handled with, namely unreasonable assumptions, over-fitting to source domains, prior knowledge, and distribution residuals. Aiming at these difficulties and key issues, this paper explores from two perspectives, i.e., general domain generalization methods and domain generalization technologies for practical scenarios. The main research contents and innovative achievements can be summarized as follows:

1. General domain-invariant classifier learning without prior knowledge: For problems of unreasonable assumptions and over-fitting to source domains in existing works, a constrained maximum cross-domain likelihood optimization problem is proposed for the domain-invariant classifier learning. Without unreasonable assumptions, only a marginal distribution constraint is designed, which effectively avoids the problem of over-fitting to source domains caused by excessive regularization. Specifically, the domain-invariant classifier learning is firstly formalized as a optimization problem, which minimizes the KL-divergence between conditional distributions of different domains. Secondly, aiming at the entropy increase problem of the conditional distribution generated by minimizing the KL divergence, a term of maximum intra-domain likelihood is designed to improve the discriminant of the feature space. Then, the constraint of marginal distribution alignment is introduced to approximate the real-world marginal distribution with the aligned marginal distribution of sorce domains, and the expectation of KL-divergence is minimized on this distribution to improve the out-of-domain generalization of the learned domain-invariant classifier. Finally, a constrained maximum cross-domain likelihood optimization problem is obtained, by solving which, the joint distribution alignment can be realized in the feature space while learning domain-invariant classifier. In addition, an effective alternating optimization strategy is designed to solve this optimization problem. Since it does not rely on prior knowledge about the data, this method is a generic approach for domain generalization.The effectiveness of the proposed method is fully verified on four public datasets.

2. A domain generalization framwork based on the visual prior knowledge and uncertainty quantification: For the utilization of visual prior knowledge, because most of the inter-domain covariate shifts in image data appear as differences in iamge style, a plug-and-play style distribution normalization module is designed. Assisted by the visual prior knowledge, the module efficiently alleviates the problem of covariate shift in image data. Specifically, feature statistics is utilized to represent image style information. The module normalizes the distributions of multi-domain image feature statistics to the same Gaussian distribution so that the normalization of style distributions is implicitly realized. For the distribution residual between the distribution fitted by the model and the test distribution, a fusion mechanism of multi-domain decisions based on the uncertainty quantification is designed. Firstly, based on subjective logic and evidence theory, the prediction uncertainty of multiple domain-specific classifiers is quantified. Secondly, based on the Dempster-Shafer evidence theory, the predictions of multiple domain-specific classifiers are dynamically fused based on their uncertainties, and a low-uncertainty prediction distribution is obtained. Approximating the true conditional distribution of a given sample with a dynamic combination of multi-domain conditional distributions effectively alleviates the distribution residual problem. The style distribution normalization module can be regarded as a marginal distribution alignment scheme based on the visual prior knowledge, and the multi-domain decision fusion can be regarded as the conditional distributions alignment. The two can be combined as a domain generalization framework to achieve the joint distribution alignment. Excellent properties of the proposed method are verified on four public datasets.

关键词领域泛化 分布外泛化 模型鲁棒性 迁移学习 计算机视觉
学科领域人工智能理论 ; 模式识别
学科门类工学::控制科学与工程
收录类别其他
语种中文
是否为代表性论文
七大方向——子方向分类模式识别基础
国重实验室规划方向分类人工智能基础前沿理论
是否有论文关联数据集需要存交
文献类型学位论文
条目标识符http://ir.ia.ac.cn/handle/173211/51886
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
林建鑫. 面向图像分类的领域泛化方法研究[D],2023.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
面向图像分类的领域泛化方法研究 - 林建(13974KB)学位论文 限制开放CC BY-NC-SA
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[林建鑫]的文章
百度学术
百度学术中相似的文章
[林建鑫]的文章
必应学术
必应学术中相似的文章
[林建鑫]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。