基于可解释性分析的神经机器翻译不确定性研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于可解释性分析的神经机器翻译不确定性研究
	卢宇
	2022-05
页数	108
学位类型	博士
中文摘要	近年来，机器翻译技术快速发展，译文质量不断提升。其中，基于神经网络的机器翻译技术表现出色，已成为机器翻译领域的主流方法。神经网络作为一种“黑盒”技术，包含亿级的参数量和复杂的网络结构，因此具有极强的学习能力，可以通过自动探测关联特征做出最终的决策。然而，这些特征表示分布在高维连续空间中，对人来说难以理解，研发者也难以探究模型结果的生成过程，且该过程易受到各种不确定性因素的影响。这种不确定性体现在：数据分布和模型参数的波动造成模型结果不可预测，给模型的稳定性带来威胁，导致使用者无法判断何时可以信任模型结果。因此，研究神经机器翻译模型中各个模块的运作机理，定位不确定性的来源，分析不确定性的影响，以减少不确定性带来的干扰具有十分重要的研究意义和应用价值。本文围绕数据收集、跨语言知识建模和译文概率预测这三个神经机器翻译的关键阶段，研究数据不确定性、注意力机制不确定性及预测不确定性，利用可解释性技术分析不确定性对模型结果的影响，通过不确定性的降低提高翻译质量，并对模型可能出现的错误进行预判。
英文摘要	In recent years, machine translation technology has developed rapidly, and the quality of translated texts has been continuously improved. Among them, neural network-based machine translation technology has gained significant progress and is now the leading method in this field. As a ``black box'' tool, neural networks contain billions of parameters and complex structures, which enables strong learning capabilities and allows the model to make final decisions by automatically detecting relevant features. However, these feature representations are distributed in high-dimensional continuous space and are difficult for humans to understand. Researchers find it hard to explore the decision-making process, which is easily affected by various uncertainties. This uncertainty is reflected in the fact that fluctuations in data distribution and model parameters make model results unpredictable, threaten the stability of the model, and leave users unsure when they can trust the model results. Therefore, studying the operating mechanisms of various modules in the neural machine translation model, locating the sources of uncertainty, analyzing the impact of uncertainty, and reducing the interference caused by uncertainty has significant research and application value. This paper focuses on three key stages of neural machine translation, such as data collection, cross-language knowledge modeling, and translation probability prediction, to study data uncertainty, attention uncertainty and predictive uncertainty. By using interpretability techniques to evaluate the impact of uncertainty on model outputs, this paper further reduces uncertainty to improve translation quality and predict potential errors in the model.
关键词	神经机器翻译数据不确定性注意力机制不确定性预测不确定性
学科领域	自然语言处理
学科门类	工学 ; 工学::计算机科学与技术（可授工学、理学学位）
语种	中文
七大方向——子方向分类	自然语言处理
国重实验室规划方向分类	可解释人工智能
是否有论文关联数据集需要存交	否
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/51867
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	卢宇. 基于可解释性分析的神经机器翻译不确定性研究[D],2022.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
基于可解释性分析的神经机器翻译不确定性研（3886KB）	学位论文		限制开放	CC BY-NC-SA