融合多语言信息的神经机器翻译方法研究

CASIA OpenIR > 毕业生 > 博士学位论文

	融合多语言信息的神经机器翻译方法研究
	王亦宁
	2020-05-28
页数	126
学位类型	博士
中文摘要	近年来，机器翻译技术取得了令人瞩目的成就，尤其是基于端到端模型的神经机器翻译方法(neural machine translation, NMT)能够极大程度地提升机器翻译的译文质量，这种变革式的发展使得神经机器翻译成为机器翻译的新范式。各大互联网公司也将神经机器翻译模型部署作为其线上的翻译服务系统。然而，现有的神经机器翻译模型通常需要占用较大的存储和运算资源，训练和部署多个语言对之间翻译模型会造成极大的资源消耗。与此同时，不同语言之间包含一些相似的通用信息。如果能够使用一套模型完成这些语言对之间的翻译任务不仅会带来较大的便利，而且能够充分利用这些语言之间共性，对翻译性能的提升有较大帮助。因此，研究融合多语言信息的神经机器翻译方法对于促进机器翻译的应用和提高译文的质量都具有重要的理论意义和应用价值。本文将对融合多语言信息的神经机器翻译方法和实现技术进行研究，以更好地利用语言之间的通用信息和独立信息，提升机器翻译的译文质量。论文的主要工作和创新点归纳如下： 1、提出了一种融合语言独立信息的多语言神经机器翻译方法融合多语言信息的神经机器翻译具备模型压缩和节约资源消耗的能力。然而，当存在多个目标语言时，仅使用一套模型参数生成不同语言的句子会造成模型的负担过重，进而降低翻译质量。针对这一问题，本文提出了一种融合语言独立信息的多语言神经机器翻译方法，能够较好地区分不同的目标语言，从而在维持模型规模不变的前提下提升翻译质量。该方法分别设计了三种不同的策略，包括语言相关的语种标签、语言相关的位置编码和语言相关的隐层向量。综合考虑这三方面因素可以更为优化地捕获不同语言的独立信息，提升翻译质量。实验表明，本文所提出的三种策略都能超越已有融合多语言信息的神经机器翻译方法。 2、提出了一种基于语言敏感表示器的多语言神经机器翻译方法当前融合多语言信息的神经机器翻译中不同语言的通用信息没有被充分利用。同时，如何平衡翻译知识共享和独立成为多语言神经机器翻译的核心挑战。针对上述问题，本文提出了一种基于语言敏感表示器的多语言神经机器翻译方法。该方法将神经机器翻译模型的编码器和解码器连接到一起，既显著减少了模型参数的规模又更好地利用了语言之间的通用信息。同时为了进一步增强模型性能，该方法分别利用输入层、中间层和输出层的语义表征强化语言独立信息。实验表明，本文所提出的方法在大规模和小规模数据集上都能显著提升译文的质量，获得了最好的翻译效果。此外，该方法更加适用于低资源甚至零资源的条件下的翻译。 3、提出了一种同步交互解码的多语言神经机器翻译方法考虑到当前神经机器翻译的解码方法通常缺乏利用目标端未来信息的能力，而在表达同样内容时不同语言运用词汇的次序有很大的差异。因此，本章提出了一种同步交互解码生成多种语言的神经机器翻译方法，从而利用其它目标语言提供的信息。该方法在解码端生成某种目标语言的词汇时，融合这种目标语言和其他目标语言的历史信息，实现多语言信息的交互，从而弥补了解码过程中使用单一语言造成的信息缺失。实验表明，所提方法在大规模和小规模数据集上都要优于仅使用双语平行语料训练得到的神经机器翻译模型和融合多语言信息的基线神经机器翻译模型，有效地提升机器翻译译文质量。综上所述，本文针对融合多语言信息的神经机器翻译质量不高的问题展开了深入研究，分别从模型的表征和译文生成的角度出发，关注能够融合不同语言之间的通用信息和区分不同语言独立信息的方法。实验表明，本文所提出的方法能够有效提升多语言神经机器翻译的性能，丰富了融合多语言信息的神经机器翻译方法研究和应用。
英文摘要	In recent years, the research on machine translation has made remarkable achievements. Especially, the end-to-end neural machine translation (NMT) method can greatly improve the translation quality. This revolutionary development makes NMT becomes the new paradigm of machine translation. The majority of Internet companies have deployed NMT models as their online translation services. However, existing NMT models usually require large storage and computational resources, and training and deploying these models between multiple language pairs will cause great consumption. At the same time, different languages contain similar general information. If we can use a set of parameters to complete these translation tasks between multiple language pairs, it will not only bring greater convenience but also make full use of the general information between these languages, which is of great help to improve translation performance. Therefore, studying the approaches of incorporating multilingual information NMT is of great theoretical significance and application value for promoting the application of NMT and enhancing the quality of translation models. This paper will study and implement the approaches of incorporating multilingual information into NMT, so as to make better use of general information and independent information between languages. The main contributions of this paper are summarized as follows: 1. A Multilingual Neural Machine Translation Method for Incorporating Independent Information of Languages Neural machine translation of incorporating multilingual information methods have the ability to compress models and save resources. However, when there exist multiple target languages, using only one set of parameters to represent these languages will overload the model and reduce the translation quality. To solve this problem, this paper proposes a multilingual neural machine translation method that integrates independent information of languages, which can improve translation quality while maintaining model size unchanged. This method designs three different strategies, including language-dependent special language label, language-dependent positional encoding and language-dependent hidden units per layer. Taking these three factors into consideration can capture independent information of different languages better and improve translation quality. Experiments demonstrate that our proposed three strategies can surpass the existing baseline systems. %Moreover, compared with the translation model obtained by bilingual training corpus, the method can achieve comparable or even better results. 2. A Compact and Language-Sensitive Multilingual Translation Method At present, general information of different languages is not fully utilized in multilingual neural machine translation systems. At the same time, how to balance the sharing and independence of translation knowledge has become the major challenge in multilingual neural machine translation. To solve the above problems, this paper proposes a compact and language-sensitive multilingual neural machine translation method. This method ties the encoder and decoder of the NMT model together, which not only significantly reduces the scale of model parameters but also makes better use of the general information between languages. In order to further enhance the performance of the model, our method makes full use of the semantic representation of the input layer, the middle layer, and the output layer. Experiments show that the proposed method can significantly improve the translation quality on both large-scale and small-scale training corpora. To the best of our knowledge, our method can achieve the best results. Besides, our method is more suitable for low resources and zero-shot translation. 3. Synchronously Generating Multiple Languages with Interactive Decoding Considering that the current decoding methods of NMT cannot usually utilize the future information of the target language. However, there exists a great difference in the order of words used in different languages when expressing the same contexts. Therefore, this paper proposes a method to synchronously generate multiple languages with interactive decoding. When generating words of a particular language, this method can fuse the historical information of this target language and other target languages to realize the interaction of multiple languages and make up for the lack of information existing in decoding a single language. Experimental results exhibit that our proposed synchronous and interactive decoding method is superior to the NMT baseline models on both large-scale and small-scale data sets, which confirms the effectiveness of our method. In summary, this paper study the problem of poor quality for multilingual neural machine translation in depth. The proposed methods mainly focus on integrating general information and distinguishing independent information of different languages from the perspectives of representation and generation respectively. The experiments prove that the proposed method in this paper can improve the performance of multilingual neural machine translation effectively and enrich the research and application of NMT methods that incorporate multiple language information.
关键词	神经机器翻译，独立信息，通用信息，同步交互解码
语种	中文
七大方向——子方向分类	自然语言处理
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/39236
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	王亦宁. 融合多语言信息的神经机器翻译方法研究[D]. 远程答辩. 中国科学院大学,2020.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
博士论文-王亦宁_终版.pdf（7972KB）	学位论文		限制开放	CC BY-NC-SA