基于神经网络的机器翻译技术研究

CASIA OpenIR > 毕业生 > 博士学位论文

	基于神经网络的机器翻译技术研究
	陈炜
	2016-05
学位类型	工学博士
英文摘要	机器翻译作为替代传统高成本的人工翻译的可行方案，具有很高的应用价值。然而，传统的统计机器翻译模型因其线性不可分、全局信息丢失严重、语义无关、错误传播等问题面临严峻的挑战。近几年，基于深度学习的神经网络模型在很多领域取得了显著的效果，引起了学术界的广泛关注，也为机器翻译突破现有性能瓶颈提供了新的机遇。目前，神经网络模型主要从改善现有统计机器翻译模型中的关键问题以及实现端到端翻译模型替换统计翻译框架两个方面应用于机器翻译领域。本文围绕神经网络模型在机器翻译领域中的应用进行研究，主要研究成果如下：提出了基于层叠对数线性模型的双语分词算法，利用层叠对数线性模型融合多层次特征，引入中文语法、双语语义、双语音译以及双语对齐等多种信息源，既保证了分词结果符合语法规范以适应神经网络的词向量计算，又最大化地缓解中英文由于语系不同而造成的词粒度混淆问题。首次提出了基于神经网络困惑度计算的双语句对质量评估算法。区别于传统方法在人工启发式特征设计、非局部上下文信息丢失严重以及语义无关等方面的缺陷，基于神经网络困惑度计算的句对质量评估算法不需要任何上下文无关性假设以及人工启发式特征设计，同时，神经网络模型能够很好地利用词语的语义信息，既减少了语义相似性带来的质量评估干扰，又能够很好地应对中英文之间常见的转义现象。提出了基于双语约束的递归神经网络模型，从而在层次短语翻译模型中引入句法和语块类别信息。相比在翻译模型中引入句法或语块信息的传统方法，该方法的特点在于不仅能够同时考虑句法和语块信息这两类知识，而且能够以一种宽松的约束方式引入这两类知识，避免了传统方法由于约束过强而导致的数据稀疏和错误传播问题。首次提出了基于注意力机制神经网络的双语命名实体对齐及翻译模型。相比传统命名实体对齐模型，该方法能够利用全局上下文信息，同时，避免了极大似然估计得到后验概率的方法所导致的概率低估问题，因而能够更加精确地对齐双语命名实体并进行翻译。搭建基于注意力机制的端到端神经网络翻译系统。利用异步随机梯度下降、层次化分解等算法优化训练效率，利用上述各项研究成果优化了中文分词、大规模双语训练语料及命名实体识别等关键问题，有效提升翻译性能。同时，在多领域的翻译任务中对该系统进行了测试、对比及分析。 ; Machine translation, which acts as an alternative to human-engineered translation, has achieved more and more attention. However, traditional methods based on statistical machine translation (SMT) face serious challenges due to the disadvantages of SMT, such as linear non-separable problems, the lack of global context information, semantic-independent, and error propagation. In recent years, the neural network model based on deep learning technology has achieved remarkable results in many domains, which draws more and more attention of academia and industry and provides a new solution for the bottleneck of machine translation. In most of the previous work, neural network model takes a role in either the key module of SMT or producing end-to-end neural machine translation model to develop the performance of machine translation. In this paper, we focus on applying neural network model in machine translation and the main contributions are as follows: A bilingual Chinese word segmentation (CWS) method based on cascaded log-linear model is proposed, which involves learning three levels of features including monolingual grammars, bilingual alignment feature, bilingual semantic feature and bilingual transliteration feature. The proposed method guarantees not only the monolingual grammars, but also the low perplexity of bilingual alignment. A quality evaluation for bilingual parallel corpus based on perplexity computation using neural network model is proposed. Different from traditional methods which suffer from manually heuristic features, lack of global context information and semantic-independent, the proposed method doesn’t have to make any context-free hypothesis and heuristic features. Moreover, neural network model can integrate semantic information of bilingual words, which not only address the problem of synonym, but also deal with transferred meaning in Chinese-English corpus. A syntactic-constrained hierarchical translation model based on bilingually-constrained recursive neural networks is proposed, which provides two types of syntactic information for standard hierarchical translation model. Different from traditional syntactic-based methods, the proposed method can leverage both syntactic knowledge of source parsing and shallow parsing. Moreover, the proposed method employ a significantly weaker constraint to integrate these two syntactic knowledge, which can alleviate data sparseness and the influence of parsing errors. A bilingual named entity alignment method based on attention mechanism is proposed. Different from traditional methods, the proposed method can leverage global context information and alleviate the underestimate of probability led by maximum likelihood estimation, which makes it extract bilingual named entity more accurately. An end-to-end neural machine translation system is built. We integrate Asynchronous stochastic gradient descent (ASGD) algorithm and hierarchical decomposition algorithm to address the training efficiency problem. Moreover, we leverage the researches above to develop the performance of the system. Finally, we evaluate the translation performance of the end-to-end neural machine translation system on multi-domain testing data.
关键词	机器翻译神经网络双语分词语料过滤句法约束命名实体
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/11807
专题	毕业生_博士学位论文
作者单位	中国科学院自动化研究所
第一作者单位	中国科学院自动化研究所
推荐引用方式 GB/T 7714	陈炜. 基于神经网络的机器翻译技术研究[D]. 北京. 中国科学院研究生院,2016.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
陈炜-毕业论文-final.pdf（2115KB）	学位论文		限制开放	CC BY-NC-SA