CASIA OpenIR  > 毕业生  > 硕士学位论文
Alternative TitleThe Design and Implementation of Decoder for Statistical Machine Translation
Thesis Advisor宗成庆
Degree Grantor中国科学院研究生院
Place of Conferral中国科学院自动化研究所
Degree Discipline计算机应用技术
Keyword统计机器翻译 解码器 最小错误率训练 口语翻译 Statistical Machine Translation Decoder Minimum Error Rate Training Spoken Language Translation
Abstract机器翻译是自然语言处理中的一个重要研究方向。近年来,统计机器翻译取得了很大的成功,基于短语的翻译系统在机器翻译评测中占据了主要地位,并且取得了领先的成绩。最大熵模型可以方便地添加不同的知识源,目前已经成为统计翻译的主流框架。 本文针对统计机器翻译解码器的设计与实现以及统计机器翻译实验平台的建设问题做了相关的研究和探讨,主要内容归纳如下: (1) 实现了最小错误率的参数训练方法 最小错误率的最大熵翻译模型参数训练方法直接以翻译结果的评价标注为优化准则,在一定程度上可以提高参数训练的质量。本方法的实现为实验系统开发和平台建设提供了灵活方便的工具模块。 (2) 设计实现了基于柱搜索的解码器 在解码器实现过程中,充分考虑了算法的执行效率和可扩展性等因素,为统计翻译系统的实现奠定了基础。 (3) 建立了统计翻译系统实验平台 在上述工作和已有技术的基础上,建立了一个统计翻译系统实验平台。该平台提供了丰富的功能选项和接口,为统计翻译系统的深入研究提供了方便。
Other AbstractStatistical machine translation (SMT) is one of the most important research fields in natural language processing. In recent years, SMT has shown considerable success, and phrase-based translation models have been suggested to be the state of art by recent empirical evaluations. Now most of SMT systems are based on maximum entropy (ME) model. This thesis is about the design and implementation of an SMT decoder and the building of an SMT experiment platform. The main work is summarized as follows: (1) Minimum Error Rate Training in Statistical Machine Translation Minimum Error Rate (MER) Training improves the performance of the SMT system by directly using the evaluation criteria as the training criteria. The implementation of MER provides a tool for the experiment platform. (2) The Design and Implementation of A Statistical Machine Translation Decoder The efficiency and expansion are considered in the design and implementation of SMT decoder. The decoder is the basic of the experiment platform for SMT. (3) The Experiment Platform for Statistical Machine Translation An SMT experiment platform is built based on the above work and previous technologies. The platform provides plenty of functions and affords a good environment for the researchers of SMT.
Other Identifier200428014628066
Document Type学位论文
Recommended Citation
GB/T 7714
柴春光. 统计机器解码器的设计与实现[D]. 中国科学院自动化研究所. 中国科学院研究生院,2007.
Files in This Item:
File Name/Size DocType Version Access License
CASIA_20042801462806(981KB) 暂不开放CC BY-NC-SAApplication Full Text
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[柴春光]'s Articles
Baidu academic
Similar articles in Baidu academic
[柴春光]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[柴春光]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.