At present, the statistical methods including Phrase-based system and Hierarchical-based system in machine translation field is predominant. Language model plays an important role in statistical translation system. It makes the translation fit for grammar of target language. We wonder what the effects of Chinese language models’ scale and n-gram’s dimension in English-Chinese machine translation systems are. So we have done many experiments in this dissertation. The main contributions of this paper are as follows: 1. Study on Phrase-based system’s framework and every functional model. The functional models include language model training, translation model training, decoder, the algorithm of minimum error rate training and post-processing. 2. Describe the implementation of Hierarchical-based statistical translation system. 3. Study on how to process Chinese to English parallel corpus in machine translation system, how to make corpus from original to mature, and developed a form to preprocess corpus. 4. Study on the effects of Chinese language models’ scale and n-gram’s dimension in English-Chinese machine translation systems. Experiments show that for the same language models, hierarchical phrase-based MT system is better than phrase-based MT system, but for the same MT system, Language models’ scale and dimension effects the BLEU value obviously. It is not sure that a larger scale and higher dimension language model has a better result. In general, this paper mainly focuses on the preprocessing of the training data, the implement of machine translation system, the scale of Chinese language models for Statistical Machine Translation Systems, which have greatly improved the translation result.
修改评论