1. Propose a bidirectionally smoothing word-based translation model. It is known to all, translation model is the key technology of machine translation. In the word-based benchmark SMT system, the thesis proposes a word-based translation model Smoothed by the bidirectionalltranslation probability of source and target sides, the translation model has an effect on lowering the noise during the IBM translation model’s training. Comparing with the initial word-based model, it makes the the BLEU of the result increase about 0.009 in experiments. 2. Propose a “null-expanding” beam search algorithm. In phrase-based statistical translation, because the different express habits, some tar-get words which appear frequently and have zero fertility in IBM Model, need be com-plemented. We call them F-zerowords. The path expanding corresponding to these words is called “null-expanding”. The thesis proposes a “null-expanding” beam search algo-rithm , it makes the F-zerowords modify the output and gains a BLEU increases of at least 0.01. Moreover, the thesis traces back the result by searching through the final sev-eral stacks not the final one stack which traditional method adopted, this also makes the translation result better. 3. Propose a multifeature-based translation model training and solve the data spare-ness and reordering problem by extracting phrase-template. The thesis proposes using four features to train translation model that can offset the shortcoming of each feature. The features are combined by a log-linear form ,and the factors are trained by a minimum error training procedure. The thesis proposes extracting N_template which corresponds to nane entity phrase and X_template which corresponds to not nane entity phrase to deal with data sparseness and distortion problem. 4. Propose a string-to-tree alignment template-based translation model, propose solv-ing the translation and reordering problems through extracting tree type of string –to-tree alignment template.
修改评论