Spoken language translation (SLT) is an important application of speech and language technology, and it is related to the linguistics, computer science, speech recognition, communication and other techonologies. The research of SLT is of great significance. This thesis presents some researches on spoken language parsing, automatic extraction of phrase translation and the building of spoken language translation experiment platform, which are all important work of SLT research. The main work is summarized as follows: 1. We propose a new approach to spoken Chinese parsing based on semantic classification trees (SCT). In this approach, the semantic classification trees, which are built by the semantic rules automatically learned from the training data, are used to disambiguate key words related to the sentences’ shallow semantic meaning, and a statistical model is used to extract the whole sentence’s shallow semantic meaning--domain action. This approach has followed strongpoint: (1) It is robust and easy to be implemented as the rules are automatically learned from training corpus; (2) The efficiency is enhanced as it uses more context information than HMM based approach; (3) Different part of domain action can be combined with each other freely. The experimental results proved that this approach has good performance and is feasible for the restricted domain oriented Chinese spoken language understanding in the shallow semantic level. 2. Statistical translation is a very important approach of spoken language translation. Phrase-based statistical translation models are effective in improving translation quality as they can deal with the relationships between words in sentences better than word-based translation models. One approach of phrase-based translation integrates phrase translations as knowledge sources into system, and the system’s performance greatly depends on the quality of phrase translations. In this thesis, we propose a new approach of phrase translation extraction based on HMM-based word alignment. At first, the word alignment of bilingual sentences are implemented based on HMM, then, the phrase translations are extracted and processed from the alignment result. The experiment results proved that the phrase translations extracted by this approach is of high quality. 3. An SLT experiment platform is built based on above work and previous technologies. In the platform, speech recognition module, text-to-speech module and three different translation approaches is integated. The platform affords a good environment for the researches of SLT.
修改评论