面向语音翻译的文本规范化和端到端建模方法研究

	面向语音翻译的文本规范化和端到端建模方法研究
	董倩倩
	2021-05
页数	124
学位类型	博士
中文摘要	语音翻译是指让机器完成从源语言的语音信号自动翻译生成目标语言的文本的过程，其基本设想是让计算机像人类译员一样充当持不同语言说话人之间翻译的角色。伴随着全球化大趋势，国际交流及信息传递日益增长，日常生活、外贸、医疗、电子商务等领域对语音翻译技术拥有着巨大而迫切的需求。主流的语音翻译系统分别独立地训练语音识别模块和机器翻译模块，然后将两个模块进行简单的拼接。近期，随着深度学习技术的发展以及数据爆炸式的增长，研究者们开始了对端到端技术的探索。在此背景下，本文以深度学习技术为理论基础，首先从级联系统出发，研究了级联系统中的文本规范化技术。然后对端到端语音翻译的模型结构和训练方法展开了研究。本文的主要研究成果总结如下。
英文摘要	Speech translation refers to the process of allowing a machine to automatically translate the audio signal in the source language to generate text in the target language. The basic idea is to let the computer act as a translator between speakers of different languages like a human translator. With the development of the globalization trend, international communication and information transmission are increasing. There is a huge and urgent need for speech translation technology in daily life, foreign trade, medical treatment, e-commerce and other fields. The mainstream speech translation systems build the speech recognition module and the machine translation module independently, and then simply connect the two modules. Recently, with the development of deep learning technology and the explosive growth of data, researchers have begun to explore end-to-end technology. In this background, based on deep learning technology, this paper starts from the cascaded system, and studies the text normalization method in the cascaded system. Then the end-to-end model structure and training methods are explored. The main research results of this dissertation can be concluded as follows.
关键词	语音翻译、级联系统、文本规范化、端到端模型
语种	中文
文献类型	学位论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44968
专题	复杂系统认知与决策实验室_听觉模型与认知计算
推荐引用方式 GB/T 7714	董倩倩. 面向语音翻译的文本规范化和端到端建模方法研究[D]. 中科院自动化所. 中科院自动化所,2021.

条目包含的文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
面向语音翻译的文本规范化和端到端建模方法（4379KB）	学位论文		开放获取	CC BY-NC-SA