Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition

	Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition
	Ye Bai; Jiangyan Yi; Jianhua Tao; Zhengkun Tian; Zhengqi Wen
	2019
会议名称	Interspeech
会议日期	2019
会议地点	Graz
出版者	Interspeech
摘要	Integrating an external language model into a sequence-to- sequence speech recognition system is non-trivial. Previous works utilize linear interpolation or a fusion network to inte- grate external language models. However, these approaches in- troduce external components, and increase decoding computa- tion. In this paper, we instead propose a knowledge distilla- tion based training approach to integrating external language models into a sequence-to-sequence model. A recurrent neural network language model, which is trained on large scale exter- nal text, generates soft labels to guide the sequence-to-sequence model training. Thus, the language model plays the role of the teacher. This approach does not add any external component to the sequence-to-sequence model during testing. And this approach is flexible to be combined with shallow fusion tech- nique together for decoding. The experiments are conducted on public Chinese datasets AISHELL-1 and CLMAD. Our ap- proach achieves a character error rate of 9.3%, which is rela- tively reduced by 18.42% compared with the vanilla sequence- to-sequence model.
七大方向——子方向分类	语音识别与合成
文献类型	会议论文
条目标识符	http://ir.ia.ac.cn/handle/173211/44983
专题	多模态人工智能系统全国重点实验室_智能交互
作者单位	NLPR, Institute of Automation, Chinese Academy of Sciences
第一作者单位	模式识别国家重点实验室
推荐引用方式 GB/T 7714	Ye Bai,Jiangyan Yi,Jianhua Tao,et al. Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition[C]:Interspeech,2019.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
LST.pdf（779KB）	会议论文		开放获取	CC BY-NC-SA	浏览下载