Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese
Shiyu Zhou1,2; Linhao Dong1,2; Shuang Xu1; Bo Xu1
2018
会议名称Interspeech
会议录名称Interspeech
期号2018
会议日期2018
会议地点印度的海德拉巴
摘要

Sequence-to-sequence attention-based models have recently
shown very promising results on automatic speech recognition
(ASR) tasks, which integrate an acoustic, pronunciation and
language model into a single neural network. In these models,
the Transformer, a new sequence-to-sequence attentionbased
model relying entirely on self-attention without using
RNNs or convolutions, achieves a new single-model state-ofthe-
art BLEU on neural machine translation (NMT) tasks. Since
the outstanding performance of the Transformer, we extend
it to speech and concentrate on it as the basic architecture of
sequence-to-sequence attention-based model on Mandarin Chinese
ASR tasks. Furthermore, we investigate a comparison between
syllable based model and context-independent phoneme
(CI-phoneme) based model with the Transformer in Mandarin
Chinese. Additionally, a greedy cascading decoder with the
Transformer is proposed for mapping CI-phoneme sequences
and syllable sequences into word sequences. Experiments on
HKUST datasets demonstrate that syllable based model with
the Transformer performs better than CI-phoneme based counterpart,
and achieves a character error rate (CER) of 28.77%,
which is competitive to the state-of-the-art CER of 28.0% by
the joint CTC-attention based encoder-decoder network.

关键词Asr Multi-head Attention Syllable Based Acoustic Modeling Sequence-to-sequence
学科门类工学::计算机科学与技术(可授工学、理学学位)
收录类别EI
语种英语
文献类型会议论文
条目标识符http://ir.ia.ac.cn/handle/173211/22392
专题数字内容技术与服务研究中心_听觉模型与认知计算
通讯作者Shiyu Zhou
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.University of Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Shiyu Zhou,Linhao Dong,Shuang Xu,et al. Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese[C],2018.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
[2018 Interspeech]Sy(416KB)会议论文 开放获取CC BY-NC-SA浏览 下载
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Shiyu Zhou]的文章
[Linhao Dong]的文章
[Shuang Xu]的文章
百度学术
百度学术中相似的文章
[Shiyu Zhou]的文章
[Linhao Dong]的文章
[Shuang Xu]的文章
必应学术
必应学术中相似的文章
[Shiyu Zhou]的文章
[Linhao Dong]的文章
[Shuang Xu]的文章
相关权益政策
暂无数据
收藏/分享
文件名: [2018 Interspeech]Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。