CASIA OpenIR  > 数字内容技术与服务研究中心  > 听觉模型与认知计算
Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese
Shiyu Zhou1,2; Linhao Dong1,2; Shuang Xu1; Bo Xu1
2018
Conference NameInterspeech
Source PublicationInterspeech
Issue2018
Conference Date2018
Conference Place印度的海德拉巴
Abstract

Sequence-to-sequence attention-based models have recently
shown very promising results on automatic speech recognition
(ASR) tasks, which integrate an acoustic, pronunciation and
language model into a single neural network. In these models,
the Transformer, a new sequence-to-sequence attentionbased
model relying entirely on self-attention without using
RNNs or convolutions, achieves a new single-model state-ofthe-
art BLEU on neural machine translation (NMT) tasks. Since
the outstanding performance of the Transformer, we extend
it to speech and concentrate on it as the basic architecture of
sequence-to-sequence attention-based model on Mandarin Chinese
ASR tasks. Furthermore, we investigate a comparison between
syllable based model and context-independent phoneme
(CI-phoneme) based model with the Transformer in Mandarin
Chinese. Additionally, a greedy cascading decoder with the
Transformer is proposed for mapping CI-phoneme sequences
and syllable sequences into word sequences. Experiments on
HKUST datasets demonstrate that syllable based model with
the Transformer performs better than CI-phoneme based counterpart,
and achieves a character error rate (CER) of 28.77%,
which is competitive to the state-of-the-art CER of 28.0% by
the joint CTC-attention based encoder-decoder network.

KeywordAsr Multi-head Attention Syllable Based Acoustic Modeling Sequence-to-sequence
MOST Discipline Catalogue工学::计算机科学与技术(可授工学、理学学位)
Indexed ByEI
Language英语
Document Type会议论文
Identifierhttp://ir.ia.ac.cn/handle/173211/22392
Collection数字内容技术与服务研究中心_听觉模型与认知计算
Corresponding AuthorShiyu Zhou
Affiliation1.Institute of Automation, Chinese Academy of Sciences
2.University of Chinese Academy of Sciences
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Shiyu Zhou,Linhao Dong,Shuang Xu,et al. Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese[C],2018.
Files in This Item: Download All
File Name/Size DocType Version Access License
[2018 Interspeech]Sy(416KB)会议论文 开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Shiyu Zhou]'s Articles
[Linhao Dong]'s Articles
[Shuang Xu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Shiyu Zhou]'s Articles
[Linhao Dong]'s Articles
[Shuang Xu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Shiyu Zhou]'s Articles
[Linhao Dong]'s Articles
[Shuang Xu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: [2018 Interspeech]Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.