CASIA OpenIR  > 复杂系统认知与决策实验室  > 听觉模型与认知计算
SSCFormer: Push the Limit of Chunk-Wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal Convolution
Wang FY(王方圆); Xu B(徐博); Xu B(徐波)
Source PublicationIEEE SIGNAL PROCESSING LETTERS
2024
Pages421-425
Abstract

Currently, the chunk-wise schemes are often used to make Automatic Speech Recognition (ASR) models to support streaming deployment. However, existing approaches are unable to capture the global context, lack support for parallel training, or exhibit quadratic complexity for the computation of multi-head self-attention (MHSA). On the other side, the causal convolution, no future context used, has become the de facto module in streaming Conformer. In this letter, we propose SSCFormer to push the limit of chunk-wise Conformer for streaming ASR using the following two techniques: 1) A novel cross-chunks context generation
method, named Sequential Sampling Chunk (SSC) scheme, to re-partition chunks fromregular partitioned chunks to facilitate efficient long-term contextual interaction within local chunks. 2)The Chunked Causal Convolution (C2Conv) is designed to concurrently capture the left context and chunk-wise future context. Evaluations on AISHELL-1 show that an End-to-End (E2E) CER 5.33% can achieve, which even outperforms a strong time-restricted baseline U2. Moreover, the chunk-wise MHSA computation in our model enables it to train with a large batch size and perform inference with linear complexity.

Indexed BySCI
Language英语
Sub direction classification语音识别与合成
planning direction of the national heavy laboratory语音语言处理
Paper associated data
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/57380
Collection复杂系统认知与决策实验室_听觉模型与认知计算
Corresponding AuthorXu B(徐波)
Affiliation中国科学院自动化研究所
First Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Corresponding Author AffilicationInstitute of Automation, Chinese Academy of Sciences
Recommended Citation
GB/T 7714
Wang FY,Xu B,Xu B. SSCFormer: Push the Limit of Chunk-Wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal Convolution[J]. IEEE SIGNAL PROCESSING LETTERS,2024:421-425.
APA Wang FY,Xu B,&Xu B.(2024).SSCFormer: Push the Limit of Chunk-Wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal Convolution.IEEE SIGNAL PROCESSING LETTERS,421-425.
MLA Wang FY,et al."SSCFormer: Push the Limit of Chunk-Wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal Convolution".IEEE SIGNAL PROCESSING LETTERS (2024):421-425.
Files in This Item: Download All
File Name/Size DocType Version Access License
final_SSCFormer_Push(1843KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang FY(王方圆)]'s Articles
[Xu B(徐博)]'s Articles
[Xu B(徐波)]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang FY(王方圆)]'s Articles
[Xu B(徐博)]'s Articles
[Xu B(徐波)]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang FY(王方圆)]'s Articles
[Xu B(徐博)]'s Articles
[Xu B(徐波)]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: final_SSCFormer_Push_the_Limit_of_Chunk-Wise_Conformer_for_Streaming_ASR_Using_Sequentially_Sampled_Chunks_and_Chunked_Causal_Convolution.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.