Knowledge Commons of Institute of Automation,CAS
Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR | |
Wang FY(王方圆)![]() ![]() | |
2022 | |
会议名称 | ICONIP 2022 |
会议日期 | 2022.11.28 |
会议地点 | Indore,India |
摘要 | Currently, there are mainly three kinds of Transformer encoder based streaming End to End (E2E) Automatic Speech Recognition (ASR) approaches, namely time-restricted methods, chunk-wise methods, and memory-based methods. Generally, all of them have limitations in aspects of linear computational complexity, global context modeling, and parallel training. In this work, we aim to build a model to take all these three advantages for streaming Transformer ASR. Particularly, we propose a shifted chunk mechanism for the chunk-wise Transformer which provides cross-chunk connections between chunks. Therefore, the global context modeling ability of chunk-wise models can be significantly enhanced while all the original merits inherited.We integrate this scheme with the chunk-wise Transformer and Conformer, and identify them as SChunk-Transformer and SChunk-Conformer, respectively. Experiments on AISHELL-1 show that the SChunk-Transformer and SChunk-Conformer can respectively achieve CER 6.43% and 5.77%. And the linear complexity makes them possible to train with large batches and infer more efficiently. Our models can significantly outperform their conventional chunk-wise counterparts, while being competitive, with only 0.22 absolute CER drop, when compared with U2 which has quadratic complexity. A better CER can be achieved if compared with existing chunkwise or memory-based methods, such as HS-DACS and MMA. Code is released. (see https://github.com/wangfangyuan/SChunk-Encoder.). |
七大方向——子方向分类 | 语音识别与合成 |
国重实验室规划方向分类 | 语音语言处理 |
是否有论文关联数据集需要存交 | 否 |
文献类型 | 会议论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/57382 |
专题 | 复杂系统认知与决策实验室_听觉模型与认知计算 |
通讯作者 | Wang FY(王方圆) |
作者单位 | 中国科学院自动化研究所 |
第一作者单位 | 中国科学院自动化研究所 |
通讯作者单位 | 中国科学院自动化研究所 |
推荐引用方式 GB/T 7714 | Wang FY,Xu B. Shifted Chunk Encoder for Transformer Based Streaming End-to-End ASR[C],2022. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
published-iconip2022(1374KB) | 会议论文 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[Wang FY(王方圆)]的文章 |
[Xu B(徐波)]的文章 |
百度学术 |
百度学术中相似的文章 |
[Wang FY(王方圆)]的文章 |
[Xu B(徐波)]的文章 |
必应学术 |
必应学术中相似的文章 |
[Wang FY(王方圆)]的文章 |
[Xu B(徐波)]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论