Knowledge Commons of Institute of Automation,CAS
Deep Neural Network-based Generalized Sidelobe Canceller for Dual-channel Far-field Speech Recognition | |
Li GJ(李冠君)![]() | |
发表期刊 | Neural Networks
![]() |
2021 | |
期号 | Volume 141,页码:Pages 225-237 |
文章类型 | 期刊 |
摘要 | The traditional generalized sidelobe canceller (GSC) is a common speech enhancement front end to improve the noise robustness of automatic speech recognition (ASR) systems in the far-field cases. However, the traditional GSC is optimized based on the signal level criteria, causing it not to guarantee the optimal ASR performance. To address this issue, we propose a novel dual-channel deep neural network (DNN)-based GSC structure, called nnGSC, which is optimized by using the objective of maximizing the ASR performance. Our key idea is to make each module of the traditional GSC fully learnable and use the acoustic model to perform joint optimization with GSC. We use the coefficients of the traditional GSC to initialize nnGSC, so that both traditional signal processing knowledge and large amounts of data can be used to guide the network learning. In addition, nnGSC can automatically track the target direction-of-arrival (DOA) frame-by-frame without the need for additional localization algorithms. In the experiments, nnGSC achieves a relative character error rate (CER) improvement of 23.7% compared to the microphone observation, 13.5% compared to the oracle direction-based super-directive beamformer, 12.2% compared to the oracle direction-based traditional GSC and 5.9% compared to the oracle mask-based minimum variance distortionless response (MVDR) beamformer. Moreover, we can improve the robustness of nnGSC against array geometry mismatches by training with multi-geometry data. |
关键词 | Deep neural networkGeneralized sidelobe cancellerDual-channelFar-field speech recognition |
WOS记录号 | WOS:000681162400002 |
七大方向——子方向分类 | 语音识别与合成 |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/44846 |
专题 | 多模态人工智能系统全国重点实验室_智能交互 |
作者单位 | Institute of Automation, Chinese Academy of Sciences |
推荐引用方式 GB/T 7714 | Li GJ. Deep Neural Network-based Generalized Sidelobe Canceller for Dual-channel Far-field Speech Recognition[J]. Neural Networks,2021(Volume 141,):Pages 225-237. |
APA | Li GJ.(2021).Deep Neural Network-based Generalized Sidelobe Canceller for Dual-channel Far-field Speech Recognition.Neural Networks(Volume 141,),Pages 225-237. |
MLA | Li GJ."Deep Neural Network-based Generalized Sidelobe Canceller for Dual-channel Far-field Speech Recognition".Neural Networks .Volume 141,(2021):Pages 225-237. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
20interspeech_DNN广义旁(1911KB) | 期刊论文 | 作者接受稿 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
个性服务 |
推荐该条目 |
保存到收藏夹 |
查看访问统计 |
导出为Endnote文件 |
谷歌学术 |
谷歌学术中相似的文章 |
[Li GJ(李冠君)]的文章 |
百度学术 |
百度学术中相似的文章 |
[Li GJ(李冠君)]的文章 |
必应学术 |
必应学术中相似的文章 |
[Li GJ(李冠君)]的文章 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论