CASIA OpenIR

浏览/检索结果: 共11条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Multi-Cue Guided Semi-Supervised Learning Toward Target Speaker Separation in Real Environments 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 卷号: 32, 页码: 151-163
作者:  Xu, Jiaming;  Cui, Jian;  Hao, Yunzhe;  Xu, Bo
收藏  |  浏览/下载:84/0  |  提交时间:2024/02/22
Cocktail party problem  target speaker separation  multi-cue guided separation  semi-supervised learning  
Towards Unified Multi-Domain Machine Translation With Mixture of Domain Experts 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 卷号: 31, 页码: 3488-3498
作者:  Lu, Jinliang;  Zhang, Jiajun
Adobe PDF(2882Kb)  |  收藏  |  浏览/下载:148/8  |  提交时间:2023/12/21
Machine Translation  Multi-domain  Mixture-of-expert  
Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 卷号: 31, 页码: 2534-2547
作者:  Li, Xingfeng;  Shi, Xiaohan;  Hu, Desheng;  Li, Yongwei;  Zhang, Qingchen;  Wang, Zhengxia;  Unoki, Masashi;  Akagi, Masato
收藏  |  浏览/下载:92/0  |  提交时间:2023/11/17
Affective computing  speech emotion recognition  acoustic representation  music theory and speech analysis  
F-0-Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 3375-3383
作者:  Li, Yongwei;  Tao, Jianhua;  Erickson, Donna;  Liu, Bin;  Akagi, Masato
收藏  |  浏览/下载:149/0  |  提交时间:2021/12/28
Speech recognition  Iterative methods  Production  Estimation  Brain modeling  Shape  Low-frequency noise  Glottal source  vocal tract  source-filter model  ARX-LF model  
Medical Term and Status Generation From Chinese Clinical Dialogue With Multi-Granularity Transformer 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 3362-3374
作者:  Li, Mei;  Xiang, Lu;  Kang, Xiaomian;  Zhao, Yang;  Zhou, Yu;  Zong, Chengqing
Adobe PDF(3036Kb)  |  收藏  |  浏览/下载:306/69  |  提交时间:2021/12/28
Medical diagnostic imaging  Transformers  Task analysis  Medical services  Computational modeling  Semantics  Data mining  Medical dialogue  multi-granularity  attention mechanism  natural language understanding  sequence to sequence learning  
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 1340-1351
作者:  Bai, Ye;  Yi, Jiangyan;  Tao, Jianhua;  Wen, Zhengqi;  Tian, Zhengkun;  Zhang, Shuai
收藏  |  浏览/下载:195/0  |  提交时间:2021/06/07
End-to-End  language modeling  speech recognition  teacher-student learning  transfer learning  
CTNet: Conversational Transformer Network for Emotion Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 期号: 29, 页码: 985-1000
作者:  Lian, Zheng;  Liu, Bin;  Tao, Jianhua
Adobe PDF(2230Kb)  |  收藏  |  浏览/下载:391/62  |  提交时间:2021/05/06
Emotion recognition  Context modeling  Feature extraction  Fuses  Speech processing  Data models  Bidirectional control  Context-sensitive modeling  conversational transformer network (CTNet)  conversational emotion recognition  multimodal fusion  speaker-sensitive modeling  
Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 期号: 29, 页码: 198-209
作者:  Fan, Cunhang;  Yi, Jiangyan;  Tao, Jianhua;  Tian, Zhengkun;  Liu, Bin;  Wen, Zhengqi
Adobe PDF(2534Kb)  |  收藏  |  浏览/下载:428/55  |  提交时间:2021/03/08
Speech enhancement  Speech recognition  Training  Noise measurement  Logic gates  Acoustic distortion  Task analysis  Gated recurrent fusion  robust end-to-end speech recognition  speech distortion  speech enhancement  speech transformer  
End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 卷号: 28, 期号: 28, 页码: 1303-1314
作者:  Fan, Cunhang;  Tao, Jianhua;  Liu, Bin;  Yi, Jiangyan;  Wen, Zhengqi;  Liu, Xuefei
Adobe PDF(1344Kb)  |  收藏  |  浏览/下载:333/69  |  提交时间:2020/06/22
Feature extraction  Training  Interference  Speech enhancement  Clustering algorithms  Spectrogram  Speech separation  end-to-end post-filter  deep attention fusion features  deep clustering  permutation invariant training  
Forward-Backward Decoding Sequence for Regularizing End-to-End TTS 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 卷号: 27, 期号: 12, 页码: 2067-2079
作者:  Zheng, Yibin;  Tao, Jianhua;  Wen, Zhengqi;  Yi, Jiangyan
收藏  |  浏览/下载:369/0  |  提交时间:2020/03/30
Decoding  Training  Speech processing  Linguistics  Acoustics  Speech recognition  Forward-backward  regularization  encoder-decoder with attention  end-to-end  joint-training  TTS