CASIA OpenIR

浏览/检索结果: 共15条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
Subband fusion of complex spectrogram for fake speech detection 期刊论文
SPEECH COMMUNICATION, 2023, 卷号: 155, 页码: 8
作者:  Fan, Cunhang;  Xue, Jun;  Dong, Shunbo;  Ding, Mingming;  Yi, Jiangyan;  Li, Jinpeng;  Lv, Zhao
收藏  |  浏览/下载:8/0  |  提交时间:2024/03/26
Automatic speaker verification  Complex spectrogram  Fake speech detection  Phase information  Subband  
A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition 期刊论文
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2023, 卷号: E106A, 期号: 6, 页码: 876-885
作者:  Liu, Yang;  Xia, Yuqi;  Sun, Haoqin;  Meng, Xiaolei;  Bai, Jianxiong;  Guan, Wenbo;  Zhao, Zhen;  LI, Yongwei
收藏  |  浏览/下载:67/0  |  提交时间:2023/11/17
speech emotion recognition  non-personalized features  cascaded attention network  multitask learning  self-adaption loss  
Two-stage deep spectrum fusion for noise-robust end-to-end speech recognition 期刊论文
APPLIED ACOUSTICS, 2023, 卷号: 212, 页码: 10
作者:  Fan, Cunhang;  Ding, Mingming;  Yi, Jiangyan;  Li, Jinpeng;  Lv, Zhao
收藏  |  浏览/下载:25/0  |  提交时间:2023/11/16
Robust end-to-end ASR  Speech enhancement  Masking and mapping  Speech distortion  Deep spectrum fusion  
SMIN: Semi-Supervised Multi-Modal Interaction Network for Conversational Emotion Recognition 期刊论文
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 卷号: 14, 期号: 3, 页码: 2415-2429
作者:  Lian, Zheng;  Liu, Bin;  Tao, Jianhua
收藏  |  浏览/下载:83/0  |  提交时间:2023/11/15
Emotion recognition  Feature extraction  Training  Acoustics  Semisupervised learning  Benchmark testing  Hidden Markov models  Semi-supervised multi-modal interaction network (SMIN)  conversational emotion recognition  semi-supervised learning  intra-modal interaction  cross-modal interaction  
SpecMNet: Spectrum mend network for monaural speech enhancement 期刊论文
APPLIED ACOUSTICS, 2022, 卷号: 194, 页码: 9
作者:  Fan, Cunhang;  Zhang, Hongmei;  Yi, Jiangyan;  Lv, Zhao;  Tao, Jianhua;  Li, Taihao;  Pei, Guanxiong;  Wu, Xiaopei;  Li, Sheng
收藏  |  浏览/下载:218/0  |  提交时间:2022/07/25
Monaural speech enhancement  Speech distortion  Spectrum mend network  SI-SNR  BLSTM  
F-0-Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 3375-3383
作者:  Li, Yongwei;  Tao, Jianhua;  Erickson, Donna;  Liu, Bin;  Akagi, Masato
收藏  |  浏览/下载:118/0  |  提交时间:2021/12/28
Speech recognition  Iterative methods  Production  Estimation  Brain modeling  Shape  Low-frequency noise  Glottal source  vocal tract  source-filter model  ARX-LF model  
Semantic-diversity transfer network for generalized zero-shot learning via inner disagreement based OOD detector 期刊论文
KNOWLEDGE-BASED SYSTEMS, 2021, 卷号: 229, 页码: 11
作者:  Liu, Bo;  Dong, Qiulei;  Hu, Zhanyi
Adobe PDF(1224Kb)  |  收藏  |  浏览/下载:330/67  |  提交时间:2021/11/04
Zero-shot learning  Visual-semantic embedding  Out-of-distribution detection  
Self-supervised graph representation learning via bootstrapping 期刊论文
NEUROCOMPUTING, 2021, 卷号: 456, 页码: 88-96
作者:  Che, Feihu;  Yang, Guohua;  Zhang, Dawei;  Tao, Jianhua;  Liu, Tong
Adobe PDF(1379Kb)  |  收藏  |  浏览/下载:353/58  |  提交时间:2021/11/03
Graph representation learning  Self-supervised  Bootstrapping  Graph neural network  
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 1340-1351
作者:  Bai, Ye;  Yi, Jiangyan;  Tao, Jianhua;  Wen, Zhengqi;  Tian, Zhengkun;  Zhang, Shuai
收藏  |  浏览/下载:163/0  |  提交时间:2021/06/07
End-to-End  language modeling  speech recognition  teacher-student learning  transfer learning  
CTNet: Conversational Transformer Network for Emotion Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 期号: 29, 页码: 985-1000
作者:  Lian, Zheng;  Liu, Bin;  Tao, Jianhua
Adobe PDF(2230Kb)  |  收藏  |  浏览/下载:339/58  |  提交时间:2021/05/06
Emotion recognition  Context modeling  Feature extraction  Fuses  Speech processing  Data models  Bidirectional control  Context-sensitive modeling  conversational transformer network (CTNet)  conversational emotion recognition  multimodal fusion  speaker-sensitive modeling