CASIA OpenIR

浏览/检索结果: 共17条,第1-10条 帮助

限定条件    
已选(0)清除 条数/页:   排序方式:
A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition 期刊论文
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2023, 卷号: E106A, 期号: 6, 页码: 876-885
作者:  Liu, Yang;  Xia, Yuqi;  Sun, Haoqin;  Meng, Xiaolei;  Bai, Jianxiong;  Guan, Wenbo;  Zhao, Zhen;  LI, Yongwei
收藏  |  浏览/下载:64/0  |  提交时间:2023/11/17
speech emotion recognition  non-personalized features  cascaded attention network  multitask learning  self-adaption loss  
Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 卷号: 31, 页码: 2534-2547
作者:  Li, Xingfeng;  Shi, Xiaohan;  Hu, Desheng;  Li, Yongwei;  Zhang, Qingchen;  Wang, Zhengxia;  Unoki, Masashi;  Akagi, Masato
收藏  |  浏览/下载:54/0  |  提交时间:2023/11/17
Affective computing  speech emotion recognition  acoustic representation  music theory and speech analysis  
Train from scratch: Single-stage joint training of speech separation and recognition 期刊论文
COMPUTER SPEECH AND LANGUAGE, 2022, 卷号: 76, 页码: 15
作者:  Shi, Jing;  Chang, Xuankai;  Watanabe, Shinji;  Xu, Bo
收藏  |  浏览/下载:200/0  |  提交时间:2022/07/25
Cocktail party problem  Speech separation  Multi-speaker speech recognition  End-to-end  Joint-training  
Adversarial-Metric Learning for Audio-Visual Cross-Modal Matching 期刊论文
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 卷号: 24, 页码: 338-351
作者:  Zheng, Aihua;  Hu, Menglan;  Jiang, Bo;  Huang, Yan;  Yan, Yan;  Luo, Bin
收藏  |  浏览/下载:222/0  |  提交时间:2022/03/17
Visualization  Task analysis  Measurement  Speech recognition  Videos  Location awareness  Image recognition  Adversarial learning  audio-visual matching  cross-modal learning  metric learning  
On Learning Semantic Representations for Large-Scale Abstract Sketches 期刊论文
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 卷号: 31, 期号: 9, 页码: 3366-3379
作者:  Xu, Peng;  Huang, Yongye;  Yuan, Tongtong;  Xiang, Tao;  Hospedales, Timothy M.;  Song, Yi-Zhe;  Wang, Liang
收藏  |  浏览/下载:176/0  |  提交时间:2021/11/03
Semantics  Visualization  Task analysis  Games  Feature extraction  Quantization (signal)  Speech recognition  Practical sketch-based application  semantic representation  hashing  retrieval  zero-shot recognition  edge-map dataset  
Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 期号: 29, 页码: 198-209
作者:  Fan, Cunhang;  Yi, Jiangyan;  Tao, Jianhua;  Tian, Zhengkun;  Liu, Bin;  Wen, Zhengqi
Adobe PDF(2534Kb)  |  收藏  |  浏览/下载:362/47  |  提交时间:2021/03/08
Speech enhancement  Speech recognition  Training  Noise measurement  Logic gates  Acoustic distortion  Task analysis  Gated recurrent fusion  robust end-to-end speech recognition  speech distortion  speech enhancement  speech transformer  
CTNet: Conversational Transformer Network for Emotion Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 期号: 29, 页码: 985-1000
作者:  Lian, Zheng;  Liu, Bin;  Tao, Jianhua
Adobe PDF(2230Kb)  |  收藏  |  浏览/下载:325/58  |  提交时间:2021/05/06
Emotion recognition  Context modeling  Feature extraction  Fuses  Speech processing  Data models  Bidirectional control  Context-sensitive modeling  conversational transformer network (CTNet)  conversational emotion recognition  multimodal fusion  speaker-sensitive modeling  
Integrating Knowledge Into End-to-End Speech Recognition From External Text-Only Data 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 1340-1351
作者:  Bai, Ye;  Yi, Jiangyan;  Tao, Jianhua;  Wen, Zhengqi;  Tian, Zhengkun;  Zhang, Shuai
收藏  |  浏览/下载:157/0  |  提交时间:2021/06/07
End-to-End  language modeling  speech recognition  teacher-student learning  transfer learning  
F-0-Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 3375-3383
作者:  Li, Yongwei;  Tao, Jianhua;  Erickson, Donna;  Liu, Bin;  Akagi, Masato
收藏  |  浏览/下载:112/0  |  提交时间:2021/12/28
Speech recognition  Iterative methods  Production  Estimation  Brain modeling  Shape  Low-frequency noise  Glottal source  vocal tract  source-filter model  ARX-LF model  
Forward-Backward Decoding Sequence for Regularizing End-to-End TTS 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 卷号: 27, 期号: 12, 页码: 2067-2079
作者:  Zheng, Yibin;  Tao, Jianhua;  Wen, Zhengqi;  Yi, Jiangyan
收藏  |  浏览/下载:315/0  |  提交时间:2020/03/30
Decoding  Training  Speech processing  Linguistics  Acoustics  Speech recognition  Forward-backward  regularization  encoder-decoder with attention  end-to-end  joint-training  TTS