CASIA OpenIR
(本次检索基于用户作品认领结果)

浏览/检索结果: 共92条,第1-10条 帮助

限定条件            
已选(0)清除 条数/页:   排序方式:
Spatial reconstructed local attention Res2Net with F0 subband for fake speech detection 期刊论文
NEURAL NETWORKS, 2024, 卷号: 175, 页码: 11
作者:  Fan, Cunhang;  Xue, Jun;  Tao, Jianhua;  Yi, Jiangyan;  Wang, Chenglong;  Zheng, Chengshi;  Lv, Zhao
收藏  |  浏览/下载:22/0  |  提交时间:2024/07/04
ASVspoof  Fake speech detection  Fundamental frequency  Res2Net  
SceneFake: An initial dataset and benchmarks for scene fake audio detection 期刊论文
PATTERN RECOGNITION, 2024, 卷号: 152, 页码: 12
作者:  Yi, Jiangyan;  Wang, Chenglong;  Tao, Jianhua;  Zhang, Chu Yuan;  Fan, Cunhang;  Tian, Zhengkun;  Ma, Haoxin;  Fu, Ruibo
收藏  |  浏览/下载:16/0  |  提交时间:2024/07/04
Scene manipulation  Fake audio detection  Speech enhancement  SceneFake dateset  
WavDepressionNet: Automatic Depression Level Prediction via Raw Speech Signals 期刊论文
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 卷号: 15, 期号: 1, 页码: 285-296
作者:  Niu, Mingyue;  Tao, Jianhua;  Li, Yongwei;  Qin, Yong;  Li, Ya
收藏  |  浏览/下载:5/0  |  提交时间:2024/07/03
Assessment block  depression level prediction  representation block  speech signals  WavDepressionNet  
Emotion selectable end-to-end text-based speech editing 期刊论文
ARTIFICIAL INTELLIGENCE, 2024, 卷号: 329, 页码: 16
作者:  Wang, Tao;  Yi, Jiangyan;  Fu, Ruibo;  Tao, Jianhua;  Wen, Zhengqi;  Zhang, Chu Yuan
收藏  |  浏览/下载:11/0  |  提交时间:2024/07/03
Emotion selectable  Text-based speech editing  Emotion decoupling  Mask prediction  Few-shot learning  Text-to-speech  
Efficient multimodal transformer with dual-level feature restoration for robust multimodal sentiment analysis 期刊论文
IEEE Transactions on Affective Computing, 2023, 卷号: 15, 期号: 1, 页码: 1-17
作者:  Licai Sun;  Zheng Lian;  Bin Liu;  Jianhua Tao
Adobe PDF(2371Kb)  |  收藏  |  浏览/下载:65/18  |  提交时间:2024/05/31
Transformers  Robustness  Semantics  Data models  Computational modeling  Videos  Training  Multimodal sentiment analysis  unaligned and incomplete data  efficient multimodal Transformer  dual-level feature restoration  robustness  
HiCMAE: Hierarchical Contrastive Masked Autoencoder for self-supervised Audio-Visual Emotion Recognition 期刊论文
Information Fusion, 2024, 卷号: 108, 页码: 1-20
作者:  Licai Sun;  Zheng Lian;  Bin Liu;  Jianhua Tao
Adobe PDF(2281Kb)  |  收藏  |  浏览/下载:52/12  |  提交时间:2024/05/31
Audio-Visual Emotion Recognition  Self-supervised learning  Masked autoencoder  Contrastive learning  
GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition 期刊论文
Information Fusion, 2024, 页码: 1-12
作者:  Zheng Lian;  Licai Sun;  Haiyang Sun;  Kang Chen;  Zhuofan Wen;  Hao Gu;  Bin Liu;  Jianhua Tao
Adobe PDF(6888Kb)  |  收藏  |  浏览/下载:63/9  |  提交时间:2024/05/31
VLP2MSA: Expanding vision-language pre-training to multimodal sentiment analysis 期刊论文
KNOWLEDGE-BASED SYSTEMS, 2024, 卷号: 283, 页码: 9
作者:  Yi, Guofeng;  Fan, Cunhang;  Zhu, Kang;  Lv, Zhao;  Liang, Shan;  Wen, Zhengqi;  Pei, Guanxiong;  Li, Taihao;  Tao, Jianhua
收藏  |  浏览/下载:108/0  |  提交时间:2024/02/22
Multimodal sentiment analysis  Vision-language  Multimodal fusion  
Spatial-temporal knowledge graph network for event prediction 期刊论文
NEUROCOMPUTING, 2023, 卷号: 553, 页码: 11
作者:  Huai, Zepeng;  Zhang, Dawei;  Yang, Guohua;  Tao, Jianhua
收藏  |  浏览/下载:111/0  |  提交时间:2023/11/17
Multi -event prediction  Knowledge graph  Dynamic graph embedding  
Adversarial Multi-Task Learning for Mandarin Prosodic Boundary Prediction With Multi-Modal Embeddings 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 卷号: 31, 页码: 2963-2973
作者:  Yi, Jiangyan;  Tao, Jianhua;  Fu, Ruibo;  Wang, Tao;  Zhang, Chu Yuan;  Wang, Chenglong
收藏  |  浏览/下载:80/0  |  提交时间:2023/11/17
Adversarial training  multi-task learning  prosodic boundaries  speech synthesis  multi-modal embeddings