CASIA OpenIR

Browse/Search Results:  1-10 of 298 Help

Selected(0)Clear Items/Page:    Sort:
多通道语音增强优化建模方法研究 学位论文
, 中科院自动化研究所: 中国科学院大学, 2021
Authors:  李冠君
Adobe PDF(5732Kb)  |  Favorite  |  View/Download:17/1  |  Submit date:2021/06/07
多通道语音增强,非点源噪声场景,点源噪声场景,复杂噪声场景,自动语音识别  
基于多域听觉特征建模的说话人无关语音分离方法研究 学位论文
, 北京市: 中国科学院自动化研究所, 2021
Authors:  范存航
Adobe PDF(3377Kb)  |  Favorite  |  View/Download:15/1  |  Submit date:2021/06/01
说话人无关语音分离  听觉特征建模  深度嵌入式特征  深度注意力融合特征  门控递归融合  
Object Reconstruction Based on Attentive Recurrent Network from Single and Multiple Images 期刊论文
NEURAL PROCESSING LETTERS, 2021, 期号: 53, 页码: 18
Authors:  Gao, Zishu;  Li, En;  Wang, Zhe;  Yang, Guodong;  Lu, Jiwu;  Ouyang, Bo;  Xu, Dawei;  Liang, Zize
Adobe PDF(1338Kb)  |  Favorite  |  View/Download:14/0  |  Submit date:2021/03/01
Object reconstruction  Convolutional LSTM  Visual attention  Robotic application  
Deep Audio-Visual Learning: A Survey 期刊论文
International Journal of Automation and Computing, 2021, 卷号: 18, 期号: 3, 页码: 351-376
Authors:  Hao Zhu;  Man-Di Luo;  Rui Wang;  Ai-Hua Zheng;  Ran He
Adobe PDF(1864Kb)  |  Favorite  |  View/Download:4/0  |  Submit date:2021/05/24
Deep audio-visual learning  audio-visual separation and localization  correspondence learning  generative models  representation learning  
CTNet: Conversational Transformer Network for Emotion Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 985-1000
Authors:  Lian, Zheng;  Liu, Bin;  Tao, Jianhua
Favorite  |  View/Download:9/0  |  Submit date:2021/05/06
Emotion recognition  Context modeling  Feature extraction  Fuses  Speech processing  Data models  Bidirectional control  Context-sensitive modeling  conversational transformer network (CTNet)  conversational emotion recognition  multimodal fusion  speaker-sensitive modeling  
Hybrid Approach to Document Anomaly Detection: An Application to Facilitate RPA in Title Insurance 期刊论文
International Journal of Automation and Computing, 2021, 卷号: 18, 期号: 1, 页码: 55-72
Authors:  Abhijit Guha;  Debabrata Samanta
View  |  Adobe PDF(1485Kb)  |  Favorite  |  View/Download:13/4  |  Submit date:2021/02/23
Anomaly detection  title insurance  autoencoder  one-class support vector machine (OSVM)  term frequency – inverse document frequency (TF-IDF)  robotic process automation  dimensionality reduction  
A time-frequency channel attention and vectorization network for automatic depression level prediction 期刊论文
Neurocomputing, 2021, 期号: 450, 页码: 208-218
Authors:  Niu MY(牛明月)
Adobe PDF(2001Kb)  |  Favorite  |  View/Download:3/0  |  Submit date:2021/06/01
Sphere embedding normalization  DenseNet  Transition layer  Time-frequency channel attention block  Time-frequency vectorization block  Depression detection  
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning 会议论文
, Hong Kong, 24-27 Jan. 2021
Authors:  Fan, Cunhang;  Liu, Bin;  Tao, Jianhua;  Yi, Jiangyan;  Wen, Zhengqi;  Song, Leichao
Adobe PDF(934Kb)  |  Favorite  |  View/Download:2/0  |  Submit date:2021/06/01
Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition 期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 期号: 29, 页码: 198-209
Authors:  Fan, Cunhang;  Yi, Jiangyan;  Tao, Jianhua;  Tian, Zhengkun;  Liu, Bin;  Wen, Zhengqi
Adobe PDF(2534Kb)  |  Favorite  |  View/Download:18/1  |  Submit date:2021/03/08
Speech enhancement  Speech recognition  Training  Noise measurement  Logic gates  Acoustic distortion  Task analysis  Gated recurrent fusion  robust end-to-end speech recognition  speech distortion  speech enhancement  speech transformer  
Gated Recurrent Fusion of Spatial and Spectral Features for Multi-channel Speech Separation with Deep Embedding Representations 会议论文
, Shanghai, China, October 25–29, 2020
Authors:  Fan, Cunhang;  Tao, Jianhua;  Liu, Bin;  Yi, Jiangyan;  Wen, Zhengqi
Adobe PDF(260Kb)  |  Favorite  |  View/Download:2/0  |  Submit date:2021/06/01