Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition
Li, Xingfeng1; Shi, Xiaohan2; Hu, Desheng3; Li, Yongwei4; Zhang, Qingchen1; Wang, Zhengxia5; Unoki, Masashi6; Akagi, Masato6
发表期刊IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
ISSN2329-9290
2023
卷号31页码:2534-2547
通讯作者Li, Xingfeng(lixingfeng@hainanu.edu.cn)
摘要This research presents a music theory-inspired acoustic representation (hereafter, MTAR) to address improved speech emotion recognition. The recognition of emotion in speech and music is developed in parallel, yet a relatively limited understanding of MTAR for interpreting speech emotions is involved. In the present study, we use music theory to study representative acoustics associated with emotion in speech from vocal emotion expressions and auditory emotion perception domains. In experiments assessing the role and effectiveness of the proposed representation in classifying discrete emotion categories and predicting continuous emotion dimensions, it shows promising performance compared with extensively used features for emotion recognition based on the spectrogram, Mel-spectrogram, Mel-frequency cepstral coefficients, VGGish, and the large baseline feature sets of the INTERSPEECH challenges. This proposal opens up a novel research avenue in developing a computational acoustic representation of speech emotion via music theory.
关键词Affective computing speech emotion recognition acoustic representation music theory and speech analysis
DOI10.1109/TASLP.2023.3289312
关键词[WOS]PERCEPTION ; EXPRESSION ; PATTERNS ; FEATURES ; PITCH ; PERSPECTIVE ; MODALITIES ; KNOWLEDGE ; INTERVALS ; COGNITION
收录类别SCI
语种英语
资助项目Key Research and Development Program of Hainan Province[ZDYF2021GXJS017] ; National Natural Science Foundation of China[82160345] ; National Natural Science Foundation of China[62201571] ; Key Science and Technology Plan Project of Haikou[2021-016]
项目资助者Key Research and Development Program of Hainan Province ; National Natural Science Foundation of China ; Key Science and Technology Plan Project of Haikou
WOS研究方向Acoustics ; Engineering
WOS类目Acoustics ; Engineering, Electrical & Electronic
WOS记录号WOS:001025466100003
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
引用统计
被引频次:1[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符http://ir.ia.ac.cn/handle/173211/53769
专题多模态人工智能系统全国重点实验室_智能交互
通讯作者Li, Xingfeng
作者单位1.Hainan Univ, Grad Sch Comp Sci & Technol, Haikou 570288, Peoples R China
2.Nagoya Univ, Sch Informat Sci, Nagoya 4648601, Japan
3.Taiyuan Univ Technol, Coll Informat & Comp, Taiyuan 030024, Peoples R China
4.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
5.Hainan Univ, Sch Comp Sci & Technol, Haikou 570288, Peoples R China
6.Japan Adv Inst Sci & Technol, Sch Informat Sci, Nomi 9231292, Japan
推荐引用方式
GB/T 7714
Li, Xingfeng,Shi, Xiaohan,Hu, Desheng,et al. Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2023,31:2534-2547.
APA Li, Xingfeng.,Shi, Xiaohan.,Hu, Desheng.,Li, Yongwei.,Zhang, Qingchen.,...&Akagi, Masato.(2023).Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition.IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,31,2534-2547.
MLA Li, Xingfeng,et al."Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition".IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 31(2023):2534-2547.
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Li, Xingfeng]的文章
[Shi, Xiaohan]的文章
[Hu, Desheng]的文章
百度学术
百度学术中相似的文章
[Li, Xingfeng]的文章
[Shi, Xiaohan]的文章
[Hu, Desheng]的文章
必应学术
必应学术中相似的文章
[Li, Xingfeng]的文章
[Shi, Xiaohan]的文章
[Hu, Desheng]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。