CASIA OpenIR  > 模式识别国家重点实验室  > 语音交互
User behavior fusion in dialog management with multi-modal history cues
Yang, Minghao; Tao, Jianhua; Chao, Linlin; Li, Hao; Zhang, Dawei; Che, Hao; Gao, Tingli; Liu, Bin
Source PublicationMULTIMEDIA TOOLS AND APPLICATIONS
2015-11-01
Volume74Issue:22Pages:10025-10051
SubtypeArticle
AbstractIt enhances user experience by making the talking avatar be sensitive to user behaviors in human computer interaction (HCI). In this study, we combine user's multi-modal behaviors with behaviors' historical information in dialog management (DM) to improve the avatar's sensitivity not only to user explicit behavior (speech command) but also to user supporting expression (emotion and gesture, etc.). In the dialog management, according to the different contributions of facial expression, gesture and head motion to speech comprehension, we divide the user's multi-modal behaviors into three categories: complementation, conflict and independence. The behavior categories could be first automatically obtained from a short-term and time-dynamic (STTD) fusion model with audio-visual input. Different behavior category leads to different avatar's response in later dialog turns. Usually, the conflict behavior reflects user's ambiguous intention (for example: user says "no" while he (her) is smiling). In this case, the trial-and-error schema is adopted to eliminate the conversation ambiguity. For the later dialog process, we divide all the avatar dialog states into four types: "Ask", "Answer", "Chat" and "Forget". With the detection of complementation and independence behaviors, the user supporting expression as well as his (her) explicit behavior could be estimated as triggers for topic maintenance or transfer among four dialog states. At the first section of experiments, we discuss the reliability of STTD model for user behavior classification. Based on the proposed dialog management and STTD model, we continue to construct a drive route information query system by connecting the user behavior sensitive dialog management (BSDM) to a 3D talking avatar. The practical conversation records of avatar with different users show that the BSDM makes the avatar be able to understand and be sensitive to the users' facial expressions, emotional voice and gesture, which improves user experience on multi-modal human computer conversation.
KeywordDialog Management (Dm) Multi-modal Data Fusion Human Computer Interaction (Hci) Emotion Detection
WOS HeadingsScience & Technology ; Technology
DOI10.1007/s11042-014-2161-5
Indexed BySCI
Language英语
Funding OrganizationNational Natural Science Foundation of China (NSFC)(61273288 ; Major Program for the National Social Science Fund of China(13ZD189) ; 61233009 ; 61203258 ; 61530503 ; 61332017 ; 61375027)
WOS Research AreaComputer Science ; Engineering
WOS SubjectComputer Science, Information Systems ; Computer Science, Software Engineering ; Computer Science, Theory & Methods ; Engineering, Electrical & Electronic
WOS IDWOS:000364019400011
Citation statistics
Cited Times:2[WOS]   [WOS Record]     [Related Records in WOS]
Document Type期刊论文
Identifierhttp://ir.ia.ac.cn/handle/173211/10496
Collection模式识别国家重点实验室_语音交互
AffiliationChinese Acad Sci, Natl Lab Pattern Recognit, Inst Automat, Beijing, Peoples R China
Recommended Citation
GB/T 7714
Yang, Minghao,Tao, Jianhua,Chao, Linlin,et al. User behavior fusion in dialog management with multi-modal history cues[J]. MULTIMEDIA TOOLS AND APPLICATIONS,2015,74(22):10025-10051.
APA Yang, Minghao.,Tao, Jianhua.,Chao, Linlin.,Li, Hao.,Zhang, Dawei.,...&Liu, Bin.(2015).User behavior fusion in dialog management with multi-modal history cues.MULTIMEDIA TOOLS AND APPLICATIONS,74(22),10025-10051.
MLA Yang, Minghao,et al."User behavior fusion in dialog management with multi-modal history cues".MULTIMEDIA TOOLS AND APPLICATIONS 74.22(2015):10025-10051.
Files in This Item: Download All
File Name/Size DocType Version Access License
2015_MHCI_Multimedia(2008KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Download
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Yang, Minghao]'s Articles
[Tao, Jianhua]'s Articles
[Chao, Linlin]'s Articles
Baidu academic
Similar articles in Baidu academic
[Yang, Minghao]'s Articles
[Tao, Jianhua]'s Articles
[Chao, Linlin]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Yang, Minghao]'s Articles
[Tao, Jianhua]'s Articles
[Chao, Linlin]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: 2015_MHCI_Multimedia Tools and Applications_SCI.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.