MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation

CASIA OpenIR > 模式识别实验室

	MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation
	王凯思源 1; Song LS(宋林森)2,3 ; Wu QY(吴潜溢)2,3; Yang ZQ(杨卓谦)4; Wu WY(吴文岩)1; Qian C(钱晨)1; He R(赫然)2,3 ; Qiao Y(乔宇)5; Loy, Chen Change 6
	2020-08-23
会议名称	European Conference on Computer Vision
会议日期	2020-08-23
会议地点	Glasgow
摘要	The synthesis of natural emotional reactions is an essential criterion in vivid talking-face video generation. This criterion is neverthe- less seldom taken into consideration in previous works due to the absence of a large-scale, high-quality emotional audio-visual dataset. To address this issue, we build the Multi-view Emotional Audio-visual Dataset (MEAD), a talking-face video corpus featuring 60 actors and actresses talking with eight different emotions at three different intensity levels. High-quality audio-visual clips are captured at seven different view angles in a strictly-controlled environment. Together with the dataset, we release an emotional talking-face generation baseline that enables the manipulation of both emotion and its intensity. Our dataset could bene- fit a number of di↵erent research fields including conditional generation, cross-modal understanding and expression recognition. Code, model and data are publicly available on our project page.
收录类别	EI
语种	英语
七大方向——子方向分类	图像视频处理与分析
国重实验室规划方向分类	视觉信息处理
是否有论文关联数据集需要存交	否
文献类型	会议论文
条目标识符	http://ir.ia.ac.cn/handle/173211/52265
专题	模式识别实验室
作者单位	1.北京商汤科技有限公司 2.中科院自动化所 3.中国科学院大学 4.卡内基梅隆大学 5.中科院深圳先进技术研究院 6.南洋理工大学
推荐引用方式 GB/T 7714	王凯思源,Song LS,Wu QY,et al. MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation[C],2020.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
[ECCV2020] MEAD.pdf（8588KB）	会议论文		开放获取	CC BY-NC-SA	浏览下载