MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation | |
王凯思源1; Song LS(宋林森)2,3![]() ![]() | |
2020-08-23 | |
会议名称 | European Conference on Computer Vision |
会议日期 | 2020-08-23 |
会议地点 | Glasgow |
摘要 | The synthesis of natural emotional reactions is an essential criterion in vivid talking-face video generation. This criterion is neverthe- less seldom taken into consideration in previous works due to the absence of a large-scale, high-quality emotional audio-visual dataset. To address this issue, we build the Multi-view Emotional Audio-visual Dataset (MEAD), a talking-face video corpus featuring 60 actors and actresses talking with eight different emotions at three different intensity levels. High-quality audio-visual clips are captured at seven different view angles in a strictly-controlled environment. Together with the dataset, we release an emotional talking-face generation baseline that enables the manipulation of both emotion and its intensity. Our dataset could bene- fit a number of di↵erent research fields including conditional generation, cross-modal understanding and expression recognition. Code, model and data are publicly available on our project page. |
收录类别 | EI |
语种 | 英语 |
七大方向——子方向分类 | 图像视频处理与分析 |
国重实验室规划方向分类 | 视觉信息处理 |
是否有论文关联数据集需要存交 | 否 |
文献类型 | 会议论文 |
条目标识符 | http://ir.ia.ac.cn/handle/173211/52265 |
专题 | 模式识别实验室 |
作者单位 | 1.北京商汤科技有限公司 2.中科院自动化所 3.中国科学院大学 4.卡内基梅隆大学 5.中科院深圳先进技术研究院 6.南洋理工大学 |
推荐引用方式 GB/T 7714 | 王凯思源,Song LS,Wu QY,et al. MEAD: A Large-scale Audio-visual Dataset for Emotional Talking Face Generation[C],2020. |
条目包含的文件 | 下载所有文件 | |||||
文件名称/大小 | 文献类型 | 版本类型 | 开放类型 | 使用许可 | ||
[ECCV2020] MEAD.pdf(8588KB) | 会议论文 | 开放获取 | CC BY-NC-SA | 浏览 下载 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论