A 3-D Audio-Visual Corpus of Affective Communication

Fanelli, Gabriele; Gall, Juergen; Romsdorfer, Harald; Weise, Thibaut; Van Gool, Luc

doi:10.1109/TMM.2010.2052239

research article

A 3-D Audio-Visual Corpus of Affective Communication

Fanelli, Gabriele

•

Gall, Juergen

•

Romsdorfer, Harald

2010

Ieee Transactions On Multimedia

Communication between humans deeply relies on the capability of expressing and recognizing feelings. For this reason, research on human-machine interaction needs to focus on the recognition and simulation of emotional states, prerequisite of which is the collection of affective corpora. Currently available datasets still represent a bottleneck for the difficulties arising during the acquisition and labeling of affective data. In this work, we present a new audio-visual corpus for possibly the two most important modalities used by humans to communicate their emotional states, namely speech and facial expression in the form of dense dynamic 3-D face geometries. We acquire high-quality data by working in a controlled environment and resort to video clips to induce affective states. The annotation of the speech signal includes: transcription of the corpus text into the phonological representation, accurate phone segmentation, fundamental frequency extraction, and signal intensity estimation of the speech signals. We employ a real-time 3-D scanner to acquire dense dynamic facial geometries and track the faces throughout the sequences, achieving full spatial and temporal correspondences. The corpus is a valuable tool for applications like affective visual speech synthesis or view-independent facial expression recognition.

Name

thumb_jgall_avcorpus_mm10.png

Type

Thumbnail

Access type

openaccess

License Condition

copyright

Size

109.04 KB

Format

PNG

Checksum (MD5)

d75708832b0d8d74766c18d9d3b7f0fb

Name

jgall_avcorpus_mm10.pdf

Type

Postprint

Version

http://purl.org/coar/version/c_ab4af688f83e57aa