Multimodal Integration for Meeting Group Action Segmentation and Recognition

Zhang, Dong

doi:10.1007/11677482_5

conference paper

Multimodal Integration for Meeting Group Action Segmentation and Recognition

Al-Hames, Marc

•

Dielmann, Alfred

•

Gatica-Perez, Daniel

2005

MLMI 2005: Machine Learning for Multimodal Interaction

MLMI

We address the problem of segmentation and recognition of sequences of multimodal human interactions in meetings. These interactions can be seen as a rough structure of a meeting, and can be used either as input for a meeting browser or as a first step towards a higher semantic analysis of the meeting. A common lexicon of multimodal group meeting actions, a shared meeting data set, and a common evaluation procedure enable us to compare the different approaches. We compare three different multimodal feature sets and four modelling infrastructures: a higher semantic feature approach, multi-layer HMMs, a multi-stream DBN, as well as a multi-stream mixed-state DBN for disturbed data.

Name

mlmi-05-joint.pdf

Access type

openaccess

Size

141.86 KB

Format

Adobe PDF

Checksum (MD5)

3631ca163aa444bd74f6b23ebebad406