Speech/Non-Speech Detection in Meetings from Automatically Extracted Low Resolution Visual Features

Hung, Hayley; Ba, Silèye O.

Hung, Hayley; Ba, Silèye O.

2009

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues from group meetings. Traditionally, the task of speech/non-speech detection or speaker diarization tries to find who speaks and when from audio features only. Recent work has addressed the problem audio-visually but often with less emphasis on the visual component. Due to the high probability of losing the audio stream during video conferences, this work proposes methods for estimating speech using just low resolution visual cues. We carry out experiments to compare how context through the observation of group behaviour and task-oriented activities can help improve estimates of speaking status. We test on 105 minutes of natural meeting data with unconstrained conversations.

Details

Title Speech/Non-Speech Detection in Meetings from Automatically Extracted Low Resolution Visual Features

Author(s) Hung, Hayley ; Ba, Silèye O.

Date 2009

Publisher Idiap

Note submitted to icmi-mlmi

Additional link URL

Laboratories LIDIAP

Record Appears in Scientific production and competences > STI - School of Engineering > IEM - Institut d'Electricité et de Microtechnique > LIDIAP - L'IDIAP Laboratory
Scientific production and competences > Euler Center for Signal Processing
Work produced at EPFL
Technical Reports
Published

Record creation date 2010-02-11

Files

Abstract

Details

PDF