Audiovisual Diarization Of People In Video Content
Audio-Visual People Diarization (AVPD) is an original framework that simultaneously improves audio, video, and audiovisual diarization results. Following a literature review of people diarization for both audio and video content and their limitations, which includes our own contributions, we describe a proposed method for associating both audio and video information by using co-occurrence matrices and present experiments which were conducted on a corpus containing TV news, TV debates, and movies. Results show the effectiveness of the overall diarization system and confirm the gains audio information can bring to video indexing and vice versa.
Khoury_MTAP_2012.pdf
openaccess
2.38 MB
Adobe PDF
f3cb6ba1a4a8d5fbc4b42d4da923e22a