Automatic extraction of geometric lip features with application to multi-modal speaker identification
In this paper we consider the problem of automatic extraction of the geometric lip features for the purposes of multi-modal speaker identification. The use of visual information from the mouth region can be of great importance for improving the speaker identification system performance in noisy conditions. We propose a novel method for automated lip features extraction that utilizes color space transformation and a fuzzy-based c-means clustering technique. Using the obtained visual cues closed-set audio-visual speaker identification experiments are performed on the CUAVE database, [1] showing promising results.
Arsic2006_1477.pdf
openaccess
158.63 KB
Adobe PDF
b6c4615d01ea57abc4cba20d9a062589