Information Fusion and Person Verification Using Speech & Face Information

This report provides an overview of important concepts in the field of information fusion, followed by a review of literature pertaining to audio-visual person identification & verification. Several recent adaptive and non-adaptive techniques for reaching the verification decision (i.e., to accept or reject the claimant), based on audio and visual information, are evaluated in clean and noisy conditions on a common database using a text-independent setup. It is shown that in clean conditions all the non-adaptive approaches provide similar performance; in noisy conditions they exhibit deterioration in their performance. It is also shown that current adaptive approaches are either inadequate or utilize restrictive assumptions. A new category of classifiers is then introduced, where the decision surface is fixed but constructed to take into account the effects of noisy conditions, providing a good trade-off between performance in clean and noisy conditions.

Related material