Comparison of Two Methods for Unsupervised Person Identification in TV Shows

We address the task of identifying people appearing in TV shows. The target persons are all people whose identity is said or written, like the journalists and the well known people, as politicians, athletes, celebrities, etc. In our approach, overlaid names displayed on the images are used to identify the persons without any use of biometric models for the speakers and the faces. Two identification methods are evaluated as part of the REPERE French evaluation campaign. The first one relies on co-occurrence times between overlay person names and speaker/face clusters, and rule-based decisions which assign a name to each monomodal cluster. The second method uses a Conditionnal Random Field (CRF) which combine different types of co-occurrence statistics and pair-wised constraints to jointly identify speakers and faces.

Presented at:
12th International Workshop on Content-Based Multimedia Indexing

 Record created 2014-06-19, last modified 2018-03-17

Rate this document:

Rate this document:
(Not yet reviewed)