Abstract

A recent internet based survey of over 35'000 samples has shown that when different human observers are asked to assign labels to static human facial expressions, different individuals categorize differently the same image. This fact results in a lack of an unique ground-truth, an assumption held by the large majority of existing models for classification. This is especially true for highly ambiguous expressions, especially in the lack of a dynamic context. In this paper we propose to address this shortcoming by the use of Discrete Choice Models (DCM) to describe the choice a human observer is faced to when assigning labels to static facial expressions. Different models of increasing complexity are specified to capture the causal effect between features of an image and its associated expression, using several combinations of different measurements. The sets of measurements we used are largely inspired by FACS but also introduce some new ideas, specific to a static framework. These models are calibrated using maximum likelihood techniques and they are compared with each other using a likelihood ratio test, in order to test for significance in the improvement resulting from adding supplemental features. Through a cross-validation procedure we assess the validity of our approach against overfitting and we provide a comparison with an alternative model based on Neural Networks for benchmark purposes.

Details