Abstract

In this paper, we investigate the task of addressee estimation in multi-party interactions. For every utterance from a human participant, the robot should know if it was being addressed or not, so as to respond and behave accordingly. To accomplish this various cues could be made use of: the most important being gaze cues of the speaker. Apart from this several other cues can act as contextual variables to improve the estimation accuracy of this task. For example, the gaze cue of other participants, and the long-term or short-term dialog context. In this paper we investigate the possibility to combine such information from diverse sources to improve the addressee estimation task. For this study, we use 11 interactions with a humanoid robot NAO giving quiz to two human participants.

Details

Actions