Files

Abstract

Humans can express their actions and intentions, resorting to verbal and/or non-verbal communication. In verbal communication, humans use language to express, in structured linguistic terms, the desired action they wish to perform. Non-verbal communication refers to the expressiveness of the human body movements during the interaction with other humans, while manipulating objects, or simply navigating in the world. In a sense, all actions require moving our musculoskeletal system which in return contribute to expressing the intention concerning the completion of that action. Moreover, considering that all humans share a common motor-repertoire, i.e. the degrees of freedom and joint limits, excluding cultural or society-based influences, all humans express action intentions using a common non-verbal language. From walking along a corridor, to pointing to a painting on a wall, or handing over a cup to someone, communication is provided in the form of non-verbal ``cues'', that express action intentions. This thesis objective is hence threefold: (i) improve robot imitation of human actions by incorporating human-inspired non-verbal cues onto robots; (ii) explore how humans communicate their goals and intention non-verbally and how robots can use the same non-verbal cues to also communicate its goals and intentions to humans; and (iii) extract latent properties of objects that are revealed by human non-verbal cues during manipulation and incorporate them onto the robot non-verbal cue system in order to express those properties. One of the contributions is the creation of multiple publicly available datasets of synchronized videos, gaze, and body motion data. We conducted several Human-human interaction experiments with three objectives in mind: (i) study the motion behaviors of both perspectives in human-human interactions, (ii) understand how the participants manage to predict the observed actions of the other; (iii) use the collected data to model the human eye-gaze behavior and arm behavior. The second contribution is an extension to the legibility concept to include eye-gaze cues. This extension proved that humans can correctly predict the robot action as early, and with the same cues, as if it were a human doing it. The third contribution is developing a human-to-human synchronized non-verbal communication model, i.e. the \textit{Gaze Dialogue}, which shows the inter-personal communication of motor and gaze cues that occur during action execution and observation, and apply it to a human-to-robot experiment. During the interaction, the robot can: (i) adequately infer the human action from gaze cues, (ii) adjust its gaze fixation according to the human eye-gaze behavior, and (iii) signal non-verbal cues that correlate with the robot's own action intentions. The fourth and final contribution is to demonstrate that non-verbal cues information extracted from human can be used by robots in recognizing the types of actions (individual or action-in-interactions), the types of intentions (to polish or to handover), and the types of manipulations (careful or careless). Overall, the communication tools developed in this thesis contribute to enhance of human robot interaction experience, by incorporating the non-verbal communication "protocols" used when humans interact with each other.

Details

PDF