3D Head Pose and Gaze Tracking and Their Application to Diverse Multimodal Tasks
In this PhD thesis the problem of 3D head pose and gaze tracking from minimal user cooperation is addressed. By exploiting characteristics of RGB-D sensors, contributions have been made related to consequent problems of the lack of cooperation: in particular, head pose and inter-person appearance variability; in addition to low resolution handling. The resulting system enabled diverse multimodal applications. In particular, recent work combined multiple RGB-D sensors to detect gazing events in dyadic interactions. The research plan consists of: i) Improving the robustness, accuracy and usability of the head pose and gaze tracking system; ii) To use additional multimodal cues, such as speech and dynamic context, to train and adapt gaze models in an unsupervised manner; iii) To extend the application of 3D gaze estimation to diverse multimodal applications. This includes visual focus of attention tasks involving multiple visual targets, e.g. people in a meeting-like setup.