Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Accurate Nod and 3D Gaze Estimation for Social Interaction Analysis
 
doctoral thesis

Accurate Nod and 3D Gaze Estimation for Social Interaction Analysis

Yu, Yu  
2020

Non-verbal behaviours play an important role in human communication since it can indicate human attention, serve as communication cue in interactions, or even reveal higher level personal constructs. For instance, head nod, a common non-verbal behaviour, can express the agreement or emphasis when people are listening or speaking. Besides, gaze, another non-verbal behaviour, conveys the human attention and can even provide access to thought processes. With the development of Internet and multimedia, large amount of vision data including videos and images becomes accessible and there are more and more requests on video analysis of human behaviour. Therefore, it is meaningful and important to develop vision based methods to extract non-verbal behaviours automatically.

In this thesis, we attempt to address the recognition of two subtle while important non-verbal behaviours, head nod and gaze. The task of head nod detection is to identify a head movement where the head is rotating up and down along the sagittal plane one or several times while the task of gaze estimation is to infer the 3D Line of Sight with respect to a World Coordinate System. Both tasks have already found applications in areas like Psychology and Sociology (social analysis by head nod detection, mental health care by analyzing gaze), Human Computer and Human Robot Interaction (behaviour recognition or integration to enable smooth interaction), Virtual Reality (rendering improvement accounting for the user's gaze directions).

To address these two problems, we first investigated the task of head pose estimation which is a fundamental task for both head nod detection and gaze estimation. We proposed HeadFusion, an approach for 360 degree robust head pose tracking. Basically, this is a model based method which relies on depth information. It mainly addresses the weakness of 3D morphable model (3DMM) based methods which usually require frontal or mid-profile poses since the 3DMM model only cover the face region. Our approach, however, achieves a complete head representation by combining the strengths of a 3DMM model fitted online with a prior-free reconstruction of a 3D full head model providing support for pose estimation from any viewpoint. In addition, we also proposes a symmetry regularizer for accurate 3DMM fitting under partial observations, and exploit visual tracking to address natural head dynamics with fast accelerations. Extensive experiments show that our method achieves accurate and robust head pose tracking in difficult scenarios.

Based on the estimated head pose, we designed a head nod detection approach. Compared to previous approaches, two contributions are made: i) the head rotation dynamic is computed within the head coordinate instead of the camera coordinate, leading to pose invariant gesture dynamics; ii) besides the rotation parameters, a feature related to the head rotation axis is proposed so that nod-like false positives due to body movements could be eliminated. The experiments demonstrate the robustness of our approach.

We then change our research focus to gaze estimation. To achieve robust remote gaze sensing, we first explore the application of multitask learning on gaze estimation. Concretely, we introduce a Constrained Landmark-Gaze Model (CLGM) modelling the joint variation of eye landmark locations (including the iris center) and gaze directions. By relating explicitly visual information (landmarks) to the more abstract gaze values, we

  • Files
  • Details
  • Metrics
Type
doctoral thesis
DOI
10.5075/epfl-thesis-7284
Author(s)
Yu, Yu  
Advisors
Odobez, Jean-Marc  
Jury

Prof. Jean-Philippe Thiran (président) ; Dr Jean-Marc Odobez (directeur de thèse) ; Prof. Otmar Hilliges, Prof. Michel Valstar, Prof. Yusuke Sugano (rapporteurs)

Date Issued

2020

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2020-03-25

Thesis number

7284

Total of pages

141

Subjects

Non-verbal behaviour

•

head pose estimation

•

head nod detection

•

gaze estimation

EPFL units
LIDIAP  
Faculty
STI  
School
IEL  
Doctoral School
EDEE  
Available on Infoscience
March 24, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/167572
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés