Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Learning How to Smile: Expression Video Generation With Conditional Adversarial Recurrent Nets
 
research article

Learning How to Smile: Expression Video Generation With Conditional Adversarial Recurrent Nets

Wang, Wei  
•
Alameda-Pineda, Xavier
•
Xu, Dan
Show more
November 1, 2020
IEEE Transactions On Multimedia

While several research studies have focused on analyzing human behavior and, in particular, emotional signals from visual data, the problem of synthesizing face video sequences with specific attributes (e.g. age, facial expressions) received much less attention. This paper proposes a novel deep generative model able to produce face videos from a given image of a neutral face and a label indicating a specific facial expression, e.g. spontaneous smile. Our framework consists of two main building blocks: an image generator and a frame sequence generator. The image generator is implemented as a deep neural model which combines generative adversarial networks and variational auto-encoders, while the sequence generator is a label-conditioned recurrent neural network. In the proposed framework, given as input a neural face and a label, the sequence generator outputs a set of hidden representations with smooth transitions corresponding to video frames. Then, the image generator is used to decode the hidden representations into the actual face images. To impose that the net generates videos consistent with the given label, a novel identity adversarial loss is proposed. Our experimental results demonstrate the effectiveness of the framework and the advantage of introducing an adversarial component into recurrent models for face video generation.

  • Details
  • Metrics
Type
research article
DOI
10.1109/TMM.2019.2963621
Web of Science ID

WOS:000584239900005

Author(s)
Wang, Wei  
Alameda-Pineda, Xavier
Xu, Dan
Ricci, Elisa
Sebe, Nicu
Date Issued

2020-11-01

Published in
IEEE Transactions On Multimedia
Volume

22

Issue

11

Start page

2808

End page

2819

Subjects

Computer Science, Information Systems

•

Computer Science, Software Engineering

•

Telecommunications

•

Computer Science

•

Telecommunications

•

face

•

generators

•

generative adversarial networks

•

manifolds

•

solid modeling

•

three-dimensional displays

•

visualization

•

video generation

•

gated recurrent unit

•

smile

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
CVLAB  
Available on Infoscience
December 16, 2020
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/174098
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés