STC-GAN: Spatio-Temporally Coupled Generative Adversarial Networks for Predictive Scene Parsing

Qi, Mengshi; Wang, Yunhong; Li, Annan; Luo, Jiebo

doi:10.1109/TIP.2020.2983567

research article

STC-GAN: Spatio-Temporally Coupled Generative Adversarial Networks for Predictive Scene Parsing

Qi, Mengshi

•

Wang, Yunhong

•

Li, Annan

January 1, 2020

Ieee Transactions On Image Processing

Predictive scene parsing is a task of assigning pixel-level semantic labels to a future frame of a video. It has many applications in vision-based artificial intelligent systems, e.g., autonomous driving and robot navigation. Although previous work has shown its promising performance in semantic segmentation of images and videos, it is still quite challenging to anticipate future scene parsing with limited annotated training data. In this paper, we propose a novel model called STC-GAN, Spatio- Temporally Coupled Generative Adversarial Networks for predictive scene parsing, which employ both convolutional neural networks and convolutional long short-term memory (LSTM) in the encoder-decoder architecture. By virtue of STC-GAN, both spatial layout and semantic context can be captured by the spatial encoder effectively, while motion dynamics are extracted by the temporal encoder accurately. Furthermore, a coupled architecture is presented for establishing joint adversarial training where the weights are shared and features are transformed in an adaptive fashion between the future frame generation model and predictive scene parsing model. Consequently, the proposed STC-GAN is able to learn valuable features from unlabeled video data. We evaluate our proposed STC-GAN on two public datasets, i.e., Cityscapes and CamVid. Experimental results demonstrate that our method outperforms the state-of-the-art.

Type

research article

DOI

10.1109/TIP.2020.2983567

Web of Science ID

WOS:000561102200009

Author(s)

Qi, Mengshi

Wang, Yunhong

Li, Annan

Luo, Jiebo

Date Issued

2020-01-01

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Published in

Ieee Transactions On Image Processing

Volume

29

Start page

5420

End page

5430

Subjects

Computer Science, Artificial Intelligence

•

Engineering, Electrical & Electronic

•

Computer Science

•

Engineering

•

predictive scene parsing

•

generative adversarial networks

•

coupled architecture

•

spatio-temporal features

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

CVLAB

Available on Infoscience

September 4, 2020

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/171356