VIENA(2): A Driving Anticipation Dataset

Aliakbarian, Mohammad Sadegh; Saleh, Fatemeh Sadat; Salzmann, Mathieu; Fernando, Basura; Petersson, Lars; Andersson, Lars

doi:10.1007/978-3-030-20887-5_28

Aliakbarian, Mohammad Sadegh; Saleh, Fatemeh Sadat; Salzmann, Mathieu; Fernando, Basura; Petersson, Lars; Andersson, Lars

2019

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Abstract

Action anticipation is critical in scenarios where one needs to react before the action is finalized. This is, for instance, the case in automated driving, where a car needs to, e.g., avoid hitting pedestrians and respect traffic lights. While solutions have been proposed to tackle subsets of the driving anticipation tasks, by making use of diverse, task-specific sensors, there is no single dataset or framework that addresses them all in a consistent manner. In this paper, we therefore introduce a new, large-scale dataset, called VIENA2, covering 5 generic driving scenarios, with a total of 25 distinct action classes. It contains more than 15K full HD, 5 s long videos acquired in various driving conditions, weathers, daytimes and environments, complemented with a common and realistic set of sensor measurements. This amounts to more than 2.25M frames, each annotated with an action label, corresponding to 600 samples per action class. We discuss our data acquisition strategy and the statistics of our dataset, and benchmark state-of-the-art action anticipation techniques, including a new multi-modal LSTM architecture with an effective loss function for action anticipation in driving scenarios.

Details

Title VIENA(2): A Driving Anticipation Dataset

Author(s) Aliakbarian, Mohammad Sadegh ; Saleh, Fatemeh Sadat ; Salzmann, Mathieu ; Fernando, Basura ; Petersson, Lars ; Andersson, Lars

Published in Computer Vision - Accv 2018, Pt I

Series Lecture Notes in Computer Science

Volume 11361

Pages 449-466

Conference 14th Asian Conference on Computer Vision (ACCV), Dec 02-06, 2018, Perth, AUSTRALIA

Date 2019-01-01

Publisher Cham, SPRINGER INTERNATIONAL PUBLISHING AG

ISSN 0302-9743
1611-3349

ISBN 978-3-030-20886-8
978-3-030-20887-5

DOI https://doi.org/10.1007/978-3-030-20887-5_28

Other identifier(s) View record in Web of Science

Laboratories CVLAB

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > CVLAB - Computer Vision Laboratory
Scientific production and competences > Euler Center for Signal Processing
Peer-reviewed publications
Conference Papers
Work produced at EPFL
Published

Record creation date 2019-11-08

Abstract

Details

Actions