EPFL-Smart-Kitchen-30 Collected data

Bonnetto, Andy; Qi, Haozhe; Leong, Franklin; Tashkovska, Matea; Hamidi Rad, Mahdi; Shokur, Solaiman; Hummel, Friedhelm Christoph; Micera, Silvestro; Pollefeys, Marc; Mathis, Alexander

doi:10.5281/zenodo.15535461

dataset

EPFL-Smart-Kitchen-30 Collected data

•

•

May 28, 2025

Zenodo

Understanding behavior requires datasets that capture humans while carrying out complex tasks. The kitchen is an excellent environment for assessing human motor and cognitive function, as many complex actions are naturally exhibited in kitchens from chopping to cleaning. Here, we introduce the EPFL-Smart-Kitchen-30 dataset, collected in a noninvasive motion capture platform inside a kitchen environment. Nine static RGB-D cameras, inertial measurement units (IMUs) and one head-mounted HoloLens~2 headset were used to capture 3D hand, body, and eye movements. The EPFL-Smart-Kitchen-30 dataset is a multi-view action dataset with synchronized exocentric, egocentric, depth, IMUs, eye gaze, body and hand kinematics spanning 29.7 hours of 16 subjects cooking four different recipes. Action sequences were densely annotated with 33.78 action segments per minute. Leveraging this multi-modal dataset, we propose four benchmarks to advance behavior understanding and modeling through

a vision-language benchmark,
a semantic text-to-motion generation benchmark,
a multi-modal action recognition benchmark,
a pose-based action segmentation benchmark.

> ⚠️ 3D pose and action annotations can be found at https://zenodo.org/records/15551913

Type

dataset

DOI

10.5281/zenodo.15535461

ACOUA ID

6a3e43c6-7386-4317-bb97-2e4803700cda

Author(s)

EPFL

EPFL

EPFL

EPFL

Microsoft (Switzerland)

Shokur, Solaiman

EPFL

Hummel, Friedhelm Christoph

EPFL

Micera, Silvestro

EPFL

Pollefeys, Marc

Microsoft ; ETH Zurich

Mathis, Alexander

EPFL

Date Issued

2025-05-28

Version

1

Publisher

Zenodo

License

CC BY

Subjects

pose estimation

•

kitchen

•

cooking

•

action segmentation

•

action recognition

•

motion generation

•

full-body

•

eye gaze

•

3D pose

•

absolute position

•

actions

•

activities

•

hierarchical behavior

•

behavior

•

motor control

EPFL units

UPAMATHIS

TNE

UPHUMMEL

Funder	Funding(s)	Grant NO	Grant URL
Swiss National Science Foundation	Joint behavior and neural data modeling for naturalistic behavior	10000950

Relation	Related work	URL/DOI
IsSupplementTo	EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models	https://infoscience.epfl.ch/handle/20.500.14299/251268
IsContinuedBy	EPFL-Smart-Kitchen-30 Annotations and Poses	https://infoscience.epfl.ch/handle/20.500.14299/251051
IsVersionOf		https://doi.org/10.5281/zenodo.15535460

Available on Infoscience

June 12, 2025

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/251269