Publication:

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

cris.virtual.department

VILAB

cris.virtual.sciperId

368631

cris.virtual.unitManager

Zamir, Amir

cris.virtualsource.author-scopus

dfbdaafc-1463-4e9c-89f6-9234338a5c1b

cris.virtualsource.department

dfbdaafc-1463-4e9c-89f6-9234338a5c1b

cris.virtualsource.orcid

dfbdaafc-1463-4e9c-89f6-9234338a5c1b

cris.virtualsource.rid

dfbdaafc-1463-4e9c-89f6-9234338a5c1b

cris.virtualsource.sciperId

dfbdaafc-1463-4e9c-89f6-9234338a5c1b

cris.virtualsource.unitManager

8f5de1fa-f78f-4e97-ad79-a09cf0317022

datacite.rights

metadata-only

dc.contributor.author

Ehsani, Kiana

dc.contributor.author

Gupta, Tanmay

dc.contributor.author

Hendrix, Rose

dc.contributor.author

Salvador, Jordi

dc.contributor.author

Weihs, Luca

dc.contributor.author

Zeng, Kuo-Hao

dc.contributor.author

Singh, Kunal Pratap

dc.contributor.author

Kim, Yejin

dc.contributor.author

Han, Winson

dc.contributor.author

Herrasti, Alvaro

dc.contributor.author

Krishnan, Ay

dc.contributor.author

Schwenk, Dustin

dc.contributor.author

VanderBilt, Eli

dc.contributor.author

Kembhavi, Aniruddha

dc.date.accessioned

2025-01-31T13:21:15Z

dc.date.available

2025-01-31T13:21:15Z

dc.date.created

2025-01-31

dc.date.issued

2024-01-01

dc.date.modified

2025-04-09T23:50:22.131421Z

dc.description.abstract

Reinforcement learning (RL) with dense rewards and imitation learning (IL) with human-generated trajectories are the most widely used approaches for training modern embodied agents. RL requires extensive reward shaping and auxiliary losses and is often too slow and ineffective for long-horizon tasks. While IL with human supervision is effective, collecting human trajectories at scale is extremely expensive. In this work, we show that imitating shortest-path planners in simulation produces agents that, given a language instruction, can proficiently navigate, explore, and manipulate objects in both simulation and in the real world using only RGB sensors (no depth map or GPS coordinates). This surprising result is enabled by our end-to-end, transformer-based, SPOC architecture, powerful visual encoders paired with extensive image augmentation, and the dramatic scale and diversity of our training data: millions of frames of shortest-path-expert trjectories collected inside approximately 200,000 procedurally generated houses containing 40,000 unique 3D assets. Our models, data, training code, and newly proposed 10-task benchmarking suite CHORES are available in spoc-robot.github.io.

en
dc.identifier.doi

10.1109/CVPR52733.2024.01537

dc.identifier.isi

WOS:001342442407060

dc.identifier.uri

https://infoscience.epfl.ch/handle/20.500.14299/246062

dc.language.iso

English

dc.publisher

IEEE

dc.relation.conference

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

dc.relation.doi

10.1109/CVPR52733.2024

dc.relation.isbn

979-8-3503-5300-6

dc.relation.ispartof

2024 IEEE/CVF Conference On Computer Vision And Pattern Recognition (CVPR)

dc.relation.ispartofseries

IEEE Conference on Computer Vision and Pattern Recognition

dc.relation.serieissn

1063-6919

dc.subject

Science & Technology

dc.subject

Technology

dc.title

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

dc.type

text::conference output::conference proceedings::conference paper

en
dspace.entity.type

Publication

epfl.peerreviewed

REVIEWED

epfl.relation.conferenceType

conference

epfl.workflow.startDateTime

2025-01-31T09:59:53.041Z

epfl.writtenAt

EPFL

local.wos.sourceType

Proceedings Paper

oaire.citation.conferenceDate

2024-06-16 - 2024-06-22

oaire.citation.conferencePlace

Seattle, WA

oaire.citation.edition

WOS.ISTP

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

École Polytechnique Fédérale de Lausanne

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

oairecerif.author.affiliation

Allen Inst AI

person.identifier.rid

X-4345-2019

person.identifier.rid

GCG-4785-2022

person.identifier.rid

CTX-3506-2022

person.identifier.rid

FVV-1899-2022

person.identifier.rid

DZO-0983-2022

person.identifier.rid

GJY-6981-2022

person.identifier.rid

JRV-9018-2023

person.identifier.rid

KNM-9584-2024

person.identifier.rid

CUV-4224-2022

person.identifier.rid

CUJ-0495-2022

person.identifier.rid

MBX-1600-2025

person.identifier.rid

FXP-6346-2022

person.identifier.rid

GGT-4831-2022

person.identifier.rid

CZD-9189-2022

Files

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
865 B
Format:
Item-specific license agreed to upon submission
Description: