Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Unified Human Localization and Trajectory Prediction with Monocular Vision
 
conference paper

Unified Human Localization and Trajectory Prediction with Monocular Vision

Luan, Po-Chien  
•
Gao, Yang  
•
Demonsant, Céline
Show more
2025
2025 IEEE International Conference on Robotics & Automation [Proceedings : Forthcoming publication]
2025 IEEE International Conference on Robotics and Automation

Conventional human trajectory prediction models rely on clean curated data, requiring specialized equipment or manual labeling, which is often impractical for robotic applications. The existing predictors tend to overfit to clean observation affecting their robustness when used with noisy inputs. In this work, we propose MonoTransmotion (MT), a Transformerbased framework that uses only a monocular camera to jointly solve localization and prediction tasks. Our framework has two main modules: Bird's Eye View (BEV) localization and trajectory prediction. The BEV localization module estimates the position of a person using 2D human poses, enhanced by a novel directional loss for smoother sequential localizations. The trajectory prediction module predicts future motion from these estimates. We show that by jointly training both tasks with our unified framework, our method is more robust in real-world scenarios made of noisy inputs. We validate our MT network on both curated and non-curated datasets. On the curated dataset, MT achieves around 12% improvement over baseline models on BEV localization and trajectory prediction. On real-world non-curated dataset, experimental results indicate that MT maintains similar performance levels, highlighting its robustness and generalization capability. The code is available at https://github.com/vita-epfl/MonoTransmotion.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

MonoTransmotion_ICRA_2025_Camera_Ready.pdf

Type

Main Document

Version

Accepted version

Access type

openaccess

License Condition

CC BY-NC-SA

Size

1.06 MB

Format

Adobe PDF

Checksum (MD5)

75389e21b667fb24df9cbc58dde92cfb

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés