Towards Generalizable Trajectory Prediction using Dual-Level Representation Learning and Adaptive Prompting

Messaoud Ben Amor, Kaouther; Matthieu Cord; Alahi, Alexandre

conference paper

Messaoud Ben Amor, Kaouther

•

Matthieu Cord

•

Alahi, Alexandre

June 11, 2025

2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition [Forthcoming publication]

The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions. It is often due to limitations like complex architectures customized for a specific dataset and inefficient multimodal handling. We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and finegrained details. Additionally, our approach of reconstructing segment-level trajectories and lane segments from masked inputs with query drop, enables effective use of contextual information and improves generalization; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation. PerReg+ sets a state-of-the-art performance on nuScenes [5], Argoverse 2 [42], and Waymo Open Motion Dataset (WOMD) [13]. Remarkably, our model reduces the error by 6.8% on smaller datasets, and multi-dataset training enhances generalization. In cross-domain tests, PerReg+ reduces B-FDE by 11.8% compared to its non-pretrained variant.

Name

CVPR_2025 (13).pdf

Type

Main Document

Version

Accepted version

Access type

openaccess

License Condition

N/A

Size

690.98 KB

Format

Adobe PDF

Checksum (MD5)

21bba5bbc93b22585a89dc7cedca0e4e