Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Segment Anything Meets Point Tracking
 
conference paper

Segment Anything Meets Point Tracking

Rajic, Frano  
•
Ke, Lei
•
Tai, Yu-Wing
Show more
January 1, 2025
2025 IEEE Winter Conference On Applications of Computer Vision. WACV 2025
2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Foundation models have marked a significant stride toward addressing generalization challenges in deep learning. While the Segment Anything Model (SAM) has established a strong foothold in image segmentation, existing video segmentation methods still require extensive mask labeling for fine-tuning, or face performance drops on unseen data domains otherwise. In this paper, we show how foundation models for image segmentation make a step toward enhancing domain generalizability in video segmentation. We discover that, combined with long-term point tracking, image segmentation models yield state-of-the-art results in zero-shot video segmentation across multiple benchmarks. Surprisingly, point trackers exhibit generalization to domains beyond their synthetic pre-training sequences, which we attribute to the trackers' ability to harness the rich local information in the vicinity of each tracked point. Thus, we introduce SAM-PT, an innovative method for point-centric video segmentation, leveraging the capabilities of SAM alongside long-term point tracking. SAM-PT extends SAM's capability to tracking and segmenting anything in dynamic videos. Unlike traditional video segmentation methods that focus on object-centric mask propagation, our approach uniquely exploits point propagation to utilize local structure information independent of object semantics. The effectiveness of point-based tracking is underscored by direct evaluation on the zero-shot open-world UVO benchmark. Our experiments on popular video object segmentation and multi-object segmentation tracking benchmarks, including DAVIS, YouTube-VOS, and BDD100K, suggest that a pointbased segmentation tracker yields better zero-shot performance and efficient interactions. We release our code at https://github.com/SysCV/sam-pt.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/WACV61041.2025.00901
Web of Science ID

WOS:001521272600411

Author(s)
Rajic, Frano  

École Polytechnique Fédérale de Lausanne

Ke, Lei

Hong Kong University of Science & Technology

Tai, Yu-Wing

Hong Kong University of Science & Technology

Tang, Chi-Keung

Hong Kong University of Science & Technology

Danelljani, Martin

Swiss Federal Institutes of Technology Domain

Yu, Fisher

Swiss Federal Institutes of Technology Domain

Date Issued

2025-01-01

Publisher

IEEE

Publisher place

Los Alamitos

Published in
2025 IEEE Winter Conference On Applications of Computer Vision. WACV 2025
DOI of the book
https://doi.org/10.1109/WACV61041.2025
ISBN of the book

979-8-3315-1084-8

979-8-3315-1083-1

Series title/Series vol.

IEEE Winter Conference on Applications of Computer Vision

ISSN (of the series)

2472-6737

Start page

9302

End page

9311

Subjects

Science & Technology

•

Technology

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
EPFL  
Event nameEvent acronymEvent placeEvent date
2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

WACV 2024

Tucson, AZ, USA

2025-02-26 - 2025-03-06

Available on Infoscience
September 15, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/254044
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés