Semantic Segmentation of Benthic Classes in Reef Environments Using a Large Vision Transformer

Sertic, Charlotte; Sauder, Jonathan; Tuia, Devis

doi:10.1007/978-3-031-92387-6_11

conference paper

Semantic Segmentation of Benthic Classes in Reef Environments Using a Large Vision Transformer

Sertic, Charlotte

•

Sauder, Jonathan

•

Tuia, Devis

Del Bue, Alessio

•

Canton, Cristian

2025

Computer Vision – ECCV 2024 Workshops, Proceedings

European Conference on Computer Vision

Coral reefs are crucial for biodiversity and provide vital resources for humankind. But despite such a central role, they are confronted to increasing threats linked to climate change, pollution, and local stressors. To ensure effective conservation, efficient and scalable monitoring is key: this necessitates automated identification of benthic classes and their states on a large scale through semantic segmentation. However, segmentation of underwater videos is challenging, because of visual similarities between benthic classes, underwater distortions and limited available datasets, making it harder to create accurate and robust models. In this paper, we present a method for training a semantic segmentation model on a small dataset of video frames of coral scenes, by fine-tuning a large transformer model. Our approach uses transfer learning on the Segment Anything Model (SAM), incorporating specific training and prediction strategies. We benchmark our model against a CNN for semantic segmentation as a baseline. Our results demonstrate a substantial improvement in model performance, particularly for benthic classes that often appear as small objects and rarer classes, highlighting the potential of our approach in advancing coral reef mapping and monitoring.

Type

conference paper

DOI

10.1007/978-3-031-92387-6_11

Scopus ID

2-s2.0-105007132340

Author(s)

Sertic, Charlotte

École Polytechnique Fédérale de Lausanne

Sauder, Jonathan

EPFL

Tuia, Devis

EPFL

Editors

Del Bue, Alessio

•

Canton, Cristian

•

Pont-Tuset, Jordi

•

Tommasi, Tatiana

Date Issued

2025

Publisher

Springer Science and Business Media Deutschland GmbH

Published in

Computer Vision – ECCV 2024 Workshops, Proceedings

DOI of the book

10.1007/978-3-031-92387-6

ISBN of the book

978-3-031-92387-6

Series title/Series vol.

Lecture Notes in Computer Science; 15624 LNCS

ISSN (of the series)

1611-3349

0302-9743

Start page

160

End page

168

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

ECEO

Event name	Event acronym	Event place	Event date
European Conference on Computer Vision		Milan, Italy	2024-09-29 - 2024-10-04

Available on Infoscience

July 10, 2025

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/252109