Semantic Segmentation of Benthic Classes in Reef Environments Using a Large Vision Transformer
Coral reefs are crucial for biodiversity and provide vital resources for humankind. But despite such a central role, they are confronted to increasing threats linked to climate change, pollution, and local stressors. To ensure effective conservation, efficient and scalable monitoring is key: this necessitates automated identification of benthic classes and their states on a large scale through semantic segmentation. However, segmentation of underwater videos is challenging, because of visual similarities between benthic classes, underwater distortions and limited available datasets, making it harder to create accurate and robust models. In this paper, we present a method for training a semantic segmentation model on a small dataset of video frames of coral scenes, by fine-tuning a large transformer model. Our approach uses transfer learning on the Segment Anything Model (SAM), incorporating specific training and prediction strategies. We benchmark our model against a CNN for semantic segmentation as a baseline. Our results demonstrate a substantial improvement in model performance, particularly for benthic classes that often appear as small objects and rarer classes, highlighting the potential of our approach in advancing coral reef mapping and monitoring.
2-s2.0-105007132340
2025
978-3-031-92387-6
Lecture Notes in Computer Science; 15624 LNCS
1611-3349
0302-9743
160
168
REVIEWED
EPFL
Event name | Event acronym | Event place | Event date |
Milan, Italy | 2024-09-29 - 2024-10-04 | ||