CLIP the Gap: A Single Domain Generalization Approach for Object Detection
Single Domain Generalization (SDG) tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain. While this has been well studied for image classification, the literature on SDG object detection remains almost non-existent. To address the challenges of simultaneously learning robust object localization and representation, we propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts. We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss. Our experiments evidence the benefits of our approach, outperforming by 10% the only existing SDG object detection method, Single-DGOD [52], on their own diverse weather-driving benchmark.
WOS:001058542603049
2023-01-01
979-8-3503-0129-8
Los Alamitos
3219
3229
REVIEWED
Event name | Event place | Event date |
Vancouver, CANADA | JUN 17-24, 2023 | |
Funder | Grant Number |
Swiss National Science Foundation | |
Swiss Innovation Agency (Innosuisse) via the BRIDGE Discovery grant | 40B2-0 194729 |