Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Adversarially Robust CLIP Models Can Induce Better (Robust) Perceptual Metrics
 
conference paper

Adversarially Robust CLIP Models Can Induce Better (Robust) Perceptual Metrics

Croce, Francesco  
•
Schlarmann, Christian
•
Singh, Naman Deep
Show more
May 22, 2025
2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)
2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)

Measuring perceptual similarity is a key tool in computer vision. In recent years perceptual metrics based on features extracted from neural networks with large and diverse training sets, e.g. CLIP, have become popular. At the same time, the metrics extracted from features of neural networks are not adversarially robust. In this paper we show that adversarially robust CLIP models, called R-CLIPF, obtained by unsupervised adversarial fine-tuning induce a better and adversarially robust perceptual metric that outperforms existing metrics in a zero-shot setting, and further matches the performance of state-of-the-art metrics while being robust after fine-tuning. Moreover, our perceptual metric achieves strong performance on related tasks such as robust image-to-image retrieval, which becomes especially relevant when applied to “Not Safe for Work” (NSFW) content detection and dataset filtering. While standard perceptual metrics can be easily attacked by a small perturbation completely degrading NSFW detection, our robust perceptual metric maintains high accuracy under an attack while having similar performance for unperturbed images. Finally, perceptual metrics induced by robust CLIP models have higher interpretability: feature inversion can show which images are considered similar, while text inversion can find what images are associated to a given prompt. This also allows us to visualize the very rich visual concepts learned by a CLIP model, including memorized persons, paintings and complex queries.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/satml64287.2025.00041
Author(s)
Croce, Francesco  

École Polytechnique Fédérale de Lausanne

Schlarmann, Christian
Singh, Naman Deep
Hein, Matthias
Date Issued

2025-05-22

Publisher

IEEE

Published in
2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)
ISBN of the book

979-8-3315-1711-3

Start page

636

End page

660

Subjects

perceptual metrics

•

adversarial robustness

•

NSFW detection

•

content filtering

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
TML  
Event nameEvent acronymEvent placeEvent date
2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)

SaTML 2025

Copenhagen, Denmark

2025-04-09 - 2025-04-11

FunderFunding(s)Grant NumberGrant URL

International Max Planck Research School for Intelligent Systems

Deutsche Forschungsgemeinschaft

390727645,464101476

Open Philanthropy

Available on Infoscience
May 26, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/250448
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés