Image-Guided Topic Modeling for Interpretable Privacy Classification
Predicting and explaining the private information contained in an image in human-understandable terms is a complex and contextual task. This task is challenging even for large language models. To facilitate the understanding of privacy decisions, we propose to predict image privacy based on a set of natural language content descriptors. These content descriptors are associated with privacy scores that reflect how people perceive image content. We generate descriptors with our novel Image-guided Topic Modeling (ITM) approach. ITM leverages, via multimodality alignment, both vision information and image textual descriptions from a vision language model. We use the ITM-generated descriptors to learn a privacy predictor, Priv×ITM, whose decisions are interpretable by design. Our Priv×ITM classifier outperforms the reference interpretable method by 5% points in accuracy and performs comparably to the current non-interpretable state-of-the-art model.
Institut Dalle Molle D'intelligence Artificielle Perceptive
EPFL
2025
Lecture Notes in Computer Science; 15643 LNCS
1611-3349
0302-9743
200
217
REVIEWED
EPFL
| Event name | Event acronym | Event place | Event date |
ECCV 2024 | Milan, Italy | 2024-09-29 - 2024-10-04 | |