Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Hein, Matthias; Andriushchenko, Maksym

2017

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Résumé

Recent work has shown that state-of-the-art classifiers are quite brittle, in the sense that a small adversarial change of an originally with high confidence correctly classified input leads to a wrong classification again with high confidence. This raises concerns that such classifiers are vulnerable to attacks and calls into question their usage in safety-critical systems. We show in this paper for the first time formal guarantees on the robustness of a classifier by giving instance-specific lower bounds on the norm of the input manipulation required to change the classifier decision. Based on this analysis we propose the Cross-Lipschitz regularization functional. We show that using this form of regularization in kernel methods resp. neural networks improves the robustness of the classifier with no or small loss in prediction performance.

Détails

Titre Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Auteur(s) Hein, Matthias ; Andriushchenko, Maksym

Présenté à Advances in Neural Information Processing Systems 30

Date 2017

Laboratoires IINFCOM

Le document apparaît dans Production scientifique et compétences > I&C - Faculté Informatique & Communications > IINFCOM > UNATTRIBUTED-IINFCOM - IINFCOM - Publications non attribuées
Publications validées par des pairs
Travail hors EPFL
Papiers de conférence
Publié

Date de création de la notice 2019-12-06