Hold me tight! Influence of discriminative features on deep network boundaries

Ortiz Jimenez, Guillermo; Modas, Apostolos; Moosavi Dezfooli, Seyed Mohsen; Frossard, Pascal

conference paper

Ortiz Jimenez, Guillermo

•

Modas, Apostolos

•

Moosavi Dezfooli, Seyed Mohsen

more

2020

Advances in Neural Information Processing Systems 34

Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020)

Important insights towards the explainability of neural networks reside in the characteristics of their decision boundaries. In this work, we borrow tools from the field of adversarial robustness, and propose a new perspective that relates dataset features to the distance of samples to the decision boundary. This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets. We use this framework to reveal some intriguing properties of CNNs. Specifically, we rigorously confirm that neural networks exhibit a high invariance to non-discriminative features, and show that very small perturbations of the training samples in certain directions can lead to sudden invariances in the orthogonal ones. This is precisely the mechanism that adversarial training uses to achieve robustness.

Name

2002.06349v4.pdf

Type

Publisher's version

Access type

openaccess

License Condition

CC BY

Size

4.53 MB

Format

Adobe PDF

Checksum (MD5)

869b51ec4b0c104acbef66c2e227407b