Understanding Deep Neural Networks using Adversarial Attacks

Nakka, Krishna Kanth

doi:10.5075/epfl-thesis-9259

doctoral thesis

Understanding Deep Neural Networks using Adversarial Attacks

2022

Deep Neural Networks (DNNs) have achieved great success in a wide range of applications, such as image recognition, object detection, and semantic segmentation. Even though the discriminative power of DNNs is nowadays unquestionable, serious concerns have arised ever since DNNs have shown to be vulnerable to adversarial examples crafted by adding imperceptible perturbations to clean images. The implications of these malicious attacks are even more significant for DNNs deployed in real-world systems, e.g., autonomous driving and biometric authentication. Consequently, an intriguing question that we aim to understand is the underlying behavior of DNNs to adversarial attacks.

This thesis contributes to a better understanding of the mechanism of adversarial attacks on DNNs. Our main contributions are broadly in two directions: (1) we propose interpretable architectures first to understand the reasons for the success of adversarial attacks and then to improve the robustness of DNNs; (2) we design intuitive adversarial attacks to both mislead and use as a tool to expand our present understanding of DNNs' internal workings and their limitations.

In the first direction, we introduce deep architectures that allow humans to interpret the reasoning process of DNNs prediction. Specifically, we incorporate Bag-of-visual-words representations from the pre-deep learning era into DNNs using an attention scheme. We find key reasons for adversarial attack success and use these insights to propose an adversarial defense by maximally separating the latent features of discriminative regions while minimizing the contribution of non-discriminative regions in the final prediction. The second direction deals with the design of adversarial attacks to understand DNNs' limitations in a real-world environment. To begin with, we show that existing state-of-the-art semantic segmentation networks that achieve superior performance by exploiting the context are highly susceptible to indirect local attacks. Furthermore, we demonstrate the existence of universal directional perturbations that are quasi-independent of the input template but still successfully fool unknown siamese-based visual object trackers. We then identify that the mid-level filter banks across different backbones bear strong similarities and thus can be potential common ground for attack. We, therefore, learn a generator that disrupts mid-level features with high transferability across different target architectures, datasets, and tasks. In short, our attacks highlight critical vulnerabilities of DNNs, which make their deployment challenging in the real-world environment, even in the extreme case when the attacker is unaware of the target architecture or the target data used to train it.

Furthermore, we go beyond fooling networks and demonstrate the usefulness of adversarial attacks for studying the internal disentangled representations in self-supervised 3D pose estimation networks. We observe that adversarial manipulation of appearance information in the input image alters the pose output, indicating that the pose code contains appearance information and disentanglement is far from complete. Besides the above contributions, an underlying theme that arises multiple times in this thesis is counteracting the adversarial attacks by detecting them.

Type

doctoral thesis

DOI

10.5075/epfl-thesis-9259

Author(s)

Nakka, Krishna Kanth

Advisors

Fua, Pascal

•

Salzmann, Mathieu

Jury

Prof. Nicolas Henri Bernard Flammarion (président) ; Prof. Pascal Fua, Dr Mathieu Salzmann (directeurs) ; Prof. Pascal Frossard, Dr Jan Hendrik Metzen, Prof. Giorgio C. Buttazzo (rapporteurs)

Date Issued

2022

Publisher

EPFL

Publisher place

Lausanne

Public defense year

2022-08-15

Thesis number

9259

Total of pages

235

Subjects

Deep Neural Networks

•

Adversarial Attacks

•

Black-box Attacks

•

Adversarial Defense

•

Image Recognition

•

Semantic Segmentation

•

Object Tracking

•

Disentanglement

EPFL units

Faculty

School

Doctoral School

Available on Infoscience

August 4, 2022

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/189769