Abstract

We propose the Square Attack, a new score-based black-box $l_2$ and $l_\infty$ adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. The Square Attack is based on a randomized search scheme where we select localized square-shaped updates at random positions so that the $l_\infty$- or $l_2$-norm of the perturbation is approximately equal to the maximal budget at each step. Our method is algorithmically transparent, robust to the choice of hyperparameters, and is significantly more query efficient compared to the more complex state-of-the-art methods. In particular, on ImageNet we improve the average query efficiency for various deep networks by a factor of at least $2$ and up to $7$ compared to the recent state-of-the-art $l_\infty$-attack of Meunier et al. while having a higher success rate. The Square Attack can even be competitive to gradient-based white-box attacks in terms of success rate. Moreover, we show its utility by breaking a recently proposed defense based on randomization. The code of our attack is available at https://github.com/max-andr/square-attack

Details

Actions