Block-Sparse Adversarial Attack To Fool Transformer-Based Text Classifiers

Sadrizadeh, Sahar; Dolamic, Ljiljana; Frossard, Pascal

doi:10.1109/ICASSP43922.2022.9747475

conference paper

Block-Sparse Adversarial Attack To Fool Transformer-Based Text Classifiers

Sadrizadeh, Sahar

•

Dolamic, Ljiljana

•

Frossard, Pascal

January 1, 2022

2022 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)

47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Recently, it has been shown that, in spite of the significant performance of deep neural networks in different fields, those are vulnerable to adversarial examples. In this paper, we propose a gradient-based adversarial attack against transformer-based text classifiers. The adversarial perturbation in our method is imposed to be block-sparse so that the resultant adversarial example differs from the original sentence in only a few words. Due to the discrete nature of textual data, we perform gradient projection to find the minimizer of our proposed optimization problem. Experimental results demonstrate that, while our adversarial attack maintains the semantics of the sentence, it can reduce the accuracy of GPT-2 to less than 5% on different datasets (AG News, MNLI, and Yelp Reviews). Furthermore, the block-sparsity constraint of the proposed optimization problem results in small perturbations in the adversarial example. (1)

Type

conference paper

DOI

10.1109/ICASSP43922.2022.9747475

Web of Science ID

WOS:000864187908029

Authors

Sadrizadeh, Sahar

•

Dolamic, Ljiljana

•

Frossard, Pascal

Publication date

2022-01-01

Publisher

IEEE

Published in

2022 Ieee International Conference On Acoustics, Speech And Signal Processing (Icassp)

ISBN of the book

978-1-6654-0540-9

Publisher place

New York

Series title/Series vol.

International Conference on Acoustics Speech and Signal Processing ICASSP

Start page

7837

End page

7841

Subjects

Acoustics

Computer Science, Art...

Engineering, Electric...

Computer Science

Engineering

adversarial attack

block sparse

deep neural network

natural language proc...

text classification

Peer reviewed

REVIEWED

EPFL units

LTS4

Event name	Event place	Event date
47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)	Singapore, SINGAPORE	May 22-27, 2022

Available on Infoscience

January 16, 2023

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/193727