Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Revisiting Character-level Adversarial Attacks for Language Models
 
Loading...
Thumbnail Image
conference paper not in proceedings

Revisiting Character-level Adversarial Attacks for Language Models

Abad Rocamora, Elias  
•
Wu, Yongtao  
•
Liu, Fanghui  
Show more
2024
41st International Conference on Machine Learning (ICML 2024)

Adversarial attacks in Natural Language Processing apply perturbations in the character or token levels. Token-level attacks, gaining prominence for their use of gradient-based methods, are susceptible to altering sentence semantics, leading to invalid adversarial examples. While characterlevel attacks easily maintain semantics, they have received less attention as they cannot easily adopt popular gradient-based methods, and are thought to be easy to defend. Challenging these beliefs, we introduce Charmer, an efficient query-based adversarial attack capable of achieving high attack success rate (ASR) while generating highly similar adversarial examples. Our method successfully targets both small (BERT) and large (Llama 2) models. Specifically, on BERT with SST-2, Charmer improves the ASR in 4.84% points and the USE similarity in 8% points with respect to the previous art. Our implementation is available in github.com/LIONS-EPFL Charmer.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

ICML_2024_Abad_CharacterLevelAttack.pdf

Type

Postprint

Access type

openaccess

License Condition

copyright

Size

6.66 MB

Format

Adobe PDF

Checksum (MD5)

9e786d03e3ec45d71bc92d4ab2cbe0a4

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés