Boosting adversarial transferability of vision transformers

  • Yajie Wang
  • , Chuan Zhang
  • , Huipeng Zhou
  • , Zuobin Ying
  • , Zehui Xiong
  • , Wanlei Zhou
  • , Liehuang Zhu

Research output: Contribution to journalArticlepeer-review

Abstract

Vision Transformers (ViTs) have emerged as a dominant backbone architecture for a variety of visual tasks; however, their vulnerability to adversarial examples continues to pose a significant challenge. Unlike Convolutional Neural Networks (CNNs), ViTs fundamentally rely on self-attention mechanisms, leading to a distinct architectural design. The limited transferability of existing adversarial attacks on ViTs can be attributed to the neglect of these unique features. To address this, we introduce a novel self-attention-oriented Adversarial Block Skip (ABS) method specifically designed to generate transferable adversarial examples. ABS aims to create a diverse range of structures by applying skip connections to blocks within the transformer encoder, thereby activating the uncertainty of the attention mechanism. This disrupts the global interaction between different features captured by ViTs, thereby confounding the model's decision-making process. The results unequivocally demonstrate that the ABS not only establishes a versatile and efficacious attack mechanism but also supports transfer attacks across a diverse array of ViTs and CNNs. This finding emphasizes the significant generalization capabilities of ViTs within the adversarial landscape, suggesting that their resilience and adaptability under such conditions may surpass previous assumptions. Comprehensive empirical evaluations involving various prominent transformer models on the ImageNet dataset substantiate that ABS markedly surpasses existing baseline methods in terms of effectiveness. Furthermore, ABS is highly compatible with prevailing adversarial attack frameworks, augmenting their efficacy upon integration. Such versatility renders ABS an indispensable component of the toolkit for executing advanced and effective adversarial attacks in the realm of machine learning security.
Original languageEnglish
Pages (from-to)329 - 342
JournalIEEE Transactions on Dependable and Secure Computing
Volume23
Issue number1
Early online date03 Sept 2025
DOIs
Publication statusPublished - Jan 2026
Externally publishedYes

Fingerprint

Dive into the research topics of 'Boosting adversarial transferability of vision transformers'. Together they form a unique fingerprint.

Cite this