An information-theoretic perspective of physical adversarial patches

Bilel Tarchoun*, Anouar Ben Khalifa, Mohamed Ali Mahjoub, Nael Abu-Ghazaleh, Ihsen Alouani

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

46 Downloads (Pure)

Abstract

Real-world adversarial patches were shown to be successful in compromising state-of-the-art models in various computer vision applications. Most existing defenses rely on analyzing input or feature level gradients to detect the patch. However, these methods have been compromised by recent GAN-based attacks that generate naturalistic patches. In this paper, we propose a new perspective to defend against adversarial patches based on the entropy carried by the input, rather than on its saliency. We present Jedi, a new defense against adversarial patches that tackles the patch localization problem from an information theory perspective; leveraging the high entropy of adversarial patches to identify potential patch zones, and using an autoencoder to complete patch regions from high entropy kernels.

Jedi achieves high-precision adversarial patch localization and removal, detecting on average 90% of adversarial patches across different benchmarks, and recovering up to 94% of successful patch attacks. Since Jedi relies on an input entropy analysis, it is model-agnostic, and can be applied to off-the-shelf models without changes to the training or inference of the models. Moreover, we propose a comprehensive qualitative analysis that investigates the cases where Jedi fails, comparatively with related methods. Interestingly, we find a significant core failure cases among the different defenses share one common property: high entropy. We think that this work offers a new perspective to understand the adversarial effect under physical-world settings. We also leverage these findings to enhance Jedi’s handling of entropy outliers by introducing Adaptive Jedi, which boosts performance by up to 9% in challenging images.
Original languageEnglish
Article number106590
JournalNeural Networks
Volume179
Early online date19 Aug 2024
DOIs
Publication statusPublished - Nov 2024

Publications and Copyright Policy

This work is licensed under Queen’s Research Publications and Copyright Policy.

Fingerprint

Dive into the research topics of 'An information-theoretic perspective of physical adversarial patches'. Together they form a unique fingerprint.

Cite this