Generating sparse explanations for malicious android opcode sequences using hierarchical LIME

Jeff Mitchell*, Niall McLaughlin, Jesus Martinez-del-Rincon

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)
48 Downloads (Pure)

Abstract

In malware analysis, understanding the reasons behind a decision is important for building trust on the system. In the case of opcode-sequence-based classifiers, when standard explanation methods, such as LIME, are applied, the resulting explanation may not provide much insight into the salient parts of the input sequence. This is because LIME treats each opcode as an independent feature, and perturbing this feature will not cause a significant change in the output, meaning the resulting explanation tends to look like random noise. In this paper, we introduce a novel method Hierarchical-LIME (H-LIME) to address this issue. We take into consideration the hierarchical structure of the program, composed of classes and methods. We show that when H-LIME is applied at the level of classes and methods the resulting explanation is sparser, vastly helping improve its interpretability. We conduct extensive experiments by evaluating our proposed method against criteria for accuracy, completeness, sparsity, stability and efficiency. We show that our method significantly improves on all the evaluation criteria compared to other explainability methods.

Original languageEnglish
Article number103637
Number of pages15
JournalComputers & Security
Volume137
Early online date12 Dec 2023
DOIs
Publication statusEarly online date - 12 Dec 2023

Fingerprint

Dive into the research topics of 'Generating sparse explanations for malicious android opcode sequences using hierarchical LIME'. Together they form a unique fingerprint.

Cite this