TY - JOUR
T1 - Multi-view Deep Learning for Zero-day Android Malware Detection
AU - Millar, Stuart
AU - McLaughlin, Niall
AU - Martinez-del-Rincon, Jesus
AU - Miller, Paul
PY - 2021/1/13
Y1 - 2021/1/13
N2 - Zero-day malware samples pose a considerable danger to users as implicitly there are no documented defences for previously unseen, newly encountered behaviour. Malware detection therefore relies on past knowledge to attempt to deal with zero-days. Often such insight is provided by a human expert hand-crafting and pre-categorising certain features as malicious. However, tightly coupled feature-engineering based on previous domain knowledge risks not being effective when faced with a new threat. In this work we decouple this human expertise, instead encapsulating knowledge inside a deep learning neural net with no prior understanding of malicious characteristics. Raw input features consist of low-level opcodes, app permissions and proprietary Android API package usage. Our method makes three main contributions. Firstly, a novel multi-view deep learning Android malware detector with no specialist malware domain insight used to select, rank or hand-craft input features. Secondly, a comprehensive zero-day scenario evaluation using the Drebin and AMD benchmarks, with our model achieving weighted average detection rates of 91% and 81% respectively, an improvement of up to 57% over the state-of-the-art. Thirdly, a 77% reduction in false positives on average compared to the state-of-the-art, with excellent F1 scores of 0.9928 and 0.9963 for the general detection task again on the Drebin and AMD benchmark datasets respectively.
AB - Zero-day malware samples pose a considerable danger to users as implicitly there are no documented defences for previously unseen, newly encountered behaviour. Malware detection therefore relies on past knowledge to attempt to deal with zero-days. Often such insight is provided by a human expert hand-crafting and pre-categorising certain features as malicious. However, tightly coupled feature-engineering based on previous domain knowledge risks not being effective when faced with a new threat. In this work we decouple this human expertise, instead encapsulating knowledge inside a deep learning neural net with no prior understanding of malicious characteristics. Raw input features consist of low-level opcodes, app permissions and proprietary Android API package usage. Our method makes three main contributions. Firstly, a novel multi-view deep learning Android malware detector with no specialist malware domain insight used to select, rank or hand-craft input features. Secondly, a comprehensive zero-day scenario evaluation using the Drebin and AMD benchmarks, with our model achieving weighted average detection rates of 91% and 81% respectively, an improvement of up to 57% over the state-of-the-art. Thirdly, a 77% reduction in false positives on average compared to the state-of-the-art, with excellent F1 scores of 0.9928 and 0.9963 for the general detection task again on the Drebin and AMD benchmark datasets respectively.
U2 - 10.1016/j.jisa.2020.102718
DO - 10.1016/j.jisa.2020.102718
M3 - Article
VL - 58
JO - Journal of Information Security and Applications
JF - Journal of Information Security and Applications
SN - 2214-2126
ER -