Abstract
This chapter investigates the potential of deep learning architectures for Android malware detection, specifically convolutional neural networks (CNNs) using natural language processing (NLP) concepts. The proposed solution is based on static analysis of raw opcode sequences from disassembled programs and other complementary features such as API calls and permissions, with features indicative of malware automatically learned by the network. This removes the need for hand-engineered malware features while performing classification. Using the Drebin and AMD benchmark datasets, the benefits of this multi-view architecture to combine multiple feature sources are demonstrated in our findings. We conclude the use of deep learning architectures enables state-of-art results in automatic malware detection, while reducing the dependency on feature engineering and domain expertise. Using multi-view compared to single-view architectures improves performance through exposure to simultaneous sources of information, learning a more effective set of features. The model achieves state-of-the art detection performance in a challenging zero-day scenario, reducing false positives by 77% in relative terms on average, an important metric for potential real-world deployment.
Original language | English |
---|---|
Title of host publication | Artificial intelligence and cybersecurity theory and applications |
Editors | Tuomo Sipola, Tero Kokkonen, Mika Karjalainen |
Publisher | Springer |
Pages | 209–246 |
ISBN (Electronic) | 9783031150302 |
ISBN (Print) | 9783031150296, 9783031150326 |
DOIs | |
Publication status | Published - 01 Aug 2022 |