Navigating concept drift and packing complexity in malware family classification

Research output: Chapter in Book/Report/Conference proceedingConference contribution

192 Downloads (Pure)

Abstract

In the rapidly evolving landscape of cybersecurity, classification of malware families presents significant challenges due to the dynamic nature of malware, a phenomenon known as concept drift. In this research, we classify Windows PE malware families using static analysis of raw opcode sequences. By leveraging Convolutional Neural Networks (CNNs) to extract unique features from these sequences, our approach achieves high classification accuracy rates of 98.20% and 89.55% on the Microsoft Malware Classification Challenge and BODMAS datasets, respectively. We also conducted a temporal analysis on BODMAS over a 13-month period to observe the evolution of malware families and identify periods where our model’s accuracy decreases. We implemented a retraining strategy, allowing us to observe how retraining the model with new data helps it adapt to new malware patterns. The study also examined the impact of packed malware and different types of packers on the model’s performance. Our findings indicate that packed malware significantly affects the model’s accuracy, with some packers having a more pronounced impact than others. These results underscore the importance of regular model updates and specialized handling of packed malware to maintain robust detection capabilities.

Original languageEnglish
Title of host publicationConference on Applied Machine Learning for Information Security (CAMLIS 2024): Proceedings
PublisherCEUR-WS
Pages129-144
Number of pages15
Volume3920
Publication statusPublished - 09 Feb 2025
EventConference on Applied Machine Learning for Information Security - Arlington, United States
Duration: 24 Oct 202425 Oct 2024
Conference number: 2024
https://www.camlis.org/

Publication series

NameCEUR Workshop Proceedings
Volume3920
ISSN (Print)1613-0073

Conference

ConferenceConference on Applied Machine Learning for Information Security
Abbreviated titleCAMLIS
Country/TerritoryUnited States
CityArlington
Period24/10/202425/10/2024
Internet address

Keywords

  • concept drift
  • packing complexity
  • malware family classification

Fingerprint

Dive into the research topics of 'Navigating concept drift and packing complexity in malware family classification'. Together they form a unique fingerprint.

Cite this