Network-based advanced malware detection using multi-classifier machine learning

  • Ahmad Almashhadani

Student thesis: Doctoral ThesisDoctor of Philosophy


Over the past decade, cyber threats have significantly evolved in persistence and sophistication. Malware has been the primary choice of weapon to carry out various cyberattacks. Host-based malware detection, as the primary line of defence, evolved into the \Achilles Heel". In particular, the increase of security-aware targeted attacks, comprises of reconnaissance and delivery phases, are capable of identifying deployed security tools and disabling these without being detected. Hence, the deployment of advanced, network-based Intrusion Detection System (IDS) has become an inevitable line-of-defence assisting host-based malware detection. Ransomware is a kind of advanced malware that has spread rapidly in recent years, causing massive financial losses for a broad range of victims, such as healthcare facilities, companies, and individuals. Modern host-based detection methods require the host to be infected first to be able to identify anomalies and detect the malware. By the time of infection, it may be too late as some of the system's assets would have been already encrypted or exfiltrated by the malware. Conversely, the network-based approach can be an effective detection method as most families of ransomware attempt to contact with command and control (C&C) servers before their harmful payloads are executed. Also, some recent ransomware families have evolved and combined the propagation properties of computer worms to be able to spread across the networks. A network-based ransomware detection approach, which complements well-established host-based ransomware detection methods, can be one of the essential means for detecting ransomware attack effectively. It can overcome the limitations of current ransomware defence while enabling early detection and timely deployment of countermeasures. State-of-the-art presents little research work that focuses on network-based approaches for ransomware detection. This thesis investigates the use of machine learning techniques for detecting crypto ransomware network activities. A thorough dynamic analysis of crypto ransomware network traffic is carried out using a dedicated malware testbed. A set of 18 network-based features are extracted from several network protocols of Locky, one of the well-established ransomware families. A new classification scheme is introduced to classify the features into four types. A multi-feature and multi-classifier intrusion detection system is proposed and implemented for detecting the communications between ransomware and its C&C server. This new approach employs two independent classifiers working in parallel on two levels: packet and flow. The experimental evaluation of the presented detection system demonstrates that the system offers high detection accuracy for each level: 97.92% and 97.08% respectively. Second, machine learning techniques are used to detect covert C&C channels established using Domain Generation Algorithm (DGA). DGA is one of the main techniques deployed by ransomware and botnet to connect with attackers by generating many pseudorandom domain names. A malicious domain name detection system, called MaldomDetector, is introduced. Prototyped MaldomDetector can detect the DGA-based communications before the malware is able to establish a successful connection with the C&C server, basing only on the used characters for the domain name MaldomDetector deploys a deterministic algorithm and easy to compute features extracted out of the domain name characters. It is not based on any probabilistic language model, i.e., a language-independent system, and does not utilise any data from an external site or wait for a DNS response packet; hence, significantly reducing the time and computation required to classify the domain names. The evaluation results demonstrate that MaldomDetector provides high accuracy of 98% in detecting different types of DGA-based domains. MaldomDetector can be employed as an early warning system to raise early alarms about potential malicious DNS communications. Finally, a multi-feature and multi-classifier network-based system (MFMCNS) is presented for detecting ransomware propagation activities. A comprehensive analysis of ransomware traffic is performed, and two sets of features are extracted based on two independent flow levels: session-based and time-based. Also, two individual classifiers are built employing the two different feature sets. The experimental results demonstrate a high detection accuracy for the session-based and time-based classifiers: 99.88% and 99.66% respectively validating the effectiveness of the extracted features. MFMCNS employs these classifiers in parallel on different levels where the classifiers' decisions are combined using a fusion rule. Experimental results validate that the overall MFMCNS detection accuracy and reliability have been enhanced.
Date of AwardJul 2021
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SupervisorSakir Sezer (Supervisor), Philip O'Kane (Supervisor) & Mustafa Kaiiali (Supervisor)


  • Network security
  • network traffic analysis
  • intrusion detection
  • machine learning
  • malware analysis
  • ransomware
  • domain generation algorithm (DGA)

Cite this