Voice spoofing detection for multiclass attack classification using deep learning

Jason Boyd, Muhammad Fahim, Oluwafemi Olukoya*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

26 Downloads (Pure)

Abstract

Voice biometric authentication is increasingly gaining adoption in organisations with high-volume identity verifications and for providing access to physical and other virtual spaces. In this form of authentication, the user’s identity is verified with their voice. However, these systems are susceptible to voice spoofing attacks as malicious actors employ different types of attacks such as speech synthesis, voice conversion or imitations, and recorded replays to spoof the Automatic Speaker Verification (ASV) system or for spam communications. In this work, we provide a voice spoofing countermeasure as a binary classification problem, that classifies real and fake audio, and also as a multiclass classification problem to detect voice conversion, synthesis and replay attacks. We investigated numerous audio features and examined each feature capability alongside state-of-the-art deep learning algorithms including convolutional neural networks (CNN), WaveNet, and recurrent neural network variants — Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM) models. Using a large dataset of 419,426 audio files for experiments, we evaluated the deep learning models for their effectiveness against voice spoofing attacks. The binary class CNN achieved a false positive rate (FPR) of 0.0216, while the multiclass solutions using CNN, WaveNet, LSTMs and GRUs achieved an FPR of 0.003, 0.0260, 0.0302 and 0.0358 respectively. We extended the evaluation of the models by including the real-time classification using microphone voice audio and user-uploaded audio to demonstrate the practical implications and deployability.

Original languageEnglish
Article number100503
Number of pages16
JournalMachine Learning with Applications
Volume14
Early online date13 Oct 2023
DOIs
Publication statusPublished - 15 Dec 2023

Fingerprint

Dive into the research topics of 'Voice spoofing detection for multiclass attack classification using deep learning'. Together they form a unique fingerprint.

Cite this