Skip to main navigation Skip to search Skip to main content

Hierarchical transformer gated graph neural network approach for efficient multi-modal hand gesture recognition

  • Nahla Majdoub Bhiri*
  • , Safa Ameur
  • , Hajer Chtioui
  • , Ihsen Alouani
  • , Anouar Ben Khalifa
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Hand Gesture Recognition (HGR) has gained significant attention as a natural and intuitive means of human-computer interaction, driven by advances in machine learning, sensor technologies, and computational power. Among these, the Leap Motion Controller (LMC) stands out for its ability to capture precise hand motion with high spatial accuracy and multiple modalities (skeletal and depth). In this work, we demonstrate that representing these complex multimodal outputs as graph structures is not only appropriate but also a significant innovation that fully utilizes sophisticated neural network architectures. To improve accuracy and generalization in HGR tasks, we present a hierarchical transformer gated graph neural network that has been improved with an intermediate fusion technique. Our architecture has been designed to overcome several limitations of current graph-based methods, including inadequate multi-modal feature integration, insufficient temporal modeling, and limited generalization across users. Using two benchmark datasets, 2MLMD and MMHGD, we examine our approach and find consistent and substantial enhancements in performance compared to the most advanced baselines. Superior accuracy, robustness, and generalizability are demonstrated by the results, confirming the efficacy of our architecture. Ablation experiments demonstrate the significance of each component in improving recognition performance.

Original languageEnglish
Article number140
JournalSignal, Image and Video Processing
Volume20
Issue number3
DOIs
Publication statusPublished - 09 Mar 2026

Keywords

  • Fusion
  • Graph neural network
  • Hand gesture recognition
  • LMC
  • Multi-modal

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Hierarchical transformer gated graph neural network approach for efficient multi-modal hand gesture recognition'. Together they form a unique fingerprint.

Cite this