Enhancing gait recognition with 3D markerless motion capture

  • James Rainey

Student thesis: Doctoral ThesisDoctor of Philosophy


In recent years gait has increased in popularity as a biometric and gait recognition has matured into a larger field that is is attracting international interest. Most current state-of-the-art gait recognition systems use appearance-based methods which are fast and capable of high performance on gait datasets. With the emergence of deep learning the focus of appearance-based methods has shifted, from the development of suitable gait representations to the optimisation of deep learning architectures using existing representations, such as Gait Energy Images (GEIs). However, advances in deep learning have also enabled the development of more reliable joint detection algorithms which allow old model-based techniques, which previously used unreliable joint estimation, to be more effective for gait recognition. Model-based methods are potentially more robust than appearance-based methods, and ideally a gait recognition system would combine both methods to make use of the strengths of each.

In this thesis a model-based gait recognition method is developed which makes use of state-of-the-art markerless motion capture and automated machine learning (AutoML) methods. Advances in 3D markerless motion capture enable full 3D body poses to be estimate from unconstrained video sources. The motion capture algorithm is used to produce 3D body poses from the CASIA-B gait dataset, which in turn are used to train an AutoML classifier. This is the first gait recognition approach which uses modern markerless motion capture from unconstrained videos. It is also the first use of AutoML in gait recognition, which provides a consistent classification approach and enables the comparison of the contribution of features to recognition accuracy, rather than the aggregate contribution of the classifiers. The approach achieves competitive performance against other motion capture based methods.

A state-of-the-art appearance based method which uses Convolutional Neural Networks (CNNs) is investigated and a number of enhancements are made to improve its performance. These include a modified pre-processing step, training of multiple CNNs on subsets of the data and combining GEI representations with poses fitted using markerless motion capture. An AutoML classifier is trained to determine the view angle of the input sequence and select the best performing model for that view. This hybrid method achieves competitive performance with state-of-the-art approaches on the CASIA-B dataset.

Finally, to get a deeper understanding of how the CNN based methods function a detailed evaluation and visual analysis is performed. A pre-trained transfer learning network is fine-tuned using the same data and pre-processing as the previous custom networks. The performance of this network is then compared to the custom network to highlight the impact of the CNN architecture on the accuracy of the system. A visual analysis of the features learned by the CNN-based method is performed. The visualisation technique produces adversarial images by iteratively modifying input images to brighten or darken important features, until there is a change in classification. This allows the features that contribute to the final classification to be visualised. This provides insight into which body parts contribute most to correct classifications and failure cases, and how the contributing parts differ as the view angle changes. The head, waist and upper leg areas are found to be the most important and the waist/arm areas are the most consistent across view angles.
Date of AwardJul 2021
Original languageEnglish
Awarding Institution
  • Queen's University Belfast
SponsorsNorthern Ireland Department for the Economy
SupervisorJohn Bustard (Supervisor) & Seán McLoone (Supervisor)


  • Gait recognition
  • biometrics
  • computer vision
  • deep learning
  • machine learning
  • markerless motion capture
  • convolutional neural networks
  • human identification

Cite this