A sequence models-based real-time multi-person action recognition method with monocular vision

Aolei Yang*, Wei Lu, Wasif Naeem, Ling Chen, Minrui Fei

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Downloads (Pure)


In intelligent video surveillance under complex scenes, it is vital to identify the current actions of multi-target human bodies accurately and in real time. In this paper, a real-time multi-person action recognition method with monocular vision is proposed based on sequence models. Firstly, the key points of multi-target human body skeleton in the video are extracted by using the OpenPose algorithm. Then, the human action features are constructed, including limb direction vector and the skeleton height-width ratio. The multi-target human bodies tracking is then achieved by using the tracking algorithm. Next, the tracking results are matched with the action features, and the action recognition model is constructed, which includes the spatial branch based on Deep neural networks and the temporal branch based on Bi-directional RNN and Bi-directional long short-term memory networks. After pre-training, the model can be used to recognize the human body action from action features, and a recognition stabilizer is designed to minimize false alarms. Finally, extensive evaluations on the JHMDB dataset validate the effectiveness and the superiority of the proposed approach.

Original languageEnglish
JournalJournal of Ambient Intelligence and Humanized Computing
Early online date21 Jul 2021
Publication statusEarly online date - 21 Jul 2021

Bibliographical note

Funding Information:
This work was supported by Natural Science Foundation of Shanghai (18ZR1415100), and National Natural Science Foundation of China (61703262).

Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.


  • Action recognition
  • Computer vision
  • Feature construction
  • Human body skeleton
  • Sequence models

ASJC Scopus subject areas

  • Computer Science(all)


Dive into the research topics of 'A sequence models-based real-time multi-person action recognition method with monocular vision'. Together they form a unique fingerprint.

Cite this