3D Human Pose Estimation using Iterative Conditional Squeeze and Excitation Networks

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)
598 Downloads (Pure)


We propose a new method for single-camera real world 3D human pose estimation. Our method uses multi-task training together with iterative pose refinement using a novel conditional attention mechanism. For iterative pose refinement, the output of each convolutional layer is conditioned on the latest pose estimate, using a Conditioned Squeeze-and-Excitation network architecture that incorporates novel feedback connections. Multi-task training on both an in-the-wild 2D pose dataset and a controlled 3D pose dataset allows for real-world 3Dpose estimation without the need for a large-scale in-the-wild 3Dpose dataset, which is unavailable. Experiments are performed on several real-world datasets, as well as the Human 3.6 Million and HumanEva-I datasets, to show that the combined attentionmechanism, iterative refinement scheme and multi-task trainingallow us to achieve robust and competitive performance withonly a simple network architecture. In addition, we show thatour method is efficient enough to run on commodity hardware,producing pose estimates in real-time.
Original languageEnglish
Pages (from-to)1-13
JournalIEEE Transactions on Cybernetics
Early online date05 Feb 2020
Publication statusEarly online date - 05 Feb 2020


Dive into the research topics of '3D Human Pose Estimation using Iterative Conditional Squeeze and Excitation Networks'. Together they form a unique fingerprint.

Cite this