Refining the Pose: Training and use of Deep Recurrent Autoencoders for Improving Human Pose Estimations

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, a discriminative human pose estimation system based on deep learning is proposed for monocular video-sequences. Our approach combines a simple but efficient Convolutional Neural Network that directly regresses the 3D pose estimation with a recurrent denoising autoencoder that provides pose refinement using the temporal information contained in the sequence of previous frames. Our architecture is also able to provide an integrated training between both parts in order to better model the space of activities, where noisy but realistic poses using the partially trained CNN are used to enhance the training of the autoencoder. The system has been evaluated in two standard datasets, HumanEva-I and Human3.6M, comprising more than 15 different activities. We show that our simple architecture can provide state of the art results.
Original languageEnglish
Title of host publicationProceeding of the X Conference on Articulated Motion and Deformable Objects AMDO 2018
PublisherSpringer
Number of pages11
Volume10945
ISBN (Print)978-3-319-94543-9
Publication statusPublished - 01 Sep 2018

Publication series

NameImage Processing, Computer Vision, Pattern Recognition, and Graphics
PublisherSpringer

Fingerprint

Dive into the research topics of 'Refining the Pose: Training and use of Deep Recurrent Autoencoders for Improving Human Pose Estimations'. Together they form a unique fingerprint.

Cite this