University of Oulu

Y. Zhou, J. Deng and S. Zafeiriou, "Improve Accurate Pose Alignment and Action Localization by Dense Pose Estimation," 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, 2018, pp. 480-484. doi: 10.1109/FG.2018.00077

Improve accurate pose alignment and action localization by dense pose estimation

Saved in:
Author: Zhou, Yuxiang1; Deng, Jiankang1; Zafeiriou, Stefanos1,2
Organizations: 1Department of Computing, Imperial College London, United Kingdom
2Centre for Machine Vision and Signal Analysis, University of Oulu, Finland
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 5.1 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2018
Publish Date: 2020-03-05


In this work we explore the use of shape-based representations as an auxiliary source of supervision for pose estimation and action recognition. We show that shape-based representations can act as a source of ’privileged information’ that complements and extends the pure landmark-level annotations. We explore 2D shape-based supervision signals, such as Support Vector Shape. Our experiments show that shape-based supervision signals substantially improve pose alignment accuracy in the form of a cascade architecture. We outperform state-of-the-art methods on the MPII and LSP datasets, while using substantially shallower networks. For action localization in untrimmed videos, our method introduces additional classification signals based on the structured segment networks (SSN) and further improved the performance. To be specific, dense human pose and landmarks localization signals are involved in detection progress. We applied out network to all frames of videos alongside with output from SSN to further improve detection accuracy, especially for pose related and sparsely annotated videos. The method in general achieves state-of-the-art performance on Activity Detection Task for ActivityNet Challenge2017 test set and witnesses remarkable improvement on pose related and sparsely annotated categories e.g. sports.

see all

ISBN: 978-1-5386-2335-0
ISBN Print: 978-1-5386-2336-7
Pages: 480 - 484
DOI: 10.1109/FG.2018.00077
Host publication: 13th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2018
Conference: IEEE International Conference on Automatic Face and Gesture Recognition
Type of Publication: A4 Article in conference proceedings
Field of Science: 213 Electronic, automation and communications engineering, electronics
Copyright information: © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.