University of Oulu

Linna, M.; Kannala, J. and Rahtu, E. (2018). Real-time Human Pose Estimation with Convolutional Neural Networks. In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, ISBN 978-989-758-290-5, pages 335-342. DOI: 10.5220/0006624403350342

Real-time human pose estimation with convolutional neural networks

Saved in:
Author: Linna, Marko1; Kannala, Juho2; Rahtu, Esa3
Organizations: 1University of Oulu, Finland
2Aalto University, Finland
3Tampere University of Technology, Finland
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 2.4 MB)
Persistent link: http://urn.fi/urn:nbn:fi-fe2019082125000
Language: English
Published: SCITEPRESS Science And Technology Publications, 2018
Publish Date: 2019-08-21
Description:

Abstract

In this paper, we present a method for real-time multi-person human pose estimation from video by utilizing convolutional neural networks. Our method is aimed for use case specific applications, where good accuracy is essential and variation of the background and poses is limited. This enables us to use a generic network architecture, which is both accurate and fast. We divide the problem into two phases: (1) pre-training and (2) finetuning. In pre-training, the network is learned with highly diverse input data from publicly available datasets, while in finetuning we train with application specific data, which we record with Kinect. Our method differs from most of the state-of-the-art methods in that we consider the whole system, including person detector, pose estimator and an automatic way to record application specific training material for finetuning. Our method is considerably faster than many of the state-of-the-art methods. Our method can be thought of as a replacement for Kinect in restricted environments. It can be used for tasks, such as gesture control, games, person tracking, action recognition and action tracking. We achieved accuracy of 96.8% (PCK@0.2) with application specific data.

see all

ISBN: 978-989-758-290-5
Volume: 5
Pages: 335 - 342
DOI: 10.5220/0006624403350342
OADOI: https://oadoi.org/10.5220/0006624403350342
Host publication: Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP
Host publication editor: Imai, Francisco
Tremeau, Alain
Braz, Jose
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Subjects:
Copyright information: © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.