University of Oulu

X. Liu, H. Shi, X. Hong, H. Chen, D. Tao and G. Zhao, "3D Skeletal Gesture Recognition via Hidden States Exploration," in IEEE Transactions on Image Processing, vol. 29, pp. 4583-4597, 2020,

3D skeletal gesture recognition via hidden states exploration

Saved in:
Author: Liu, Xin1,2; Shi, Henglin1; Hong, Xiaopeng3;
Organizations: 1Center for Machine Vision and Signal Analysis, University of Oulu, FI-90014, Finland
2School of Information Technologies, Faculty of Engineering and Information Technologies, The University of Sydney, Australia
3Xi’an Jiaotong University, Xi’an, China
4School of Computer Science, in the Faculty of Engineering, at The University of Sydney, 6 Cleveland St, Darlington, NSW 2008, Australia
5School of Information and Technology, Northwest University, 710069, China
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 4.6 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2020
Publish Date: 2020-04-23


Temporal dynamics is an open issue for modeling human body gestures. A solution is resorting to the generative models, such as the hidden Markov model (HMM). Nevertheless, most of the work assumes fixed anchors for each hidden state, which make it hard to describe the explicit temporal structure of gestures. Based on the observation that a gesture is a time series with distinctly defined phases, we propose a new formulation to build temporal compositions of gestures by the low-rank matrix decomposition. The only assumption is that the gesture’s “hold” phases with static poses are linearly correlated among each other. As such, a gesture sequence could be segmented into temporal states with semantically meaningful and discriminative concepts. Furthermore, different to traditional HMMs which tend to use specific distance metric for clustering and ignore the temporal contextual information when estimating the emission probability, we utilize the long short-term memory to learn probability distributions over states of HMM. The proposed method is validated on multiple challenging datasets. Experiments demonstrate that our approach can effectively work on a wide range of gestures, and achieve state-of-the-art performance.

see all

Series: IEEE transactions on image processing
ISSN: 1057-7149
ISSN-E: 1941-0042
ISSN-L: 1057-7149
Volume: 29
Pages: 4583 - 4597
DOI: 10.1109/TIP.2020.2974061
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Funding: This work was supported in part by the Academy of Finland for project MiGA (grant 316765) and ICT 2023 project (grant 328115), the strategic Funds of the University of Oulu, Finland, the Infotech Oulu, and in part by Endeavour Research Fellowship of Australian Government Department of Education and Training, and in part by Australian Research Council Project FL-170100117. As well, the authors wish to acknowledge CSC - IT Center for Science, Finland, for computational resources.
Academy of Finland Grant Number: 316765
Detailed Information: 316765 (Academy of Finland Funding decision)
328115 (Academy of Finland Funding decision)
Copyright information: © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.