H. Chen, X. Liu, J. Shi and G. Zhao, "Temporal Hierarchical Dictionary Guided Decoding for Online Gesture Segmentation and Recognition," in IEEE Transactions on Image Processing, vol. 29, pp. 9689-9702, 2020, doi: 10.1109/TIP.2020.3028962
Temporal hierarchical dictionary guided decoding for online gesture segmentation and recognition
|Author:||Chen, Haoyu1; Liu, Xin1; Shi, Jingang2;|
1Center for Machine Vision and Signal Analysis, University of Oulu, FI-90014, Finland
2School of Software, Xi’an Jiaotong University, China
|Online Access:||PDF Full Text (PDF, 1.5 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe2020120399293
Institute of Electrical and Electronics Engineers,
|Publish Date:|| 2020-12-03
Online segmentation and recognition of skeleton- based gestures are challenging. Compared with offline cases, the inference of online settings can only rely on the current few frames and always completes before whole temporal movements are performed. However, incompletely performed gestures are ambiguous and their early recognition is easy to fall into local optimum. In this work, we address the problem with a temporal hierarchical dictionary to guide the hidden Markov model (HMM) decoding procedure. The intuition is that, gestures are ambiguous with high uncertainty at early performing phases, and only become discriminate after certain phases. This uncertainty naturally can be measured by entropy. Thus, we propose a measurement called “relative entropy map” (REM) to encode this temporal context to guide HMM decoding. Furthermore, we introduce a progressive learning strategy with which neural networks could learn a robust recognition of HMM states in an iterative manner. The performance of our method is intensively evaluated on three challenging databases and achieves state-of-the-art results. Our method shows the abilities of both extracting the discriminate connotations and reducing large redundancy in the HMM transition process. It is verified that our framework can achieve online recognition of continuous gesture streams even when they are halfway performed.
IEEE transactions on image processing
|Pages:||9689 - 9702|
|Type of Publication:||
A1 Journal article – refereed
|Field of Science:||
113 Computer and information sciences
This work is supported by the Academy of Finland for project MiGA (grant 316765), ICT 2023 project (grant 328115), and Infotech Oulu and in part by the Chinese Scholarship Council. As well, the authors wish to acknowledge CSC IT Center for Science, Finland, for computational resources. Xin is supported by the Academy of Finland for postdoctoral researcher project (grant 331146).
|Academy of Finland Grant Number:||
316765 (Academy of Finland Funding decision)
328115 (Academy of Finland Funding decision)
331146 (Academy of Finland Funding decision)
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.