University of Oulu

Liu X., Zhao G. (2019) 3D Skeletal Gesture Recognition via Sparse Coding of Time-Warping Invariant Riemannian Trajectories. In: Kompatsiaris I., Huet B., Mezaris V., Gurrin C., Cheng WH., Vrochidis S. (eds) MultiMedia Modeling. MMM 2019. Lecture Notes in Computer Science, vol 11295. Springer, Cham

3D skeletal gesture recognition via sparse coding of time-warping invariant riemannian trajectories

Saved in:
Author: Liu, Xin1; Zhao, Guoying1
Organizations: 1Center for Machine Vision and Signal Analysis, University of Oulu, 90014 Oulu, Finland
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 0.4 MB)
Persistent link:
Language: English
Published: Springer Nature, 2019
Publish Date: 2019-08-06


3D skeleton based human representation for gesture recognition has increasingly attracted attention due to its invariance to camera view and environment dynamics. Existing methods typically utilize absolute coordinate to present human motion features. However, gestures are independent of the performer’s locations, and the features should be invariant to the body size of performer. Moreover, temporal dynamics can significantly distort the distance metric when comparing and identifying gestures. In this paper, we represent each skeleton as a point in the product space of special orthogonal group SO3, which explicitly models the 3D geometric relationships between body parts. Then, a gesture skeletal sequence can be characterized by a trajectory on a Riemannian manifold. Next, we generalize the transported square-root vector field to obtain a re-parametrization invariant metric on the product space of SO(3), therefore, the goal of comparing trajectories in a time-warping invariant manner is realized. Furthermore, we present a sparse coding of skeletal trajectories by explicitly considering the labeling information with each atoms to enforce the discriminant validity of dictionary. Experimental results demonstrate that proposed method has achieved state-of-the-art performance on three challenging benchmarks for gesture recognition.

see all

Series: Lecture notes in computer science
ISSN: 0302-9743
ISSN-E: 1611-3349
ISSN-L: 0302-9743
ISBN: 978-3-030-05710-7
ISBN Print: 978-3-030-05709-1
Issue: 11295
Pages: 678 - 690
DOI: 10.1007/978-3-030-05710-7_56
Host publication: MultiMedia Modeling. MMM 2019
Host publication editor: Kompatsiaris, Ioannis
Huet, Benoit
Mezaris, Vasileios
Gurrin, Cathal
Cheng, Wen-Huang
Vrochidis, Stefanos
Conference: International Conference on Multimedia Modeling
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Funding: This work is supported by Academy of Finland, Tekes Fidipro Program, Infotech, Tekniikan Edistamissaatio, and Nokia Foundation.
Copyright information: © Springer Nature Switzerland AG 2019. This is a post-peer-review, pre-copyedit version of an article published in Lecture Notes in Computer Science, vol 11295. The final authenticated version is available online at: