University of Oulu

X. Liu and G. Zhao, "3D Skeletal Gesture Recognition via Discriminative Coding on Time-Warping Invariant Riemannian Trajectories," in IEEE Transactions on Multimedia, vol. 23, pp. 1841-1854, 2021, doi: 10.1109/TMM.2020.3003783

3D skeletal gesture recognition via discriminative coding on time-warping invariant Riemannian trajectories

Saved in:
Author: Liu, Xin1,2; Zhao, Guoying2,3
Organizations: 1School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
2Center for Machine Vision and Signal Analysis, University of Oulu, FI-90014, Finland
3School of Information and Technology, Northwest University, 710069, China
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 1 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2021
Publish Date: 2021-10-14


Learning 3D skeleton-based representation for gesture recognition has progressively stood out because of its invariance to the viewpoint and background dynamics of video. Typically, existing techniques use absolute coordinates to determine human motion features. The recognition of gestures, however, is irrespective of the position of the performer, and the extracted features should be invariant to body size. In addition, when comparing and classifying gestures, the problem of temporal dynamics can greatly distort the distance metric. In this paper, we represent a 3D skeleton as a point in the special orthogonal group SO(3) product space that expressly models the 3D geometric relationships between body parts. As such, a gesture skeletal sequence can be described by a trajectory on a Riemannian manifold. Following that, we propose to generalize the transported square-root vector field to obtain a time-warping invariant metric for comparing these trajectories (identifying these gestures). Moreover, by specifically considering the labeling information with encoding, a sparse coding scheme of skeletal trajectories is presented to enforce the discriminant validity of atoms in the dictionary. Experimental results indicate that the proposed approach has achieved state-of-the-art performance on many challenging gesture recognition benchmarks.

see all

Series: IEEE transactions on multimedia
ISSN: 1520-9210
ISSN-E: 1941-0077
ISSN-L: 1520-9210
Volume: 23
Pages: 1841 - 1854
DOI: 10.1109/TMM.2020.3003783
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Funding: This work was supported by the Academy of Finland for postdoctoral researcher project (grant 331146), project MiGA (grant 316765), and ICT 2023 project (grant 328115), the strategic Funds of the University of Oulu, the Infotech Oulu, Finland. As well, the authors wish to acknowledge CSC - IT Center for Science, Finland, for computational resources.
Academy of Finland Grant Number: 331146
Detailed Information: 331146 (Academy of Finland Funding decision)
316765 (Academy of Finland Funding decision)
328115 (Academy of Finland Funding decision)
Copyright information: © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.