University of Oulu

L. Zafeiriou, Y. Panagakis, M. Pantic and S. Zafeiriou, "Nonnegative Decompositions for Dynamic Visual Data Analysis," in IEEE Transactions on Image Processing, vol. 26, no. 12, pp. 5603-5617, Dec. 2017. doi: 10.1109/TIP.2017.2735186

Nonnegative decompositions for dynamic visual data analysis

Saved in:
Author: Zafeiriou, Lazaros1; Panagakis, Yannis2; Pantic, Maja2;
Organizations: 1AimBrain, London E14 5AB, U.K
2Imperial College London, London SW7 2AZ, U.K
3University of Oulu, Finland
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 11.5 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2017
Publish Date: 2019-10-03


The analysis of high-dimensional, possibly temporally misaligned, and time-varying visual data is a fundamental task in disciplines, such as image, vision, and behavior computing. In this paper, we focus on dynamic facial behavior analysis and in particular on the analysis of facial expressions. Distinct from the previous approaches, where sets of facial landmarks are used for face representation, raw pixel intensities are exploited for: 1) unsupervised analysis of the temporal phases of facial expressions and facial action units (AUs) and 2) temporal alignment of a certain facial behavior displayed by two different persons. To this end, the slow features nonnegative matrix factorization (SFNMF) is proposed in order to learn slow varying parts-based representations of time varying sequences capturing the underlying dynamics of temporal phenomena, such as facial expressions. Moreover, the SFNMF is extended in order to handle two temporally misaligned data sequences depicting the same visual phenomena. To do so, the dynamic time warping is incorporated into the SFNMF, allowing the temporal alignment of the data sets onto the subspace spanned by the estimated nonnegative shared latent features amongst the two visual sequences. Extensive experimental results in two video databases demonstrate the effectiveness of the proposed methods in: 1) unsupervised detection of the temporal phases of posed and spontaneous facial events and 2) temporal alignment of facial expressions, outperforming by a large margin the state-of-the-art methods that they are compared to.

see all

Series: IEEE transactions on image processing
ISSN: 1057-7149
ISSN-E: 1941-0042
ISSN-L: 1057-7149
Volume: 26
Issue: 12
Pages: 5603 - 5617
DOI: 10.1109/TIP.2017.2735186
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Funding: This work was supported by the EPSRC Project under Grant EP/N007743/1 (FACER2VM). The work of L. Zafeiriou was supported by the EPSRC Project under Grant EP/N007743/1 (FACER2VM). The work of Y. Panagakis and M. Pantic was supported by the European Community Horizon 2020 [H2020/2014-2020] under Grant 645094 (SEWA). The work of S. Zafeiriou was supported in part by the EPSRC Project under Grant EP/J017787/1 (4D-FAB) and in part by the FiDiPro Program of Tekes under Project 1849/31/2015.
Copyright information: © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.