A. Fabris, M. A. Nicolaou, I. Kotsia and S. Zafeiriou, "Dynamic Probabilistic Linear Discriminant Analysis for video classification," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, 2017, pp. 2781-2785. doi: 10.1109/ICASSP.2017.7952663
Dynamic probabilistic linear discriminant analysis for video classification
|Author:||Fabris, Alessandro1,; Nicolaou, Mihalis A.1,2; Kotsia, Irene3,4;|
1Deparment of Computing, Imperial College London, UK
2Deparment of Computing, Goldsmiths, University of London, UK
3Middlesex University London
4International Hellenic University
5Center for Machine Vision and Signal Analysis, University of Oulu, Finland
|Online Access:||PDF Full Text (PDF, 0.3 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe201902276428
Institute of Electrical and Electronics Engineers,
|Publish Date:|| 2019-02-27
Component Analysis (CA) comprises of statistical techniques that decompose signals into appropriate latent components, relevant to a task-at-hand (e.g., clustering, segmentation, classification). Recently, an explosion of research in CA has been witnessed, with several novel probabilistic models proposed (e.g., Probabilistic Principal CA, Probabilistic Linear Discriminant Analysis (PLDA), Probabilistic Canonical Correlation Analysis). PLDA is a popular generative probabilistic CA method, that incorporates knowledge regarding class-labels and furthermore introduces class-specific and sample-specific latent spaces. While PLDA has been shown to outperform several state-of-the-art methods, it is nevertheless a static model; any feature-level temporal dependencies that arise in the data are ignored. As has been repeatedly shown, appropriate modelling of temporal dynamics is crucial for the analysis of temporal data (e.g., videos). In this light, we propose the first, to the best of our knowledge, probabilistic LDA formulation that models dynamics, the so-called Dynamic-PLDA (DPLDA). DPLDA is a generative model suitable for video classification and is able to jointly model the label information (e.g., face identity, consistent over videos of the same subject), as well as dynamic variations of each individual video. Experiments on video classification tasks such as face and facial expression recognition show the efficacy of the proposed method.
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
|Pages:||2781 - 2785|
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
IEEE international conference on acoustics, speech and signal processing
|Type of Publication:||
A4 Article in conference proceedings
|Field of Science:||
113 Computer and information sciences
This work was partially funded by the FiDiPro program of Tekes (project number: 1849/31/2015), as well as by the European Community Horizon 2020 [H2020/2014-2020] under grant agreement no. 688520 (TeSLA).
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.