University of Oulu

U. Muhammad, J. Zhang, L. Liu and M. Oussalah, "An Adaptive Spatio-temporal Global Sampling for Presentation Attack Detection," in IEEE Transactions on Circuits and Systems II: Express Briefs, doi: 10.1109/TCSII.2022.3169435

An adaptive spatio-temporal global sampling for presentation attack detection

Saved in:
Author: Muhammad, Usman1; Zhang, Jiehua1; Liu, Li2,3;
Organizations: 1Centre for Machine Vision and Signal Analysis (CMVS), Faculty of Information Technology and Electrical Engineering (ITEE), University of Oulu, Finland
2National University of Defense Technology of China
3Center for Machine Vision and Signal analysis (CMVS), University of Oulu, Oulu, Finland
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 0.8 MB)
Persistent link: http://urn.fi/urn:nbn:fi-fe2022082956579
Language: English
Published: Institute of Electrical and Electronics Engineers, 2022
Publish Date: 2022-08-29
Description:

Abstract

Without developing dedicated countermeasures, facial biometric systems can be spoofed with printed photos, replay attacks, silicone masks, or even a 3D mask of a targeted person. Thus, the threat of presentation attacks needs to be addressed to strengthen the security of the biometric systems. Since a 2D convolutional neural network (CNN) captures static features from video frames, the camera motion might hinders the performance of modern CNNs for video-based presentation attack detection (PAD). Inspired by the egomotion theory, we introduce an adaptive spatiotemporal global sampling (ASGS) technique to compensate the camera motion and use the resulting estimation to encode the appearance and dynamics of the video sequences into a single RGB image. This is achieved by adaptively splitting the video into small segments and capturing their global motion within each segment. The proposed global motion is estimated based on four key steps: dense sampling, FREAK feature extraction and matching, similarity transformation, and aggregation function. This allows using deep models pre-trained on images for video-based PAD detection. Moreover, the interpretation of ASGS reveals that the most important parts for supporting the decision on PAD are consistent with motion cues associated with the artifacts, i.e., hand movement, material reflection, and expression changes. Extensive experiments on four standard face PAD databases demonstrate its effectiveness and encourage further study in this domain.

see all

Series: IEEE transactions on circuits and systems. II, Express briefs
ISSN: 1549-7747
ISSN-E: 1558-3791
ISSN-L: 1549-7747
Issue: Online first
DOI: 10.1109/tcsii.2022.3169435
OADOI: https://oadoi.org/10.1109/tcsii.2022.3169435
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Subjects:
Copyright information: © The Author(s) 2022. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0.
  https://creativecommons.org/licenses/by/4.0/