Usman Muhammad, Zitong Yu, Jukka Komulainen, Self-supervised 2D face presentation attack detection via temporal sequence sampling, Pattern Recognition Letters, Volume 156, 2022, Pages 15-22, ISSN 0167-8655, https://doi.org/10.1016/j.patrec.2022.03.001
Self-supervised 2D face presentation attack detection via temporal sequence sampling
|Author:||Muhammad, Usman1; Yu, Zitong1; Komulainen, Jukka1|
1Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Finland
|Online Access:||PDF Full Text (PDF, 1.3 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe2022041929477
|Publish Date:|| 2022-04-19
Conventional 2D face biometric systems are vulnerable to presentation attacks performed with different face artefacts, e.g., printouts, video-replays and wearable 3D masks. The research focus in face presentation attack detection (PAD) has been recently shifting towards end-to-end learning of deep representations directly from annotated data rather than designing hand-crafted (low-level) features. However, even the state-of-the-art deep learning based face PAD models have shown unsatisfying generalization performance when facing unknown attacks or acquisition conditions due to lack of representative training and tuning data available in the existing public benchmarks. To alleviate this issue, we propose a video pre-processing technique called Temporal Sequence Sampling (TSS) for 2D face PAD by removing the estimated inter-frame 2D affine motion in the view and encoding the appearance and dynamics of the resulting smoothed video sequence into a single RGB image. Furthermore, we leverage the features of a Convolutional Neural Network (CNN) by introducing a self-supervised representation learning scheme, where the labels are automatically generated by the TSS method as the stabilized frames accumulated over video clips of different temporal lengths provide the supervision. The learnt feature representations are then fine-tuned for the downstream task using labelled face PAD data. Our extensive experiments on four public benchmarks, namely Replay-Attack, MSU-MFSD, CASIA-FASD and OULU-NPU, demonstrate that the proposed framework provides promising generalization capability and encourage further study in this domain.
Pattern recognition letters
|Pages:||15 - 22|
|Type of Publication:||
A1 Journal article – refereed
|Field of Science:||
113 Computer and information sciences
The financial support of the Tauno Tönning Foundation is gratefully acknowledged.
© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).