University of Oulu

Z. Wang et al., "Deep Spatial Gradient and Temporal Depth Learning for Face Anti-Spoofing," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 5041-5050, doi: 10.1109/CVPR42600.2020.00509

Deep spatial gradient and temporal depth learning for face anti-spoofing

Saved in:
Author: Wang, Zezheng1; Yu, Zitong2; Zhao, Chenxu3;
Organizations: 1AIBEE
2CMVS, University of Oulu
3Academy of Sciences, Mininglamp Technology
5Northwestern Polytechnical University
6JD Digits
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 4.9 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2020
Publish Date: 2020-12-17


Face anti-spoofing is critical to the security of face recognition systems. Depth supervised learning has been proven as one of the most effective methods for face anti-spoofing. Despite the great success, most previous works still formulate the problem as a single-frame multi-task one by simply augmenting the loss with depth, while neglecting the detailed fine-grained information and the interplay between facial depths and moving patterns. In contrast, we design a new approach to detect presentation attacks from multiple frames based on two insights: 1) detailed discriminative clues (e.g., spatial gradient magnitude) between living and spoofing face may be discarded through stacked vanilla convolutions, and 2) the dynamics of 3D moving faces provide important clues in detecting the spoofing faces. The proposed method is able to capture discriminative details via Residual Spatial Gradient Block (RSGB) and encode spatio-temporal information from Spatio-Temporal Propagation Module (STPM) efficiently. Moreover, a novel Contrastive Depth Loss is presented for more accurate depth supervision. To assess the efficacy of our method, we also collect a Double-modal Anti-spoofing Dataset (DMAD) which provides actual depth for each sample. The experiments demonstrate that the proposed approach achieves state-of-the-art results on five benchmark datasets including OULU-NPU, SiW, CASIA-MFSD, Replay-Attack, and the new DMAD. Codes will be available at

see all

ISBN: 978-1-7281-7168-5
ISBN Print: 978-1-7281-7169-2
Pages: 5041 - 5050
Article number: 9157185
DOI: 10.1109/CVPR42600.2020.00509
Host publication: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
Conference: IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
213 Electronic, automation and communications engineering, electronics
Funding: This work has been partially supported by the Chinese National Natural Science Foundation Projects #61876178, #61806196, #61976229.
Copyright information: © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.