University of Oulu

Z. Yu, X. Li, J. Shi, Z. Xia and G. Zhao, "Revisiting Pixel-Wise Supervision for Face Anti-Spoofing," in IEEE Transactions on Biometrics, Behavior, and Identity Science, vol. 3, no. 3, pp. 285-295, July 2021, doi: 10.1109/TBIOM.2021.3065526

Revisiting pixel-wise supervision for face anti-spoofing

Saved in:
Author: Yu, Zitong1; Li, Xiaobai1; Shi, Jingang2;
Organizations: 1Center for Machine Vision and Signal Analysis, University of Oulu, 90014 Oulu, Finland
2School of Software Engineering, Xi’an Jiaotong University, Xi’an 710049, China
3School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710072, China
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 2.3 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2021
Publish Date: 2021-11-17


Face anti-spoofing (FAS) plays a vital role in securing face recognition systems from the presentation attacks (PAs). As more and more realistic PAs with novel types spring up, it is necessary to develop robust algorithms for detecting unknown attacks even in unseen scenarios. However, deep models supervised by traditional binary loss (e.g., `0' for bonafide vs. `1' for PAs) are weak in describing intrinsic and discriminative spoofing patterns. Recently, pixel-wise supervision has been proposed for the FAS task, intending to provide more fine-grained pixel/patch-level cues. In this paper, we firstly give a comprehensive review and analysis about the existing pixel-wise supervision methods for FAS. Then we propose a novel pyramid supervision, which guides deep models to learn both local details and global semantics from multi-scale spatial context. Extensive experiments are performed on five FAS benchmark datasets to show that, without bells and whistles, the proposed pyramid supervision could not only improve the performance beyond existing pixel-wise supervision frameworks, but also enhance the model's interpretability (i.e., locating the patch-level positions of PAs more reasonably). Furthermore, elaborate studies are conducted for exploring the efficacy of different architecture configurations with two kinds of pixel-wise supervisions (binary mask and depth map supervisions), which provides inspirable insights for future architecture/supervision design.

see all

Series: IEEE transactions on biometrics, behavior, and identity science
ISSN: 2637-6407
ISSN-E: 2637-6407
ISSN-L: 2637-6407
Volume: 3
Issue: 3
Pages: 285 - 295
DOI: 10.1109/TBIOM.2021.3065526
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Funding: This work was supported in part by the Academy of Finland for project MiGA under Grant 316765; in part by ICT 2023 Project under Grant 328115; and in part by Infotech Oulu. The work of Jingang Shi was supported by the National Natural Science Foundation of China under Grant 62002283.
Academy of Finland Grant Number: 316765
Detailed Information: 316765 (Academy of Finland Funding decision)
328115 (Academy of Finland Funding decision)
Copyright information: © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.