University of Oulu

Shim, H.-j., Gonzalez Hautamäki, R., Sahidullah, M., Kinnunen, T. (2023) How to Construct Perfect and Worse-than-Coin-Flip Spoofing Countermeasures: A Word of Warning on Shortcut Learning. Proc. INTERSPEECH 2023, 785-789, doi: 10.21437/Interspeech.2023-1901

How to construct perfect and worse-than-coin-flip spoofing countermeasures : a word of warning on shortcut learning

Saved in:
Author: Shim, Hye-jin1; Gonzalez Hautamäki, Rosa2; Sahidullah, Md3;
Organizations: 1University of Eastern Finland, Finland
2University of Oulu, Finland
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 0.3 MB)
Persistent link:
Language: English
Published: International Speech Communication Association, 2023
Publish Date: 2023-09-27


Shortcut learning, or ‘Clever Hans effect’ refers to situations where a learning agent (e.g., deep neural networks) learns spurious correlations present in data, resulting in biased models. We focus on finding shortcuts in deep learning based spoofing countermeasures (CMs) that predict whether a given utterance is spoofed or not. While prior work has addressed specific data artifacts, such as silence, no general normative framework has been explored for analyzing shortcut learning in CMs. In this study, we propose a generic approach to identifying shortcuts by introducing systematic interventions on the training and test sides, including the boundary cases of ‘near-perfect’ and ‘worse than coin flip’ (label flip). By using three different models, ranging from classic to state-of-the-art, we demonstrate the presence of shortcut learning in five simulated conditions. We also analyze the results using a regression model to understand how biases affect the class-conditional score statistics.

see all

Series: Interspeech
ISSN: 1990-9772
ISSN-L: 1990-9772
Pages: 785 - 789
DOI: 10.21437/Interspeech.2023-1901
Host publication: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Conference: Annual Conference of the International Speech Communication Association
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Funding: This work was partially supported by Academy of Finland (Decision No. 349605, project “SPEECHFAKES”).
Copyright information: © ISCA