How to construct perfect and worse-than-coin-flip spoofing countermeasures : a word of warning on shortcut learning |
|
Author: | Shim, Hye-jin1; Gonzalez Hautamäki, Rosa2; Sahidullah, Md3; |
Organizations: |
1University of Eastern Finland, Finland 2University of Oulu, Finland 3TCG CREST, India |
Format: | article |
Version: | published version |
Access: | open |
Online Access: | PDF Full Text (PDF, 0.3 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe20230927137604 |
Language: | English |
Published: |
International Speech Communication Association,
2023
|
Publish Date: | 2023-09-27 |
Description: |
AbstractShortcut learning, or ‘Clever Hans effect’ refers to situations where a learning agent (e.g., deep neural networks) learns spurious correlations present in data, resulting in biased models. We focus on finding shortcuts in deep learning based spoofing countermeasures (CMs) that predict whether a given utterance is spoofed or not. While prior work has addressed specific data artifacts, such as silence, no general normative framework has been explored for analyzing shortcut learning in CMs. In this study, we propose a generic approach to identifying shortcuts by introducing systematic interventions on the training and test sides, including the boundary cases of ‘near-perfect’ and ‘worse than coin flip’ (label flip). By using three different models, ranging from classic to state-of-the-art, we demonstrate the presence of shortcut learning in five simulated conditions. We also analyze the results using a regression model to understand how biases affect the class-conditional score statistics. see all
|
Series: |
Interspeech |
ISSN: | 1990-9772 |
ISSN-L: | 1990-9772 |
Pages: | 785 - 789 |
DOI: | 10.21437/Interspeech.2023-1901 |
OADOI: | https://oadoi.org/10.21437/Interspeech.2023-1901 |
Host publication: |
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Conference: |
Annual Conference of the International Speech Communication Association |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
113 Computer and information sciences |
Subjects: | |
Funding: |
This work was partially supported by Academy of Finland (Decision No. 349605, project “SPEECHFAKES”). |
Copyright information: |
© ISCA |