The INTERSPEECH 2017 computational paralinguistics challenge : addressee, cold & snoring |
|
Author: | Schuller, Björn1,2; Steidl, Stefan3; Batliner, Anton2,3; |
Organizations: |
1Department of Computing, Imperial College London, UK 2Chair of Complex & Intelligent Systems, University of Passau, Germany 3Pattern Recognition Lab, FAU Erlangen-Nuremberg, Germany
4Psychology and Neuroscience, Duke University, USA
5University of Wuppertal, Germany 6Technische Universität München, Germany 7Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands 8Speech, Language, and Hearing Sciences, Purdue University, USA 9Psychology, University of Manitoba, Canada 10Cognitive and Information Sciences, University of California, Merced, USA 11Clinic for ENT Medicine, Head and Neck Surgery, Alfried Krupp Krankenhaus, Essen, Germany 12Clinic for ENT Medicine, Head and Neck Surgery, Carl-Thiem-Klinikum, Cottbus, Germany 13University of Oulu, Finland |
Format: | article |
Version: | accepted version |
Access: | open |
Online Access: | PDF Full Text (PDF, 0.2 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe2019042613366 |
Language: | English |
Published: |
International Speech Communication Association,
2017
|
Publish Date: | 2019-04-26 |
Description: |
AbstractThe INTERSPEECH 2017 Computational Paralinguistics Challenge addresses three different problems for the first time in research competition under well-defined conditions: In the Addressee sub-challenge, it has to be determined whether speech produced by an adult is directed towards another adult or towards a child; in the Cold sub-challenge, speech under cold has to be told apart from ‘healthy’ speech; and in the Snoring sub-challenge, four different types of snoring have to be classified. In this paper, we describe these sub-challenges, their conditions, and the baseline feature extraction and classifiers, which include data-learnt feature representations by end-to-end learning with convolutional and recurrent neural networks, and bag-of-audio-words for the first time in the challenge series. see all
|
Series: |
Interspeech |
ISSN: | 1990-9772 |
ISSN-L: | 1990-9772 |
Pages: | 3442 - 3446 |
DOI: | 10.21437/Interspeech.2017-43 |
OADOI: | https://oadoi.org/10.21437/Interspeech.2017-43 |
Host publication: |
Interspeech 2017 : 20-24 August 2017, Stockholm |
Conference: |
Interspeech |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
113 Computer and information sciences |
Subjects: | |
Funding: |
This research has received funding from the EU’s Framework Programme HORIZON 2020 Grants No. 115902 (RADAR CNS) and No. 645378 (ARIA-VALUSPA), the EU’s 7th Framework Programme ERC Starting Grant No. 338164 (iHEARu), as well as from SSHRC Insight Grant (#435-2015-0628), ERC Advanced Grant INTERACT (269484), NIH DP5-OD019812, and NSF SBE-1539129 and NSF BCS-1529127. Further, the support of the EPSRC Centre for Doctoral Training in High Performance Embedded and Distributed Systems (HiPEDS, Grant Reference EP/L016796/1) is gratefully acknowledged. The authors thank the research assistants who provided HB-CHAAC labels and Kelsey Dyck for help developing the labelling protocol, and the sponsors of the Challenge: audEERING GmbH, and the Association for the Advancement of Affective Computing (aaac). |
Copyright information: |
© 2017 ISCA. This is a post-peer-review, pre-copyedit version of an article published in Interspeech 2017 : 20-24 August 2017, Stockholm. The final authenticated version is available online at: https://doi.org/10.21437/Interspeech.2017-43. |