University of Oulu

W. Carneiro de Melo, E. Granger and M. Bordallo Lopez, "MDN: A Deep Maximization-Differentiation Network for Spatio-Temporal Depression Detection," in IEEE Transactions on Affective Computing, doi: 10.1109/TAFFC.2021.3072579.

MDN : a deep maximization-differentiation network for spatio-temporal depression detection

Saved in:
Author: Carneiro de Melo, Wheidima; Granger, Eric; Bordallo Lopez, Miguel
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 2.4 MB)
Persistent link: http://urn.fi/urn:nbn:fi-fe2022081855732
Language: English
Published: Institute of Electrical and Electronics Engineers, 2021
Publish Date: 2022-08-18
Description:

Abstract

Deep learning (DL) models have been successfully applied in video-based affective computing, allowing to recognize emotions and mood, or to estimate the intensity of pain or stress based on facial expressions. Despite the advances with state-of-the-art DL models for spatio-temporal recognition of facial expressions associated with depression, some challenges remain in the cost-effective application of 3D-CNNs: (1) 3D convolutions employ structures with fixed temporal depth that decreases the potential to extract discriminative representations due to the usually small difference of spatio-temporal variations along different depression levels; and (2) the computationally complexity of these models with consequent susceptibility to overfitting. To address these challenges, we propose a novel DL architecture called the Maximization and Differentiation Network (MDN) in order to effectively represent facial expression variations that are relevant for depression assessment. The MDN operates without 3D convolutions, by exploring multiscale temporal information using a maximization block that captures smooth facial variations, and a difference block to encode sudden facial variations. Extensive experiments using our proposed MDN result in improved performance while reducing the number of parameters by more than 3 when compared with 3D-ResNet models. Our model also outperforms other 3D models and achieves state-of-the-art results for depression detection. Code available at: https://github.com/wheidima/MDN.

see all

Series: IEEE transactions on affective computing
ISSN: 2371-9850
ISSN-E: 1949-3045
ISSN-L: 2371-9850
DOI: 10.1109/TAFFC.2021.3072579
OADOI: https://oadoi.org/10.1109/TAFFC.2021.3072579
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Subjects:
Copyright information: This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
  https://creativecommons.org/licenses/by/4.0/