University of Oulu

Lee, Y., Chen, H., Zhao, G., & Specht, M. (2022). Wedar: Webcam-based attention analysis via attention regulator behavior recognition with a novel e-reading dataset. INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 319–328.

WEDAR : webcam-based attention analysis via attention regulator behavior recognition with a novel e-reading dataset

Saved in:
Author: Lee, Yoon1; Chen, Haoyu2; Zhao, Guoying2;
Organizations: 1Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Netherlands and Leiden-Delft-Erasmus Centre for Education and Learning, Netherlands
2Center for Machine Vision and Signal Analysis, University of Oulu, Finland
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 1.5 MB)
Persistent link:
Language: English
Published: ACM, 2022
Publish Date: 2023-03-31


Human attention is critical yet challenging cognitive process to measure due to its diverse definitions and non-standardized evaluation. In this work, we focus on the attention self-regulation of learners, which commonly occurs as an effort to regain focus, contrary to attention loss. We focus on easy-to-observe behavioral signs in the real-world setting to grasp learners’ attention in e-reading. We collected a novel dataset of 30 learners, which provides clues of learners’ attentional states through various metrics, such as learner behaviors, distraction self-reports, and questionnaires for knowledge gain. To achieve automatic attention regulator behavior recognition, we annotated 931,440 frames into six behavior categories every second in the short clip form, using attention self-regulation from the literature study as our labels. The preliminary Pearson correlation coefficient analysis indicates certain correlations between distraction self-reports and unimodal attention regulator behaviors. Baseline model training has been conducted to recognize the attention regulator behaviors by implementing classical neural networks to our WEDAR dataset, with the highest prediction result of 75.18% and 68.15% in subject-dependent and subject-independent settings, respectively. Furthermore, we present the baseline of using attention regulator behaviors to recognize the attentional states, showing a promising performance of 89.41% (leave-five-subject-out). Our work inspires the detection & feedback loop design for attentive e-reading, connecting multimodal interaction, learning analytics, and affective computing.

see all

ISBN Print: 978-1-4503-9390-4
Pages: 319 - 328
DOI: 10.1145/3536221.3556619
Host publication: ICMI '22: Proceedings of the 2022 International Conference on Multimodal Interaction
Conference: International Conference on Multimodal Interaction
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Funding: This research was supported by Leiden-Delft-Erasmus Center for Education and Learning, Academy of Finland for Academy Professor project EmotionAI (grants 336116, 345122), and project MiGA(grant 316765).
Academy of Finland Grant Number: 336116
Detailed Information: 336116 (Academy of Finland Funding decision)
345122 (Academy of Finland Funding decision)
316765 (Academy of Finland Funding decision)
Copyright information: © 2022 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution International 4.0 License.