University of Oulu

P. Kostakos, A. Pandya, O. Kyriakouli and M. Oussalah, "Inferring Demographic Data of Marginalized Users in Twitter with Computer Vision APIs," 2018 European Intelligence and Security Informatics Conference (EISIC), Karlskrona, Sweden, 2018, pp. 81-84. doi: 10.1109/EISIC.2018.00022

Inferring demographic data of marginalized users in Twitter with computer vision APIs

Saved in:
Author: Kostakos, Panos1; Pandya, Abhinay1; Kyriakouli, Olga2;
Organizations: 1Center for Ubiquitous Computing, University of Oulu, Oulu, Finland
2Dpt. of Informatics and Telematic,s Harokopio University Athens, Greece
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 0.5 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2019
Publish Date: 2019-08-20


Inferring demographic intelligence from unlabeled social media data is an actively growing area of research, challenged by low availability of ground truth annotated training corpora. High-accuracy approaches for labeling demographic traits of social media users employ various heuristics that do not scale up and often discount non-English texts and marginalized users. First, we present a framework for inferring the demographic attributes of Twitter users from their profile pictures (avatars) using the Microsoft Azure Face API. Second, we measure the inter-rater agreement between annotations made using our framework against two pre-labeled samples of Twitter users (N1=1163; N2=659) whose age labels were manually annotated. Our results indicate that the strength of the inter-rater agreement (Gwet’s AC1=0.89; 0.90) between the gold standard and our approach is ‘very good’ for labelling the age group of users. The paper provides a use case of Computer Vision for enabling the development of large cross-sectional labeled datasets, and further advances novel solutions in the field of demographic inference from short social media texts.

see all

ISBN: 978-1-5386-9400-8
ISBN Print: 978-1-5386-9401-5
Pages: 81 - 84
DOI: 10.1109/EISIC.2018.00022
Host publication: Proceedings of the European Intelligence and Security Informatics Conference (EISIC) 2018 October 24-25, 2018 Blekinge Institute of Technology, Karlskrona, Sweden
Host publication editor: Brynielsson, Joel
Conference: European Intelligence and Security Informatics Conference (EISIC)
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Funding: This work is (partially) funded by the European Commission grant 770469-CUTLER and 645706-GRAGE.
EU Grant Number: (770469) CUTLER - Coastal Urban developmenT through the LEnses of Resiliency
(645706) GRAGE - Grey and green in Europe: elderly living in urban areas
Copyright information: © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.