University of Oulu

Abhinay Pandya, Mourad Oussalah, Paola Monachesi, Panos Kostakos, On the use of distributed semantics of tweet metadata for user age prediction, Future Generation Computer Systems, Volume 102, 2020, Pages 437-452, ISSN 0167-739X,

On the use of distributed semantics of tweet metadata for user age prediction

Saved in:
Author: Pandya, Abhinay1; Oussalah, Mourad1; Monachesi, Paola2;
Organizations: 1Center for Ubiquitous Computing, Faculty of Information Technology and Electrical Engineering, University of Oulu, FI90014, Finland
2Department of Linguistics, University of Utrecht, 3512 JK Utrecht, The Netherlands
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 2.9 MB)
Persistent link:
Language: English
Published: Elsevier, 2020
Publish Date: 2020-01-20


Social media data represent an important resource for behavioral analysis of the aging population. This paper addresses the problem of age prediction from Twitter dataset, where the prediction issue is viewed as a classification task. For this purpose, an innovative model based on Convolutional Neural Network is devised. To this end, we rely on language-related features and social media specific metadata. More specifically, we introduce two features that have not been previously considered in the literature: the content of URLs and hashtags appearing in tweets. We also employ distributed representations of words and phrases present in tweets, hashtags and URLs, pre-trained on appropriate corpora in order to exploit their semantic information in age prediction. We show that our CNN-based classifier, when compared with baseline models, yields an improvement of up to 12.3% for Dutch dataset, 9.8% for English1 dataset, and 6.6% for English2 dataset in the micro-averaged F1 score.

see all

Series: Future generation computer systems
ISSN: 0167-739X
ISSN-E: 1872-7115
ISSN-L: 0167-739X
Volume: 102
Pages: 437 - 452
DOI: 10.1016/j.future.2019.08.018
Type of Publication: A1 Journal article – refereed
Field of Science: 113 Computer and information sciences
Funding: This work is (partially) funded by the Marie Skłodowska-Curie, Finland grant (645706-GRAGE) and EU grant (770469-Cutler).
EU Grant Number: (645706) GRAGE - Grey and green in Europe: elderly living in urban areas
(770469) CUTLER - Coastal Urban developmenT through the LEnses of Resiliency
Copyright information: © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (