On the use of URLs and hashtags in age prediction of Twitter users |
|
Author: | Pandya, Abhinay1; Oussalah, Mourad1; Monachesi, Paola2; |
Organizations: |
1Center for Ubiquitous Computing, University of Oulu, Finland 2University of Utrecht, Netherlands |
Format: | article |
Version: | accepted version |
Access: | open |
Online Access: | PDF Full Text (PDF, 0.4 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe2018112849416 |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers,
2018
|
Publish Date: | 2018-11-28 |
Description: |
AbstractSocial media data represent an important resource for behavioral analysis of the ageing population. This paper addresses the problem of age prediction from Twitter dataset, where the prediction issue is viewed as a classification task. For this purpose, an innovative model based on Convolutional Neural Network is devised. To this end, we rely on language-related features and social media specific metadata. More specifically, we introduce two features that have not been previously considered in the literature: the content of URLs and hashtags appearing in tweets. We also employ distributed representations of words and phrases present in tweets, hashtags and URLs, pre-trained on appropriate corpora in order to exploit their semantic information in age prediction. We show that our CNN-based classifier, when compared with an SVM baseline model, yields an improvement of 12.3% and 6.6% in the micro-averaged F1 score on the Dutch and English datasets, respectively. see all
|
ISBN: | 978-1-5386-2659-7 |
ISBN Print: | 978-1-5386-2660-3 |
Pages: | 62 - 69 |
DOI: | 10.1109/IRI.2018.00017 |
OADOI: | https://oadoi.org/10.1109/IRI.2018.00017 |
Host publication: |
2018 IEEE International Conference on Information Reuse and Integration (IRI), 7–9 July 2018, Salt Lake City, Utah, USA |
Conference: |
IEEE International Conference on Information Reuse and Integration |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
113 Computer and information sciences 213 Electronic, automation and communications engineering, electronics |
Subjects: | |
Funding: |
This work is partially supported by EU Marie Skodowska-Curie grant No 645706 and EU grant 770469-Cutler. This paper is based on results of a project that has received funding from the European Unions Horizon 2020 research and innovation program under the Marie Skodowska-Curie grant agreement No 645706. |
EU Grant Number: |
(645706) GRAGE - Grey and green in Europe: elderly living in urban areas (770469) CUTLER - Coastal Urban developmenT through the LEnses of Resiliency |
Copyright information: |
© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |