University of Oulu

C. Abderrouaf and M. Oussalah, "On Online Hate Speech Detection. Effects of Negated Data Construction," 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 2019, pp. 5595-5602. doi: 10.1109/BigData47090.2019.9006336

On online hate speech detection : effects of negated data construction

Saved in:
Author: Abderrouaf, Cheniki1; Oussalah, Mourad1
Organizations: 1University of Oulu, Faculty of Information Technology, CMVS PO Box 4500, Oulu 90014 FINLAND
Format: article
Version: accepted version
Access: open
Online Access: PDF Full Text (PDF, 0.3 MB)
Persistent link:
Language: English
Published: Institute of Electrical and Electronics Engineers, 2019
Publish Date: 2020-02-25


In the era of social media and mobile internet, the design of automatic tools for online detection of hate speech and/or abusive language becomes crucial for society and community empowerment. Nowadays of current technology in this respect is still limited and many service providers are still relying on the manual check. This paper aims to advance in this topic by leveraging novel natural language processing, machine learning, and feature engineering techniques. The proposed approach advocates a classification-like technique that makes use of a special data design procedure. The latter enforces a balanced training scheme by exploring the negativity of the original dataset. This generates new transfer learning paradigms, Two classification schemes using convolution neural network and LSTN architecture that use FastText embeddings as input features are contrasted with baseline models constituted of Logistic regression and Naives’ Bayes classifiers. Wikipedia Comment dataset constituted of Personal Attack, Aggression and Toxicity data are employed to test the validity and usefulness of the proposal.

see all

ISBN: 978-1-7281-0858-2
ISBN Print: 978-1-7281-0859-9
Pages: 5595 - 5602
DOI: 10.1109/BigData47090.2019.9006336
Host publication: 2019 IEEE International Conference on Big Data (Big Data)
Conference: IEEE International Conference on Big Data
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Copyright information: © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.