On online hate speech detection : effects of negated data construction |
|
Author: | Abderrouaf, Cheniki1; Oussalah, Mourad1 |
Organizations: |
1University of Oulu, Faculty of Information Technology, CMVS PO Box 4500, Oulu 90014 FINLAND |
Format: | article |
Version: | accepted version |
Access: | open |
Online Access: | PDF Full Text (PDF, 0.3 MB) |
Persistent link: | http://urn.fi/urn:nbn:fi-fe202002256445 |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers,
2019
|
Publish Date: | 2020-02-25 |
Description: |
AbstractIn the era of social media and mobile internet, the design of automatic tools for online detection of hate speech and/or abusive language becomes crucial for society and community empowerment. Nowadays of current technology in this respect is still limited and many service providers are still relying on the manual check. This paper aims to advance in this topic by leveraging novel natural language processing, machine learning, and feature engineering techniques. The proposed approach advocates a classification-like technique that makes use of a special data design procedure. The latter enforces a balanced training scheme by exploring the negativity of the original dataset. This generates new transfer learning paradigms, Two classification schemes using convolution neural network and LSTN architecture that use FastText embeddings as input features are contrasted with baseline models constituted of Logistic regression and Naives’ Bayes classifiers. Wikipedia Comment dataset constituted of Personal Attack, Aggression and Toxicity data are employed to test the validity and usefulness of the proposal. see all
|
ISBN: | 978-1-7281-0858-2 |
ISBN Print: | 978-1-7281-0859-9 |
Pages: | 5595 - 5602 |
DOI: | 10.1109/BigData47090.2019.9006336 |
OADOI: | https://oadoi.org/10.1109/BigData47090.2019.9006336 |
Host publication: |
2019 IEEE International Conference on Big Data (Big Data) |
Conference: |
IEEE International Conference on Big Data |
Type of Publication: |
A4 Article in conference proceedings |
Field of Science: |
113 Computer and information sciences |
Subjects: | |
Copyright information: |
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |