University of Oulu

Pandya, A. and Oussalah, M. (2017). Novel Semantics-based Distributed Representations for Message Polarity Classification using Deep Convolutional Neural Networks. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, ISBN 978-989-758-271-4, pages 71-82. DOI: 10.5220/0006500800710082

Novel semantics-based distributed representations for message polarity classification using deep convolutional neural networks

Saved in:
Author: Pandya, Abhinay1; Oussalah, Mourad1
Organizations: 1Center for Ubiquitous Computing, Faculty of Information Technology and Electrical Engineering (ITEE), University of Oulu, Finland
Format: article
Version: published version
Access: open
Online Access: PDF Full Text (PDF, 0.4 MB)
Persistent link: http://urn.fi/urn:nbn:fi-fe2019081924640
Language: English
Published: SciTePress, 2017
Publish Date: 2019-08-19
Description:

Abstract

Unsupervised learning of distributed representations (word embeddings) obviates the need for task-specific feature engineering for various NLP applications. However, such representations learned from massive text datasets do not faithfully represent finer semantic information in the feature space required by specific applications. This is owing to the fact that (a) models learning such representations ignore the linguistic structure of the sentences, (b) they fail to capture polysemous usages of the words, and (c) they ignore pre-existing semantic information from manually-created ontologies. In this paper, we propose three semantics-based distributed representations of words and phrases as features for message polarity classification: Sentiment-Specific Multi-Word Expressions Embeddings(SSMWE) are sentiment encoded distributed representations of multi-word expressions (MWEs); Sense-Disambiguated Word Embeddings(SDWE) are sense-specific distributed representations of words; and WordNet embeddings(WNE) are distributed representations of hypernym and hyponym of the correct sense of a given word. We examine the effects of these features incorporated in a convolutional neural network(CNN) model for evaluation on the SemEval benchmarked dataset. Our approach of using these novel features yields 14.24% improvement in the macro-averaged F1 score on SemEval datasets over existing methods. While we have shown promising results in twitter sentiment classification, we believe that the method is general enough to be applied to many NLP applications where finer semantic analysis is required.

see all

Series: IC3K
ISSN: 2184-3228
ISSN-E: 2184-3228
ISSN-L: 2184-3228
ISBN Print: 978-989-758-271-4
Pages: 71 - 82
DOI: 10.5220/0006500800710082
OADOI: https://oadoi.org/10.5220/0006500800710082
Host publication: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
Host publication editor: Fred, Ana
Filipe, Joaquim
Conference: ACM International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
Type of Publication: A4 Article in conference proceedings
Field of Science: 113 Computer and information sciences
Subjects:
Copyright information: © 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved. Published in this repository with the kind permission of the publisher.