Loading [MathJax]/extensions/MathMenu.js
Multilingual Sentiment Analysis Using Emoticons and Keywords | IEEE Conference Publication | IEEE Xplore

Multilingual Sentiment Analysis Using Emoticons and Keywords


Abstract:

Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 a...Show More

Abstract:

Nowadays the World Wide Web has evolved into a leading communication channel and information exchange medium. Especially after the introduction of the so-called web 2.0 and the explosion that followed regarding user generated content, the amount of data available over the internet has attracted the interest of both the scientific and business community. Their efforts focus on identifying the inner structures of data and the knowledge that can be derived by analyzing them. Web 2.0 is the subject of study and research in a number of areas. One of these areas is sentiment analysis, where the main goal is to study and draw conclusions about subjectivity, polarity and the feeling that is expressed in user generated content, which mainly consist of free text documents. The goal of this paper is to apply sentiment analysis on multilingual data, focusing on documents written in Greek. We developed an integrated framework that accepts user generated documents and then identifies the polarity of the text (neutral, negative or positive) and the sentiment expressed through it (joy, love, anger or sadness). We followed a semi-supervised approach which led to the development of two techniques for the automatic collection of training data without any human intervention. Our approach involves the detection and use of self-defining features that are available within the data. We take into account two emotionally rich features: a) emoticons and b) lists of emotionally intense keywords. These features are evaluated on data coming from a popular forum, using various classifiers and feature vectors. Our experimental results point to various conclusions about the effectiveness, advantages and limitations of applying such methods on Greek data. Using keywords we achieved 90% mean accuracy on identifying the subjectivity level and 93% on correctly identifying the polarity level, whereas using emoticons the mean accuracy for each of these levels was 74% and 77% respectively.
Date of Conference: 11-14 August 2014
Date Added to IEEE Xplore: 20 October 2014
Electronic ISBN:978-1-4799-4143-8
Conference Location: Warsaw, Poland

Contact IEEE to Subscribe

References

References is not available for this document.