By Topic

Sentiment classification for Indonesian message in social media

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Aqsath Rasyid Naradhipa ; Sch. of Electr. & Inf. Eng., Bandung Inst. of Technol., Bandung, Indonesia ; Ayu Purwarianti

Nowadays, social media has grown rapidly. This growth makes people enjoy sharing their activities in social media, including complains about products or services from companies. This behaviour is a big opportunity for a company to know sentiments of customer toward their company. However, classifying sentiments from messages in social media has several challenges. First, the language used in social media often does not have formal structure in their sentence, such as the use of abbreviations, change of letters to numbers, lack of punctuation marks, etc. Second, sentences in social media are domain independent so it's hard to classify the sentence. Related with these challenges, this paper discusses a method to classify sentiments on social media, which is written in Indonesian language. The method we use is to classify each sentence into several classes of sentiment. Before the classification, sentence transformation is done to transform the informal words in the sentence into formal words. The transformation method that we use is the deletion of punctuation mark, the tokenization, the conversion of number to letter, the reduction of repetition letter, the Levensthein distance and using corpus to formalize abbreviation. Formalized sentence will be the input to the classification model for data training and classification process. We classify the message into four classes: Neutral (fact, greetings, etc.), Question, Positive Sentiment, and Negative Sentiment. SVM (Support Vector Machine) and Maximum Entropy are used as the classification algorithms with machine learning features of count of positive, negative, and question word in sentence. From our experimental result, the best classification method is SVM that yields 86,66% accuracy.

Published in:

Electrical Engineering and Informatics (ICEEI), 2011 International Conference on

Date of Conference:

17-19 July 2011