Skip to Main Content
We present a machine learning approach to sentiment classification on twitter messages (tweets). We classify each tweet into two categories: polar and non-polar. Tweets with positive or negative sentiment are considered polar. They are considered non-polar otherwise. Sentiment analysis of tweets can potentially benefit different parties, such as consumers and marketing researchers, for obtaining opinions on different products and services. We present methods for text normalization of the noisy tweets and their classification with respect to the polarity. We experiment with a mixture model approach for generation of sentimental words, which are later used as indicator features of the classification model. Based on a gold standard manually annotated ensemble of tweets, with the new approach, we obtain F-scores that are relatively 10% better than a classification baseline that uses raw word n-gram features.