Skip to Main Content
Inferring the sentiment of social media content, for instance blog posts and forum threads, is both of great interest to security analysts and technically challenging to accomplish. This paper presents a new method for estimating social media sentiment which addresses the challenges associated with Web-based analysis. The approach formulates the task as one of learning-based text classification, models the data as a bipartite graph of documents and words, and provides accurate sentiment estimation using only a small lexicon of words of known sentiment orientation, in particular, good performance is obtained without the need for labeled training documents. This capability for effective learning without (labeled) exemplar documents is realized by 1.)exploiting the information present in unlabeled documents and words, which are abundant online, and 2.) appropriately smoothing the sentiment polarity estimates for documents and words in the bipartite graph data model. The utility of the proposed algorithm is demonstrated through implementation with a "standard"sentiment analysis task involving online consumer product reviews. Additionally, we illustrate the potential of the method for security informatics by inferring regional public opinion regarding the Egyptian revolution via analysis of Arabic, Indonesian, and Danish blog posts.