Skip to Main Content
Inferring the sentiment of social media content, for instance blog posts and forum threads, is both of great interest to security analysts and technically challenging to accomplish. This paper presents two computational methods for estimating social media sentiment which address the challenges associated with Web-based analysis. Each method formulates the task as one of text classification, models the data as a bipartite graph of documents and words, and assumes that only limited prior information is available regarding the sentiment orientation of any of the documents or words of interest. The first algorithm is a semi-supervised sentiment classifier which combines knowledge of the sentiment labels for a few documents and words with information present in unlabeled data, which is abundant online. The second algorithm assumes existence of a set of labeled documents in a domain related to the domain of interest, and leverages these data to estimate sentiment in the target domain. We demonstrate the utility of the proposed methods by showing they outperform several standard techniques for the task of inferring the sentiment of online movie and consumer product reviews. Additionally, we illustrate the potential of the methods for security informatics by estimating regional public opinion regarding Egypt's unfolding revolution through analysis of Arabic, Indonesian, and Danish (language) blog posts.