Skip to Main Content
Micro-blogging sites like Twitter have become a valuable source of information due to their recent upsurge in popularity. The objective of this research is to sense the urban phenomena like peoples' interest in particular topics or shift of interest from one topic to another. Sometimes, even just the proportion of tweets related to a topic appearing in the daily Twitter corpus of a region can give a good indication about peoples' level of interest in that topic on that particular day. Unfortunately, most of the tweets are not explicitly tagged with topic keywords by the Twitter users. In this paper we propose a method for automatic tagging of untagged tweets. Our method is based on identification of important collocations from a large training set of tweets. We then train a multinomial Naiıve Bayes classifier using these collocation features for tagging untagged tweets. We could achieve 88.25% accuracy with high precision and recall.