Skip to Main Content
Today many businesses have adopted Twitter as a new marketing channel to promote their products and services. One of the potentially useful applications is to recommend users to follow businesses which match their interests. One possible solution is to apply classification algorithm to predict user's Twitter posts into some predefined business categories. Due to the short length characteristic, classifying Twitter posts is very difficult and challenging. In this paper, we propose a feature processing framework for constructing text categorization models. A topic model is constructed from a set of terms based on the Latent Dirichlet Allocation (LDA) algorithm. We apply the topic model for two different feature processing approaches: (1) feature transformation, i.e., using a set of topics as features and (2) feature expansion, i.e., appending a set of topics to a set of terms. Experimental results show that the highest accuracy of 95.7% is obtained with feature expansion technique, an improvement of 18.7% over the Bag of Words (BOW) model.