Skip to Main Content
This paper proposes a method based on generation model for sentiment analysis and topic identification in texts. Firstly sentiment and topic of training texts are labeled by hand and sentiment models and topic models are established. Secondly compute the Kullback-Leibler divergence between a testing text and sentiment models in order to determine sentiment of the text. Similarly, calculate the Kullback-Leibler divergence between the testing text and topic model, so the topic of text can be identified. The unigram and bigram of words are employed as the model parameters, and correspondingly maximum likelihood estimation and some smoothing techniques are used to estimate these parameters. Empirical experiments on product reviews corpus show that this language modeling approach performs better than SVM and obtains improvement on precision. Moreover this method is better than SVM in robustness.