Skip to Main Content
This paper constructed a web site supervision system to analyze the sentiment orientation of the articles (document, text) on the web sites in intranet. A scheme based on Vector Space Model (VSM) is put forward for identifying the sentiment orientation of document. Considering the feature of a document is relative to the term frequencies appears in the document and corpus, we extract the feature of document by using the popular formula TF-IDF (Term Frequency-Inverse Document Frequency), and the weight of term is computed based on a sentiment lexicon, which is built manually. In the sentiment lexicon, we divide Chinese words into three types according to the sentiment orientation of word and each word is assigned a weight according to the polarity of words. At begin, we introduce some text classification techniques such as feature representation, feature extract and text classifiers based on VSM. The structure and the working process of the web site monitor system are also described in the end of this paper.