Skip to Main Content
Internet is becoming a spreading platform for the public opinion. It is important to grasp the Internet public opinion in time and understand the trends of their opinion correctly. Text classification plays a fundamental role in a number of information management and retrieval tasks. But Web-page classification is much more difficult than pure-text classification due to a large variety of noisy information embedded in Web pages. In this paper, we propose a system scheme for the analysis of the Internet public opinion (IPO). We apply Web-page classification through summarization to extract the most relevant content from the Web pages and then pass them to standard text classification algorithms (NB or SVM). We comprehensive use text classification and text clustering algorithms, which have been shown to be efficient and effective for singly using. Through the result of the experiment,we have proved the superiority of the system systempsilas architecture in the system design.