By Topic

Design and Implementation of Chinese Text Clustering System

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Ying Tan ; Coll. of Comput. Sci. & Technol., Jilin Univ., Changchun, China ; Lan Huang ; Hong Qi ; Yandong Zhai

Clustering technology is the core technology of text mining. Through text clustering, a large number of text messages can be divided into several meaningful classes or clusters. According to the features of Chinese documents, this paper designs and implements the Chinese Text Clustering System to perform automatic clustering of Chinese documents. Firstly, this system will carry out Chinese word automatic segmentation for the input Chinese document sets by using reverse maximum matching method. Secondly, further text preprocessing is performed. Finally the K-means clustering algorithm is used to obtain the clustering results. The prototype system can also be used in clustering Chinese Web pages to search for user's interest model by search engines, which will improve the efficiency of searching the target content.

Published in:

INC, IMS and IDC, 2009. NCM '09. Fifth International Joint Conference on

Date of Conference:

25-27 Aug. 2009