By Topic

A multi-label Chinese text categorization system based on boosting algorithm

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Junli Chen ; Coll. of Compute Sci., Zhejiang Univ., China ; Xuezhong Zhou ; Zhaohui Wu

This paper presents a multi-label Chinese text categorization system based on Chinese character features and boosting algorithm. This system has been successfully evaluated on the TCM-MED dataset provided by China Academy of traditional Chinese medicine (TCM) and the Reuters-21578 benchmark. We suggest that the TCM-MED dataset can be used as a standard corpus for the Chinese text categorization tasks. We have also carried out experiments to compare the performance of the boosting algorithm with two other traditional algorithms on the same datasets. The results indicate that for the design of a multi-label Chinese text categorization system, the boosting algorithm has a high performance and outperforms the other two algorithms.

Published in:

Computer and Information Technology, 2004. CIT '04. The Fourth International Conference on

Date of Conference:

14-16 Sept. 2004