By Topic

Flatten hierarchies for large-scale hierarchical text categorization

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Xiao-Lin Wang ; Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China ; Bao-Liang Lu

Hierarchies are very popular in organizing documents and web pages, hence automated hierarchical classification techniques are desired. However, the current dominant hierarchical approach of top-down method suffers accuracy decrease compared with flat classification approaches, because of error propagation and bottom nodes' data sparsity. In this paper we flatten hierarchies to relieve such accuracy decrease in top-down method, which aims to make hierarchies both effective enough to make large-scale classification tasks feasible, and simple enough to ensure high classification accuracy. We propose two flattening strategies based on these two causes of the accuracy decrease, and experimental results show that the flattening strategy designed for error propagation is more effective, which suggests that hierarchies with lots of branches at top layers can provide high classification accuracy. Besides, we analyze the computational complexity before and after flattening, which approximately agree with the experimental results.

Published in:

Digital Information Management (ICDIM), 2010 Fifth International Conference on

Date of Conference:

5-8 July 2010