By Topic

Analysis and Design of a Decision Tree Based on Entropy Reduction and Its Application to Large Character Set Recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Qing Ren Wang ; Department of Computer Science, Concordia University, Montreal, P. Q., Canada H3G 1M8. ; Ching Y. Suen

Based on a recursive process of reducing the entropy, the general decision tree classifier with overlap has been analyzed. Several theorems have been proposed and proved. When the number of pattern classes is very large, the theorems can reveal both the advantages of a tree classifier and the main difficulties in its implementation. Suppose H is Shannon's entropy measure of the given problem. The theoretical results indicate that the tree searching time can be minimized to the order O(H), but the error rate is also in the same order O(H) due to error accumulation. However, the memory requirement is in the order 0(H exp(H)) which poses serious problems in the implementation of a tree classifier for a large number of classes. To solve these problems, several theorems related to the bounds on the search time, error rate, memory requirement and overlap factor in the design of a decision tree have been proposed and some principles have been established to analyze the behaviors of the decision tree. When applied to classify sets of 64, 450, and 3200 Chinese characters, respectively, the experimental results support the theoretical predictions. For 3200 classes, a very high recognition rate of 99.88 percent was achieved at a high speed of 873 samples/s when the experiment was conducted on a Cyber 172 computer using a high-level language.

Published in:

IEEE Transactions on Pattern Analysis and Machine Intelligence  (Volume:PAMI-6 ,  Issue: 4 )