A statistical-heuristic feature selection criterion for decisiontree induction
Zhou, X.J.
Dillon, T.S.
Dept. of Comput Sci., La Trobe Univ., Bundoora, Vic.;
This paper appears in: Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publication Date: Aug 1991
Volume: 13,
Issue: 8
On page(s): 834-841
ISSN: 0162-8828
References Cited: 30
CODEN: ITPIDJ
INSPEC Accession Number: 4034327
Digital Object Identifier: 10.1109/34.85676
Current Version Published: 2002-08-06
Abstract
The authors present a statistical-heuristic feature selection
criterion for constructing multibranching decision trees in noisy
real-world domains. Real world problems often have multivalued features.
To these problems, multibranching decision trees provide a more
efficient and more comprehensible solution that binary decision trees.
The authors propose a statistical-heuristic criterion, the symmetrical
τ and then discuss its consistency with a Bayesian classifier and
its built-in statistical test. The combination of a measure of
proportional-reduction-in-error and cost-of-complexity heuristic enables
the symmetrical τ to be a powerful criterion with many merits,
including robustness to noise, fairness to multivalued features, and
ability to handle a Boolean combination of logical features, and
middle-cut preference. The τ criterion also provides a natural basis
for prepruning and dynamic error estimation. Illustrative examples are
also presented
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.