Skip to Main Content
Data preprocessing is an important data manipulation process prior to mining actions. Various techniques that include feature selection and data transformation have been studied in the past, with the aim of producing a compact and efficient decision tree. They all have their respective strengths, but in general they commonly lack of preserving the meanings of the attributes. The concept of Attribute Value Taxonomies (AVT) that is a value set of a particular attribute which is specified at different levels of precision and can be represented as a tree-structure was originally proposed by Honavar in year 2003. AVT has the advantages of naturally and easily understanding the attributes in a hierarchy of resolutions. In this paper, we extend the concept of AVT into the domain of data preprocessing for building decision trees based on attributes that are abstracted in different levels. The result is a series of decision trees with each specifically built pertaining to an abstract level of concept. A visualization tool is also programmed that shows both the significances of the attributes and the predictive powers in each tree. A live dataset of e-Bay transactions was used as a case study. The experimental results indicate that by applying appropriate abstraction and aggregation techniques, the decision tree can be made simpler, and accuracy can be improved. The resultant trees can be mapped across to AVT for easy interpretation by human.