Skip to Main Content
Nowadays China has speeded up urbanization, urban land use occurred in areas of significant change. In order to obtain land cover information speedily and correctly, many methods from data mining are used to classify the remote sensing image. In recent years, using decision trees (DTs) to classify remotely sensed data has increased, due to the algorithm running fast and making no statistical assumptions. While in remotely sensed data, the classification borders of topographic features are often not parallel with the feature space axes. So it will result in a large decision tree and poor generalization to the unobserved instances, if we use univariate DT method which tests a single feature at a node and splits the instance space with borders parallel with the features'. Aiming at the defect of univariate DT method, in this paper, principal component analysis-based approach and “best-first” method which is superior to the depth-first method that standard DT learners used are combined to construct a multivariate DT in which each tree node test can be based on one or more of the input features. In order to construct a good multivariate DT, the following issues are considered in this paper: calculating features for classification, determining the best feature space dimension, and avoiding overfitting the training data. In this study, separate test and training data sets from multispectral Landsat TM are used to evaluate the performance of univariate and multivariate DT for land cover classification. Evaluation factors considered are: the training data set size, the final tree size built by DT algorithms and algorithms classification accuracy. When compared our multivariate method with C4.5, a univariate DT algorithms, the experiments confirm that the multivariate DT builds a pithiness tree and generally improves the accuracy of the resulting DT over a univariate tree.