Most classification studies are done by using all the objects data. It is expected to classify objects by using some subsets data in the total data. A rough set based reduct is a minimal subset of features, which has almost the same discernible power as the entire conditional features. Here, we propose a greedy algorithm to compute a set of rough set reducts which is followed by the k-nearest neighbor to classify documents. To improve the classification performance, reducts-kNN with confidence was developed. These proposed rough set reduct based methods are compared with the classification by AdaBoost and SVM(Support Vector Machine) methods. Experiments have been conducted on some benchmark datasets from the Reuters 21578 data set.
Date of Conference: 9-11 June 2010