Abstract:
The k-nearest neighbor (KNN) is a widely used classification algorithm in data mining. One of the problems faced by the KNN approach is how to determine the appropriate v...Show MoreMetadata
Abstract:
The k-nearest neighbor (KNN) is a widely used classification algorithm in data mining. One of the problems faced by the KNN approach is how to determine the appropriate value of k. The common value of k is usually not optimal for all instances, especially when there is a large difference between instances. In this paper, we take a proposed training method (PTM) to select the corresponding optimal local k value for every instance according to the distribution characteristics, and apply it to class-imbalanced data sets. Then, considering the difference in impact due to distance between the instance and its neighbors, we assign different weights to its neighbors which is called weighted k-nearest neighbor (WKNN), and classify the test instance by weighted voting. The new proposed PTM-WKNN method combines the advantages of past methods, and aims at improving the classification performance of imbalanced data. In addition, we do an experiment on the class-imbalanced data sets from the University of California at Irvine (UCI) machine learning repository, and obtain the values of Recall, G-mean and F-score as evaluations. The experiment results show that the proposed method has a better performance in class-imbalanced data sets.
Date of Conference: 07-10 December 2018
Date Added to IEEE Xplore: 01 August 2019
ISBN Information: