Abstract:
Imbalanced datasets are pervasive and comprise many real-world applications, such as medical diagnosis and software fault detection. As common classifiers assume a balanc...Show MoreMetadata
Abstract:
Imbalanced datasets are pervasive and comprise many real-world applications, such as medical diagnosis and software fault detection. As common classifiers assume a balanced distribution of examples in the data, learning from imbalanced datasets presents its own challenges. Sampling techniques play an essential role in aiding classifiers which learn from imbalanced datasets, as these techniques return a more balanced version of the imbalanced dataset. Given the current number of sampling techniques available, selecting a technique together with a set of values for its hyper-parameters is a time-consuming task. In this work, we treat the mentioned problem as a many-objective optimization problem. An evolutionary algorithm was applied to select sampling algorithms and their parameters to imbalanced datasets considering multiple performance criteria. In the experiments, we compared the proposed method against the brute-force, the default (all sampling algorithms with their default hyper-parameters' values), and the random approaches. The experiments revealed that the proposal reached results comparable to those achieved by the brute-force approach, overcame the techniques with their default parameters most of the time, and surpassed the random search approach in the majority of the problems.
Published in: 2018 IEEE Congress on Evolutionary Computation (CEC)
Date of Conference: 08-13 July 2018
Date Added to IEEE Xplore: 04 October 2018
ISBN Information: