Abstract:
Undersampling is a widely adopted method to deal with imbalance pattern classification problems. Current methods mainly depend on either random resampling on the majority...Show MoreMetadata
Abstract:
Undersampling is a widely adopted method to deal with imbalance pattern classification problems. Current methods mainly depend on either random resampling on the majority class or resampling at the decision boundary. Random-based undersampling fails to take into consideration informative samples in the data while resampling at the decision boundary is sensitive to class overlapping. Both techniques ignore the distribution information of the training dataset. In this paper, we propose a diversified sensitivity-based undersampling method. Samples of the majority class are clustered to capture the distribution information and enhance the diversity of the resampling. A stochastic sensitivity measure is applied to select samples from both clusters of the majority class and the minority class. By iteratively clustering and sampling, a balanced set of samples yielding high classifier sensitivity is selected. The proposed method yields a good generalization capability for 14 UCI datasets.
Published in: IEEE Transactions on Cybernetics ( Volume: 45, Issue: 11, November 2015)
Funding Agency:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Classification Problem ,
- Major Classes ,
- Classification Of Samples ,
- Distribution Information ,
- Decision Boundary ,
- Minority Class ,
- Classical Clustering ,
- Random Resampling ,
- Neural Network ,
- Support Vector Machine ,
- Cluster Sampling ,
- Multi-label ,
- Ensemble Method ,
- Complex Datasets ,
- Output Neurons ,
- Incremental Learning ,
- Imbalance Problem ,
- Imbalanced Datasets ,
- Hidden Neurons ,
- Imbalance Ratio ,
- Random Undersampling ,
- Synthetic Minority Oversampling Technique ,
- Random Oversampling ,
- Artificial Datasets ,
- Candidate Samples ,
- Blue Cross ,
- Independent Run ,
- Center Vector ,
- Classification Algorithms ,
- Square Root
- Author Keywords
- Author Free Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Classification Problem ,
- Major Classes ,
- Classification Of Samples ,
- Distribution Information ,
- Decision Boundary ,
- Minority Class ,
- Classical Clustering ,
- Random Resampling ,
- Neural Network ,
- Support Vector Machine ,
- Cluster Sampling ,
- Multi-label ,
- Ensemble Method ,
- Complex Datasets ,
- Output Neurons ,
- Incremental Learning ,
- Imbalance Problem ,
- Imbalanced Datasets ,
- Hidden Neurons ,
- Imbalance Ratio ,
- Random Undersampling ,
- Synthetic Minority Oversampling Technique ,
- Random Oversampling ,
- Artificial Datasets ,
- Candidate Samples ,
- Blue Cross ,
- Independent Run ,
- Center Vector ,
- Classification Algorithms ,
- Square Root
- Author Keywords
- Author Free Keywords