Handling Imbalanced Datasets by Partially Guided Hybrid Sampling for Pattern Recognition | IEEE Conference Publication | IEEE Xplore

Handling Imbalanced Datasets by Partially Guided Hybrid Sampling for Pattern Recognition


Abstract:

Occurrence of high imbalance in real-world domains is a direct result of rarity of interesting events, which results in skewed datasets. Without dataset rebalancing, the ...Show More

Abstract:

Occurrence of high imbalance in real-world domains is a direct result of rarity of interesting events, which results in skewed datasets. Without dataset rebalancing, the learning algorithm will encounter extremely low minority class samples therefore it gets biased towards the majority class in the classification tasks. Hence properly handling the imbalanced dataset is a crucial issue in the pattern recognition domain. We have employed bootstrapping by simultaneous oversampling of the minority class and under sampling of the majority class to build the ensemble of classifiers. Oversampling is partially guided by the extracted hidden patterns from minority class, which prevents its over-generalization and amplify subtle vital patterns. The proposed framework is evaluated on four highly imbalanced datasets with employing a series of classifiers like, support vector machine, logistic regression, nearest neighbor and Gaussian process classifier. Experimental results showed that the pattern classification performance for various tasks improves after rebalancing datasets using the proposed framework.
Date of Conference: 24-28 August 2014
Date Added to IEEE Xplore: 06 December 2014
Electronic ISBN:978-1-4799-5209-0
Print ISSN: 1051-4651
Conference Location: Stockholm, Sweden

Contact IEEE to Subscribe

References

References is not available for this document.