CLUS: A new hybrid sampling classification for imbalanced data | IEEE Conference Publication | IEEE Xplore

CLUS: A new hybrid sampling classification for imbalanced data


Abstract:

The new hybrid sampling approach called CLUS- CLUSter-based hybrid sampling approach is proposed in this paper to improve the performance of classifier for two-class imba...Show More

Abstract:

The new hybrid sampling approach called CLUS- CLUSter-based hybrid sampling approach is proposed in this paper to improve the performance of classifier for two-class imbalanced datasets. The objective of this research is to develop algorithm that can effectively classify two-class imbalanced datasets, which have complicated distributions and large overlap between classes. These problems can make the learners failed in classification. Therefore, the contribution of CLUS is to alleviate the large overlap between classes and to balance the class distribution. Firstly, all instances are partitioned into k clusters using k-mean algorithms. Next, CLUS created the new subset, which consists of the instances from different classes, which have different characteristics. Secondly, for each subset, oversampling method is applied. Finally, SVMs is used to classify each training set based on majority vote. CLUS is tested using eight imbalanced benchmark datasets and assessed over two metrics; F-measure and AUC. The experimental results show that CLUS outperforms other methods especially when the number of imbalanced ratio is high.
Date of Conference: 22-24 July 2015
Date Added to IEEE Xplore: 27 August 2015
ISBN Information:
Conference Location: Songkhla, Thailand

Contact IEEE to Subscribe

References

References is not available for this document.