Loading [MathJax]/extensions/MathMenu.js
An Investigation of SMOTE Based Methods for Imbalanced Datasets With Data Complexity Analysis | IEEE Journals & Magazine | IEEE Xplore

Scheduled Maintenance: On Tuesday, May 20, IEEE Xplore will undergo scheduled maintenance from 1:00-5:00 PM ET (6:00-10:00 PM UTC). During this time, there may be intermittent impact on performance. We apologize for any inconvenience.

An Investigation of SMOTE Based Methods for Imbalanced Datasets With Data Complexity Analysis


Abstract:

Many binary class datasets in real-life applications are affected by class imbalance problem. Data complexities like noise examples, class overlap and small disjuncts pro...Show More

Abstract:

Many binary class datasets in real-life applications are affected by class imbalance problem. Data complexities like noise examples, class overlap and small disjuncts problems are observed to play a key role in producing poor classification performance. These complexities tend to exist in tandem with class imbalance problem. Synthetic Minority Oversampling Technique (SMOTE) is a well-known method to re-balance the number of examples in imbalanced datasets. However, this technique cannot effectively tackle data complexities and it also has the capability of magnifying the degree of complexities. Also, the performance of the SMOTE is still not satisfactory. Therefore, various SMOTE variants have been proposed to overcome the downsides of SMOTE either by combining SMOTE with other algorithms or modifying the existing SMOTE algorithm. This paper aims to comparatively review the algorithms applied in SMOTE variants and investigate which data complexities are being addressed in what variants. Series of experiments are conducted on 24 binary class imbalanced datasets to observe the changes in the data complexity measures after SMOTE variants were applied in these datasets. The evaluation metrics like G-Mean and F1-Score are also analyzed to investigate the difference in classification performance between SMOTE variants.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 35, Issue: 7, 01 July 2023)
Page(s): 6651 - 6672
Date of Publication: 01 June 2022

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.