Loading [MathJax]/extensions/MathMenu.js
Newton’s Cooling Law-Based Oversampling Technique for Handling Imbalanced Data in Software Defect Prediction | IEEE Journals & Magazine | IEEE Xplore

Newton’s Cooling Law-Based Oversampling Technique for Handling Imbalanced Data in Software Defect Prediction


The Graphical Abstract is divided into three layers to introduce the NCLWO algorithm for software defect prediction. The first layer is the motivation, The second layer i...

Abstract:

Class imbalance presents considerable challenges for software defect prediction. However, software defect datasets exhibit additional complex characteristics, with class ...Show More

Abstract:

Class imbalance presents considerable challenges for software defect prediction. However, software defect datasets exhibit additional complex characteristics, with class overlap being the most harmful to prediction performance. Currently, most techniques only focus on balancing data distributions. To this end, this study applies the previously proposed Newton’s Cooling Law-Based Oversampling Technique (NCLWO) to software defect prediction. The method first uses density and distance factors to identify hard-to-learn defect instances in SDP, such as boundary samples in overlapping distributions and small disjoints. Then, the density and distance factors are used to measure the initial heat of each defect instance. Newton’s Cooling Law is applied to expand the sampling area into a hypersphere until thermal equilibrium is achieved, moving non-defective instances out of the region to clean the overlapping area. Finally, weighted oversampling is performed within the expanded sampling area to generate defect instances, addressing within-class imbalance. Comparative experiments were conducted between NCLWO and ten state-of-the-art sampling methods on 30 software defect datasets. The experimental results reveal that the proposed method excels in performance and showcases significant robustness and versatility in handling the joint effects of class overlap and imbalance.
The Graphical Abstract is divided into three layers to introduce the NCLWO algorithm for software defect prediction. The first layer is the motivation, The second layer i...
Published in: IEEE Access ( Volume: 13)
Page(s): 47820 - 47832
Date of Publication: 12 March 2025
Electronic ISSN: 2169-3536

Funding Agency:


References

References is not available for this document.